[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
Thank you so much always, @gatorsmile !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14132
  
I do not know the exact context why you need SQL generation. Tomorrow, I 
will review your PR (https://github.com/apache/spark/pull/14116) to understand 
your issues tomorrow. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
Or, now, I got it. It was the terminology problem I used. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
You know, Spark decide to use `createOrReplaceTempView` by deprecating 
`registerTempTable` in 2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14132
  
That is right. TempView -> Temporary View. 

`Temporal views` are wrong. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14235
  
I can understand the advantage of comparing SQL statement strings, but the 
cost is higher than the benefit especially for a small group of key developers 
who are maintaining the Spark code everyday. 

In some commercial RDBMS, they have a whole team for developing and 
maintaining SQL generation. However, I am not sure if the Spark community can 
afford it. You know, SQL generation is only used for native view support so far.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
What I mean is not making real view as you concern. I'm telling the one I 
need to make a view definition by SQL generation is the result of 
```
spark.range(10).createOrReplaceTempView("t")
new org.apache.spark.sql.catalyst.SQLBuilder(sql("select * from t")).toSQL
```
I'm not disagree with your opinion here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
? `createOrReplaceTempView` => `TempView` => Temporary View.
I mean this, @gatorsmile . So, I called this Spark Native Temporary View. 
This does not exist in Hive or External catalog.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14132
  
Really? Can you show me the files? I think we should correct them


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
Yep. I know why you said like that. I also feel like that, but Spark 2.0 
uses that term in API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14235
  
:) Sorry, but I don't think so.
* At every level, we need to prove correctness. The tolerance you want is 
the result of unpredictable removal of Optimizer.
* Also, this is `LogicalPlanToSQLSuite` . SQL statement comparison is not a 
good way, but the only correct way to this module.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14132
  
I see. That is a temporary table. Temporal Tables are a term for different 
purposes. 

The sizing is not small, if we want to support the temporary views created 
by any DataFrame/Dataset. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14235
  
Comparing the SQL statement strings is horrible to me. 

I have a different approach to verify the **correctness**. How about 
compare the optimized plans, which can tolerate more slight changes that do not 
affect the correctness? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
Yep. The above example, `spark.range(10).createOrReplaceTempView("t")`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14235
  
First of all, you can update the whole query set by one flag. So, 
maintainability is no more difficult issue.

I made this PR because `SQL generation` is currently fragile as [you said 
yesterday](https://github.com/apache/spark/pull/14132#issuecomment-233107976).

We need to prevent unintentional and accidental changes on that before both 
Hint or [SPARK-16576](https://issues.apache.org/jira/browse/SPARK-16576).

IMO, this is a **correctness** issue we should resolve. I hope this PR 
protects Spark from **me**. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14194: [SPARK-16485][DOC][ML] Fixed several inline formatting i...

2016-07-16 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/14194
  
Thanks for finding this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14132
  
What are Spark temporal views? Temporal tables?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14235#discussion_r71076593
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala
 ---
@@ -76,7 +79,29 @@ class LogicalPlanToSQLSuite extends SQLBuilderTest with 
SQLTestUtils {
 }
   }
 
-  private def checkHiveQl(hiveQl: String): Unit = {
+  // Used for generating new query answer files by saving
+  private val saveQuery = false
--- End diff --

Hi, @gatorsmile .
This is my answer to that. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14132
  
Thank you for replying, @gatorsmile . Yes. Right.

In fact, that limitation is important to me. To complete my 
`INFORMATION_SCHEMA` PR (https://github.com/apache/spark/pull/14116), I need to 
support view for Spark temporal views some day later. Of course, currently, in 
that PR, I removed the VIEW DEFINITION column implementation part due to many 
reason.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14235
  
It sounds like the latest changes are unable to resolve Reynold's concern: 
how to regenerate all the expected SQL queries in bulk? 

You know, for native view support, we are not generating optimal SQL 
statements. You can see the generated SQL is verbose and not readable. However, 
these SQL statements can be optimized by Catalyst optimizer at runtime. Thus, I 
have the same concern. This approach is not maintainable 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14235
  
**[Test build #62423 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62423/consoleFull)**
 for PR 14235 at commit 
[`9f20a63`](https://github.com/apache/spark/commit/9f20a63d9be027bcb25e1eeecb28658976cdae98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14132
  
In the JIRA https://issues.apache.org/jira/browse/SPARK-11012, we 
documented the current limitation of SQL generation:
>Note that not all resolved logical query plan can be converted back to SQL 
query string. Either because it consists of some language structure that has 
not been supported yet, or it doesn't have a SQL representation inherently 
(e.g. query plans built on top of local Scala collections).

The purpose of SQL generation is for better native view support. The target 
logical plan must be parsed from a valid HiveQL query statement. Thus, the two 
examples you mentioned above are not supported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62422/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62422 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62422/consoleFull)**
 for PR 14177 at commit 
[`12899c5`](https://github.com/apache/spark/commit/12899c516547a2f5064639386c5a42530a345ec6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62422 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62422/consoleFull)**
 for PR 14177 at commit 
[`12899c5`](https://github.com/apache/spark/commit/12899c516547a2f5064639386c5a42530a345ec6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14177
  
Agreed - this could be a bug in SQL/Hive, I'd be interested in digging into 
it a bit more later next week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14223: [SPARK-10614][CORE] Change SystemClock to derive time fr...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14223
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62421/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14223: [SPARK-10614][CORE] Change SystemClock to derive time fr...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14223
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14223: [SPARK-10614][CORE] Change SystemClock to derive time fr...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14223
  
**[Test build #62421 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62421/consoleFull)**
 for PR 14223 at commit 
[`3dec78a`](https://github.com/apache/spark/commit/3dec78a28147588ce7d88eb4e5790265af8ca28b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14217: [SPARK-16562][SQL] Do not allow downcast in INT32...

2016-07-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/14217#discussion_r71076055
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
 ---
@@ -169,6 +169,19 @@ class ParquetIOSuite extends QueryTest with 
ParquetTest with SharedSQLContext {
 }
   }
 
+  test("SPARK-16562 Do not allow downcast in INT32 based types for normal 
Parquet reader") {
+withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> "false") {
+  withTempPath { file =>
+(1 to 
4).map(Tuple1(_)).toDF("a").write.parquet(file.getAbsolutePath)
--- End diff --

Cool, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14235
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14235
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62420/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14235
  
**[Test build #62420 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62420/consoleFull)**
 for PR 14235 at commit 
[`a3bb306`](https://github.com/apache/spark/commit/a3bb306af576834d93394d4ac636c5b52b6d745a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14223: [SPARK-10614][CORE] Change SystemClock to derive time fr...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14223
  
**[Test build #62421 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62421/consoleFull)**
 for PR 14223 at commit 
[`3dec78a`](https://github.com/apache/spark/commit/3dec78a28147588ce7d88eb4e5790265af8ca28b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14223: [SPARK-10614][CORE] Change SystemClock to derive time fr...

2016-07-16 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/14223
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14235
  
Oh, thank you for fast review!
Yep. I will update like that.
The purpose of this PR is having stronger `LogicalPlanToSQLSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14235
  
hm the problem with this approach is that we'd need to spend a lot of time 
to update test cases whenever we change sql generation slightly. I think in 
order to do this, we should put the generated sql in  files and then have a way 
to regenerate all the sql queries in bulk.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14235
  
**[Test build #62420 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62420/consoleFull)**
 for PR 14235 at commit 
[`a3bb306`](https://github.com/apache/spark/commit/a3bb306af576834d93394d4ac636c5b52b6d745a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-16 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/14235

[SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite to check generated 
SQL directly

## What changes were proposed in this pull request?

This issue improves `LogicalPlanToSQLSuite` to check the generated SQL 
directly by **structure**. So far, `LogicalPlanToSQLSuite` relies on  
`checkHiveQl` to ensure the **successful SQL generation** and **answer 
equality**. However, it does not guarantee the generated SQL is the same or 
will not be changed unnoticeabley.

The following is an example result of this issue. 
```scala
-checkHiveQl("SELECT * FROM parquet_t0 TABLESAMPLE(0.1 PERCENT) WHERE 
1=0")
+checkHiveQl("SELECT * FROM parquet_t0 TABLESAMPLE(0.1 PERCENT) WHERE 
1=0",
+  """
+|SELECT `gen_attr` AS `id`
+|FROM (SELECT `gen_attr`
+|  FROM (SELECT `id` AS `gen_attr`
+|FROM `default`.`parquet_t0`
+|TABLESAMPLE(0.1 PERCENT))
+|AS gen_subquery_0
+|  WHERE (1 = 0))
+|  AS parquet_t0
+  """.stripMargin)
```


## How was this patch tested?

Pass the Jenkins. This is only a testsuite change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-16590

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14235


commit a3bb306af576834d93394d4ac636c5b52b6d745a
Author: Dongjoon Hyun 
Date:   2016-07-17T01:39:18Z

[SPARK-16590][SQL] Improve LogicalPlanToSQLSuite to check generated SQL 
directly




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62419 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62419/consoleFull)**
 for PR 14177 at commit 
[`bec4b33`](https://github.com/apache/spark/commit/bec4b3372d8e861d8b3f7c04cf4675a02918808f).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62419/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14177
  
I just realized that my local build was not using the hive profile. If this 
fails on Jenkins let's just go back to the original PR. Also I wonder if this 
is something we should notify the SQL commiters about


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14179: [SPARK-16055][SPARKR] warning added while using s...

2016-07-16 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14179#discussion_r71074245
  
--- Diff: R/pkg/R/sparkR.R ---
@@ -155,6 +155,10 @@ sparkR.sparkContext <- function(
 
   existingPort <- Sys.getenv("EXISTING_SPARKR_BACKEND_PORT", "")
   if (existingPort != "") {
+if (length(sparkPackages) != 0) {
--- End diff --

checking 
```
if (length(packages) != 0)
```

sounds like a much better idea!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62419 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62419/consoleFull)**
 for PR 14177 at commit 
[`bec4b33`](https://github.com/apache/spark/commit/bec4b3372d8e861d8b3f7c04cf4675a02918808f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13785: [SPARK-15613] set correct TimeZone in test cases ...

2016-07-16 Thread ckadner
Github user ckadner closed the pull request at:

https://github.com/apache/spark/pull/13785


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13785: [SPARK-15613] set correct TimeZone in test cases "to UTC...

2016-07-16 Thread ckadner
Github user ckadner commented on the issue:

https://github.com/apache/spark/pull/13785
  
Closing this PR since another more comprehensive fix was committed from PR 
#13784 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14177
  
I didn't think that should be needed since SparkSession was created with 
enableHiveSupport = F. Let me try that too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14177
  
Hmm ok - The only difference in the patch I tried out locally is that I had 
the `sleep` in the loop test case. Did you remove that for some other reason ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13785: [SPARK-15613] set correct TimeZone in test cases "to UTC...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13785
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14177
  
Jenkins failed with:
```
123456789a.bcdefghijklmnopqrstuvwxyzABCDS...EFGHIJKLMNOPQRSTUVW
[Stage 61:> (0 + 0) 
/ 2]


XYZSError in invokeJava(isStatic = TRUE, className, methodName, 
...) : 
  java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
at 
org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
at 
org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
at 
org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
at org.apache.spark.sql.hive.HiveSharedState.externa
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62418 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62418/consoleFull)**
 for PR 14177 at commit 
[`c7e7592`](https://github.com/apache/spark/commit/c7e7592090a2e3ef84029bc9112ce27abf085e40).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62418/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62417/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62417 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62417/consoleFull)**
 for PR 14177 at commit 
[`03f163a`](https://github.com/apache/spark/commit/03f163a660f7a4abd9d99449524396ad91830e24).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14177
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62418 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62418/consoleFull)**
 for PR 14177 at commit 
[`c7e7592`](https://github.com/apache/spark/commit/c7e7592090a2e3ef84029bc9112ce27abf085e40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073776
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -1093,12 +1067,10 @@ spark.streams().awaitAnyTermination()   # block 
until any one of them terminates
 
 
 
-Finally, for asynchronous monitoring of streaming queries, you can create 
and attach a `StreamingQueryListener` (

-[Scala](api/scala/index.html#org.apache.spark.sql.streaming.StreamingQueryListener)/

-[Java](api/java/org/apache/spark/sql/streaming/StreamingQueryListener.html) 
docs), which will give you regular callback-based updates when queries are 
started and terminated.
+Finally, for asynchronous monitoring of streaming queries, you can create 
and attach a `StreamingQueryListener` 
([Scala](api/scala/index.html#org.apache.spark.sql.streaming.StreamingQueryListener)/[Java](api/java/org/apache/spark/sql/streaming/StreamingQueryListener.html)
 docs), which will give you regular callback-based updates when queries are 
started and terminated.
 
 ## Recovering from Failures with Checkpointing 
-In case of a failure or intentional shutdown, you can recover the previous 
progress and state of a previous query, and continue where it left off. This is 
done using checkpointing and write ahead logs. You can configure a query with a 
checkpoint location, and the query will save all the progress information (i.e. 
range of offsets processed in each trigger), and the running aggregates (e.g. 
word counts in the quick example) will be saved the checkpoint location. As of 
Spark 2.0, this checkpoint location has to be a path in a HDFS compatible file 
system, and can be set as an option in the DataStreamWriter when [starting a 
query](#starting-streaming-queries). 
--- End diff --

Added anchor `[quick example](#quick-example)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073767
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -620,16 +603,14 @@ df.groupBy("type").count()
 ### Window Operations on Event Time
 Aggregations over a sliding event-time window are straightforward with 
Structured Streaming. The key idea to understand about window-based 
aggregations are very similar to grouped aggregations. In a grouped 
aggregation, aggregate values (e.g. counts) are maintained for each unique 
value in the user-specified grouping column. In case of window-based 
aggregations, aggregate values are maintained for each window the event-time of 
a row falls into. Let's understand this with an illustration. 
 
-Imagine our quick example is modified and the stream now contains lines 
along with the time when the line was generated. Instead of running word 
counts, we want to count words within 10 minute windows, updating every 5 
minutes. That is, word counts in words received between 10 minute windows 12:00 
- 12:10, 12:05 - 12:15, 12:10 - 12:20, etc. Note that 12:00 - 12:10 means data 
that arrived after 12:00 but before 12:10. Now, consider a word that was 
received at 12:07. This word should increment the counts corresponding to two 
windows 12:00 - 12:10 and 12:05 - 12:15. So the counts will be indexed by both, 
the grouping key (i.e. the word) and the window (can be calculated from the 
event-time).
--- End diff --

Added anchor `[quick example](#quick-example)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073763
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -519,10 +502,10 @@ csvDF = spark \
 
 
 
-These examples generate streaming DataFrames that are untyped, meaning 
that the schema of the DataFrame is not checked at compile time, only checked 
at runtime when the query is submitted. Some operations like `map`, `flatMap`, 
etc. need the type to be known at compile time. To do those, you can convert 
these untyped streaming DataFrames to typed streaming Datasets using the same 
methods as static DataFrame. See the SQL Programming Guide for more details. 
Additionally, more details on the supported streaming sources are discussed 
later in the document.
--- End diff --

Added link `[SQL Programming Guide](sql-programming-guide.html)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14177
  
**[Test build #62417 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62417/consoleFull)**
 for PR 14177 at commit 
[`03f163a`](https://github.com/apache/spark/commit/03f163a660f7a4abd9d99449524396ad91830e24).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14177
  
No luck, but I push that change to see if it works better in Jenkins - is 
that what you are referring to?

I'm consistently getting these errors:
```
Error in invokeJava(isStatic = TRUE, className, methodName, ...) :
  java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
at 
org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
at 
org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
at 
org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
at org.apache.spark.sql.hive.HiveSharedState.externa
Calls: test_package ... with_reporter -> force -> source_file -> eval -> 
eval

java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358)
at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
at 
org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
at 
org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
at 
org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
at 
org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
at 
org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
at 
org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
at 
org.apache.spark.sql.hive.HiveSessionState$$anon$1.(HiveSessionState.scala:63)
at 
org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
at 
org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at 
org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:527)
at 
org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:291)
at org.apache.spark.sql.api.r.SQLUtils$.createDF(SQLUtils.scala:139)
at org.apache.spark.sql.api.r.SQLUtils.createDF(SQLUtils.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
at 
org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:86)
at 
org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:38)
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at 

[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073751
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -439,7 +422,7 @@ Here are some examples.
 
 
 {% highlight scala %}
-val spark: SparkSession = … 
--- End diff --

Using same convention; it is `...` everywhere


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073746
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -410,26 +398,21 @@ see how this model handles event-time based 
processing and late arriving data.
 ## Handling Event-time and Late Data
 Event-time is the time embedded in the data itself. For many applications, 
you may want to operate on this event-time. For example, if you want to get the 
number of events generated by IoT devices every minute, then you probably want 
to use the time when the data was generated (that is, event-time in the data), 
rather than the time Spark receives them. This event-time is very naturally 
expressed in this model -- each event from the devices is a row in the table, 
and event-time is a column value in the row. This allows window-based 
aggregations (e.g. number of event every minute) to be just a special type of 
grouping and aggregation on the even-time column -- each time window is a group 
and each row can belong to multiple windows/groups. Therefore, such 
event-time-window-based aggregation queries can be defined consistently on both 
a static dataset (e.g. from collected device events logs) as well as on a data 
stream, making the life of the user much easier.
 
-Furthermore this model naturally handles data that has arrived later than 
expected based on its event-time. Since Spark is updating the Result Table, it 
has full control over updating/cleaning up the aggregates when there is late 
data. While not yet implemented in Spark 2.0, event-time watermarking will be 
used to manage this data. These are explained later in more details in the 
[Window Operations](#window-operations-on-event-time) section.
+Furthermore, this model naturally handles data that has arrived later than 
expected based on its event-time. Since Spark is updating the Result Table, it 
has full control over updating/cleaning up the aggregates when there is late 
data. While not yet implemented in Spark 2.0, event-time watermarking will be 
used to manage this data. These are explained later in more details in the 
[Window Operations](#window-operations-on-event-time) section.
 
 ## Fault Tolerance Semantics
 Delivering end-to-end exactly-once semantics was one of key goals behind 
the design of Structured Streaming. To achieve that, we have designed the 
Structured Streaming sources, the sinks and the execution engine to reliably 
track the exact progress of the processing so that it can handle any kind of 
failure by restarting and/or reprocessing. Every streaming source is assumed to 
have offsets (similar to Kafka offsets, or Kinesis sequence numbers)
 to track the read position in the stream. The engine uses checkpointing 
and write ahead logs to record the offset range of the data being processed in 
each trigger. The streaming sinks are designed to be idempotent for handling 
reprocessing. Together, using replayable sources and idempotant sinks, 
Structured Streaming can ensure **end-to-end exactly-once semantics** under any 
failure.
 
 # API using Datasets and DataFrames
-Since Spark 2.0, DataFrames and Datasets can represent static, bounded 
data, as well as streaming, unbounded data. Similar to static 
Datasets/DataFrames, you can use the common entry point `SparkSession` (
-[Scala](api/scala/index.html#org.apache.spark.sql.SparkSession)/
-[Java](api/java/org/apache/spark/sql/SparkSession.html)/
-[Python](api/python/pyspark.sql.html#pyspark.sql.SparkSession) docs) to 
create streaming DataFrames/Datasets from streaming sources, and apply the same 
operations on them as static DataFrames/Datasets. If you are not familiar with 
Datasets/DataFrames, you are strongly advised to familiarize yourself with them 
using the 
+Since Spark 2.0, DataFrames and Datasets can represent static, bounded 
data, as well as streaming, unbounded data. Similar to static 
Datasets/DataFrames, you can use the common entry point `SparkSession` 
([Scala](api/scala/index.html#org.apache.spark.sql.SparkSession)/[Java](api/java/org/apache/spark/sql/SparkSession.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.SparkSession)
 docs) to create streaming DataFrames/Datasets from streaming sources, and 
apply the same operations on them as static DataFrames/Datasets. If you are not 
familiar with Datasets/DataFrames, you are strongly advised to familiarize 
yourself with them using the 
 [DataFrame/Dataset Programming Guide](sql-programming-guide.html).
 
 ## Creating streaming DataFrames and streaming Datasets
 Streaming DataFrames can be created through the `DataStreamReader` 
interface 

-([Scala](api/scala/index.html#org.apache.spark.sql.streaming.DataStreamReader)/
-[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/


[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073739
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -410,26 +398,21 @@ see how this model handles event-time based 
processing and late arriving data.
 ## Handling Event-time and Late Data
 Event-time is the time embedded in the data itself. For many applications, 
you may want to operate on this event-time. For example, if you want to get the 
number of events generated by IoT devices every minute, then you probably want 
to use the time when the data was generated (that is, event-time in the data), 
rather than the time Spark receives them. This event-time is very naturally 
expressed in this model -- each event from the devices is a row in the table, 
and event-time is a column value in the row. This allows window-based 
aggregations (e.g. number of event every minute) to be just a special type of 
grouping and aggregation on the even-time column -- each time window is a group 
and each row can belong to multiple windows/groups. Therefore, such 
event-time-window-based aggregation queries can be defined consistently on both 
a static dataset (e.g. from collected device events logs) as well as on a data 
stream, making the life of the user much easier.
 
-Furthermore this model naturally handles data that has arrived later than 
expected based on its event-time. Since Spark is updating the Result Table, it 
has full control over updating/cleaning up the aggregates when there is late 
data. While not yet implemented in Spark 2.0, event-time watermarking will be 
used to manage this data. These are explained later in more details in the 
[Window Operations](#window-operations-on-event-time) section.
+Furthermore, this model naturally handles data that has arrived later than 
expected based on its event-time. Since Spark is updating the Result Table, it 
has full control over updating/cleaning up the aggregates when there is late 
data. While not yet implemented in Spark 2.0, event-time watermarking will be 
used to manage this data. These are explained later in more details in the 
[Window Operations](#window-operations-on-event-time) section.
 
 ## Fault Tolerance Semantics
 Delivering end-to-end exactly-once semantics was one of key goals behind 
the design of Structured Streaming. To achieve that, we have designed the 
Structured Streaming sources, the sinks and the execution engine to reliably 
track the exact progress of the processing so that it can handle any kind of 
failure by restarting and/or reprocessing. Every streaming source is assumed to 
have offsets (similar to Kafka offsets, or Kinesis sequence numbers)
 to track the read position in the stream. The engine uses checkpointing 
and write ahead logs to record the offset range of the data being processed in 
each trigger. The streaming sinks are designed to be idempotent for handling 
reprocessing. Together, using replayable sources and idempotant sinks, 
Structured Streaming can ensure **end-to-end exactly-once semantics** under any 
failure.
 
 # API using Datasets and DataFrames
-Since Spark 2.0, DataFrames and Datasets can represent static, bounded 
data, as well as streaming, unbounded data. Similar to static 
Datasets/DataFrames, you can use the common entry point `SparkSession` (
-[Scala](api/scala/index.html#org.apache.spark.sql.SparkSession)/
--- End diff --

many cases like this later


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073736
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -410,26 +398,21 @@ see how this model handles event-time based 
processing and late arriving data.
 ## Handling Event-time and Late Data
 Event-time is the time embedded in the data itself. For many applications, 
you may want to operate on this event-time. For example, if you want to get the 
number of events generated by IoT devices every minute, then you probably want 
to use the time when the data was generated (that is, event-time in the data), 
rather than the time Spark receives them. This event-time is very naturally 
expressed in this model -- each event from the devices is a row in the table, 
and event-time is a column value in the row. This allows window-based 
aggregations (e.g. number of event every minute) to be just a special type of 
grouping and aggregation on the even-time column -- each time window is a group 
and each row can belong to multiple windows/groups. Therefore, such 
event-time-window-based aggregation queries can be defined consistently on both 
a static dataset (e.g. from collected device events logs) as well as on a data 
stream, making the life of the user much easier.
 
-Furthermore this model naturally handles data that has arrived later than 
expected based on its event-time. Since Spark is updating the Result Table, it 
has full control over updating/cleaning up the aggregates when there is late 
data. While not yet implemented in Spark 2.0, event-time watermarking will be 
used to manage this data. These are explained later in more details in the 
[Window Operations](#window-operations-on-event-time) section.
+Furthermore, this model naturally handles data that has arrived later than 
expected based on its event-time. Since Spark is updating the Result Table, it 
has full control over updating/cleaning up the aggregates when there is late 
data. While not yet implemented in Spark 2.0, event-time watermarking will be 
used to manage this data. These are explained later in more details in the 
[Window Operations](#window-operations-on-event-time) section.
 
 ## Fault Tolerance Semantics
 Delivering end-to-end exactly-once semantics was one of key goals behind 
the design of Structured Streaming. To achieve that, we have designed the 
Structured Streaming sources, the sinks and the execution engine to reliably 
track the exact progress of the processing so that it can handle any kind of 
failure by restarting and/or reprocessing. Every streaming source is assumed to 
have offsets (similar to Kafka offsets, or Kinesis sequence numbers)
 to track the read position in the stream. The engine uses checkpointing 
and write ahead logs to record the offset range of the data being processed in 
each trigger. The streaming sinks are designed to be idempotent for handling 
reprocessing. Together, using replayable sources and idempotant sinks, 
Structured Streaming can ensure **end-to-end exactly-once semantics** under any 
failure.
 
 # API using Datasets and DataFrames
-Since Spark 2.0, DataFrames and Datasets can represent static, bounded 
data, as well as streaming, unbounded data. Similar to static 
Datasets/DataFrames, you can use the common entry point `SparkSession` (
-[Scala](api/scala/index.html#org.apache.spark.sql.SparkSession)/
--- End diff --

( Scala/ Java/ Python docs) to (Scala/Java/Python docs)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073721
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -223,7 +211,7 @@ $ ./bin/run-example 
org.apache.spark.examples.sql.streaming.JavaStructuredNetwor
 {% endhighlight %}
 
 
- {% highlight bash %}   
--- End diff --

The trailing spaces add unnecessary line to the snippet


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073714
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -65,11 +51,13 @@ val words = lines.as[String].flatMap(_.split(" "))
 val wordCounts = words.groupBy("value").count()
 {% endhighlight %}
 
-This `lines` DataFrame represents an unbounded table containing the 
streaming text data. This table contains one column of strings named 
“value”, and each line in the streaming text data becomes a row in the 
table. Note, that this is not currently receiving any data as we are just 
setting up the transformation, and have not yet started it. Next, we have 
converted the DataFrame to a  Dataset of String using `.as(Encoders.STRING())`, 
so that we can apply the `flatMap` operation to split each line into multiple 
words. The resultant `words` Dataset contains all the words. Finally, we have 
defined the `wordCounts` DataFrame by grouping by the unique values in the 
Dataset and counting them. Note that this is a streaming DataFrame which 
represents the running word counts of the stream.
--- End diff --

`.as(Encoders.STRING())`, java's, changed to `.as[String]`, scala's


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073700
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -82,8 +70,6 @@ SparkSession spark = SparkSession
 .builder()
 .appName("JavaStructuredNetworkWordCount")
 .getOrCreate();
-
-import spark.implicits._
--- End diff --

Moved to `Scala` snippet


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14173: [SPARKR][SPARK-16507] Add a CRAN checker, fix Rd ...

2016-07-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14173


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
Github user ahmed-mahran commented on a diff in the pull request:

https://github.com/apache/spark/pull/14234#discussion_r71073691
  
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -14,29 +14,13 @@ Structured Streaming is a scalable and fault-tolerant 
stream processing engine b
 
 # Quick Example
 Let’s say you want to maintain a running word count of text data 
received from a data server listening on a TCP socket. Let’s see how you can 
express this using Structured Streaming. You can see the full code in 

-[Scala]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredNetworkWordCount.scala)/

-[Java]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredNetworkWordCount.java)/

-[Python]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/python/sql/streaming/structured_network_wordcount.py).
 And if you 
-[download Spark](http://spark.apache.org/downloads.html), you can directly 
run the example. In any case, let’s walk through the example step-by-step and 
understand how it works. First, we have to import the necessary classes and 
create a local SparkSession, the starting point of all functionalities related 
to Spark.

+[Scala]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredNetworkWordCount.scala)/[Java]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredNetworkWordCount.java)/[Python]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/python/sql/streaming/structured_network_wordcount.py).
 And if you 
+[download Spark](http://spark.apache.org/downloads.html), you can directly 
run the example. In any case, let’s walk through the example step-by-step and 
understand how it works.
 
 
 
 
-
--- End diff --

Removing empty `` elements


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuati...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14234
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, pu...

2016-07-16 Thread ahmed-mahran
GitHub user ahmed-mahran opened a pull request:

https://github.com/apache/spark/pull/14234

[MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuations and grammar

## What changes were proposed in this pull request?

Minor fixes correcting some typos, punctuations, grammar.
Adding more anchors for easy navigation.
Fixing minor issues with code snippets.


## How was this patch tested?

`jekyll serve`


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ahmed-mahran/spark b-struct-streaming-docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14234.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14234


commit 4b566b1b5d21af24032701c17b41c7e411659b92
Author: Ahmed Mahran 
Date:   2016-07-16T23:58:02Z

Fix minor typos, punctuations and grammar

Minor fixes correcting some typos, punctuations, grammar. Adding
more anchors for easy navigation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14173: [SPARKR][SPARK-16507] Add a CRAN checker, fix Rd aliases

2016-07-16 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14173
  
Cool. Merging this to master, branch-2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14179: [SPARK-16055][SPARKR] warning added while using s...

2016-07-16 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/14179#discussion_r71073655
  
--- Diff: R/pkg/R/sparkR.R ---
@@ -155,6 +155,10 @@ sparkR.sparkContext <- function(
 
   existingPort <- Sys.getenv("EXISTING_SPARKR_BACKEND_PORT", "")
   if (existingPort != "") {
+if (length(sparkPackages) != 0) {
--- End diff --

I looked into this and we get the unit test error because the check here 
isn't correct. So we get `sparkPackages`  as `""` and in R the length of empty 
string is not 0 but 1. 
```
> length("")
[1] 1
```
I think we should instead check the length of `packages` to be zero  ? 
(`packages` is a list created by splitting the input in `processSparkPackages`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...

2016-07-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14090


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...

2016-07-16 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14090
  
Merging this to master, branch-2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...

2016-07-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14090
  
LGTM. thanks for putting this together!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14179: [SPARK-16055][SPARKR] warning added while using sparkPac...

2016-07-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14179
  
A couple of test files are reusing an existing SparkSession/SparkContext by 
calling `sparkR.sparkContext` I think they are now hitting the warning 
statement that is added in this PR, somehow. I'm not sure which ones are 
calling `sparkR.sparkContext` with `sparkPackages` set. If you find them you 
are add `suppressWarnings()` around it.

```
SerDe functionality : Error in sparkR.sparkContext(master, appName, 
sparkHome, sparkConfigMap,  : 
  (converted from warning) sparkPackages has no effect when using 
spark-submit or sparkR shell,please use the --packages commandline instead
Calls: test_package ... eval -> eval -> sparkR.session -> 
sparkR.sparkContext
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14173: [SPARKR][SPARK-16507] Add a CRAN checker, fix Rd aliases

2016-07-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14173
  
Yap - I'll check if there's anything left as per SPARK-16508 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14210: [SPARK-16556] [SPARK-16559] [SQL] Fix Two Bugs in...

2016-07-16 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14210#discussion_r71072892
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1264,6 +1265,29 @@ class DDLSuite extends QueryTest with 
SharedSQLContext with BeforeAndAfterEach {
 }
   }
 
+  test("create table using cluster by without schema specification") {
--- End diff --

s/cluster/clustered? Even an uppercase variant CLUSTERED BY as in the other 
descriptions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14210: [SPARK-16556] [SPARK-16559] [SQL] Fix Two Bugs in...

2016-07-16 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14210#discussion_r71072862
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/CreateTableAsSelectSuite.scala
 ---
@@ -199,7 +200,7 @@ class CreateTableAsSelectSuite extends DataSourceTest 
with SharedSQLContext with
 }
   }
 
-  test("create table using as select - with bucket") {
+  test("create table using as select - with non-zero bucket") {
--- End diff --

s/bucket/buckets?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14210: [SPARK-16556] [SPARK-16559] [SQL] Fix Two Bugs in...

2016-07-16 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14210#discussion_r71072854
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/CreateTableAsSelectSuite.scala
 ---
@@ -212,7 +213,23 @@ class CreateTableAsSelectSuite extends DataSourceTest 
with SharedSQLContext with
   )
   val table = catalog.getTableMetadata(TableIdentifier("t"))
   assert(DDLUtils.getBucketSpecFromTableProperties(table) ==
-Some(BucketSpec(5, Seq("a"), Seq("b"
+Option(BucketSpec(5, Seq("a"), Seq("b"
+}
+  }
+
+  test("create table using as select - with zero bucket") {
--- End diff --

s/bucket/buckets?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14210: [SPARK-16556] [SPARK-16559] [SQL] Fix Two Bugs in...

2016-07-16 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14210#discussion_r71072842
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1264,6 +1265,29 @@ class DDLSuite extends QueryTest with 
SharedSQLContext with BeforeAndAfterEach {
 }
   }
 
+  test("create table using cluster by without schema specification") {
+import testImplicits._
+withTempPath { tempDir =>
+  withTable("jsonTable") {
+(("a", "b") :: 
Nil).toDF().toJSON.rdd.saveAsTextFile(tempDir.getCanonicalPath)
--- End diff --

Why don't you use `toDF.write.json` instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14217: [SPARK-16562][SQL] Do not allow downcast in INT32...

2016-07-16 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/14217#discussion_r71072773
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
 ---
@@ -169,6 +169,19 @@ class ParquetIOSuite extends QueryTest with 
ParquetTest with SharedSQLContext {
 }
   }
 
+  test("SPARK-16562 Do not allow downcast in INT32 based types for normal 
Parquet reader") {
+withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> "false") {
+  withTempPath { file =>
+(1 to 
4).map(Tuple1(_)).toDF("a").write.parquet(file.getAbsolutePath)
--- End diff --

Why do you `map(Tuple1(_))`? Why don't `(1 to 4).toDF("a")` instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...

2016-07-16 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14090
  
Thanks @NarineK - I tried it on a fresh Ubuntu VM and it rendered fine. I 
think it has something to do with ruby / jekyll versions. The rendered docs 
looked fine on the Ubuntu VM

LGTM. @felixcheung Could you also take one final look ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14098
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14098
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62416/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14098
  
**[Test build #62416 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62416/consoleFull)**
 for PR 14098 at commit 
[`68e65fa`](https://github.com/apache/spark/commit/68e65fa5675eed0359e7d8c36020cabc9b2cea47).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14098
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62415/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14098
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14098
  
**[Test build #62415 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62415/consoleFull)**
 for PR 14098 at commit 
[`91a2e10`](https://github.com/apache/spark/commit/91a2e1067b24df48f25bb540248d27bd9cc57f9c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14179: [SPARK-16055][SPARKR] warning added while using sparkPac...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14179
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62414/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14179: [SPARK-16055][SPARKR] warning added while using sparkPac...

2016-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14179
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14179: [SPARK-16055][SPARKR] warning added while using sparkPac...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14179
  
**[Test build #62414 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62414/consoleFull)**
 for PR 14179 at commit 
[`b425540`](https://github.com/apache/spark/commit/b425540018881be644697c3a1468a3f5bebb9d9d).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14098
  
**[Test build #62416 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62416/consoleFull)**
 for PR 14098 at commit 
[`68e65fa`](https://github.com/apache/spark/commit/68e65fa5675eed0359e7d8c36020cabc9b2cea47).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14098
  
**[Test build #62415 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62415/consoleFull)**
 for PR 14098 at commit 
[`91a2e10`](https://github.com/apache/spark/commit/91a2e1067b24df48f25bb540248d27bd9cc57f9c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14179: [SPARK-16055][SPARKR] warning added while using sparkPac...

2016-07-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14179
  
**[Test build #62414 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62414/consoleFull)**
 for PR 14179 at commit 
[`b425540`](https://github.com/apache/spark/commit/b425540018881be644697c3a1468a3f5bebb9d9d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >