date:20141004

[GitHub] spark pull request: [SPARK-3787] Assembly jar name is wrong when w...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2647#issuecomment-57896058
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21286/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3787] Assembly jar name is wrong when w...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2647#issuecomment-57896055
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21286/consoleFull)
 for   PR 2647 at commit 
[`5fc1259`](https://github.com/apache/spark/commit/5fc12597afe5964c7b9f688fd2919426b928b3ec).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3790][MLlib] CosineSimilarity Example

2014-10-04 Thread rezazadeh

Github user rezazadeh commented on the pull request:

https://github.com/apache/spark/pull/2622#issuecomment-57896484
  
Parameters are now configurable.
Added approximation error reporting.
Added JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3790][MLlib] CosineSimilarity Example

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2622#issuecomment-57896544
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21287/consoleFull)
 for   PR 2622 at commit 
[`eca3dfd`](https://github.com/apache/spark/commit/eca3dfd62c1ce3643ef03b44f79c3e840b27a390).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Spark-3525] Adding gradient boosting

2014-10-04 Thread epahomov

Github user epahomov closed the pull request at:

https://github.com/apache/spark/pull/2394


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57897337
  
@davis Thanks for all the suggestions, really makes things a lot cleaner!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2576#issuecomment-57897374
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21288/consoleFull)
 for   PR 2576 at commit 
[`f928657`](https://github.com/apache/spark/commit/f92865707782387bb59c2c66a73d68eb8b030fa8).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57897375
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21289/consoleFull)
 for   PR 2563 at commit 
[`54c46ce`](https://github.com/apache/spark/commit/54c46ce607c521df4bea390d3cac7d42a6f006f8).
 * This patch **does not** merge cleanly!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2576#issuecomment-57897385
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21288/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57897538
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21291/consoleFull)
 for   PR 2563 at commit 
[`785b683`](https://github.com/apache/spark/commit/785b6834e4f0ea24a3b5be4c55d675b8687b12c9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2576#issuecomment-57897540
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21290/consoleFull)
 for   PR 2576 at commit 
[`1505af4`](https://github.com/apache/spark/commit/1505af48becdca9f17d84d9c7a0ef7f03dbc4e8a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2576#issuecomment-57897553
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21290/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3790][MLlib] CosineSimilarity Example

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2622#issuecomment-57897798
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21287/consoleFull)
 for   PR 2622 at commit 
[`eca3dfd`](https://github.com/apache/spark/commit/eca3dfd62c1ce3643ef03b44f79c3e840b27a390).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3790][MLlib] CosineSimilarity Example

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2622#issuecomment-57897801
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21287/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57898946
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21291/consoleFull)
 for   PR 2563 at commit 
[`785b683`](https://github.com/apache/spark/commit/785b6834e4f0ea24a3b5be4c55d675b8687b12c9).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57898948
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21291/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57899622
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21289/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57899624
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21289/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2563#issuecomment-57900055
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/265/consoleFull)
 for   PR 2563 at commit 
[`785b683`](https://github.com/apache/spark/commit/785b6834e4f0ea24a3b5be4c55d675b8687b12c9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2513#issuecomment-57902212
  
Rebased to the master, with the new `CACHE LAZY TABLE t` syntax.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2513#issuecomment-57902286
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21292/consoleFull)
 for   PR 2513 at commit 
[`fe92287`](https://github.com/apache/spark/commit/fe922870ec9b16d053621d37ddb847a89502087c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2576#issuecomment-57902624
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21293/consoleFull)
 for   PR 2576 at commit 
[`1db30b1`](https://github.com/apache/spark/commit/1db30b1d9e9b24fb5bd0933855b65f99f8ae715b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57902670
  
Hi, @liancheng, master branch test failed in my machine for all dynamic 
partition , 
[info] - dynamic_partition *** FAILED ***
[info] - Dynamic partition folder layout *** FAILED ***
[info] - dynamic_partition_skip_default *** FAILED ***
[info] - load_dyn_part1 *** FAILED ***
[info] - load_dyn_part10 *** FAILED ***
[info] - load_dyn_part11 *** FAILED ***
[info] - load_dyn_part12 *** FAILED ***
[info] - load_dyn_part13 *** FAILED ***
[info] - load_dyn_part14 *** FAILED ***
[info] - load_dyn_part14_win *** FAILED ***
[info] - load_dyn_part2 *** FAILED ***
[info] - load_dyn_part3 *** FAILED ***
[info] - load_dyn_part4 *** FAILED ***
[info] - load_dyn_part5 *** FAILED ***
[info] - load_dyn_part6 *** FAILED ***
[info] - load_dyn_part8 *** FAILED ***
[info] - load_dyn_part9 *** FAILED ***
[info] *** 17 TESTS FAILED ***

Detail log---
[info] - dynamic_partition *** FAILED ***
[info]   Failed to execute query using catalyst:
[info]   Error: get partition: Value for key partcol1 is null or empty
[info]   org.apache.hadoop.hive.ql.metadata.HiveException: get partition: 
Value for key partcol1 is null or empty
[info]  at 
org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1585)
[info]  at 
org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1556)
[info]  at 
org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1189)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-04 Thread scwf

GitHub user scwf opened a pull request:

https://github.com/apache/spark/pull/2652

[SPARK-3792][SQL]enable JavaHiveQLSuite

Do not use TestSQLContext in JavaHiveQLSuite, that may lead to two 
SparkContext and enable JavaHiveQLSuite

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/scwf/spark fix-JavaHiveQLSuite

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2652.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2652


commit be35c919f2e4701775a69c1ce09831cb205037de
Author: scwf wangf...@huawei.com
Date:   2014-10-04T07:18:39Z

enable JavaHiveQLSuite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2652#issuecomment-57903264
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2513#issuecomment-57903715
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21292/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2513#issuecomment-57903713
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21292/consoleFull)
 for   PR 2513 at commit 
[`fe92287`](https://github.com/apache/spark/commit/fe922870ec9b16d053621d37ddb847a89502087c).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class CacheTableCommand(tableName: String, plan: 
Option[LogicalPlan], isLazy: Boolean)`
  * `case class UncacheTableCommand(tableName: String) extends Command`
  * `case class CacheTableCommand(`
  * `case class UncacheTableCommand(tableName: String) extends LeafNode 
with Command `
  * `case class DescribeCommand(child: SparkPlan, output: Seq[Attribute])(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2576#issuecomment-57905623
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21293/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1980#issuecomment-57905663
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21294/consoleFull)
 for   PR 1980 at commit 
[`baaa1ce`](https://github.com/apache/spark/commit/baaa1ce05fcb426de7d3002a5cc30f18ae119d34).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-10-04 Thread MLnick

Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/1297#issuecomment-57905819
  
This looks really interesting. Is there a blocker for supporting generic 
keys (or at least say `String`), or is that a performance issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3770: Make userFeatures accessible from ...

2014-10-04 Thread MLnick

Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/2636#issuecomment-57905979
  
Can we use the existing `pairRDDToPython ` function? 


https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/python/SerDeUtil.scala#L120


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3773][PySpark][Doc] Sphinx build warnin...

2014-10-04 Thread cocoatomo

GitHub user cocoatomo opened a pull request:

https://github.com/apache/spark/pull/2653

[SPARK-3773][PySpark][Doc] Sphinx build warning

When building Sphinx documents for PySpark, we have 12 warnings.
Their causes are almost docstrings in broken ReST format.

To reproduce this issue, we should run following commands on the commit: 
6e27cb630de69fa5acb510b4e2f6b980742b1957.

```bash
$ cd ./python/docs
$ make clean html
...
/Users/user/MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of 
pyspark.SparkContext.sequenceFile:4: ERROR: Unexpected indentation.
/Users/user/MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of 
pyspark.RDD.saveAsSequenceFile:4: ERROR: Unexpected indentation.

/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring
 of pyspark.mllib.classification.LogisticRegressionWithSGD.train:14: ERROR: 
Unexpected indentation.

/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring
 of pyspark.mllib.classification.LogisticRegressionWithSGD.train:16: WARNING: 
Definition list ends without a blank line; unexpected unindent.

/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring
 of pyspark.mllib.classification.LogisticRegressionWithSGD.train:17: WARNING: 
Block quote ends without a blank line; unexpected unindent.

/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring
 of pyspark.mllib.classification.SVMWithSGD.train:14: ERROR: Unexpected 
indentation.

/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring
 of pyspark.mllib.classification.SVMWithSGD.train:16: WARNING: Definition list 
ends without a blank line; unexpected unindent.

/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring
 of pyspark.mllib.classification.SVMWithSGD.train:17: WARNING: Block quote ends 
without a blank line; unexpected unindent.
/Users/user/MyRepos/Scala/spark/python/docs/pyspark.mllib.rst:50: 
WARNING: missing attribute mentioned in :members: or __all__: module 
pyspark.mllib.regression, attribute RidgeRegressionModelLinearRegressionWithSGD
/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/tree.py:docstring of 
pyspark.mllib.tree.DecisionTreeModel.predict:3: ERROR: Unexpected indentation.
...
checking consistency... 
/Users/user/MyRepos/Scala/spark/python/docs/modules.rst:: WARNING: document 
isn't included in any toctree
...
copying static files... WARNING: html_static_path entry 
u'/Users/user/MyRepos/Scala/spark/python/docs/_static' does not exist
...
build succeeded, 12 warnings.
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cocoatomo/spark 
issues/3773-sphinx-build-warnings

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2653.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2653


commit 6f656618d7a6fe3f9977f6a1fb15350577388f06
Author: cocoatomo cocoatom...@gmail.com
Date:   2014-10-04T14:07:20Z

[SPARK-3773][PySpark][Doc] Sphinx build warning

Remove all warnings on document building




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3773][PySpark][Doc] Sphinx build warnin...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2653#issuecomment-57906805
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18429655
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
+   * Full support of new decimal feature need to be fixed in seperate PR.
+   */
+  val metastoreDecimal = decimal(10,0)
+
+  def getTableDesc(
+serdeClass: Class[_ : Deserializer],
+inputFormatClass: Class[_ : InputFormat[_, _]],
+outputFormatClass: Class[_],
+properties: Properties) = {
+new TableDesc(inputFormatClass, outputFormatClass, properties)
+  }
+
+  def getStatsSetupConstTotalSize = StatsSetupConst.TOTAL_SIZE
+
+  def createDefaultDBIfNeeded(context: HiveContext) ={
+context.runSqlHive(CREATE DATABASE default)
+context.runSqlHive(USE default)
+  }
+
+  /** The string used to denote an empty comments field in the schema. */
+  def getEmptyCommentsFieldValue = 
+
+  def getCommandProcessor(cmd: Array[String], conf: HiveConf) =  {
+CommandProcessorFactory.get(cmd, conf)
+  }
+
+  def createDecimal(bd: java.math.BigDecimal): HiveDecimal = {
+HiveDecimal.create(bd)
+  }
+
+  /*
+   * This function in hive-0.13 become private, but we have to do this to 
walkaround hive bug
+   */
+  private def appendReadColumnNames(conf: Configuration, cols: 
Seq[String]) {
+val old: String = 
conf.get(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, )
+val result: StringBuilder = new StringBuilder(old)
+var first: Boolean = old.isEmpty
+
+for (col - cols) {
+  if (first) {
+first = false
+  }
+  else {
+result.append(',')
+  }
+  result.append(col)
+}
+conf.set(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, 
result.toString)
+  }
+
+  /*
+   * Cannot use ColumnProjectionUtils.appendReadColumns directly, if ids 
is null or empty
+   */
+  def appendReadColumns(conf: Configuration, ids: Seq[Integer], names: 
Seq[String]) {
+if (ids != null  ids.size  0) {
+  ColumnProjectionUtils.appendReadColumns(conf, ids)
+}
+appendReadColumnNames(conf, names)
+  }
+
+  def getExternalTmpPath(context: Context, path: Path) = {
+context.getExternalTmpPath(path.toUri)
+  }
+
+  def getDataLocationPath(p: Partition) = p.getDataLocation
+
+  def getAllPartitionsOf(client: Hive, tbl: Table) =  
client.getAllPartitionsOf(tbl)
+
+  /*
+   * Bug introdiced in hive-0.13. FileSinkDesc is serilizable, but its 
member path is not.
+   * Fix it through wrapper.
--- End diff --

I am pretty confused about it. I think Hive needs to serialize FileSinkDesc 
when the query plan

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1980#issuecomment-57907416
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21294/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1980#issuecomment-57907414
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21294/consoleFull)
 for   PR 1980 at commit 
[`baaa1ce`](https://github.com/apache/spark/commit/baaa1ce05fcb426de7d3002a5cc30f18ae119d34).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57907458
  
@scwf Can you elaborate on what configurations you're using? Details like 
compilation flags, environment variables and building process can be helpful. 
I've been tracking this failure during the last a few days but couldn't 
reproduce it either locally or on Jenkins PR builder.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-04 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2513#discussion_r18429700
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -91,42 +92,42 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   protected val IN = Keyword(IN)
   protected val INNER = Keyword(INNER)
   protected val INSERT = Keyword(INSERT)
+  protected val INTERSECT = Keyword(INTERSECT)
   protected val INTO = Keyword(INTO)
   protected val IS = Keyword(IS)
   protected val JOIN = Keyword(JOIN)
+  protected val LAST = Keyword(LAST)
+  protected val LAZY = Keyword(LAZY)
   protected val LEFT = Keyword(LEFT)
+  protected val LIKE = Keyword(LIKE)
   protected val LIMIT = Keyword(LIMIT)
+  protected val LOWER = Keyword(LOWER)
   protected val MAX = Keyword(MAX)
   protected val MIN = Keyword(MIN)
   protected val NOT = Keyword(NOT)
   protected val NULL = Keyword(NULL)
   protected val ON = Keyword(ON)
   protected val OR = Keyword(OR)
--- End diff --

Added keyword `LAZY` and sorted all the keywords in alphabetical order 
here. This list was once sorted but broken later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-04 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2513#discussion_r18429702
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -91,42 +92,42 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   protected val IN = Keyword(IN)
   protected val INNER = Keyword(INNER)
   protected val INSERT = Keyword(INSERT)
+  protected val INTERSECT = Keyword(INTERSECT)
   protected val INTO = Keyword(INTO)
   protected val IS = Keyword(IS)
   protected val JOIN = Keyword(JOIN)
+  protected val LAST = Keyword(LAST)
+  protected val LAZY = Keyword(LAZY)
   protected val LEFT = Keyword(LEFT)
+  protected val LIKE = Keyword(LIKE)
   protected val LIMIT = Keyword(LIMIT)
+  protected val LOWER = Keyword(LOWER)
   protected val MAX = Keyword(MAX)
   protected val MIN = Keyword(MIN)
   protected val NOT = Keyword(NOT)
   protected val NULL = Keyword(NULL)
   protected val ON = Keyword(ON)
   protected val OR = Keyword(OR)
-  protected val OVERWRITE = Keyword(OVERWRITE)
-  protected val LIKE = Keyword(LIKE)
-  protected val RLIKE = Keyword(RLIKE)
-  protected val UPPER = Keyword(UPPER)
-  protected val LOWER = Keyword(LOWER)
-  protected val REGEXP = Keyword(REGEXP)
   protected val ORDER = Keyword(ORDER)
   protected val OUTER = Keyword(OUTER)
+  protected val OVERWRITE = Keyword(OVERWRITE)
+  protected val REGEXP = Keyword(REGEXP)
   protected val RIGHT = Keyword(RIGHT)
+  protected val RLIKE = Keyword(RLIKE)
   protected val SELECT = Keyword(SELECT)
   protected val SEMI = Keyword(SEMI)
+  protected val SQRT = Keyword(SQRT)
   protected val STRING = Keyword(STRING)
+  protected val SUBSTR = Keyword(SUBSTR)
+  protected val SUBSTRING = Keyword(SUBSTRING)
   protected val SUM = Keyword(SUM)
   protected val TABLE = Keyword(TABLE)
   protected val TIMESTAMP = Keyword(TIMESTAMP)
   protected val TRUE = Keyword(TRUE)
   protected val UNCACHE = Keyword(UNCACHE)
   protected val UNION = Keyword(UNION)
+  protected val UPPER = Keyword(UPPER)
   protected val WHERE = Keyword(WHERE)
--- End diff --

Added keyword LAZY and sorted all the keywords in alphabetical order here. 
This list was once sorted but broken later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18429712
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
+   * Full support of new decimal feature need to be fixed in seperate PR.
+   */
+  val metastoreDecimal = decimal(10,0)
+
+  def getTableDesc(
+serdeClass: Class[_ : Deserializer],
+inputFormatClass: Class[_ : InputFormat[_, _]],
+outputFormatClass: Class[_],
+properties: Properties) = {
+new TableDesc(inputFormatClass, outputFormatClass, properties)
+  }
+
+  def getStatsSetupConstTotalSize = StatsSetupConst.TOTAL_SIZE
+
+  def createDefaultDBIfNeeded(context: HiveContext) ={
+context.runSqlHive(CREATE DATABASE default)
+context.runSqlHive(USE default)
+  }
+
+  /** The string used to denote an empty comments field in the schema. */
+  def getEmptyCommentsFieldValue = 
+
+  def getCommandProcessor(cmd: Array[String], conf: HiveConf) =  {
+CommandProcessorFactory.get(cmd, conf)
+  }
+
+  def createDecimal(bd: java.math.BigDecimal): HiveDecimal = {
+HiveDecimal.create(bd)
+  }
+
+  /*
+   * This function in hive-0.13 become private, but we have to do this to 
walkaround hive bug
+   */
+  private def appendReadColumnNames(conf: Configuration, cols: 
Seq[String]) {
+val old: String = 
conf.get(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, )
+val result: StringBuilder = new StringBuilder(old)
+var first: Boolean = old.isEmpty
+
+for (col - cols) {
+  if (first) {
+first = false
+  }
+  else {
+result.append(',')
+  }
+  result.append(col)
+}
+conf.set(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, 
result.toString)
+  }
+
+  /*
+   * Cannot use ColumnProjectionUtils.appendReadColumns directly, if ids 
is null or empty
+   */
+  def appendReadColumns(conf: Configuration, ids: Seq[Integer], names: 
Seq[String]) {
+if (ids != null  ids.size  0) {
+  ColumnProjectionUtils.appendReadColumns(conf, ids)
+}
+appendReadColumnNames(conf, names)
+  }
+
+  def getExternalTmpPath(context: Context, path: Path) = {
+context.getExternalTmpPath(path.toUri)
+  }
+
+  def getDataLocationPath(p: Partition) = p.getDataLocation
+
+  def getAllPartitionsOf(client: Hive, tbl: Table) =  
client.getAllPartitionsOf(tbl)
+
+  /*
+   * Bug introdiced in hive-0.13. FileSinkDesc is serilizable, but its 
member path is not.
+   * Fix it through wrapper.
+   * */
+  implicit def wrapperToFileSinkDesc(w: ShimFileSinkDesc): FileSinkDesc = {
+var f = new

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57907663
  
@scwf Or could you please describe the steps to reproduce this failure from 
a newly checked out master branch? I guess once you can reproduce it, it 
happens deterministically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18429736
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
+   * Full support of new decimal feature need to be fixed in seperate PR.
+   */
+  val metastoreDecimal = decimal(10,0)
+
+  def getTableDesc(
+serdeClass: Class[_ : Deserializer],
+inputFormatClass: Class[_ : InputFormat[_, _]],
+outputFormatClass: Class[_],
+properties: Properties) = {
+new TableDesc(inputFormatClass, outputFormatClass, properties)
+  }
+
+  def getStatsSetupConstTotalSize = StatsSetupConst.TOTAL_SIZE
+
+  def createDefaultDBIfNeeded(context: HiveContext) ={
+context.runSqlHive(CREATE DATABASE default)
+context.runSqlHive(USE default)
+  }
+
+  /** The string used to denote an empty comments field in the schema. */
+  def getEmptyCommentsFieldValue = 
+
+  def getCommandProcessor(cmd: Array[String], conf: HiveConf) =  {
+CommandProcessorFactory.get(cmd, conf)
+  }
+
+  def createDecimal(bd: java.math.BigDecimal): HiveDecimal = {
+HiveDecimal.create(bd)
+  }
+
+  /*
+   * This function in hive-0.13 become private, but we have to do this to 
walkaround hive bug
+   */
+  private def appendReadColumnNames(conf: Configuration, cols: 
Seq[String]) {
+val old: String = 
conf.get(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, )
+val result: StringBuilder = new StringBuilder(old)
+var first: Boolean = old.isEmpty
+
+for (col - cols) {
+  if (first) {
+first = false
+  }
+  else {
+result.append(',')
+  }
+  result.append(col)
+}
+conf.set(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, 
result.toString)
+  }
+
+  /*
+   * Cannot use ColumnProjectionUtils.appendReadColumns directly, if ids 
is null or empty
+   */
+  def appendReadColumns(conf: Configuration, ids: Seq[Integer], names: 
Seq[String]) {
+if (ids != null  ids.size  0) {
+  ColumnProjectionUtils.appendReadColumns(conf, ids)
+}
+appendReadColumnNames(conf, names)
+  }
+
+  def getExternalTmpPath(context: Context, path: Path) = {
+context.getExternalTmpPath(path.toUri)
+  }
+
+  def getDataLocationPath(p: Partition) = p.getDataLocation
+
+  def getAllPartitionsOf(client: Hive, tbl: Table) =  
client.getAllPartitionsOf(tbl)
+
+  /*
+   * Bug introdiced in hive-0.13. FileSinkDesc is serilizable, but its 
member path is not.
+   * Fix it through wrapper.
+   * */
+  implicit def wrapperToFileSinkDesc(w: ShimFileSinkDesc): FileSinkDesc = {
--- End diff --

If we

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57907852
  
Ah, just found out that I can reproduce it with `-Phive`, had been using 
`-Phive,hadoop-2.4` all the time and just couldn't reproduce this, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57908014
  
Yes, i will use -Phive,hadoop-2.4 to see whether it has the peoblem


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18429781
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
+   * Full support of new decimal feature need to be fixed in seperate PR.
+   */
+  val metastoreDecimal = decimal(10,0)
+
+  def getTableDesc(
+serdeClass: Class[_ : Deserializer],
+inputFormatClass: Class[_ : InputFormat[_, _]],
+outputFormatClass: Class[_],
+properties: Properties) = {
+new TableDesc(inputFormatClass, outputFormatClass, properties)
+  }
+
+  def getStatsSetupConstTotalSize = StatsSetupConst.TOTAL_SIZE
+
+  def createDefaultDBIfNeeded(context: HiveContext) ={
+context.runSqlHive(CREATE DATABASE default)
+context.runSqlHive(USE default)
+  }
+
+  /** The string used to denote an empty comments field in the schema. */
+  def getEmptyCommentsFieldValue = 
+
+  def getCommandProcessor(cmd: Array[String], conf: HiveConf) =  {
+CommandProcessorFactory.get(cmd, conf)
+  }
+
+  def createDecimal(bd: java.math.BigDecimal): HiveDecimal = {
+HiveDecimal.create(bd)
+  }
+
+  /*
+   * This function in hive-0.13 become private, but we have to do this to 
walkaround hive bug
+   */
+  private def appendReadColumnNames(conf: Configuration, cols: 
Seq[String]) {
+val old: String = 
conf.get(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, )
+val result: StringBuilder = new StringBuilder(old)
+var first: Boolean = old.isEmpty
+
+for (col - cols) {
+  if (first) {
+first = false
+  }
+  else {
--- End diff --

```
if () {
  ...
} else {
  ...
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18429783
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
+   * Full support of new decimal feature need to be fixed in seperate PR.
+   */
+  val metastoreDecimal = decimal(10,0)
+
+  def getTableDesc(
+serdeClass: Class[_ : Deserializer],
+inputFormatClass: Class[_ : InputFormat[_, _]],
+outputFormatClass: Class[_],
+properties: Properties) = {
+new TableDesc(inputFormatClass, outputFormatClass, properties)
+  }
+
+  def getStatsSetupConstTotalSize = StatsSetupConst.TOTAL_SIZE
+
+  def createDefaultDBIfNeeded(context: HiveContext) ={
+context.runSqlHive(CREATE DATABASE default)
+context.runSqlHive(USE default)
+  }
+
+  /** The string used to denote an empty comments field in the schema. */
+  def getEmptyCommentsFieldValue = 
+
+  def getCommandProcessor(cmd: Array[String], conf: HiveConf) =  {
+CommandProcessorFactory.get(cmd, conf)
+  }
+
+  def createDecimal(bd: java.math.BigDecimal): HiveDecimal = {
+HiveDecimal.create(bd)
+  }
+
+  /*
+   * This function in hive-0.13 become private, but we have to do this to 
walkaround hive bug
+   */
+  private def appendReadColumnNames(conf: Configuration, cols: 
Seq[String]) {
+val old: String = 
conf.get(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, )
+val result: StringBuilder = new StringBuilder(old)
+var first: Boolean = old.isEmpty
+
+for (col - cols) {
+  if (first) {
+first = false
+  }
+  else {
+result.append(',')
+  }
+  result.append(col)
+}
+conf.set(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, 
result.toString)
+  }
+
+  /*
+   * Cannot use ColumnProjectionUtils.appendReadColumns directly, if ids 
is null or empty
+   */
+  def appendReadColumns(conf: Configuration, ids: Seq[Integer], names: 
Seq[String]) {
+if (ids != null  ids.size  0) {
+  ColumnProjectionUtils.appendReadColumns(conf, ids)
+}
+appendReadColumnNames(conf, names)
--- End diff --

Why no null and empty check at here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2654#issuecomment-57908521
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-04 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/2654

[Minor] Trivial fix to make codes more readable

It should just use `maxResults` there.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 trivial_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2654.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2654


commit 13622893af0a67e21d149792aaee47fcdaf427ca
Author: Liang-Chi Hsieh vii...@gmail.com
Date:   2014-10-04T15:07:09Z

Trivial fix to make codes more readable.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3793][SQL]use hiveconf when parse hive ...

2014-10-04 Thread scwf

GitHub user scwf opened a pull request:

https://github.com/apache/spark/pull/2655

[SPARK-3793][SQL]use hiveconf when parse hive ql

This PR is to make hive ql parser more general and compatible with both 
hive-0.12 and hive-0.13. In hive-0.13 we may need hiveconf(or hivecontext) when 
parsing a sql(quoted sql). For example, when runing sql as follow without 
hiveconf will get NPE exception:

createQueryTest(quoted alias.attr,
SELECT `a`.`key` FROM src a ORDER BY key LIMIT 1)
[info] - quoted alias.attr *** FAILED ***
[info]   org.apache.spark.sql.hive.HiveQl$ParseException: Failed to parse: 
SELECT `a`.`key` FROM src a ORDER BY key LIMIT 1
[info]   at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:221)
[info]   at 
org.apache.spark.sql.hive.test.TestHiveContext$HiveQLQueryExecution.logical$lzycompute(TestHive.scala:143)
[info]   at 
org.apache.spark.sql.hive.test.TestHiveContext$HiveQLQueryExecution.logical(TestHive.scala:143)
[info]   at 
org.apache.spark.sql.hive.test.TestHiveContext$QueryExecution.analyzed$lzycompute(TestHive.scala:153)
[info]   at 
org.apache.spark.sql.hive.test.TestHiveContext$QueryExecution.analyzed(TestHive.scala:152)
[info]   at 
org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:403)
[info]   at 
org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:403)
[info]   at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:407)
[info]   at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:405)
[info]   at 
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:411)
[info]   ...
[info]   Cause: java.lang.NullPointerException:
[info]   at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:1295)
[info]   at 
org.apache.hadoop.hive.ql.parse.HiveLexer.allowQuotedId(HiveLexer.java:342)
[info]   at 
org.apache.hadoop.hive.ql.parse.HiveLexer$DFA21.specialStateTransition(HiveLexer.java:10945)
[info]   at org.antlr.runtime.DFA.predict(DFA.java:80)
[info]   at 
org.apache.hadoop.hive.ql.parse.HiveLexer.mIdentifier(HiveLexer.java:7925)
[info]   at 
org.apache.hadoop.hive.ql.parse.HiveLexer.mTokens(HiveLexer.java:10818)
[info]   at org.antlr.runtime.Lexer.nextToken(Lexer.java:89)
[info]   at 
org.antlr.runtime.BufferedTokenStream.fetch(BufferedTokenStream.java:133)
[info]   at 
org.antlr.runtime.BufferedTokenStream.sync(BufferedTokenStream.java:127)
[info]   at 
org.antlr.runtime.CommonTokenStream.consume(CommonTokenStream.java:70)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/scwf/spark addconf-to-hiveql

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2655.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2655


commit 51e61621469392c3b357781230ef2909cf98b7a8
Author: scwf wangf...@huawei.com
Date:   2014-10-04T15:09:18Z

add hiveconf when parse hive ql




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57908927
  
using -Phive,hadoop-2.4 is ok in my local maching


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3793][SQL]use hiveconf when parse hive ...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2655#issuecomment-57908961
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2750] support https in spark web ui

2014-10-04 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/1980#issuecomment-57909017
  
Updated and fix the confilct, @JoshRosen you can have a test for this refer 
to my last comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3758] [Windows] Wrong EOL character in ...

2014-10-04 Thread nchammas

Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/2612#issuecomment-57909026
  
  I'm worried since most of us develop on mac/linux that this will end up 
in a weird state where there are mixed EOL characters.

I'd worry about this, too. If one of us edits one of these files on a Mac, 
we might unknowingly recreate the issue this PR is trying to fix.

Does it make sense to add some kind of style check to `dev/run-tests` that 
validates that all `.cmd` files use Windows-style newlines?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57909183
  
So this bug can be triggered by lower versions of Hadoop, e.g. 1.0.3. I 
haven't validate the exact range yet.

In `Hive.loadDynamicPartitions`, Hive calls 
`o.a.h.h.q.e.Utilities.getFIleStatusRecurse` to glob the temporary directory 
for data files, it seems that lower versions of Hadoop doesn't filter out files 
like `_SUCCESS`, which causes the problem.

Within Hive, `loadDynamicPartitions` is only used in operations like 
`LOAD`. At the end of a normal insertion to a dynamically partitioned table, 
`FileSinkOperator` calls `Utilities.mvFileToFinalPath` to move the entire 
temporary directory to target location, thus doesn't have this problem.

`Utilities.mvFileToFinalPath` is more efficient than 
`Hive.loadDynamicPartitions` since it doesn't parses and validates partition 
specs. But it requires some internal Hive data structures like 
`DynamicPartitionCtx`. I'll try to see whether I can mock these data structures 
and use `mvFileToFinalPath` instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3777] Display Executor ID for Tasks i...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2642#issuecomment-57909187
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21295/consoleFull)
 for   PR 2642 at commit 
[`37945af`](https://github.com/apache/spark/commit/37945af9defd4dbf450f1391ca621b9c4c63030f).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57909238
  
@scwf Thanks for all the information you provided offline :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3777] Display Executor ID for Tasks i...

2014-10-04 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/2642#issuecomment-57909316
  
 As horizontal space is precious for including more metrics, might it make 
sense to combine Address / Executor and Executor ID into a single 
Executor column, with values like 1 / 10.37.129.2.

Agree. I updated to put them into one column. I use `host` because I feel 
`Address` should include the `port` number but here we only have `host`. The 
new screenshot is as follow:


![executor_id_host](https://cloud.githubusercontent.com/assets/1000778/4515998/f84fdc82-4bdb-11e4-81a3-659cd28a4b43.png)

 Also, is including the port still worthwhile now that we have the ID?

`TaskInfo` does not have a `port` field, and I cannot find an easy way to 
add it. However, I think `Executor ID` is enough. If the executor id is 
provided, I can use `ps -ef | grep spark | grep  executor_id ` to find the 
process id. Looks `port` cannot help me find the process id more easily.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57909423
  
According to previous failed Jenkins builds 
([1](https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/752/), 
[2](https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/753/), etc.), 
Hadoop 1.0.3 and 2.0 are vulnerable, 2.2 and above are OK. That explains why 
this PR together with #2226 always passes Jenkins -- the PR builder uses Hadoop 
2.3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL] Adds dynamic partitioning su...

2014-10-04 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/2616#issuecomment-57909534
  
Get it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430042
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.net.URI
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.`type`.HiveDecimal
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Hive, Partition, Table}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors._
+import org.apache.hadoop.hive.ql.stats.StatsSetupConst
+import org.apache.hadoop.hive.serde2.{Deserializer, ColumnProjectionUtils}
+import org.apache.hadoop.{io = hadoopIo}
+import org.apache.hadoop.mapred.InputFormat
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.12.0.
+ */
+private[hive] object HiveShim {
+  val version = 0.12.0
+  val metastoreDecimal = decimal
+
+  def getTableDesc(
+serdeClass: Class[_ : Deserializer],
+inputFormatClass: Class[_ : InputFormat[_, _]],
+outputFormatClass: Class[_],
+properties: Properties) = {
+new TableDesc(serdeClass, inputFormatClass, outputFormatClass, 
properties)
--- End diff --

Is it necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3777] Display Executor ID for Tasks i...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2642#issuecomment-57911257
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21295/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3777] Display Executor ID for Tasks i...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2642#issuecomment-57911254
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21295/consoleFull)
 for   PR 2642 at commit 
[`37945af`](https://github.com/apache/spark/commit/37945af9defd4dbf450f1391ca621b9c4c63030f).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430235
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TestHive.scala 
---
@@ -369,6 +371,7 @@ class TestHiveContext(sc: SparkContext) extends 
HiveContext(sc) {
* tests.
*/
   protected val originalUdfs: JavaSet[String] = 
FunctionRegistry.getFunctionNames
+  HiveShim.createDefaultDBIfNeeded(this)
--- End diff --

Can you add a comment at here to explain why it is necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430262
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TestHive.scala 
---
@@ -78,6 +79,7 @@ class TestHiveContext(sc: SparkContext) extends 
HiveContext(sc) {
   // For some hive test case which contain ${system:test.tmp.dir}
   System.setProperty(test.tmp.dir, testTempDir.getCanonicalPath)
 
+  CommandProcessorFactory.clean(hiveconf)
--- End diff --

Since it is a cleanup work, seems it is better to be placed after 
`System.clearProperty(spark.hostPort)`. Also, please add comment about what 
this call is doing and why it is needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430322
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---
@@ -80,8 +81,10 @@ class StatisticsSuite extends QueryTest with 
BeforeAndAfterAll {
 sql(INSERT INTO TABLE analyzeTable SELECT * FROM src).collect()
 sql(INSERT INTO TABLE analyzeTable SELECT * FROM src).collect()
 
-assert(queryTotalSize(analyzeTable) === defaultSizeInBytes)
-
+// TODO: How it works? needs to add it back for other hive version.
+if (HiveShim.version ==0.12.0) {
--- End diff --

For Hive 0.13, will table always be updated after `INSERT INTO`?

When we added this test, the table size was not updated with the `INSERT 
INTO` command.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430463
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
--- End diff --

Can you double check it? I am not sure DECIMAL in hive-0.12 is actually 
DECIMAL(10,0). From the code, seems precision is unbounded.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430480
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala ---
@@ -212,7 +214,18 @@ private[hive] object HiveQl {
   /**
* Returns the AST for the given SQL string.
*/
-  def getAst(sql: String): ASTNode = ParseUtils.findRootNonNullToken((new 
ParseDriver).parse(sql))
+  def getAst(sql: String): ASTNode = {
+/*
+ * Context has to be passed in in hive0.13.1.
--- End diff --

in in


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430557
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
+   * Full support of new decimal feature need to be fixed in seperate PR.
+   */
+  val metastoreDecimal = decimal(10,0)
--- End diff --

Let's say we connect to a existing hive 0.13 metastore. If there is a 
decimal column with a user-defined precision and scale, will we see parsing 
error?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430590
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
 ---
@@ -557,11 +557,14 @@ class HiveQuerySuite extends HiveComparisonTest {
   |WITH serdeproperties('s1'='9')
 .stripMargin)
 }
-sql(sADD JAR $testJar)
-sql(
-  ALTER TABLE alter1 SET SERDE 
'org.apache.hadoop.hive.serde2.TestSerDe'
-|WITH serdeproperties('s1'='9')
-  .stripMargin)
+// now only verify 0.12.0, and ignore other versions due to binary 
compatability
--- End diff --

Can you explain it a little bit more?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430600
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.net.URI
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.`type`.HiveDecimal
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Hive, Partition, Table}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors._
+import org.apache.hadoop.hive.ql.stats.StatsSetupConst
+import org.apache.hadoop.hive.serde2.{Deserializer, ColumnProjectionUtils}
+import org.apache.hadoop.{io = hadoopIo}
+import org.apache.hadoop.mapred.InputFormat
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.12.0.
+ */
+private[hive] object HiveShim {
+  val version = 0.12.0
+  val metastoreDecimal = decimal
+
+  def getTableDesc(
+serdeClass: Class[_ : Deserializer],
+inputFormatClass: Class[_ : InputFormat[_, _]],
+outputFormatClass: Class[_],
+properties: Properties) = {
+new TableDesc(serdeClass, inputFormatClass, outputFormatClass, 
properties)
+  }
+
+  def getStatsSetupConstTotalSize = StatsSetupConst.TOTAL_SIZE
+
+  def createDefaultDBIfNeeded(context: HiveContext) ={  }
+
+  /** The string used to denote an empty comments field in the schema. */
+  def getEmptyCommentsFieldValue = None
+
+  def getCommandProcessor(cmd: Array[String], conf: HiveConf) =  {
+CommandProcessorFactory.get(cmd(0), conf)
+  }
+
+  def createDecimal(bd: java.math.BigDecimal): HiveDecimal = {
+new HiveDecimal(bd)
+  }
+
+  def appendReadColumns(conf: Configuration, ids: Seq[Integer], names: 
Seq[String]) {
+ColumnProjectionUtils.appendReadColumnIDs(conf, ids)
+ColumnProjectionUtils.appendReadColumnNames(conf, names)
+  }
+
+  def getExternalTmpPath(context: Context, uri: URI): String = {
--- End diff --

It will be good to make the return type consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18430657
  
--- Diff: sql/hive/pom.xml ---
@@ -119,6 +83,74 @@
 
   profiles
 profile
+  idhive-default/id
+  activation
+property
+  name!hive.version/name
--- End diff --

If we use modified hive dependencies, can we avoid this error and simplify 
pom changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-04 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2241#discussion_r18431391
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.util.Properties
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.hive.common.StatsSetupConst
+import org.apache.hadoop.hive.common.`type`.{HiveDecimal}
+import org.apache.hadoop.hive.conf.HiveConf
+import org.apache.hadoop.hive.ql.Context
+import org.apache.hadoop.hive.ql.metadata.{Table, Hive, Partition}
+import org.apache.hadoop.hive.ql.plan.{FileSinkDesc, TableDesc}
+import org.apache.hadoop.hive.ql.processors.CommandProcessorFactory
+import org.apache.hadoop.hive.serde2.{ColumnProjectionUtils, Deserializer}
+import org.apache.hadoop.mapred.InputFormat
+import org.apache.spark.Logging
+import org.apache.hadoop.{io = hadoopIo}
+import scala.collection.JavaConversions._
+import scala.language.implicitConversions
+
+/**
+ * A compatibility layer for interacting with Hive version 0.13.1.
+ */
+private[hive] object HiveShim {
+  val version = 0.13.1
+  /*
+   * TODO: hive-0.13 support DECIMAL(precision, scale), DECIMAL in 
hive-0.12 is actually DECIMAL(10,0)
--- End diff --

Yeah I think you are right, it is unbounded in Hive 12.  Spark SQL also 
will use unbounded precision decimals internally, so when its not specified 
thats what we should assume.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-10-04 Thread ankurdave

Github user ankurdave commented on the pull request:

https://github.com/apache/spark/pull/1297#issuecomment-57917807
  
@MLnick It's a slight performance issue, since we currently use 
PrimitiveKeyOpenHashMap which optimizes for primitive keys by avoiding null 
tracking, but I think the performance loss is worth it and I'm working on 
adding this ([SPARK-3668](https://issues.apache.org/jira/browse/SPARK-3668)).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3798][SQL] Store the output of a genera...

2014-10-04 Thread marmbrus

GitHub user marmbrus opened a pull request:

https://github.com/apache/spark/pull/2656

[SPARK-3798][SQL] Store the output of a generator in a val

This prevents it from changing during serialization, leading to corrupted 
results.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/marmbrus/spark generateBug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2656.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2656


commit efa32eb7b2122a97c8ea309da9acdcccd462ec12
Author: Michael Armbrust mich...@databricks.com
Date:   2014-10-04T21:26:25Z

Store the output of a generator in a val.
This prevents it from changing during serialization.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Add type checking debugging functions

2014-10-04 Thread marmbrus

GitHub user marmbrus opened a pull request:

https://github.com/apache/spark/pull/2657

[SQL] Add type checking debugging functions

Adds some functions that were very useful when trying to track down the bug 
from #2656.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/marmbrus/spark debugging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2657.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2657


commit 1d0c2da90fe605ae81267477d07704725d4ac132
Author: Michael Armbrust mich...@databricks.com
Date:   2014-10-04T21:30:05Z

Add typeChecking debugging functions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-04 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2654#issuecomment-57919211
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3798][SQL] Store the output of a genera...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2656#issuecomment-57919209
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21296/consoleFull)
 for   PR 2656 at commit 
[`efa32eb`](https://github.com/apache/spark/commit/efa32eb7b2122a97c8ea309da9acdcccd462ec12).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-04 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2652#issuecomment-57919225
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1655][MLLIB] Add option for distributed...

2014-10-04 Thread staple

Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2491#issuecomment-57919277
  
@mengxr Ok, updated to address your suggestions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2654#issuecomment-57919347
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21299/consoleFull)
 for   PR 2654 at commit 
[`1362289`](https://github.com/apache/spark/commit/13622893af0a67e21d149792aaee47fcdaf427ca).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Add type checking debugging functions

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2657#issuecomment-57919342
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21297/consoleFull)
 for   PR 2657 at commit 
[`1d0c2da`](https://github.com/apache/spark/commit/1d0c2da90fe605ae81267477d07704725d4ac132).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1655][MLLIB] Add option for distributed...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2491#issuecomment-57919346
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21300/consoleFull)
 for   PR 2491 at commit 
[`e535d8b`](https://github.com/apache/spark/commit/e535d8b7f08fb848ee5687881cbfc6e4c9e798cd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2652#issuecomment-57919345
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21298/consoleFull)
 for   PR 2652 at commit 
[`be35c91`](https://github.com/apache/spark/commit/be35c919f2e4701775a69c1ce09831cb205037de).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Add type checking debugging functions

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2657#issuecomment-57919377
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21297/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Add type checking debugging functions

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2657#issuecomment-57919376
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21297/consoleFull)
 for   PR 2657 at commit 
[`1d0c2da`](https://github.com/apache/spark/commit/1d0c2da90fe605ae81267477d07704725d4ac132).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class TypeCheck(child: SparkPlan) extends SparkPlan `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2654#issuecomment-57920602
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21299/consoleFull)
 for   PR 2654 at commit 
[`1362289`](https://github.com/apache/spark/commit/13622893af0a67e21d149792aaee47fcdaf427ca).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2652#issuecomment-57920595
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21298/consoleFull)
 for   PR 2652 at commit 
[`be35c91`](https://github.com/apache/spark/commit/be35c919f2e4701775a69c1ce09831cb205037de).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3773][PySpark][Doc] Sphinx build warnin...

2014-10-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2653#issuecomment-57920629
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3798][SQL] Store the output of a genera...

2014-10-04 Thread mateiz

Github user mateiz commented on the pull request:

https://github.com/apache/spark/pull/2656#issuecomment-57920689
  
Is there any way to add a unit test for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1655][MLLIB] Add option for distributed...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2491#issuecomment-57920705
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21300/consoleFull)
 for   PR 2491 at commit 
[`e535d8b`](https://github.com/apache/spark/commit/e535d8b7f08fb848ee5687881cbfc6e4c9e798cd).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class NaiveBayesModel extends ClassificationModel with 
Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1655][MLLIB] Add option for distributed...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2491#issuecomment-57920707
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21300/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3773][PySpark][Doc] Sphinx build warnin...

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2653#issuecomment-57920747
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21301/consoleFull)
 for   PR 2653 at commit 
[`6f65661`](https://github.com/apache/spark/commit/6f656618d7a6fe3f9977f6a1fb15350577388f06).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2654#issuecomment-57920603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21299/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2652#issuecomment-57920598
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21298/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3798][SQL] Store the output of a genera...

2014-10-04 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2656#issuecomment-57920778
  
Unfortunately, I haven't found a way to reproduce it deterministically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1655][MLLIB] Add option for distributed...

2014-10-04 Thread staple

Github user staple commented on the pull request:

https://github.com/apache/spark/pull/2491#issuecomment-57920855
  
Again, python tests failed because the python interface is disabled in 
order to focus on the scala implementation first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3597][Mesos] Implement `killTask`.

2014-10-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2453#issuecomment-57921063
  
This looks good to me.  I tested this patch against Mesos 0.20.1 running in 
Docker with a modified version of a test from Spark's JobCancellationSuite .

@tnachen commented on this over at 
https://github.com/apache/spark/pull/1940#issuecomment-56246740:

 On Mesos side if you call killTask on a non-existing Task all you get is 
a LOG(WARNING).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3765][Doc] add testing with sbt to docs

2014-10-04 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/2629#issuecomment-57921350
  
@JoshRosen, can you test this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3765][Doc] add testing with sbt to docs

2014-10-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2629#issuecomment-57921539
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3765][Doc] add testing with sbt to docs

2014-10-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2629#issuecomment-57921605
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21302/consoleFull)
 for   PR 2629 at commit 
[`fd9cf29`](https://github.com/apache/spark/commit/fd9cf297865e4e5bd5ba375b56094c68beb7287a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 >

1 - 100 of 159 matches

Mail list logo