[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...

2016-05-21 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13244#discussion_r64144623
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
 ---
@@ -32,4 +32,4 @@ package org.apache.spark.sql.catalyst.plans.logical
  * @param sizeInBytes Physical size in bytes. For leaf operators this 
defaults to 1, otherwise it
  *defaults to the product of children's `sizeInBytes`.
  */
-private[sql] case class Statistics(sizeInBytes: BigInt)
+private[sql] case class Statistics(sizeInBytes: BigInt, isBroadcastable: 
Boolean = false)
--- End diff --

would be good to document isBroadcastable in the classdoc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...

2016-05-21 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/13247#issuecomment-220815712
  
@cloud-fan Based on my understanding, runtime conf (`class RuntimeConfig`) 
is designed as the public/external interface for users to access the internal 
conf. If users want to make a change on Config at runtime, they must use 
`RuntimeConfig`. In the future, we can further enhance it to block external 
users to change the internal conf?  Also easier to manage Hadoop configuration 
in `RuntimeConfig`? 

`SQLConf` will be just an internal implementation of configuration. We do 
not expect external users to directly access it.

You know, this is just my understanding. : ) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15194] [ML] Add Python ML API for Multi...

2016-05-21 Thread praveendareddy21
GitHub user praveendareddy21 opened a pull request:

https://github.com/apache/spark/pull/13248

[SPARK-15194] [ML] Add Python ML API for MultivariateGaussian

## What changes were proposed in this pull request?

Added MultivariateGaussian in pyspark ML to match scala's ML API


## How was this patch tested?

Tested locally and also added testcases from scala's testsuite


(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/praveendareddy21/spark local_branch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13248.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13248


commit a7250b4dd538be255f8220de20277d69edbeebac
Author: red 
Date:   2016-05-22T05:22:05Z

added Multivariate gaussian in ML Pyspark

commit 0c58e8866498d4e42af0542819fca8a6d76af08a
Author: red 
Date:   2016-05-22T05:33:56Z

added testcase for python multivariate




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15194] [ML] Add Python ML API for Multi...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13248#issuecomment-220815663
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15285][SQL] Generated SpecificSafeProje...

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/13243#issuecomment-220815424
  
The fallback approach doesn't look that simple and clean, can you try split 
the generated code like we did in `CreateExternalRow`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/13247#issuecomment-220815210
  
What's the difference between runtime conf and normal conf?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220814644
  
Unfortunately, `Dataset` (or `Dataframe`) seems not suitable to achieve the 
goal on Python.
```python
>>> spark.parallelize(range(1, 10)).toDS()
...
AttributeError: 'RDD' object has no attribute 'toDS'
>>> spark.parallelize(range(1, 10)).toDF()
...
TypeError: Can not infer schema for type: 
```

I'll think about this more until tomorrow and close this if I cannot find a 
neat solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13247#issuecomment-220814526
  
**[Test build #59090 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59090/consoleFull)**
 for PR 13247 at commit 
[`f443064`](https://github.com/apache/spark/commit/f443064bfabb9e1055d75b7ee1b33085d72b1a3f).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13247#issuecomment-220814528
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59090/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13247#issuecomment-220814527
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15396] [SQL] [DOC] It can't connect hiv...

2016-05-21 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/13225#issuecomment-220814492
  
@rxin @jameszhouyi Do you think the document changes in this PR are clear? 
Please let me know if anything is missing or inappropriate. Thanks!

Also CC all the Committers who changed the related codes. @yhuai 
@andrewor14 @cloud-fan @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13247#issuecomment-220814370
  
**[Test build #59090 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59090/consoleFull)**
 for PR 13247 at commit 
[`f443064`](https://github.com/apache/spark/commit/f443064bfabb9e1055d75b7ee1b33085d72b1a3f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...

2016-05-21 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/13095#issuecomment-220814326
  
Thank you, @cloud-fan !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15470] [SQL] Unify the Configuration In...

2016-05-21 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/13247

[SPARK-15470] [SQL] Unify the Configuration Interface in SQLContext

 What changes were proposed in this pull request?
We introduced `RuntimeConfig` in `SQLContext` in the PR 
https://github.com/apache/spark/pull/12669. Now, `SQLContext` has both `conf` 
and `runtimeConf`. `SQLContext` is being replaced by `SparkSession`. Like 
`SparkSession`, we should not have two configuration interfaces. That means, we 
should not expose `conf` to external users.

This PR contains three major parts:

1. removed `conf` from `SQLContext`. 
2. added the missing functions into `RuntimeConfig`, including two `set` 
functions and one `clear` function.
3. fixed the test cases in `SparkSessionBuilderSuite.scala`. Without this 
fix, we are unable to individually run the test cases. All the test cases 
require `initialSession`.  

@rxin @andrewor14 @yhuai @cloud-fan  Do you think this PR is valid? Thanks!

 How was this patch tested?
Existing test cases cover it.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark configNew

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13247


commit f5708f52171ef5ce04eb4358d101d55862cc2294
Author: gatorsmile 
Date:   2016-05-21T21:25:38Z

initial fix.

commit 0808fa13f04a228377be8f1d17d0aa7da4a47aee
Author: gatorsmile 
Date:   2016-05-22T03:29:38Z

update the test suites.

commit f443064bfabb9e1055d75b7ee1b33085d72b1a3f
Author: gatorsmile 
Date:   2016-05-22T04:33:35Z

remove conf from SQLContext




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15206][SQL] add testcases for distinct ...

2016-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/12984


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13095#issuecomment-220814047
  
**[Test build #59089 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59089/consoleFull)**
 for PR 13095 at commit 
[`d7c2420`](https://github.com/apache/spark/commit/d7c2420cd21e812e08bdea7aa27adf42fe534b98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220814048
  
**[Test build #59088 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59088/consoleFull)**
 for PR 13216 at commit 
[`d677105`](https://github.com/apache/spark/commit/d67710504723ef42b6719d2b242aa0527cad2584).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15206][SQL] add testcases for distinct ...

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/12984#issuecomment-220814026
  
thanks, merging to master and 2.0!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13244#issuecomment-220813993
  
**[Test build #3009 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3009/consoleFull)**
 for PR 13244 at commit 
[`8b9bf51`](https://github.com/apache/spark/commit/8b9bf515423fa422d3c8436097acd87c4d09b733).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15379][SQL] check special invalid date

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13169#discussion_r64143999
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
 ---
@@ -353,6 +353,20 @@ class DateTimeUtilsSuite extends SparkFunSuite {
 c.getTimeInMillis * 1000 + 123456)
   }
 
+  test("SPARK-15379: special invalid date string") {
+// Test stringToDate
+assert(stringToDate(
+  UTF8String.fromString("2015-02-29 00:00:00")).isEmpty)
--- End diff --

Can we try date string(without timestamp part) here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/13095#issuecomment-220813962
  
LGTM, pending jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15312] [SQL] Detect Duplicate Key in Pa...

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/13095#issuecomment-220813959
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13210#issuecomment-220813852
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59087/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13210#issuecomment-220813851
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13210#issuecomment-220813817
  
**[Test build #59087 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59087/consoleFull)**
 for PR 13210 at commit 
[`abc12a5`](https://github.com/apache/spark/commit/abc12a5ee606282b069ff0c326a2f32d4ed2fbe2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220813421
  
I see. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15464][ML][MLlib][SQL][Tests] Replace S...

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13242#issuecomment-220812010
  
cc @andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13244#issuecomment-220811992
  
**[Test build #3009 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3009/consoleFull)**
 for PR 13244 at commit 
[`8b9bf51`](https://github.com/apache/spark/commit/8b9bf515423fa422d3c8436097acd87c4d09b733).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13244#issuecomment-220811981
  
Looks good at high level. Will take a closer look later!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220811972
  
hm we are trying to avoid returning rdds in the new apis. one thing we can 
do is to introduce a parallelize api that returns dataset?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13121


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15428][SQL] Disable multiple streaming ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13210#issuecomment-220811819
  
**[Test build #59087 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59087/consoleFull)**
 for PR 13210 at commit 
[`abc12a5`](https://github.com/apache/spark/commit/abc12a5ee606282b069ff0c326a2f32d4ed2fbe2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220811816
  
Merging in master/2.0. Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13246#issuecomment-220810589
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59086/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13246#issuecomment-220810588
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13246#issuecomment-220810568
  
**[Test build #59086 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59086/consoleFull)**
 for PR 13246 at commit 
[`3a57975`](https://github.com/apache/spark/commit/3a5797544792557a6a143784277753f4d93dd031).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11696] [ML, MLlib] Optimization: Extend...

2016-05-21 Thread NarineK
Github user NarineK closed the pull request at:

https://github.com/apache/spark/pull/9667


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220809759
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59082/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220809758
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220809739
  
**[Test build #59082 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59082/consoleFull)**
 for PR 13245 at commit 
[`65f9746`](https://github.com/apache/spark/commit/65f9746362ac6fb227a5c8ff59717852b5ae87c4).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220809417
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220809418
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59085/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220809398
  
**[Test build #59085 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59085/consoleFull)**
 for PR 13245 at commit 
[`4f6a69e`](https://github.com/apache/spark/commit/4f6a69e75d3c96f3b2ed9d93edf8d1bf958acf1c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos

2016-05-21 Thread bomeng
Github user bomeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/13246#discussion_r64142270
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -227,8 +227,8 @@ object IntegerIndex {
  *  - Unnamed grouping expressions are named so that they can be referred 
to across phases of
  *aggregation
  *  - Aggregations that appear multiple times are deduplicated.
- *  - The compution of the aggregations themselves is separated from the 
final result. For example,
- *the `count` in `count + 1` will be split into an 
[[AggregateExpression]] and a final
+ *  - The computation of the aggregations themselves is separated from the 
final result. For
+ *example, the `count` in `count + 1` will be split into an 
[[AggregateExpression]] and a final
--- End diff --

This is just needed for 100-char line limit as previous line fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13246#issuecomment-220807733
  
**[Test build #59086 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59086/consoleFull)**
 for PR 13246 at commit 
[`3a57975`](https://github.com/apache/spark/commit/3a5797544792557a6a143784277753f4d93dd031).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15468] [SQL] some some typos

2016-05-21 Thread bomeng
GitHub user bomeng opened a pull request:

https://github.com/apache/spark/pull/13246

[SPARK-15468] [SQL] some some typos

## What changes were proposed in this pull request?

Fix some typos while browsing the codes.

## How was this patch tested?

None and obvious.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bomeng/spark typo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13246


commit ff73a8ddc036e1d8edf7eaa3be2e39db4b17d67f
Author: bomeng 
Date:   2016-05-19T01:32:27Z

fix typo

commit 6b05bc95623483f96757a917508fc3737b20bc90
Author: Bo Meng 
Date:   2016-05-20T18:48:17Z

Merge remote-tracking branch 'upstream/master' into typo

commit 3a5797544792557a6a143784277753f4d93dd031
Author: Bo Meng 
Date:   2016-05-21T22:32:12Z

Merge remote-tracking branch 'upstream/master' into typo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220807530
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220807531
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59084/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220807510
  
**[Test build #59084 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59084/consoleFull)**
 for PR 13216 at commit 
[`c29acae`](https://github.com/apache/spark/commit/c29acaeccc5342b51f645449ee75e8e513c89c36).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...

2016-05-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13211#discussion_r64142098
  
--- Diff: core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala ---
@@ -437,7 +438,9 @@ class ListAccumulator[T] extends AccumulatorV2[T, 
java.util.List[T]] {
   s"Cannot merge ${this.getClass.getName} with 
${other.getClass.getName}")
   }
 
-  override def value: java.util.List[T] = 
java.util.Collections.unmodifiableList(_list)
+  override def value: java.util.List[T] = _list.synchronized {
+java.util.Collections.unmodifiableList(new ArrayList[T](_list))
--- End diff --

I think so. Allowing users modifying the list seems not a good idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220806369
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59083/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220806446
  
**[Test build #59085 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59085/consoleFull)**
 for PR 13245 at commit 
[`4f6a69e`](https://github.com/apache/spark/commit/4f6a69e75d3c96f3b2ed9d93edf8d1bf958acf1c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220806368
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220806334
  
**[Test build #59083 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59083/consoleFull)**
 for PR 13121 at commit 
[`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220805756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59081/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220805754
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220805723
  
**[Test build #59081 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59081/consoleFull)**
 for PR 13245 at commit 
[`810f08a`](https://github.com/apache/spark/commit/810f08a666c5d14a2178e329b7c1727603be485e).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220805471
  
**[Test build #59084 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59084/consoleFull)**
 for PR 13216 at commit 
[`c29acae`](https://github.com/apache/spark/commit/c29acaeccc5342b51f645449ee75e8e513c89c36).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15280] [Input/Output] Refactored OrcOut...

2016-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13066


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15280] [Input/Output] Refactored OrcOut...

2016-05-21 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/13066#issuecomment-220805144
  
Merging to master and branch 2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220804464
  
LGTM pending tests.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14554][SQL] disable whole stage codegen...

2016-05-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/12322#discussion_r64141139
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -620,6 +620,12 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
 val df = streaming.join(static, Seq("b"))
 assert(df.isStreaming, "streaming Dataset returned false for 
'isStreaming'.")
   }
+
+  test("SPARK-14554: Dataset.map may generate wrong java code for wide 
table") {
+val wideDF = sqlContext.range(10).select(Seq.tabulate(1000) {i => ('id 
+ i).as(s"c$i")} : _*)
+// Make sure the generated code for this plan can compile and execute.
+wideDF.map(_.getLong(0)).collect()
--- End diff --

Do you know why this test case is super slow? It took more than 5 minutes 
to finish it. Is this expected?  

```
- SPARK-14554: Dataset.map may generate wrong java code for wide table (5 
minutes, 20 seconds)
```

See the link: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59079/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220803288
  
**[Test build #59083 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59083/consoleFull)**
 for PR 13121 at commit 
[`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220803118
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220802722
  
**[Test build #59082 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59082/consoleFull)**
 for PR 13245 at commit 
[`65f9746`](https://github.com/apache/spark/commit/65f9746362ac6fb227a5c8ff59717852b5ae87c4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220801908
  
**[Test build #59081 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59081/consoleFull)**
 for PR 13245 at commit 
[`810f08a`](https://github.com/apache/spark/commit/810f08a666c5d14a2178e329b7c1727603be485e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request:

https://github.com/apache/spark/pull/13245#issuecomment-220801877
  
Hi, @rxin . 
I'm wondering your opinion about this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15466][SQL] Make `SparkSession` as the ...

2016-05-21 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/13245

[SPARK-15466][SQL] Make `SparkSession` as the entry point to programming 
with RDD too

## What changes were proposed in this pull request?

`SparkSession` greatly reduces the number of concepts which Spark users 
must know. Currently, `SparkSession` is defined as the entry point to 
programming Spark with the Dataset and DataFrame API. And, we can easily get 
`RDD` by calling `Dataset.rdd` or `DataFrame.rdd`, too.

However, many usages (including examples) are observed to extract 
`SparkSession.sparkContext` and keep it as own variable to call `parallelize`.

If `SparkSession` supports RDD seamlessly too, it would be great for 
usability. We can do this by simply adding `parallelize` API.

**Example**
```scala
 object SparkPi {
   def main(args: Array[String]) {
 val spark = SparkSession
   .builder
   .appName("Spark Pi")
   .getOrCreate()
-val sc = spark.sparkContext
 val slices = if (args.length > 0) args(0).toInt else 2
 val n = math.min(10L * slices, Int.MaxValue).toInt // avoid 
overflow
-val count = sc.parallelize(1 until n, slices).map { i =>
+val count = spark.parallelize(1 until n, slices).map { i =>
 val count = spark.parallelize(1 until n, slices).map { i =>
   val x = random * 2 - 1
   val y = random * 2 - 1
   if (x*x + y*y < 1) 1 else 0
 }.reduce(_ + _)
 println("Pi is roughly " + 4.0 * count / n)
 spark.stop()
   }
 }
```

```python
 spark = SparkSession\
   .builder\
   .appName("PythonPi")\
   .getOrCreate()

- sc = spark._sc
-
 partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2
 n = 10 * partitions

 def f(_):
   x = random() * 2 - 1
   y = random() * 2 - 1
   return 1 if x ** 2 + y ** 2 < 1 else 0

-count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
 count = spark.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
 print("Pi is roughly %f" % (4.0 * count / n))

 spark.stop()
```

## How was this patch tested?

Pass the Jenkins test (with new python test) and also manual.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-15466

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13245.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13245


commit 810f08a666c5d14a2178e329b7c1727603be485e
Author: Dongjoon Hyun 
Date:   2016-05-21T21:36:04Z

[SPARK-15466][SQL] Make `SparkSession` as the entry point to programming 
with RDD too




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220801790
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220801786
  
**[Test build #59080 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59080/consoleFull)**
 for PR 13121 at commit 
[`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220801791
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59080/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220801477
  
**[Test build #59080 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59080/consoleFull)**
 for PR 13121 at commit 
[`5b759da`](https://github.com/apache/spark/commit/5b759da0800c06037283d553978bac33717e71a1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13121#discussion_r64140446
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfSuite.scala ---
@@ -107,6 +107,53 @@ class SQLConfSuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("reset - public conf") {
+spark.sqlContext.conf.clear()
+val original = spark.conf.get(SQLConf.GROUP_BY_ORDINAL)
+try{
--- End diff --

Thanks, let me fix it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220799366
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220799367
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59079/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13105#issuecomment-220799325
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59078/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13105#issuecomment-220799323
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220799329
  
**[Test build #59079 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59079/consoleFull)**
 for PR 13216 at commit 
[`bc0d10a`](https://github.com/apache/spark/commit/bc0d10a5103c4e82dce725be792530120c9f6ff6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class TruncateTable(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15315][SQL] Adding error check to the C...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13105#issuecomment-220799279
  
**[Test build #59078 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59078/consoleFull)**
 for PR 13105 at commit 
[`87c6f27`](https://github.com/apache/spark/commit/87c6f27e8755c6f72e4821cf5cd1b77baf74ed4b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...

2016-05-21 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/13211#discussion_r64139990
  
--- Diff: core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala ---
@@ -437,7 +438,9 @@ class ListAccumulator[T] extends AccumulatorV2[T, 
java.util.List[T]] {
   s"Cannot merge ${this.getClass.getName} with 
${other.getClass.getName}")
   }
 
-  override def value: java.util.List[T] = 
java.util.Collections.unmodifiableList(_list)
+  override def value: java.util.List[T] = _list.synchronized {
+java.util.Collections.unmodifiableList(new ArrayList[T](_list))
--- End diff --

One last thought ... now that this is cloned, does it need to be in an 
unmodifiable wrapper? maybe it's still a good idea so that the caller doesn't 
somehow think modifying the list modifies the accumulator


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15459][SQL] Make Range logical and phys...

2016-05-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13239#discussion_r64139787
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -359,8 +359,8 @@ private[sql] abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
   generator, join = join, outer = outer, g.output, 
planLater(child)) :: Nil
   case logical.OneRowRelation =>
 execution.RDDScanExec(Nil, singleRowRdd, "OneRowRelation") :: Nil
-  case r @ logical.Range(start, end, step, numSlices, output) =>
-execution.RangeExec(start, step, numSlices, r.numElements, output) 
:: Nil
+  case r : logical.Range =>
--- End diff --

nit. 'case r :' -> 'case r:' ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15452][SQL] Mark aggregator API as expe...

2016-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13226


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15452][SQL] Mark aggregator API as expe...

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13226#issuecomment-220796810
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15459][SQL] Make Range logical and phys...

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13239#issuecomment-220796771
  
cc @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15327] [SQL] fix split expression in wh...

2016-05-21 Thread jurriaan
Github user jurriaan commented on the pull request:

https://github.com/apache/spark/pull/13235#issuecomment-220796543
  
Thanks! Not sure if I understand it correctly, but what happens when the 
whole-stage codegen generates code longer then 64k? Because I thought about 
fixing this issue by passing some of the variables to the generated functions 
(but was not sure how to do that exactly).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13211#issuecomment-220796529
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15430][SQL] Fix potential ConcurrentMod...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13211#issuecomment-220796487
  
**[Test build #59074 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59074/consoleFull)**
 for PR 13211 at commit 
[`3af08bd`](https://github.com/apache/spark/commit/3af08bd7c417520854971d2d14fb4ae608a9522a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220796420
  
**[Test build #59079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59079/consoleFull)**
 for PR 13216 at commit 
[`bc0d10a`](https://github.com/apache/spark/commit/bc0d10a5103c4e82dce725be792530120c9f6ff6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15327] [SQL] fix split expression in wh...

2016-05-21 Thread jurriaan
Github user jurriaan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13235#discussion_r64139152
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2477,6 +2477,30 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("SPARK-15327: fail to compile generated code with complex data 
structure") {
+withTempDir{ dir =>
+  val json =
+"""
+  |{"h": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", 
"count": 3}],
+  |"b": [{"e": "test", "count": 1}]}}, "d": {"b": {"c": [{"e": 
"adfgd"}],
+  |"a": [{"e": "testing", "count": 3}], "b": [{"e": "test", 
"count": 1}]}},
+  |"c": {"b": {"c": [{"e": "adfgd"}], "a": [{"count": 3}],
+  |"b": [{"e": "test", "count": 1}]}}, "a": {"b": {"c": [{"e": 
"adfgd"}],
+  |"a": [{"count": 3}], "b": [{"e": "test", "count": 1}]}},
+  |"e": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", 
"count": 3}],
+  |"b": [{"e": "test", "count": 1}]}}, "g": {"b": {"c": [{"e": 
"adfgd"}],
+  |"a": [{"e": "testing", "count": 3}], "b": [{"e": "test", 
"count": 1}]}},
+  |"f": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", 
"count": 3}],
+  |"b": [{"e": "test", "count": 1}]}}, "b": {"b": {"c": [{"e": 
"adfgd"}],
+  |"a": [{"count": 3}], "b": [{"e": "test", "count": 1}]}}}'
--- End diff --

Nice fixture, haha!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15452][SQL] Mark aggregator API as expe...

2016-05-21 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/13226#issuecomment-220796372
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15434][SQL] improve EmbedSerializerInFi...

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/13216#issuecomment-220796197
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13243#issuecomment-220796216
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

2016-05-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13243#issuecomment-220796188
  
**[Test build #59077 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59077/consoleFull)**
 for PR 13243 at commit 
[`59b0a76`](https://github.com/apache/spark/commit/59b0a76a2dc2ed6484f005880001c273536088ae).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15258][SQL] Nested/Chained case stateme...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13243#issuecomment-220796217
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59077/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...

2016-05-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13244#issuecomment-220796189
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...

2016-05-21 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/13188#issuecomment-220796056
  
I'm going to cherry-pick this into 2.0 since it has caused confusion and 
people thought 2.0 couldn't run the queries.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15282][SQL] PushDownPredicate should no...

2016-05-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request:

https://github.com/apache/spark/pull/13087#issuecomment-220796055
  
Thank you. @cloud-fan .
By the way, to be clear with this, should we revert the change on 
`PushDownPredicate`, too?
I think it's another separate issue.

If the decision on this is finalized too, I can update this PR again.

Thank you for fast decision, @marmbrus , @markhamstra , @thunterdb , 
@cloud-fan .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...

2016-05-21 Thread jurriaan
Github user jurriaan commented on the pull request:

https://github.com/apache/spark/pull/13244#issuecomment-220796042
  
@rxin Could you take a look at this? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15415][SQL] Fix BroadcastHint when auto...

2016-05-21 Thread jurriaan
GitHub user jurriaan opened a pull request:

https://github.com/apache/spark/pull/13244

[SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is 0 
or -1

## What changes were proposed in this pull request?

This PR makes BroadcastHint more deterministic by using a special 
isBroadcastable property
instead of setting the sizeInBytes to 1.

See https://issues.apache.org/jira/browse/SPARK-15415

## How was this patch tested?

Added testcases to test if the broadcast hash join is included in the plan 
when the BroadcastHint is supplied and also tests for propagation of the joins.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jurriaan/spark broadcast-hint

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13244.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13244


commit 8b9bf515423fa422d3c8436097acd87c4d09b733
Author: Jurriaan Pruis 
Date:   2016-05-21T19:18:54Z

[SPARK-15415][SQL] Fix BroadcastHint when autoBroadcastJoinThreshold is low 
or disabled

This makes BroadcastHint more deterministic by using a special 
isBroadcastable property
instead of setting the sizeInBytes to 1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15330] [SQL] Implement Reset Command

2016-05-21 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/13121#issuecomment-220795764
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >