[GitHub] spark pull request #16383: [SPARK-18980][SQL] implement Aggregator with Type...

2016-12-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16383#discussion_r93724428 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala --- @@ -505,19 +511,18 @@ abstract class

[GitHub] spark pull request #16383: [SPARK-18980][SQL] implement Aggregator with Type...

2016-12-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16383#discussion_r93724370 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala --- @@ -471,23 +471,29 @@ abstract class

[GitHub] spark issue #16294: [SPARK-18669][SS][DOCS] Update Apache docs for Structure...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16294 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 @hvanhovell Got it. Thanks for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93726194 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -255,19 +288,22 @@ class ChiSqSelector @Since("2.1.0") ()

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93725579 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala --- @@ -92,8 +92,36 @@ private[feature] trait ChiSqSelectorParams extends

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93726073 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -171,11 +171,14 @@ object ChiSqSelectorModel extends

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16228 **[Test build #70533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70533/testReport)** for PR 16228 at commit

[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-22 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16368 I kept getting error with the merge script - not sure if it went through. we are likely having some sync issue with github? --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16119: [SPARK-18687][Pyspark][SQL]Backward compatibility - crea...

2016-12-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16119 @vijoshi do you mind updating your PR according to the dicussion? i.e. simplify the fix and test --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-22 Thread lirui-intel
Github user lirui-intel commented on the issue: https://github.com/apache/spark/pull/12775 Not sure if my patch makes the tests unstable. But I can't figure out why. @kayousterhout @mridulm any ideas? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/13909#discussion_r93726522 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -56,33 +58,100 @@ case class

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70531/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70531/ Test FAILed. ---

[GitHub] spark issue #15666: [SPARK-11421] [Core][Python][R] Added ability for addJar...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15666 **[Test build #70534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70534/testReport)** for PR 15666 at commit

[GitHub] spark issue #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R into mult...

2016-12-22 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16312 ah, thank you @shivaram. sorry I couldn't get around to investigate earlier. @yanboliang It looks like that is the design in the trait BaseReadWrite

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16386 > the corrupt column will contain the filename instead of the literal JSON if there is a parsing failure I am worried of changing the behaviour. I understand why it had to be here as

[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15212 **[Test build #70536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70536/testReport)** for PR 15212 at commit

[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70536/ Test PASSed. ---

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15996 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15996 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70532/ Test FAILed. ---

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15996 **[Test build #70532 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70532/testReport)** for PR 15996 at commit

[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-22 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13909#discussion_r93732043 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -56,33 +58,100 @@ case class

[GitHub] spark issue #16387: [SPARK-18986][Core] ExternalAppendOnlyMap shouldn't fail...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16387 **[Test build #70538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70538/testReport)** for PR 16387 at commit

[GitHub] spark pull request #16387: [SPARK-18986][Core] ExternalAppendOnlyMap shouldn...

2016-12-22 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/16387 [SPARK-18986][Core] ExternalAppendOnlyMap shouldn't fail when forced to spill before calling its iterator ## What changes were proposed in this pull request?

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2016-12-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r93733483 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonFileFormat.scala --- @@ -36,29 +31,31 @@ import

[GitHub] spark issue #15666: [SPARK-11421] [Core][Python][R] Added ability for addJar...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15666 **[Test build #70534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70534/testReport)** for PR 15666 at commit

[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16337 **[Test build #70535 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70535/testReport)** for PR 16337 at commit

[GitHub] spark issue #16294: [SPARK-18669][SS][DOCS] Update Apache docs for Structure...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16294 **[Test build #70530 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70530/testReport)** for PR 16294 at commit

[GitHub] spark issue #16294: [SPARK-18669][SS][DOCS] Update Apache docs for Structure...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16294 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70530/ Test PASSed. ---

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-12-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14627 @rxin, it does not fix any bug but just gets rid of duplicated logics. I will try to open a separate JIRA in this case in the future to prevent confusion. Thank you/ --- If your project is

[GitHub] spark pull request #16383: [SPARK-18980][SQL] implement Aggregator with Type...

2016-12-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16383#discussion_r93725196 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TypedAggregateExpression.scala --- @@ -143,15 +197,96 @@ case class

[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16337 **[Test build #70535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70535/testReport)** for PR 16337 at commit

[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16337 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-22 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/16368 Merging this into master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R into mult...

2016-12-22 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/16312 I looked at this more closely and I think I found the problem - Not sure its easy to fix though. What I traced here is: - When we call sparkR.session.stop and sparkR.session the same JVM

[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15212 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15666: [SPARK-11421] [Core][Python][R] Added ability for addJar...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15666 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16337 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70535/ Test PASSed. ---

[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16337 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #70540 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70540/testReport)** for PR 13909 at commit

[GitHub] spark issue #15666: [SPARK-11421] [Core][Python][R] Added ability for addJar...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15666 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70534/ Test PASSed. ---

[GitHub] spark pull request #16323: [SPARK-18911] [SQL] Define CatalogStatistics to i...

2016-12-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16323#discussion_r93726972 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala --- @@ -41,13 +41,13 @@ import

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16228 **[Test build #70533 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70533/testReport)** for PR 16228 at commit

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16228 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70533/ Test FAILed. ---

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16228 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70537/ Test FAILed. ---

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14452 **[Test build #70541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70541/testReport)** for PR 14452 at commit

[GitHub] spark issue #16361: [SPARK-18952] Regex strings not properly escaped in code...

2016-12-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16361 it seems to that the grouping key alias is only used for execution(logical Aggregate node doesn't need grouping expression to be named), can we just alias them with k1,k2, ... with avoid this

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15996 **[Test build #70532 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70532/testReport)** for PR 15996 at commit

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93726203 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -255,19 +288,22 @@ class ChiSqSelector @Since("2.1.0") ()

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93726092 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -245,6 +264,20 @@ class ChiSqSelector @Since("2.1.0") () extends

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93725098 --- Diff: docs/ml-features.md --- @@ -1423,12 +1423,12 @@ for more details on the API. `ChiSqSelector` stands for Chi-Squared feature selection. It

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93725546 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala --- @@ -92,8 +92,36 @@ private[feature] trait ChiSqSelectorParams extends

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93725173 --- Diff: docs/ml-features.md --- @@ -1423,12 +1423,12 @@ for more details on the API. `ChiSqSelector` stands for Chi-Squared feature selection. It

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93726320 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/ChiSqSelectorSuite.scala --- @@ -27,61 +27,240 @@ class ChiSqSelectorSuite extends

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93726048 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala --- @@ -111,11 +139,14 @@ private[feature] trait ChiSqSelectorParams

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93725408 --- Diff: docs/mllib-feature-extraction.md --- @@ -227,11 +227,13 @@ both speed and statistical learning behavior.

[GitHub] spark pull request #15212: [SPARK-17645][MLLIB][ML]add feature selector meth...

2016-12-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15212#discussion_r93726001 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala --- @@ -92,8 +92,36 @@ private[feature] trait ChiSqSelectorParams extends

[GitHub] spark issue #16291: [SPARK-18838][CORE] Use separate executor service for ea...

2016-12-22 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/16291 I agree with @markhamstra and @vanzin - having ability to tag listeners into groups (default = spark listener group) and preserving current synchronized behavior within group would be ensure

[GitHub] spark pull request #16384: [BUILD] make-distribution support alternate pytho...

2016-12-22 Thread felixcheung
Github user felixcheung closed the pull request at: https://github.com/apache/spark/pull/16384 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-22 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15996 LGTM. Can you update the comment to address my last comment (https://github.com/apache/spark/pull/15996#discussion_r93730700)? --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTable...

2016-12-22 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15996#discussion_r93730700 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -643,6 +644,14 @@ class DataFrameReaderWriterSuite

[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-22 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/16368 Hmm looks like this is merged but not reflected on github ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2016-12-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r93732800 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala --- @@ -0,0 +1,204 @@ +/* + * Licensed

[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-22 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16368 ah, it was merged https://git-wip-us.apache.org/repos/asf?p=spark.git --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15211 **[Test build #70539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70539/testReport)** for PR 15211 at commit

[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-22 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/16337 I just run build/sbt "test-only org.apache.spark.sql.streaming.StreamSuite" on my local machine, also the whole sql suite, it works fine. Can you re-run the test? Thanks --- If your project is

[GitHub] spark pull request #15666: [SPARK-11421] [Core][Python][R] Added ability for...

2016-12-22 Thread mariusvniekerk
Github user mariusvniekerk commented on a diff in the pull request: https://github.com/apache/spark/pull/15666#discussion_r93730314 --- Diff: core/src/main/scala/org/apache/spark/TestUtils.scala --- @@ -164,6 +164,27 @@ private[spark] object TestUtils {

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93731229 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,525 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15212 **[Test build #70536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70536/testReport)** for PR 15212 at commit

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2016-12-22 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r93731259 --- Diff: python/pyspark/sql/readwriter.py --- @@ -155,21 +155,24 @@ def load(self, path=None, format=None, schema=None, **options): return

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #70537 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70537/testReport)** for PR 13909 at commit

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-22 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15996 ah https://github.com/apache/spark/commit/9a1ad71db44558bb6eb380dc23a1a1abbc2f3e98 failed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16232 ping @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2016-12-22 Thread NathanHowell
GitHub user NathanHowell opened a pull request: https://github.com/apache/spark/pull/16386 [SPARK-18352][SQL] Support parsing multiline json files ## What changes were proposed in this pull request? If a new option `wholeFile` is set to `true` the JSON reader will parse

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70531/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Hello recent JacksonGenerator.scala commiters, please take a look. cc/ @rxin @hvanhovell @clockfly @hyukjinkwon @cloud-fan --- If your project is set up for it, you can reply to this

[GitHub] spark pull request #16323: [SPARK-18911] [SQL] Define CatalogStatistics to i...

2016-12-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16323#discussion_r93726768 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -237,6 +239,38 @@ case class CatalogTable( }

[GitHub] spark pull request #15666: [SPARK-11421] [Core][Python][R] Added ability for...

2016-12-22 Thread mariusvniekerk
Github user mariusvniekerk commented on a diff in the pull request: https://github.com/apache/spark/pull/15666#discussion_r93729928 --- Diff: core/src/main/scala/org/apache/spark/TestUtils.scala --- @@ -164,6 +164,27 @@ private[spark] object TestUtils {

[GitHub] spark pull request #16387: [SPARK-18986][Core] ExternalAppendOnlyMap shouldn...

2016-12-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16387#discussion_r93732158 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -192,12 +193,16 @@ class ExternalAppendOnlyMap[K, V, C](

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #70537 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70537/testReport)** for PR 13909 at commit

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15211 **[Test build #70539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70539/testReport)** for PR 15211 at commit

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15211 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15211 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70539/ Test PASSed. ---

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-22 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13909 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15211 I've sent a new update addressing most of the comments. The only exception is about `SetWeightCol` in `LinearSVCModel`. cc @jkbradley. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-12-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14452 Revisit this by rebasing with master. BTW, in 500+ LOC changes, actually there are 200+ LOC changes are test cases. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13909#discussion_r93721954 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -56,33 +58,100 @@ case class

[GitHub] spark pull request #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTable...

2016-12-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15996#discussion_r93722074 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -364,48 +366,162 @@ final class DataFrameWriter[T] private[sql](ds:

[GitHub] spark pull request #16294: [SPARK-18669][SS][DOCS] Update Apache docs for St...

2016-12-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16294#discussion_r93722076 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1493,13 +1493,13 @@ SparkSession spark = ... spark.streams.addListener(new

[GitHub] spark issue #16294: [SPARK-18669][SS][DOCS] Update Apache docs for Structure...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16294 **[Test build #70530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70530/testReport)** for PR 16294 at commit

[GitHub] spark issue #16294: [SPARK-18669][SS][DOCS] Update Apache docs for Structure...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16294 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16294: [SPARK-18669][SS][DOCS] Update Apache docs for Structure...

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16294 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70529/ Test PASSed. ---

[GitHub] spark pull request #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTable...

2016-12-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15996#discussion_r93722231 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -364,48 +366,162 @@ final class DataFrameWriter[T] private[sql](ds:

[GitHub] spark issue #16294: [SPARK-18669][SS][DOCS] Update Apache docs for Structure...

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16294 **[Test build #70529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70529/testReport)** for PR 16294 at commit

[GitHub] spark pull request #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTable...

2016-12-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15996#discussion_r93722277 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -363,48 +365,125 @@ final class DataFrameWriter[T] private[sql](ds:

[GitHub] spark pull request #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTable...

2016-12-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15996#discussion_r93722334 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -140,153 +140,55 @@ case class

  1   2   3   4   >