[GitHub] spark issue #14987: [SPARK-17372][SQL][STREAMING] Avoid serialization issues...

2016-09-06 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14987 LGTM pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark pull request #14987: [SPARK-17372][SQL][STREAMING] Avoid serialization...

2016-09-06 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/14987#discussion_r77744980 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala --- @@ -49,6 +49,10 @@ import

[GitHub] spark issue #14834: [SPARK-17163][ML][WIP] Unified LogisticRegression interf...

2016-09-06 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14834 @sethah Thank you for coming up with PR with detailed documentation. For option 2, if a two class model is trained with multinomial family, how do you store it? I was thinking about maybe we could

[GitHub] spark issue #14987: [SPARK-17372][SQL][STREAMING] Avoid serialization issues...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14987 **[Test build #65015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65015/consoleFull)** for PR 14987 at commit

[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

2016-09-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14671 @davies Do you mind if I ask whether it is sensible to perform a benchmark and try to submit a PR to disable this (maybe with adding an extra option to enable/disable this but false by

[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

2016-09-06 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/14671 @andreweduffy Good point, but we still use the parquet-mr when there is any complex type in the schema. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14987: [SPARK-17372][SQL][STREAMING] Avoid serialization issues...

2016-09-06 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/14987 @yhuai @zsxwing Can you take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

2016-09-06 Thread andreweduffy
Github user andreweduffy commented on the issue: https://github.com/apache/spark/pull/14671 @davies Row-level filtering doesn't occur with the vectorized reader, which is now enabled by default --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #14987: [SPARK-17372][SQL][STREAMING] Avoid serialization...

2016-09-06 Thread tdas
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/14987 [SPARK-17372][SQL][STREAMING] Avoid serialization issues by using Arrays to save file names in FileStreamSource ## What changes were proposed in this pull request? When we create a

[GitHub] spark issue #14919: [SPARK-17354][SQL] Partitioning by dates/timestamps shou...

2016-09-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14919 Thanks @sameeragarwal ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #14978: [SPARK-17408] [TEST] Flaky test: org.apache.spark...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14978 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14978: [SPARK-17408] [TEST] Flaky test: org.apache.spark.sql.hi...

2016-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14978 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77740936 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77740858 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14973: [SPARK-17356][SQL][1.6] Fix out of memory issue w...

2016-09-06 Thread clockfly
Github user clockfly closed the pull request at: https://github.com/apache/spark/pull/14973 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #10970: [SPARK-13067][SQL] workaround for a weird scala reflecti...

2016-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/10970 @atronchi can you create a JIRA and put the code that can reproduce the bug? thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77740557 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77740392 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14932: [SPARK-17371] Resubmitted shuffle outputs can get...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14932 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77740084 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark issue #14932: [SPARK-17371] Resubmitted shuffle outputs can get delete...

2016-09-06 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/14932 @ericl and I discussed this and decided to address the file cleanup issues that I mentioned above in a separate PR: the issues that I outlined above can generally only occur in cases where we've

[GitHub] spark pull request #14983: [SPARK-17316][Core]Fix the 'ask' type parameter i...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14983 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14983: [SPARK-17316][Core]Fix the 'ask' type parameter in 'remo...

2016-09-06 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14983 Merging to master, 2.0 and 1.6 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

2016-09-06 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/14671 Before disable the record level filter in parquet reader, I think pushing more non-efficient predicates into parquet reader will be even worse, right? --- If your project is set up for it, you can

[GitHub] spark issue #14931: [SPARK-17370] Shuffle service files not invalidated when...

2016-09-06 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14931 This looks ok from what I read of the standalone code, but probably someone more familiar with standalone should take a look. @JoshRosen ? --- If your project is set up for it, you can reply to

[GitHub] spark issue #14983: [SPARK-17316][Core]Fix the 'ask' type parameter in 'remo...

2016-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14983 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65010/ Test PASSed. ---

[GitHub] spark issue #14983: [SPARK-17316][Core]Fix the 'ask' type parameter in 'remo...

2016-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14983 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14983: [SPARK-17316][Core]Fix the 'ask' type parameter in 'remo...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14983 **[Test build #65010 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65010/consoleFull)** for PR 14983 at commit

[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

2016-09-06 Thread andreweduffy
Github user andreweduffy commented on the issue: https://github.com/apache/spark/pull/14671 cool, ping to @davies @cloud-fan would either of you be able to look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #14976: [SPARK-17306] [SQL] QuantileSummaries doesn't compress

2016-09-06 Thread thunterdb
Github user thunterdb commented on the issue: https://github.com/apache/spark/pull/14976 By the way, you gave some great advice here. Is there a page on the wiki where we collect all this internal knowledge? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #14986: [WIP] [SPARK-17421] Don't use -XX:MaxPermSize option whe...

2016-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14986 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-06 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r77737542 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -446,6 +463,20 @@ private[ml] object DefaultParamsReader { val cls =

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-06 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r77737428 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -446,6 +463,20 @@ private[ml] object DefaultParamsReader { val cls =

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-06 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r77737335 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -139,16 +146,61 @@ class KMeansSuite extends SparkFunSuite with

[GitHub] spark pull request #14986: [WIP] [SPARK-17421] Don't use -XX:MaxPermSize opt...

2016-09-06 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/14986 [WIP] [SPARK-17421] Don't use -XX:MaxPermSize option when Java version >= 8 ## What changes were proposed in this pull request? Modifies the `build/mvn` and `build/sbt-launch-lib.bash`

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-06 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r77737026 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -303,6 +322,29 @@ class KMeans @Since("1.5.0") ( @Since("1.5.0")

[GitHub] spark issue #14816: [SPARK-17245] [SQL] [BRANCH-1.6] Do not rely on Hive's s...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14816 **[Test build #65014 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65014/consoleFull)** for PR 14816 at commit

[GitHub] spark issue #14985: [SPARK-17396][core] Share the task support between Union...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14985 **[Test build #65012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65012/consoleFull)** for PR 14985 at commit

[GitHub] spark issue #14984: [SPARK-17296][SQL] Simplify parser join processing [BACK...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14984 **[Test build #65013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65013/consoleFull)** for PR 14984 at commit

[GitHub] spark pull request #14985: [SPARK-17396][core] Share the task support betwee...

2016-09-06 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/14985 [SPARK-17396][core] Share the task support between UnionRDD instances. ## What changes were proposed in this pull request? Share the ForkJoinTaskSupport between UnionRDD instances to avoid

[GitHub] spark pull request #14984: [SPARK-17296

2016-09-06 Thread hvanhovell
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/14984 [SPARK-17296 ## What changes were proposed in this pull request? This PR backports https://github.com/apache/spark/pull/14867 to branch-2.0. It fixes a number of join ordering bugs. ##

[GitHub] spark issue #14932: [SPARK-17371] Resubmitted shuffle outputs can get delete...

2016-09-06 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/14932 Context for other reviewers: As of #9610 (Spark 1.5.3+), map tasks write their output to temporary files and then atomically rename those files to put them at the final destination path.

[GitHub] spark issue #14816: [SPARK-17245] [SQL] [BRANCH-1.6] Do not rely on Hive's s...

2016-09-06 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14816 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #14976: [SPARK-17306] [SQL] QuantileSummaries doesn't compress

2016-09-06 Thread thunterdb
Github user thunterdb commented on the issue: https://github.com/apache/spark/pull/14976 @srowen LGTM, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #14976: [SPARK-17306] [SQL] QuantileSummaries doesn't com...

2016-09-06 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/14976#discussion_r77735611 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala --- @@ -236,7 +240,7 @@ object QuantileSummaries {

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-06 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r77734912 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -318,6 +327,14 @@ private[ml] object DefaultParamsWriter { val

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-06 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r77734849 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -318,6 +327,14 @@ private[ml] object DefaultParamsWriter { val

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77734883 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77734719 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77734604 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark issue #14702: [SPARK-15694] Implement ScriptTransformation in sql/core...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14702 **[Test build #65011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65011/consoleFull)** for PR 14702 at commit

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77734509 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14976: [SPARK-17306] [SQL] QuantileSummaries doesn't com...

2016-09-06 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/14976#discussion_r77734419 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala --- @@ -59,9 +59,14 @@ class QuantileSummaries(

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77733682 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77734337 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77734227 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77734065 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14943: [SPARK-15891][yarn] Clean up some logging in the ...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14943 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77733949 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77733925 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77733824 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark issue #14943: [SPARK-15891][yarn] Clean up some logging in the YARN AM...

2016-09-06 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14943 Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77733377 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77733194 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77733088 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77732888 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -0,0 +1,853 @@ +--- +title: "SparkR - Practical Guide" +output: + html_document:

[GitHub] spark pull request #14867: [SPARK-17296][SQL] Simplify parser join processin...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14867 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14867: [SPARK-17296][SQL] Simplify parser join processing.

2016-09-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14867 Merging to master/2.0. Thanks for the review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14961: [SPARK-17379] [BUILD] Upgrade netty-all to 4.0.41 final ...

2016-09-06 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14961 I think we can binary search the first broken netty version. It would be easy to find out the real issue. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #14979: [SPARK-17415][SQL] Better error message for driver-side ...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14979 **[Test build #3250 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3250/consoleFull)** for PR 14979 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-06 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 @dbtsai It's ready for your review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14982: [WIP] Not re-use dictionary across batches

2016-09-06 Thread sameeragarwal
Github user sameeragarwal closed the pull request at: https://github.com/apache/spark/pull/14982 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-06 Thread lresende
Github user lresende commented on the issue: https://github.com/apache/spark/pull/14981 Spark kinesis has dependency on the kinesis client which is category-x com.amazonaws amazon-kinesis-client ${aws.kinesis.client.version} Thus

[GitHub] spark pull request #14944: [SPARK-16334][BACKPORT] Reusing same dictionary c...

2016-09-06 Thread sameeragarwal
Github user sameeragarwal closed the pull request at: https://github.com/apache/spark/pull/14944 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14981 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65008/ Test PASSed. ---

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14981 **[Test build #65008 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65008/consoleFull)** for PR 14981 at commit

[GitHub] spark pull request #14952: [SPARK-17110] Fix StreamCorruptionException in Bl...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14952 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14952: [SPARK-17110] Fix StreamCorruptionException in BlockMana...

2016-09-06 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/14952 I'm going to merge this into master and branch-2.0 as an immediate fix for the PySpark caching issue. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14982: [WIP] Not re-use dictionary across batches

2016-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14982 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14982: [WIP] Not re-use dictionary across batches

2016-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14982 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65009/ Test PASSed. ---

[GitHub] spark pull request #14976: [SPARK-17306] [SQL] QuantileSummaries doesn't com...

2016-09-06 Thread clockfly
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/14976#discussion_r77727133 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala --- @@ -59,9 +59,14 @@ class QuantileSummaries( *

[GitHub] spark issue #14982: [WIP] Not re-use dictionary across batches

2016-09-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14982 **[Test build #65009 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65009/consoleFull)** for PR 14982 at commit

[GitHub] spark pull request #14976: [SPARK-17306] [SQL] QuantileSummaries doesn't com...

2016-09-06 Thread clockfly
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/14976#discussion_r77726882 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala --- @@ -59,9 +59,14 @@ class QuantileSummaries( *

[GitHub] spark pull request #14926: [SPARK-17365][Core] Remove/Kill multiple executor...

2016-09-06 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/14926#discussion_r77724521 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -392,10 +397,36 @@ private[spark] class ExecutorAllocationManager(

[GitHub] spark pull request #14926: [SPARK-17365][Core] Remove/Kill multiple executor...

2016-09-06 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/14926#discussion_r77723977 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -392,10 +397,36 @@ private[spark] class ExecutorAllocationManager(

[GitHub] spark issue #14961: [SPARK-17379] [BUILD] Upgrade netty-all to 4.0.41 final ...

2016-09-06 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14961 > Is the lesson here to not bother with pooling and use the UnpooledByteBufAllocator? Not sure. Pooling is for improving the performance because allocating direct buffers is pretty slow.

[GitHub] spark pull request #14943: [SPARK-15891][yarn] Clean up some logging in the ...

2016-09-06 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14943#discussion_r77722666 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala --- @@ -59,43 +58,46 @@ private[yarn] class ExecutorRunnable(

[GitHub] spark issue #14961: [SPARK-17379] [BUILD] Upgrade netty-all to 4.0.41 final ...

2016-09-06 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14961 Aha, possibly this: https://groups.google.com/forum/#!topic/netty/3BoF7q34Z4I Is the lesson here to not bother with pooling and use the UnpooledByteBufAllocator? --- If your project is

[GitHub] spark issue #14943: [SPARK-15891][yarn] Clean up some logging in the YARN AM...

2016-09-06 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/14943 +1 feel free to commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #14943: [SPARK-15891][yarn] Clean up some logging in the YARN AM...

2016-09-06 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/14943 +1 feel free to commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #14943: [SPARK-15891][yarn] Clean up some logging in the YARN AM...

2016-09-06 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/14943 +1 feel free to commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #14943: [SPARK-15891][yarn] Clean up some logging in the ...

2016-09-06 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/14943#discussion_r77722010 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala --- @@ -59,43 +58,46 @@ private[yarn] class ExecutorRunnable(

[GitHub] spark issue #14834: [SPARK-17163][ML][WIP] Unified LogisticRegression interf...

2016-09-06 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/14834 @jkbradley Thanks for your input. Let's see what @dbtsai thinks as well :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #14927: [SPARK-16922] [SPARK-17211] [SQL] make the address of va...

2016-09-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14927 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #14972: [Minor] Remove unnecessary check in MLSerDe

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14972 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #14924: [SPARK-17299] TRIM/LTRIM/RTRIM should not strips ...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14924 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14972: [Minor] Remove unnecessary check in MLSerDe

2016-09-06 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/14972 I'm OK with this not having a JIRA, and I agree the code path is tested sufficiently. (But wanting these is a good sentiment!) LGTM Merging with master --- If your project is set up for

[GitHub] spark issue #14924: [SPARK-17299] TRIM/LTRIM/RTRIM should not strips charact...

2016-09-06 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14924 Merged to master/2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #14924: [SPARK-17299] TRIM/LTRIM/RTRIM should not strips charact...

2016-09-06 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14924 Merged to master/2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

<    1   2   3   4   5   6   >