[GitHub] spark issue #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspark for ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15817 **[Test build #68534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68534/consoleFull)** for PR 15817 at commit [`d589515`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15546: [SPARK-17982][SQL] SQLBuilder should wrap the generated ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15546 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68528/ Test PASSed. ---

[GitHub] spark issue #15546: [SPARK-17982][SQL] SQLBuilder should wrap the generated ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15546 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15546: [SPARK-17982][SQL] SQLBuilder should wrap the generated ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15546 **[Test build #68528 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68528/consoleFull)** for PR 15546 at commit [`604414e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15848 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68529/ Test FAILed. ---

[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15848 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15848 **[Test build #68529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68529/consoleFull)** for PR 15848 at commit [`cc9cbdc`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread techaddict
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15843 @holdenk updated the description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enable

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspark for ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15817 **[Test build #68534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68534/consoleFull)** for PR 15817 at commit [`d589515`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #15790: [SPARK-18264][SPARKR] build vignettes with package, upda...

2016-11-11 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15790 Problem is the required and generated `vignette.rds` RDS file is a binary file? I'm not sure about checking in binaries in git, that would show up in a source-only release? Maybe create-

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 So this change looks good to me, but it seems like it fixes more than just the bug described in the JIRA & PR description with @jkbradley's change integrated (namely the issue with param copy which

[GitHub] spark issue #15852: Spark 18187

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15852 **[Test build #68533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68533/consoleFull)** for PR 15852 at commit [`96d2dfe`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #15851: [SPARK-18412][SPARKR][ML] Fix exception for some SparkR ...

2016-11-11 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15851 Thanks for testing and fixing this. A couple of minor comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark pull request #15851: [SPARK-18412][SPARKR][ML] Fix exception for some ...

2016-11-11 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15851#discussion_r87646073 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/RWrapperUtils.scala --- @@ -41,5 +41,29 @@ object RWrapperUtils extends Logging { s"usin

[GitHub] spark pull request #15851: [SPARK-18412][SPARKR][ML] Fix exception for some ...

2016-11-11 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15851#discussion_r87645242 --- Diff: R/pkg/inst/tests/testthat/test_mllib.R --- @@ -1039,6 +1045,12 @@ test_that("spark.gbt", { expect_equal(iris2$NumericSpecies, as.double(c

[GitHub] spark issue #15801: [SPARK-18337] Complete mode memory sinks should be able ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15801 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15801: [SPARK-18337] Complete mode memory sinks should be able ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15801 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68527/ Test PASSed. ---

[GitHub] spark issue #15801: [SPARK-18337] Complete mode memory sinks should be able ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15801 **[Test build #68527 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68527/consoleFull)** for PR 15801 at commit [`52ea3f1`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15851: [SPARK-18412][SPARKR][ML] Fix exception for some ...

2016-11-11 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15851#discussion_r87643716 --- Diff: R/pkg/inst/tests/testthat/test_mllib.R --- @@ -971,10 +971,15 @@ test_that("spark.randomForest Classification", { predictions <- collect(

[GitHub] spark pull request #15851: [SPARK-18412][SPARKR][ML] Fix exception for some ...

2016-11-11 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15851#discussion_r87643171 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/RWrapperUtils.scala --- @@ -41,5 +41,29 @@ object RWrapperUtils extends Logging { s"usin

[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...

2016-11-11 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15659 Thanks @davies :) I think at this point we've probably got things pretty well covered (@viirya & @nchammas also have done a lot of review and @minrk took a look at the initial PR / directi

[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15659 **[Test build #68532 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68532/consoleFull)** for PR 15659 at commit [`210c9d4`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...

2016-11-11 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/15659 This looks good to me in general, but did not check all the details. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark pull request #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip...

2016-11-11 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/15659#discussion_r87642242 --- Diff: python/MANIFEST.in --- @@ -0,0 +1,23 @@ +#!/usr/bin/env python --- End diff -- oh yah not needed, probably from copying the license

[GitHub] spark pull request #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip...

2016-11-11 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/15659#discussion_r87641702 --- Diff: python/MANIFEST.in --- @@ -0,0 +1,23 @@ +#!/usr/bin/env python --- End diff -- ? --- If your project is set up for it, you can rep

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13909 @cloud-fan @hvanhovell could you please review this since there is no failure now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If y

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-11 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/15763 Thanks for the tip on the code. I will work on it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark issue #15852: Spark 18187

2016-11-11 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15852 (Update the title please; see others for format) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15848 **[Test build #68531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68531/consoleFull)** for PR 15848 at commit [`9540f70`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-11 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15763 I think it is also better that we start whitelisting operators instead blacklisting them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub a

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-11 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15763 @nsyca it should be relatively straighforward to implement this here: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#

[GitHub] spark issue #15826: [SPARK-14077][ML][FOLLOW-UP] Minor refactor and cleanup ...

2016-11-11 Thread thunterdb
Github user thunterdb commented on the issue: https://github.com/apache/spark/pull/15826 @yanboliang that looks great, thank you. LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark pull request #15683: [SPARK-18166][MLlib] Fix Poisson GLM bug due to w...

2016-11-11 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/15683#discussion_r87639131 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala --- @@ -88,6 +89,12 @@ class GeneralizedLinearRegression

[GitHub] spark pull request #15410: [SPARK-17843][Web UI] Indicate event logs pending...

2016-11-11 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15410 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-11-11 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15410 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featu

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-11 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/15763 @hvanhovell Then we will need to walk from the top of the operator hosting the outer reference to the operator hosting the correlation to ensure there is no Aggregate or Window operator. If t

[GitHub] spark issue #15852: Spark 18187

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15852 **[Test build #68530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68530/consoleFull)** for PR 15852 at commit [`6901eac`](https://github.com/apache/spark/commit/6

[GitHub] spark pull request #15172: [SPARK-13331] AES support for over-the-wire encry...

2016-11-11 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15172 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #15852: Spark 18187

2016-11-11 Thread tcondie
GitHub user tcondie opened a pull request: https://github.com/apache/spark/pull/15852 Spark 18187 ## What changes were proposed in this pull request? CompactibleFileStreamLog relys on "compactInterval" to detect a compaction batch. If the "compactInterval" is reset by user, Comp

[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption

2016-11-11 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/15172 Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #15546: [SPARK-17982][SQL] SQLBuilder should wrap the generated ...

2016-11-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15546 Thank you, @gatorsmile . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-11 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15763 @nsyca I feel that the current approach is to restrictive. I would prefer to just close the gap for Window and Aggregate. --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark issue #15848: [SPARK-9487v2] Use the same num. worker threads in Scala...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15848 **[Test build #68529 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68529/consoleFull)** for PR 15848 at commit [`cc9cbdc`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #15848: [SPARK-9487v2] Use the same num. worker threads in Scala...

2016-11-11 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15848 Also please fix the title to [SPARK-9487], and remove Python reference --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68526/ Test PASSed. ---

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #68526 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68526/consoleFull)** for PR 13909 at commit [`a82ed38`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15848: [SPARK-9487v2] Use the same num. worker threads in Scala...

2016-11-11 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15848 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15848: [SPARK-9487v2] Use the same num. worker threads in Scala...

2016-11-11 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15848 Jenkins add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15683: [SPARK-18166][MLlib] Fix Poisson GLM bug due to wrong re...

2016-11-11 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/15683 @sethah Thanks for your review and suggestion. I have made a new commit reflecting your comments. @srowen Thanks for all the suggestions. When do you think this change could be merged

[GitHub] spark issue #15848: [SPARK-9487v2] Use the same num. worker threads in Scala...

2016-11-11 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15848 @skanjila rebases are a fact of life, but they're not hard. I doubt it will be an issue here. No, you shouldn't make other changes in a separate PR. If TestSQLContext needs to be changed too plea

[GitHub] spark issue #15546: [SPARK-17982][SQL] SQLBuilder should wrap the generated ...

2016-11-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15546 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15546: [SPARK-17982][SQL] SQLBuilder should wrap the generated ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15546 **[Test build #68528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68528/consoleFull)** for PR 15546 at commit [`604414e`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #15546: [SPARK-17982][SQL] SQLBuilder should wrap the generated ...

2016-11-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15546 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the fe

[GitHub] spark issue #15801: [SPARK-18337] Complete mode memory sinks should be able ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15801 **[Test build #68527 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68527/consoleFull)** for PR 15801 at commit [`52ea3f1`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...

2016-11-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15704 Thank you, @viirya . I added a prevention logic to prevent that case. Now the PR works more like Hive. Hi, @hvanhovell . Could you review when you have sometime? --- If your proj

[GitHub] spark issue #15801: [SPARK-18337] Complete mode memory sinks should be able ...

2016-11-11 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15801 @tdas Addressed your comments. Test time increased to 2.5 seconds though, fyi. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If you

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread techaddict
Github user techaddict commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87621123 --- Diff: python/pyspark/ml/feature.py --- @@ -1163,9 +1184,11 @@ class QuantileDiscretizer(JavaEstimator, HasInputCol, HasOutputCol, JavaMLReadab

[GitHub] spark pull request #15847: [SPARK-18387] [SQL] Add serialization to checkEva...

2016-11-11 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/15847#discussion_r87620051 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -1431,43 +1431,49 @@ case class FormatNumber(x

[GitHub] spark pull request #15847: [SPARK-18387] [SQL] Add serialization to checkEva...

2016-11-11 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/15847#discussion_r87619721 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -36,7 +36,7 @@ import org.apache.spark.unsafe.ty

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87617539 --- Diff: python/pyspark/ml/feature.py --- @@ -158,21 +158,28 @@ class Bucketizer(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadable, Jav

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87617849 --- Diff: python/pyspark/ml/feature.py --- @@ -1163,9 +1184,11 @@ class QuantileDiscretizer(JavaEstimator, HasInputCol, HasOutputCol, JavaMLReadab

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-11 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/15763 @hvanhovell ping... Are there anything I need to do to close this incorrect result problem? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as w

[GitHub] spark issue #15800: [SPARK-18334] MinHash should use binary hash distance

2016-11-11 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15800 @jkbradley Thanks for clarifying, I see your argument now. I agree that it makes sense from a statistical perspective. Still, I have not seen a single paper that describes anything quite exactly like

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15410 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15410 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68524/ Test PASSed. ---

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15410 **[Test build #68524 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68524/consoleFull)** for PR 15410 at commit [`46357ee`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15683: [SPARK-18166][MLlib] Fix Poisson GLM bug due to wrong re...

2016-11-11 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15683 @actuaryzhang Thanks a lot for correcting this! I just had a small comment to make the additional test shorter. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #15683: [SPARK-18166][MLlib] Fix Poisson GLM bug due to w...

2016-11-11 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/15683#discussion_r87609107 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala --- @@ -453,6 +464,56 @@ class GeneralizedLinearRegressi

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #68526 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68526/consoleFull)** for PR 13909 at commit [`a82ed38`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #15563: [SPARK-16759][CORE] Add a configuration property to pass...

2016-11-11 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15563 +1. @mridulm you have any further comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #15835: [SPARK-17059][SQL] Allow FileFormat to specify pa...

2016-11-11 Thread schlosna
Github user schlosna commented on a diff in the pull request: https://github.com/apache/spark/pull/15835#discussion_r87597499 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -478,23 +491,33 @@ case class FileSourceScanExec(

[GitHub] spark pull request #15835: [SPARK-17059][SQL] Allow FileFormat to specify pa...

2016-11-11 Thread schlosna
Github user schlosna commented on a diff in the pull request: https://github.com/apache/spark/pull/15835#discussion_r87597457 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -478,23 +491,33 @@ case class FileSourceScanExec(

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #68523 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68523/consoleFull)** for PR 13909 at commit [`357171f`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68523/ Test FAILed. ---

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspark for ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15817 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspark for ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15817 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68525/ Test PASSed. ---

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspark for ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15817 **[Test build #68525 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68525/consoleFull)** for PR 15817 at commit [`234d165`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15835: [SPARK-17059][SQL] Allow FileFormat to specify partition...

2016-11-11 Thread pwoody
Github user pwoody commented on the issue: https://github.com/apache/spark/pull/15835 I've pushed up the ability to configure this feature being enabled as well. Here is a benchmark when writing out 200 files with this code: ``` withSQLConf(ParquetOutputFormat.ENABLE_JOB_S

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspark for ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15817 **[Test build #68525 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68525/consoleFull)** for PR 15817 at commit [`234d165`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspark for ...

2016-11-11 Thread techaddict
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15817 @MLnick thanks for the review, addressed your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread techaddict
Github user techaddict commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87593705 --- Diff: python/pyspark/ml/feature.py --- @@ -1194,21 +1217,30 @@ class QuantileDiscretizer(JavaEstimator, HasInputCol, HasOutputCol, JavaMLReadab

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread techaddict
Github user techaddict commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87593693 --- Diff: python/pyspark/ml/feature.py --- @@ -158,19 +158,26 @@ class Bucketizer(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadable, Jav

[GitHub] spark issue #15840: [SPARK-18398][SQL] Fix nullabilities of MapObjects and o...

2016-11-11 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/15840 Does this approach avoid to generate code `convertedArray[loopIndex] = null;` if `lambdaFunction.nullable` is `true`? Is it a design? I know that it works correctly in this case since an array allocat

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15410 **[Test build #68524 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68524/consoleFull)** for PR 15410 at commit [`46357ee`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #15835: [SPARK-17059][SQL] Allow FileFormat to specify partition...

2016-11-11 Thread pwoody
Github user pwoody commented on the issue: https://github.com/apache/spark/pull/15835 Cool - I've added the caching, fixed style issues, and added pruning to the bucketed reads. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-11-11 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15410 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wis

[GitHub] spark issue #14789: [SPARK-17209][YARN] Add the ability to manually update c...

2016-11-11 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/14789 so I agree these are separate cases but I think the api makes sense to be very similar, or at least in the same sort of class. I don't think we want a public end user api in SparkHadoopUtil.updat

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87586676 --- Diff: python/pyspark/ml/feature.py --- @@ -1194,21 +1217,30 @@ class QuantileDiscretizer(JavaEstimator, HasInputCol, HasOutputCol, JavaMLReadab

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87586787 --- Diff: python/pyspark/ml/feature.py --- @@ -1194,21 +1217,30 @@ class QuantileDiscretizer(JavaEstimator, HasInputCol, HasOutputCol, JavaMLReadab

[GitHub] spark pull request #15817: [SPARK-18366][PYSPARK] Add handleInvalid to Pyspa...

2016-11-11 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/15817#discussion_r87586948 --- Diff: python/pyspark/ml/feature.py --- @@ -158,19 +158,26 @@ class Bucketizer(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadable, Jav

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #68523 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68523/consoleFull)** for PR 13909 at commit [`357171f`](https://github.com/apache/spark/commit/3

[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-11-11 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13909#discussion_r87586194 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -43,11 +43,38 @@ trait ExpressionEvalHelper

[GitHub] spark issue #15800: [SPARK-18334] MinHash should use binary hash distance

2016-11-11 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/15800 @Yunni Spark DF should have a `posexplode`: ``` scala> val df = Seq((0, Array(Vectors.dense(1, 2), Vectors.dense(5, 4))), (1, Array(Vectors.dense(3, 2), Vectors.dense(1, 2.toDF("id",

[GitHub] spark issue #15851: [SPARK-18412][SPARKR][ML] Fix exception for some SparkR ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15851 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15851: [SPARK-18412][SPARKR][ML] Fix exception for some SparkR ...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15851 **[Test build #68522 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68522/consoleFull)** for PR 15851 at commit [`d0d7c28`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15851: [SPARK-18412][SPARKR][ML] Fix exception for some SparkR ...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15851 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68522/ Test PASSed. ---

[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15704 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68521/ Test PASSed. ---

<    1   2   3   4   >