[GitHub] spark issue #18694: [SPARK-21492][SQL] Memory leak in SortMergeJoin

2017-07-26 Thread zhzhan
Github user zhzhan commented on the issue: https://github.com/apache/spark/pull/18694 Close the PR and will work on adding close interface for the iterator used in SparkSQL to remove extra overhead. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #18694: [SPARK-21492][SQL] Memory leak in SortMergeJoin

2017-07-26 Thread zhzhan
Github user zhzhan closed the pull request at: https://github.com/apache/spark/pull/18694 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #18679: [SPARK-21319][SQL] Fix memory leak in sorter

2017-07-26 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18679#discussion_r129751794 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java --- @@ -138,15 +147,20 @@ private UnsafeExternalSorter(

[GitHub] spark issue #17180: [SPARK-19839][Core]release longArray in BytesToBytesMap

2017-07-26 Thread zhzhan
Github user zhzhan commented on the issue: https://github.com/apache/spark/pull/17180 Will fix the unit test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #18739: [WIP][SPARK-21539][CORE] Job should not be aborte...

2017-07-26 Thread caneGuy
Github user caneGuy commented on a diff in the pull request: https://github.com/apache/spark/pull/18739#discussion_r129753610 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -665,10 +667,15 @@ private[spark] class TaskSetManager(

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-26 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #79987 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79987/testReport)** for PR 18664 at commit

[GitHub] spark issue #18746: [ML][Python] Implemented UnaryTransformer in Python

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18746 **[Test build #79988 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79988/testReport)** for PR 18746 at commit

[GitHub] spark pull request #18746: Implemented UnaryTransformer in Python

2017-07-26 Thread ajaysaini725
GitHub user ajaysaini725 opened a pull request: https://github.com/apache/spark/pull/18746 Implemented UnaryTransformer in Python ## What changes were proposed in this pull request? Implemented UnaryTransformer in Python (Please fill in changes proposed in this

[GitHub] spark issue #18746: Implemented UnaryTransformer in Python

2017-07-26 Thread ajaysaini725
Github user ajaysaini725 commented on the issue: https://github.com/apache/spark/pull/18746 @jkbradley @thunterdb @MrBago Could you please review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17180: [SPARK-19839][Core]release longArray in BytesToBytesMap

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17180 **[Test build #79989 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79989/testReport)** for PR 17180 at commit

[GitHub] spark issue #18746: [ML][Python] Implemented UnaryTransformer in Python

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18746 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18746: [ML][Python] Implemented UnaryTransformer in Python

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18746 **[Test build #79988 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79988/testReport)** for PR 18746 at commit

[GitHub] spark issue #18746: [ML][Python] Implemented UnaryTransformer in Python

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18746 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79988/ Test PASSed. ---

[GitHub] spark issue #18744: [SPARK-21538][SQL] Fix attribute resolution inconsistenc...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18744 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18744: [SPARK-21538][SQL] Fix attribute resolution inconsistenc...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18744 **[Test build #79986 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79986/testReport)** for PR 18744 at commit

[GitHub] spark issue #18744: [SPARK-21538][SQL] Fix attribute resolution inconsistenc...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18744 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79986/ Test PASSed. ---

[GitHub] spark issue #18305: [SPARK-20988][ML] Logistic regression uses aggregator hi...

2017-07-26 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18305 Merged to master. Thanks @sethah, and thanks all for reviews. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18305: [SPARK-20988][ML] Logistic regression uses aggregator hi...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18305 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79962/ Test PASSed. ---

[GitHub] spark issue #18305: [SPARK-20988][ML] Logistic regression uses aggregator hi...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18305 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18731: [SPARK-20990][SQL] Read all JSON documents in files when...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18731 **[Test build #79963 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79963/testReport)** for PR 18731 at commit

[GitHub] spark pull request #18554: [SPARK-21306][ML] OneVsRest should support setWei...

2017-07-26 Thread facaiy
Github user facaiy commented on a diff in the pull request: https://github.com/apache/spark/pull/18554#discussion_r129562237 --- Diff: python/pyspark/ml/tests.py --- @@ -1255,6 +1255,24 @@ def test_output_columns(self): output = model.transform(df)

[GitHub] spark pull request #18610: [SPARK-21386] ML LinearRegression supports warm s...

2017-07-26 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18610#discussion_r129574777 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -309,6 +313,23 @@ private[ml] object DefaultParamsWriter { val

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79955/ Test PASSed. ---

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18503 **[Test build #79956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79956/testReport)** for PR 18503 at commit

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18709 **[Test build #79959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79959/testReport)** for PR 18709 at commit

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18655 **[Test build #79957 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79957/testReport)** for PR 18655 at commit

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79957/ Test FAILed. ---

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18709 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18738: Typo in comment

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18738 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18695: [SPARK-12717][PYTHON] Adding thread-safe broadcas...

2017-07-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18695#discussion_r129516461 --- Diff: python/pyspark/context.py --- @@ -195,7 +195,7 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, #

[GitHub] spark pull request #18659: [SPARK-21404][PYSPARK][WIP] Simple Python Vectori...

2017-07-26 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r129522956 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -132,6 +135,61 @@ private[sql] object ArrowConverters {

[GitHub] spark issue #18554: [SPARK-21306][ML] OneVsRest should support setWeightCol

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18554 **[Test build #79964 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79964/testReport)** for PR 18554 at commit

[GitHub] spark pull request #18554: [SPARK-21306][ML] OneVsRest should support setWei...

2017-07-26 Thread facaiy
Github user facaiy commented on a diff in the pull request: https://github.com/apache/spark/pull/18554#discussion_r129562189 --- Diff: python/pyspark/ml/classification.py --- @@ -1517,20 +1517,22 @@ class OneVsRest(Estimator, OneVsRestParams, MLReadable, MLWritable):

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-26 Thread jreback
Github user jreback commented on the issue: https://github.com/apache/spark/pull/18664 I cannot repro this; can you show what ``item['timezone']`` is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17180: [SPARK-19839][Core]release longArray in BytesToBytesMap

2017-07-26 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/17180 Is it better to fix this test instead of remove it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18305: [SPARK-20988][ML] Logistic regression uses aggregator hi...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18305 **[Test build #79962 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79962/testReport)** for PR 18305 at commit

[GitHub] spark pull request #18305: [SPARK-20988][ML] Logistic regression uses aggreg...

2017-07-26 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18305 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #18739: [WIP][SPARK-21539][CORE] Job should not be aborte...

2017-07-26 Thread caneGuy
GitHub user caneGuy opened a pull request: https://github.com/apache/spark/pull/18739 [WIP][SPARK-21539][CORE] Job should not be aborted when dynamic allocation is en… …abled or spark.executor.instances larger then current allocated number by yarn ## What changes were

[GitHub] spark issue #18739: [WIP][SPARK-21539][CORE] Job should not be aborted when ...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18739 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18737: [SPARK-21536][R] Remove the workaroud to allow dots in f...

2017-07-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18737 cc @felixcheung and @shivaram, I first was worried of this behaviour change but I guess this was rather be a workaround for a known bug that should be removed out and we have warned properly so

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18709 **[Test build #79959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79959/testReport)** for PR 18709 at commit

[GitHub] spark issue #18652: [SPARK-21497][SQL][WIP] Pull non-deterministic equi join...

2017-07-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18652 @gatorsmile When the flag is enabled, we don't follow Hive on non-deterministic join conditions. The difference are: * Hive allows non-deterministic expressions in equi join keys

[GitHub] spark issue #18737: [SPARK-21536][R] Remove the workaroud to allow dots in f...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18737 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18503 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79956/ Test FAILed. ---

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18709 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79959/ Test FAILed. ---

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18503 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18737: [WIP][SPARK-21536][R] Remove the workaroud to allow dots...

2017-07-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18737 Hm.. this is a bigger change than I thought .. I mean the change itself here should be correct as we support dots in columns in Scala side but it looks there are few bugs related with dots in

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread heary-cao
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18725 @viirya @baibaichen thank your for review it. I made a comparison test: ``` select k,k,sum(id) from (select d004 as id, floor(c010 * 1) as k, ceil(c010) as cceila from

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 @heary-cao, is the better performance with your fix? e.g. changing RDG's deterministic property from false to true? ``` override def deterministic: Boolean = true ``` --- If

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 @heary-cao your fix is wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConvert...

2017-07-26 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18655#discussion_r129487716 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala --- @@ -0,0 +1,383 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18652: [SPARK-21497][SQL][WIP] Pull non-deterministic equi join...

2017-07-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18652 It is a good question. Based on previous discussion, I think Join operator has no unique result in the non-deterministic case. The migration issue from Hive is because this kind of queries can't run

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18725 @baibaichen I agree. Looks correct. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18632: [SPARK-21412][SQL] Reset BufferHolder while initialize a...

2017-07-26 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/18632 @cloud-fan You are right, thanks. I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18738: Typo in comment

2017-07-26 Thread nahoj
GitHub user nahoj opened a pull request: https://github.com/apache/spark/pull/18738 Typo in comment - You can merge this pull request into a Git repository by running: $ git pull https://github.com/nahoj/spark patch-1 Alternatively you can review and apply these changes as

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread heary-cao
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18725 @baibaichen Okay, I try to modify this particular scenario by split it to two Projects. thanks. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18738: Typo in comment

2017-07-26 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18738 Can you have a look for similar typos, or others in this file? we encourage people to submit more than just one minor typo fix in a PR if possible --- If your project is set up for it, you can

[GitHub] spark issue #18738: Typo in comment

2017-07-26 Thread nahoj
Github user nahoj commented on the issue: https://github.com/apache/spark/pull/18738 Sorry, I don't have time to proof-read the docs, I just saw this one typo as it is in the summary of this much-used class. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #18728: [SPARK-21524] [ML] unit test fix: ValidatorParams...

2017-07-26 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18728 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18655 **[Test build #79960 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79960/testReport)** for PR 18655 at commit

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79960/ Test PASSed. ---

[GitHub] spark issue #18305: [SPARK-20988][ML] Logistic regression uses aggregator hi...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18305 **[Test build #79962 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79962/testReport)** for PR 18305 at commit

[GitHub] spark pull request #18554: [SPARK-21306][ML] OneVsRest should support setWei...

2017-07-26 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18554#discussion_r129532746 --- Diff: python/pyspark/ml/classification.py --- @@ -1517,20 +1517,22 @@ class OneVsRest(Estimator, OneVsRestParams, MLReadable, MLWritable):

[GitHub] spark pull request #18554: [SPARK-21306][ML] OneVsRest should support setWei...

2017-07-26 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18554#discussion_r129533677 --- Diff: python/pyspark/ml/tests.py --- @@ -1255,6 +1255,24 @@ def test_output_columns(self): output = model.transform(df)

[GitHub] spark issue #18736: [SPARK-21481][ML] Add indexOf method for ml.feature.Hash...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18736 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18737: [SPARK-21536][R] Remove the workaroud to allow do...

2017-07-26 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/18737 [SPARK-21536][R] Remove the workaroud to allow dots in field names in R's createDataFame ## What changes were proposed in this pull request? This PR removes the workaround for dots in

[GitHub] spark issue #18554: [SPARK-21306][ML] OneVsRest should support setWeightCol

2017-07-26 Thread facaiy
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/18554 ping @holdenk @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18731: [SPARK-20990][SQL] Read all JSON documents in files when...

2017-07-26 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/18731 I am debugging, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18737: [WIP][SPARK-21536][R] Remove the workaroud to allow dots...

2017-07-26 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18737 it's very likely we need to make sure a column name with `.` is specified with backtick, esp. when referenced in SQL expression... --- If your project is set up for it, you can reply to this

[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18513 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79961/ Test PASSed. ---

[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18513 **[Test build #79961 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79961/testReport)** for PR 18513 at commit

[GitHub] spark issue #18305: [SPARK-20988][ML] Logistic regression uses aggregator hi...

2017-07-26 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18305 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConvert...

2017-07-26 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18655#discussion_r129488362 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/arrow/ArrowConvertersSuite.scala --- @@ -857,6 +857,449 @@ class ArrowConvertersSuite

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 The `HiveTableScans` strategy need `CatalogRelation`, but it's `LogicalRelation` in my case. Actually, the hive table is external table in my test, I guess that's the reason. I believe

[GitHub] spark pull request #18632: [SPARK-21412][SQL] Reset BufferHolder while initi...

2017-07-26 Thread gczsjdy
Github user gczsjdy closed the pull request at: https://github.com/apache/spark/pull/18632 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread heary-cao
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18725 @baibaichen yes, In my test environment `Time taken: 557.276 seconds, Fetched 1 row(s)` VS `Time taken: 5997.238 seconds, Fetched 1 row(s)` But I'm not sure about the

[GitHub] spark issue #18728: [SPARK-21524] [ML] unit test fix: ValidatorParamsSuiteHe...

2017-07-26 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18728 merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #18737: [SPARK-21536][R] Remove the workaroud to allow dots in f...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18737 **[Test build #79958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79958/testReport)** for PR 18737 at commit

[GitHub] spark issue #18737: [SPARK-21536][R] Remove the workaroud to allow dots in f...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18737 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79958/ Test FAILed. ---

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18725 I think it's a `HiveTableScan`, rather than `FileSourceScanExec`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18737: [WIP][SPARK-21536][R] Remove the workaroud to allow dots...

2017-07-26 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18737 It's a breaking change, but IMO one we need since we have quite a bit of feedback on this. re: test failure ``` java.lang.IllegalArgumentException: Field "Sepal_Length" does not

[GitHub] spark issue #18731: [SPARK-20990][SQL] Read all JSON documents in files when...

2017-07-26 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/18731 The reason of the UT failure is that in these two UTs we are passing invalid JSONs (mind the extra closed curly brace): -

[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2017-07-26 Thread heary-cao
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18555 @gatorsmile @cloud-fan I added new test case again. except ``` DYN_ALLOCATION_MIN_EXECUTORS DYN_ALLOCATION_INITIAL_EXECUTORS DYN_ALLOCATION_MAX_EXECUTORS

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18737: [SPARK-21536][R] Remove the workaroud to allow dots in f...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18737 **[Test build #79958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79958/testReport)** for PR 18737 at commit

[GitHub] spark issue #18652: [SPARK-21497][SQL][WIP] Pull non-deterministic equi join...

2017-07-26 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18652 Then, will this PR resolve the migration issue from Hive workloads? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18655 **[Test build #79955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79955/testReport)** for PR 18655 at commit

[GitHub] spark pull request #18737: [WIP][SPARK-21536][R] Remove the workaroud to all...

2017-07-26 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/18737 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18655 **[Test build #79957 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79957/testReport)** for PR 18655 at commit

[GitHub] spark pull request #18736: [SPARK-21481][ML] Add indexOf method for ml.featu...

2017-07-26 Thread facaiy
GitHub user facaiy opened a pull request: https://github.com/apache/spark/pull/18736 [SPARK-21481][ML] Add indexOf method for ml.feature.HashingTF ## What changes were proposed in this pull request? Add indexOf method for ml.feature.HashingTF. The PR is a hotfix by

[GitHub] spark issue #18731: [SPARK-20990][SQL] Read all JSON documents in files when...

2017-07-26 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18731 Could you fix the bug? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 It's another issue about non-deterministic. When generating SparkPlan in `FileSourceStrategy` , `PhysicalOperation` is used to extract projects and filters on top of relation. But with

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-26 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18737: [WIP][SPARK-21536][R] Remove the workaroud to allow dots...

2017-07-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18737 Yea. The test failure above itself is legitimate but while manually running and debugging few more tests with some more fixes, it printed: ``` Failed

[GitHub] spark issue #18337: [SPARK-21131][GraphX] Fix batch gradient bug in SVDPlusP...

2017-07-26 Thread daniellaah
Github user daniellaah commented on the issue: https://github.com/apache/spark/pull/18337 I also tested the SVDPlusPlus on movielens-100k dataset. The algorithm just diverged. And the mse on the dataset gets 2.14748364347152E9. I tested @lxmly 's code as well, it works but I

[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...

2017-07-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18702 I guess it probably will take about a week more for my Apache account creation (according to the doc). --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer

2017-07-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18513 **[Test build #79961 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79961/testReport)** for PR 18513 at commit

  1   2   3   >