[GitHub] spark issue #19779: [SPARK-17920][SPARK-19580][SPARK-19878][SQL] Support wri...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19779 The fix looks good to me. You can address the comments left by @dongjoon-hyun --- - To unsubscribe, e-mail:

[GitHub] spark issue #19787: [SPARK-22541][SQL] Explicitly claim that Python udfs can...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19787 **[Test build #84056 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84056/testReport)** for PR 19787 at commit

[GitHub] spark issue #19787: [SPARK-22541][SQL] Explicitly claim that Python udfs can...

2017-11-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19787 Thanks @HyukjinKwon I've revised the doc for pandas_udf too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19779: [SPARK-17920][SPARK-19580][SPARK-19878][SQL] Supp...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19779#discussion_r152198714 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -841,6 +841,76 @@ class VersionsSuite extends

[GitHub] spark pull request #19787: [SPARK-22541][SQL] Explicitly claim that Python u...

2017-11-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19787#discussion_r152198789 --- Diff: python/pyspark/sql/functions.py --- @@ -2198,12 +2198,9 @@ def udf(f=None, returnType=StringType()): duplicate invocations may be

[GitHub] spark pull request #19787: [SPARK-22541][SQL] Explicitly claim that Python u...

2017-11-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19787#discussion_r152198691 --- Diff: python/pyspark/sql/functions.py --- @@ -2205,6 +2205,10 @@ def udf(f=None, returnType=StringType()): rows that do not satisfy the

[GitHub] spark pull request #19787: [SPARK-22541][SQL] Explicitly claim that Python u...

2017-11-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19787#discussion_r152198181 --- Diff: python/pyspark/sql/functions.py --- @@ -2198,12 +2198,9 @@ def udf(f=None, returnType=StringType()): duplicate invocations may be

[GitHub] spark issue #19737: [SPARK-22508][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19737 LGTM pending jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19790: [SPARK-22569] [SQL] Clean usage of addMutableState and s...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19790 LGTM, can you improve the PR description to describe the major change? e.g. ``` 1. replace hardcoded type string to ctx.JAVA_BOOLEAN etc. 2. create a default value of the initCode for

[GitHub] spark pull request #19790: [SPARK-22569] [SQL] Clean usage of addMutableStat...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19790#discussion_r152195602 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -205,27 +209,32 @@ case class

[GitHub] spark pull request #19787: [SPARK-22541][SQL] Explicitly claim that Python u...

2017-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19787#discussion_r152195436 --- Diff: python/pyspark/sql/functions.py --- @@ -2198,12 +2198,9 @@ def udf(f=None, returnType=StringType()): duplicate invocations may be

[GitHub] spark issue #19790: [SPARK-22569] [SQL] Clean usage of addMutableState and s...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19790 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84045/ Test FAILed. ---

[GitHub] spark issue #19790: [SPARK-22569] [SQL] Clean usage of addMutableState and s...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19790 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19790: [SPARK-22569] [SQL] Clean usage of addMutableState and s...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19790 **[Test build #84045 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84045/testReport)** for PR 19790 at commit

[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19776 **[Test build #84055 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84055/testReport)** for PR 19776 at commit

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152193847 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala --- @@ -0,0 +1,306 @@ +/* + * Licensed

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152193833 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala --- @@ -0,0 +1,305 @@ +/* + * Licensed

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152193763 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster(

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19730 @cloud-fan I see, I will create another PR to fix this global variable issue. @gatorsmile I will check other calls. --- - To

[GitHub] spark issue #19778: [SPARK-22550][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19778 LGTM pending jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r152193203 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -812,10 +812,14 @@ private[spark] object MapOutputTracker extends Logging {

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19730 I think this is a different issue and should be fixed with another PR. @kiszk how about we change the test to cast int to long to avoid this issue? ---

[GitHub] spark issue #19737: [SPARK-22508][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19737 **[Test build #84054 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84054/testReport)** for PR 19737 at commit

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r152192733 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -140,6 +140,13 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152192018 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala --- @@ -0,0 +1,306 @@ +/* + *

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152191858 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala --- @@ -0,0 +1,305 @@ +/* + *

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19607 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19607 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84044/ Test PASSed. ---

[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19776 **[Test build #84053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84053/testReport)** for PR 19776 at commit

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19607 **[Test build #84044 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84044/testReport)** for PR 19607 at commit

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152191659 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala --- @@ -0,0 +1,305 @@ +/* + *

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152191430 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala --- @@ -0,0 +1,305 @@ +/* + * Licensed

[GitHub] spark issue #19778: [SPARK-22550][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19778 **[Test build #84052 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84052/testReport)** for PR 19778 at commit

[GitHub] spark issue #19389: [SPARK-22165][SQL] Resolve type conflicts between decima...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19389 **[Test build #84051 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84051/testReport)** for PR 19389 at commit

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152190564 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala --- @@ -0,0 +1,305 @@ +/* + *

[GitHub] spark pull request #19778: [SPARK-22550][SQL] Fix 64KB JVM bytecode limit pr...

2017-11-20 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19778#discussion_r152190136 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -841,6 +825,26 @@ class CodegenContext {

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r152189670 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -140,6 +140,13 @@ object SQLConf { .booleanConf

[GitHub] spark issue #19389: [SPARK-22165][SQL] Resolve type conflicts between decima...

2017-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19389 I have just made a table to check the diff easily: **Before**: |InputA \

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19730 Yeah. We can fix them in this PR. BTW, could you check all the other calls of `addMutableState` and fix them too? ---

[GitHub] spark issue #19755: [SPARK-22524][SQL] Subquery shows reused on UI SQL tab e...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19755 I can't find a way to distinguish `reused` and `unreused` subquery. For example, in the `ReuseSubquery` rule, after seeing the 1st SubqueryExec(with `unreused` in name), it's buffered. When the

[GitHub] spark issue #19778: [SPARK-22550][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19778 **[Test build #84050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84050/testReport)** for PR 19778 at commit

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19607 **[Test build #84048 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84048/testReport)** for PR 19607 at commit

[GitHub] spark issue #19389: [SPARK-22165][SQL] Resolve type conflicts between decima...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19389 **[Test build #84049 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84049/testReport)** for PR 19389 at commit

[GitHub] spark issue #19776: [SPARK-22548][SQL] Incorrect nested AND expression pushe...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19776 **[Test build #84047 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84047/testReport)** for PR 19776 at commit

[GitHub] spark pull request #19778: [SPARK-22550][SQL] Fix 64KB JVM bytecode limit pr...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19778#discussion_r152186300 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -841,6 +825,26 @@ class

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152186116 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite // The

[GitHub] spark issue #19746: [SPARK-22346][ML] VectorSizeHint Transformer for using V...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19746 **[Test build #84046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84046/testReport)** for PR 19746 at commit

[GitHub] spark pull request #19737: [SPARK-22508][SQL] Fix 64KB JVM bytecode limit pr...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19737#discussion_r152185474 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeRowJoiner.scala --- @@ -154,7 +164,10 @@ object

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152185531 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark pull request #19737: [SPARK-22508][SQL] Fix 64KB JVM bytecode limit pr...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19737#discussion_r152185361 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeRowJoiner.scala --- @@ -88,8 +92,14 @@ object

[GitHub] spark issue #19790: [SPARK-22569] [SQL] Clean usage of addMutableState and s...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19790 **[Test build #84045 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84045/testReport)** for PR 19790 at commit

[GitHub] spark issue #19790: [SPARK-22569] [SQL] Clean usage of addMutableState and s...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19790 cc @cloud-fan @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19790: [SPARK-22569] [SQL] Clean usage of addMutableStat...

2017-11-20 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/19790 [SPARK-22569] [SQL] Clean usage of addMutableState and splitExpressions ## What changes were proposed in this pull request? This PR is to clean the usage of addMutableState and

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84042/ Test PASSed. ---

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19439 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #84042 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84042/testReport)** for PR 19439 at commit

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19607 Well, I'd leave the config `spark.sql.execution.pandas.respectSessionTimeZone` as it is for now to be safe as @gatorsmile mentioned before. And as we discussed in dev list, I'll update this to

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19607 **[Test build #84044 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84044/testReport)** for PR 19607 at commit

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r152174929 --- Diff: python/pyspark/worker.py --- @@ -150,7 +150,8 @@ def read_udfs(pickleSer, infile, eval_type): if eval_type ==

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r152174935 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -58,6 +59,11 @@ class ArrowPythonRunner(

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r152174813 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1913,7 +1920,16 @@ def toPandas(self): for f, t in dtype.items():

[GitHub] spark issue #19753: [SPARK-22521][ML] VectorIndexerModel support handle unse...

2017-11-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19753 LGTM with two minor comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19753: [SPARK-22521][ML] VectorIndexerModel support hand...

2017-11-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19753#discussion_r152172693 --- Diff: python/pyspark/ml/feature.py --- @@ -2490,7 +2490,8 @@ def setParams(self, inputCols=None, outputCol=None): @inherit_doc

[GitHub] spark pull request #19753: [SPARK-22521][ML] VectorIndexerModel support hand...

2017-11-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19753#discussion_r152173508 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala --- @@ -55,7 +55,7 @@ private[ml] trait VectorIndexerParams extends Params

[GitHub] spark issue #19787: [SPARK-22541][SQL] Explicitly claim that Python udfs can...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19787 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84043/ Test PASSed. ---

[GitHub] spark issue #19787: [SPARK-22541][SQL] Explicitly claim that Python udfs can...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19787 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19787: [SPARK-22541][SQL] Explicitly claim that Python udfs can...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19787 **[Test build #84043 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84043/testReport)** for PR 19787 at commit

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16578 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16578 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84041/ Test PASSed. ---

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16578 **[Test build #84041 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84041/testReport)** for PR 16578 at commit

[GitHub] spark issue #19787: [SPARK-22541][SQL] Explicitly claim that Python udfs can...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19787 **[Test build #84043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84043/testReport)** for PR 19787 at commit

[GitHub] spark pull request #19787: [SPARK-22541][SQL] Explicitly claim that Python u...

2017-11-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19787#discussion_r152170467 --- Diff: python/pyspark/sql/functions.py --- @@ -2205,6 +2205,10 @@ def udf(f=None, returnType=StringType()): rows that do not satisfy the

[GitHub] spark pull request #19781: [SPARK-22445][SQL][FOLLOW-UP] Respect children's ...

2017-11-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19781#discussion_r152168668 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala --- @@ -76,20 +76,23 @@ case class

[GitHub] spark issue #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - Basic Sc...

2017-11-20 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19468 Sounds good to me, also cc @sameeragarwal @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19746: [SPARK-22346][ML] VectorSizeHint Transformer for ...

2017-11-20 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19746#discussion_r152166365 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorSizeHintSuite.scala --- @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #19381: [SPARK-10884][ML] Support prediction on single instance ...

2017-11-20 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19381 I'll try to take a look but am pretty swamped currently. CC @yanboliang @MLnick @dbtsai @holdenk might you have time? --- -

[GitHub] spark issue #19753: [SPARK-22521][ML] VectorIndexerModel support handle unse...

2017-11-20 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19753 I'll try to take a look but am pretty swamped currently. CC @yanboliang @MLnick @dbtsai @holdenk might you have time? --- -

[GitHub] spark issue #19770: [SPARK-21571][WEB UI] Spark history server leaves incomp...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19770 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84040/ Test PASSed. ---

[GitHub] spark issue #19770: [SPARK-21571][WEB UI] Spark history server leaves incomp...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19770 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19770: [SPARK-21571][WEB UI] Spark history server leaves incomp...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19770 **[Test build #84040 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84040/testReport)** for PR 19770 at commit

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #84042 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84042/testReport)** for PR 19439 at commit

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-11-20 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/19439 @jkbradley done! rebased to latest master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19730 Working for this. I met another problem. [This

[GitHub] spark pull request #19746: [SPARK-22346][ML] VectorSizeHint Transformer for ...

2017-11-20 Thread MrBago
Github user MrBago commented on a diff in the pull request: https://github.com/apache/spark/pull/19746#discussion_r152159939 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorSizeHintSuite.scala --- @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19746: [SPARK-22346][ML] VectorSizeHint Transformer for ...

2017-11-20 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19746#discussion_r152158938 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorSizeHintSuite.scala --- @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152158653 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite // The

[GitHub] spark pull request #19774: [SPARK-22475][SQL] show histogram in DESC COLUMN ...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19774#discussion_r152158010 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -689,6 +689,11 @@ case class DescribeColumnCommand(

[GitHub] spark pull request #19774: [SPARK-22475][SQL] show histogram in DESC COLUMN ...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19774#discussion_r152157864 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -689,6 +689,11 @@ case class DescribeColumnCommand(

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152156940 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite //

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19776#discussion_r152156397 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -497,7 +497,19 @@ object

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-11-20 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19439 LGTM, except that it looks like this doesn't merge cleanly. Would you mind rebasing it on master? --- - To unsubscribe,

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16578 **[Test build #84041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84041/testReport)** for PR 16578 at commit

[GitHub] spark pull request #19777: [SPARK-22549][SQL] Fix 64KB JVM bytecode limit pr...

2017-11-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19777 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19777: [SPARK-22549][SQL] Fix 64KB JVM bytecode limit problem w...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19777 thanks, merging to master/2.2! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19770: [SPARK-21571][WEB UI] Spark history server leaves incomp...

2017-11-20 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/19770 This look a bit cleaner this time around. Since this is left off by default it LGTM --- - To unsubscribe, e-mail:

[GitHub] spark issue #19746: [SPARK-22346][ML] VectorSizeHint Transformer for using V...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19746 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84039/ Test PASSed. ---

[GitHub] spark issue #19746: [SPARK-22346][ML] VectorSizeHint Transformer for using V...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19746 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19746: [SPARK-22346][ML] VectorSizeHint Transformer for using V...

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19746 **[Test build #84039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84039/testReport)** for PR 19746 at commit

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-11-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #3988 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3988/testReport)** for PR 19439 at commit

[GitHub] spark issue #19370: [SPARK-22495] Fix setup of SPARK_HOME variable on Window...

2017-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19370 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

  1   2   3   4   >