[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142586761 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142598216 --- Diff: python/pyspark/sql/tests.py --- @@ -3106,8 +3106,9 @@ def assertFramesEqual(self, df_with_arrow, df_without):

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142586105 --- Diff: python/pyspark/sql/group.py --- @@ -194,6 +194,65 @@ def pivot(self, pivot_col, values=None): jgd =

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142593679 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142591629 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142591870 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142585501 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142592031 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142588765 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142584934 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,66 @@ def pivot(self, pivot_col, values=None): jgd =

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142586717 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142591369 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark issue #16618: [SPARK-14409][ML][WIP] Add RankingEvaluator

2017-10-04 Thread Kornel
Github user Kornel commented on the issue: https://github.com/apache/spark/pull/16618 @MLnick I'm wondering what's the status of this issue: seems closed, have you any plans on picking it up again? I might pick it up, but I'm not sure what's left: move from package mllib to

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142622117 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19422: [SPARK-22193][SQL] Minor typo fix

2017-10-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19422 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19083 **[Test build #82446 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82446/testReport)** for PR 19083 at commit

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82446/ Test PASSed. ---

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 @WeichenXu123. thank you --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19416: [SPARK-22187][SS] Update unsaferow format for sav...

2017-10-04 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/19416#discussion_r142583033 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/FlatMapGroupsWithState_StateManager.scala --- @@ -0,0 +1,143 @@ +/*

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82444 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82444/testReport)** for PR 19424 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142592915 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142620788 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142624984 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19083 **[Test build #82445 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82445/testReport)** for PR 19083 at commit

[GitHub] spark pull request #19424: [SPARK-22197][SQL] push down operators to data so...

2017-10-04 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19424 [SPARK-22197][SQL] push down operators to data source before planning ## What changes were proposed in this pull request? As we discussed in

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19424 cc @rdblue @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82441 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82441/testReport)** for PR 19061 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142610856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -0,0 +1,95 @@ +/* + * Licensed to

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142632240 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142625490 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82445/ Test PASSed. ---

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17455: [Spark-20044][Web UI] Support Spark UI behind front-end ...

2017-10-04 Thread okoethibm
Github user okoethibm commented on the issue: https://github.com/apache/spark/pull/17455 @jiangxb1987 I've stopped working on Spark related efforts a while ago and will not be following up on this PR any more. Anyone interested is free to take over. ---

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19083 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19083 **[Test build #82445 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82445/testReport)** for PR 19083 at commit

[GitHub] spark pull request #19426: [SPARK-22190][CORE] Add Spark executor task metri...

2017-10-04 Thread LucaCanali
GitHub user LucaCanali opened a pull request: https://github.com/apache/spark/pull/19426 [SPARK-22190][CORE] Add Spark executor task metrics to Dropwizard metrics ## What changes were proposed in this pull request? This proposed patch is about making Spark executor task

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82447 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82447/testReport)** for PR 19424 at commit

[GitHub] spark pull request #19427: Reset spark.driver.bindAddress when starting a Ch...

2017-10-04 Thread ssaavedra
GitHub user ssaavedra opened a pull request: https://github.com/apache/spark/pull/19427 Reset spark.driver.bindAddress when starting a Checkpoint ## What changes were proposed in this pull request? It seems that recovering from a checkpoint can replace the old driver

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82447 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82447/testReport)** for PR 19424 at commit

[GitHub] spark pull request #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread susanxhuynh
GitHub user susanxhuynh opened a pull request: https://github.com/apache/spark/pull/19428 [SPARK-22131][MESOS] Mesos driver secrets ## What changes were proposed in this pull request? The driver launches executors that have access to env or file-based secrets. Most

[GitHub] spark issue #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows ...

2017-10-04 Thread minixalpha
Github user minixalpha commented on the issue: https://github.com/apache/spark/pull/19090 @jsnowacki I have already add comments to explain the quotes, could you help me review the comments? Thanks. --- - To

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 @hhbyyh, this change does not target performance but scalability, and I am afraid, the change is beneficial only for huge datasets and the tests would require massive computational resources.

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 BTW. Seems like `updateLambda` method relies (in older version as well) on `batchSize` only because this is `an optimization to avoid batch.count`. Shouldn't we rather use `nonEmptyDocsN` instead

[GitHub] spark issue #19427: Reset spark.driver.bindAddress when starting a Checkpoin...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19427 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19429: [SPARK-20055] [Docs] Added documentation for load...

2017-10-04 Thread jomach
GitHub user jomach opened a pull request: https://github.com/apache/spark/pull/19429 [SPARK-20055] [Docs] Added documentation for loading csv files into DataFrames ## What changes were proposed in this pull request? Added documentation for loading csv files

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #82448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82448/testReport)** for PR 18924 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19428 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19428 **[Test build #82449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82449/testReport)** for PR 19428 at commit

[GitHub] spark issue #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19428 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82449/ Test FAILed. ---

[GitHub] spark issue #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows ...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19090 Thanks for reviewing @jsnowacki, let me try to take a final look. I also checked what I could all but let me double check. Just want to be careful as it's the entry point. ---

[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19429 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19419: [SPARK-22188] [CORE] Adding security headers for ...

2017-10-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19419#discussion_r142634158 --- Diff: conf/spark-defaults.conf.template --- @@ -19,9 +19,16 @@ # This is useful for setting default environmental settings. # Example:

[GitHub] spark pull request #19419: [SPARK-22188] [CORE] Adding security headers for ...

2017-10-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19419#discussion_r142634239 --- Diff: conf/spark-defaults.conf.template --- @@ -19,9 +19,16 @@ # This is useful for setting default environmental settings. # Example:

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #82448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82448/testReport)** for PR 18924 at commit

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82447/ Test PASSed. ---

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82448/ Test PASSed. ---

[GitHub] spark issue #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19428 **[Test build #82449 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82449/testReport)** for PR 19428 at commit

[GitHub] spark issue #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows ...

2017-10-04 Thread jsnowacki
Github user jsnowacki commented on the issue: https://github.com/apache/spark/pull/19090 I think the comments are fine and sufficiently explain extra quotes existence. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142678914 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142689702 --- Diff: python/pyspark/sql/tests.py --- @@ -3106,8 +3106,9 @@ def assertFramesEqual(self, df_with_arrow, df_without):

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142690650 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,66 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142690602 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,66 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142694484 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142694381 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142697418 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142692448 --- Diff: python/pyspark/sql/tests.py --- @@ -3106,8 +3106,9 @@ def assertFramesEqual(self, df_with_arrow, df_without):

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142695501 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -26,6 +26,25 @@ import

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142695129 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142694835 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #19419: [SPARK-22188] [CORE] Adding security headers for ...

2017-10-04 Thread krishna-pandey
Github user krishna-pandey commented on a diff in the pull request: https://github.com/apache/spark/pull/19419#discussion_r142701588 --- Diff: conf/spark-defaults.conf.template --- @@ -19,9 +19,16 @@ # This is useful for setting default environmental settings. #

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142691179 --- Diff: python/pyspark/sql/tests.py --- @@ -3106,8 +3106,9 @@ def assertFramesEqual(self, df_with_arrow, df_without):

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142704126 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark pull request #19419: [SPARK-22188] [CORE] Adding security headers for ...

2017-10-04 Thread krishna-pandey
Github user krishna-pandey commented on a diff in the pull request: https://github.com/apache/spark/pull/19419#discussion_r142708896 --- Diff: conf/spark-defaults.conf.template --- @@ -19,9 +19,16 @@ # This is useful for setting default environmental settings. #

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142703487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark issue #19041: [SPARK-21097][CORE] Add option to recover cached data

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19041 **[Test build #82450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82450/testReport)** for PR 19041 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142693843 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142703829 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark pull request #18704: [SPARK-20783][SQL] Create ColumnVector to abstrac...

2017-10-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18704 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142624246 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142624340 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142624093 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #18969: [SPARK-21520][SQL][FOLLOW-UP]fix a special case for non-...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18969 @heary-cao Maybe you can close this PR first? @jiangxb1987 will handle it in the previous PR. --- - To unsubscribe, e-mail:

[GitHub] spark issue #13893: [SPARK-14172][SQL] Hive table partition predicate not pa...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13893 @heary-cao tried to resolve the same issue in https://github.com/apache/spark/pull/18969 ping @jiangxb1987 --- -

[GitHub] spark pull request #19251: [SPARK-22035][SQL]the value of statistical logica...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19251#discussion_r142722330 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/SizeInBytesOnlyStatsPlanVisitor.scala --- @@ -32,12

[GitHub] spark issue #19392: [SPARK-22169][SQL] support byte length literal as identi...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19392 LGTM pending Jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-10-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19083 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19416: [SPARK-22187][SS] Update unsaferow format for saved stat...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19416 **[Test build #82454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82454/testReport)** for PR 19416 at commit

[GitHub] spark issue #18747: [WIP][SPARK-20822][SQL] Generate code to directly get va...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18747 **[Test build #82451 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82451/testReport)** for PR 18747 at commit

[GitHub] spark issue #19392: [SPARK-22169][SQL] support byte length literal as identi...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19392 **[Test build #82452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82452/testReport)** for PR 19392 at commit

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19083 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19392: [SPARK-22169][SQL] support byte length literal as...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19392#discussion_r142731310 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -699,20 +699,30 @@ class AstBuilder(conf: SQLConf)

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142735696 --- Diff: python/pyspark/sql/group.py --- @@ -194,6 +194,65 @@ def pivot(self, pivot_col, values=None): jgd =

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142740947 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,33 @@ class RelationalGroupedDataset

[GitHub] spark issue #19425: [SPARK-22196][Core] Combine multiple input splits into a...

2017-10-04 Thread vgankidi
Github user vgankidi commented on the issue: https://github.com/apache/spark/pull/19425 @davies Can you please take a look? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142720877 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to

[GitHub] spark issue #18747: [WIP][SPARK-20822][SQL] Generate code to directly get va...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18747 **[Test build #82451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82451/testReport)** for PR 18747 at commit

[GitHub] spark issue #18747: [WIP][SPARK-20822][SQL] Generate code to directly get va...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18747 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

<    1   2   3   >