[GitHub] spark pull request #19433: [SPARK-3162] [MLlib][WIP] Add local tree training...

2017-10-09 Thread smurching
Github user smurching commented on a diff in the pull request: https://github.com/apache/spark/pull/19433#discussion_r143398990 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/LocalDecisionTreeUtils.scala --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Sof

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19363 shall we fix RelationalGroupedDataset too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional co

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @gatorsmile If my words make you upset, I'm sorry. It's you right to raise suspicion against any PRs. I do respect this right. Maybe I'm wrong and there actually is a possible regression. App

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19424 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82547/testReport)** for PR 19424 at commit [`24f1a75`](https://github.com/apache/spark/commit/24

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143406835 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class QuantileSummariesSuite ext

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143406940 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala --- @@ -157,21 +157,21 @@ class DataFrameStatSuite extends QueryTest with Sh

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19082 Yes no need for a back-and-forth. @gatorsmile I think it's reasonable to ask for a little more detail on your comment. --- - To u

[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-10-09 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/17357#discussion_r143407147 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala --- @@ -23,14 +23,15 @@ import org.apache.commons.lang3.StringUtils

[GitHub] spark pull request #18748: [SPARK-20679][ML] Support recommending for a subs...

2017-10-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18748 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-09 Thread yaooqinn
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/19363 Ok,i will add a jira tgt and fix RelationalGroupedDataset --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82548 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82548/testReport)** for PR 19363 at commit [`7e7ed19`](https://github.com/apache/spark/commit/7e

[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread skonto
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/19437#discussion_r143411366 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala --- @@ -170,9 +174,122 @@ private[

[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread skonto
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/19437#discussion_r143413006 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala --- @@ -21,6 +21,39 @@ import java.util.concurrent.TimeUnit

[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread skonto
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/19437#discussion_r143415419 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala --- @@ -170,9 +174,122 @@ private[

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19266#discussion_r143421894 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java --- @@ -35,6 +35,11 @@ * if the fields of row

[GitHub] spark issue #19433: [SPARK-3162] [MLlib][WIP] Add local tree training for de...

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19433 @smurching Does it still WIP ? If done remove "[WIP]", I will begin review, thanks! --- - To unsubscribe, e-mail: reviews-u

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r143426157 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,216 @@ +/* + * Licensed to the Apa

[GitHub] spark pull request #19419: [SPARK-22188] [CORE] Adding security headers for ...

2017-10-09 Thread krishna-pandey
Github user krishna-pandey commented on a diff in the pull request: https://github.com/apache/spark/pull/19419#discussion_r143427428 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -89,6 +92,9 @@ private[spark] object JettyUtils extends Logging {

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19106 @srowen Any other comments? Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comman

[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread skonto
Github user skonto commented on the issue: https://github.com/apache/spark/pull/19437 @susanxhuynh this a mesos 1.4 feature, shouldn't we document this for users? https://issues.apache.org/jira/browse/MESOS-7418 ---

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82547/testReport)** for PR 19424 at commit [`24f1a75`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82547/ Test PASSed. ---

[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-09 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r143439369 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -116,9 +116,10 @@ private [sql] object Gen

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82548/testReport)** for PR 19363 at commit [`7e7ed19`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82548/ Test PASSed. ---

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #19457: [SPARK] Misleading error message

2017-10-09 Thread pavel-sakun
GitHub user pavel-sakun opened a pull request: https://github.com/apache/spark/pull/19457 [SPARK] Misleading error message Fix misleading error message when argument is expected. ## What changes were proposed in this pull request? Change message to be accurate.

[GitHub] spark issue #19457: [SPARK] Misleading error message

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19457 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19106: [SPARK-21770][ML] ProbabilisticClassificationMode...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19106#discussion_r143445323 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala --- @@ -230,21 +230,22 @@ private[ml] object ProbabilisticCl

[GitHub] spark pull request #19106: [SPARK-21770][ML] ProbabilisticClassificationMode...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19106#discussion_r143445232 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala --- @@ -230,21 +230,22 @@ private[ml] object ProbabilisticCl

[GitHub] spark issue #19457: [SPARK] Misleading error message for missing --proxy-use...

2017-10-09 Thread pavel-sakun
Github user pavel-sakun commented on the issue: https://github.com/apache/spark/pull/19457 Not aware ATM, this one handles missing value for args expecting one. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread sohum2002
Github user sohum2002 commented on the issue: https://github.com/apache/spark/pull/19454 Would appreciate some help in the Python implementation of the `flatten` function as I have never used pyspark. Could someone help me out? --- ---

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19454 Let's fix up the PR title from `[SPARK-18855 ][SQL]` to `[SPARK-18855][SQL]` BTW. --- - To unsubscribe, e-mail: reviews-unsu

[GitHub] spark issue #19457: [SPARK] Misleading error message for missing --proxy-use...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19457 **[Test build #3946 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3946/testReport)** for PR 19457 at commit [`bc6d92e`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82549/testReport)** for PR 19363 at commit [`f051c10`](https://github.com/apache/spark/commit/f0

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19454 **[Test build #82550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82550/testReport)** for PR 19454 at commit [`cc08623`](https://github.com/apache/spark/commit/cc

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19454 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82550/ Test FAILed. ---

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19454 **[Test build #82550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82550/testReport)** for PR 19454 at commit [`cc08623`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19454 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-10-09 Thread DaimonPl
Github user DaimonPl commented on the issue: https://github.com/apache/spark/pull/16578 @mallman @viirya from my understanding current workaround is for case when reading columns which are not in file schema > Parquet-mr will throw an exception if we try to read a superset of

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19454 I think @srowen requested to fix it in a more performant way as well, for example, referring https://github.com/apache/spark/pull/16276, if I understood correctly and otherwise closing it.

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19454 BTW, for the answer to https://github.com/apache/spark/pull/19454#issuecomment-335138642, I think you should take a look at, for example, `flatMap` as a reference in `rdd.py` and related tests,

[GitHub] spark pull request #19250: [SPARK-12297] Table timezone correction for Times...

2017-10-09 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/19250#discussion_r143462649 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -266,6 +267,10 @@ final class DataFrameWriter[T] private[sql](ds: Datas

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19222 ping for review @hvanhovell @tejasapatil --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/17968#discussion_r143465176 --- Diff: python/pyspark/mllib/linalg/__init__.py --- @@ -1131,14 +1131,21 @@ def __getitem__(self, indices): return self.values[i + j

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17968 ping @gglanzani This bug need fixed ASAP. Can you update code when you're free ? Thanks. --- - To unsubscribe, e-mail: revi

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17968 @gglanzani And you the `ml.linalg.DenseMatrix` looks have the same bug. Can you also update it ? --- - To unsubscribe, e-ma

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143480931 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class QuantileSummariesSuite exte

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143481416 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class QuantileSummariesSuite exte

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143481784 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala --- @@ -157,21 +157,21 @@ class DataFrameStatSuite extends QueryTest with Sha

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19106 **[Test build #82551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82551/testReport)** for PR 19106 at commit [`1dfa4c1`](https://github.com/apache/spark/commit/1d

[GitHub] spark issue #16648: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-10-09 Thread bdrillard
Github user bdrillard commented on the issue: https://github.com/apache/spark/pull/16648 I'm blocking out time to prepare the part 2 PR for this issue starting today over this week, regarding compaction of excess primitive state. cc: @kiszk ---

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143361887 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -276,8 +276,8 @@ private[s

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143487275 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -276,8 +276,8 @@ private[s

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143484031 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -804,45 +814,52 @@ private

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143344688 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -374,6 +375,15 @@ private[

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143492975 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class QuantileSummariesSui

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82549/ Test PASSed. ---

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten functions ...

2017-10-09 Thread sohum2002
Github user sohum2002 commented on the issue: https://github.com/apache/spark/pull/19454 @HyukjinKwon - Thank you for your comments and analysis of this PR. I will also try to improve the `flatMap(identity)` as mentioned by @srowen. Also, will add a python implementation. ---

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82549/testReport)** for PR 19363 at commit [`f051c10`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #82552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82552/testReport)** for PR 19439 at commit [`b0c4ace`](https://github.com/apache/spark/commit/b0

[GitHub] spark issue #19457: [SPARK] Misleading error message for missing --proxy-use...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19457 **[Test build #3946 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3946/testReport)** for PR 19457 at commit [`bc6d92e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19106 **[Test build #82551 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82551/testReport)** for PR 19106 at commit [`1dfa4c1`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19106 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19106 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82551/ Test PASSed. ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143506845 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82553 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82553/testReport)** for PR 18732 at commit [`876b118`](https://github.com/apache/spark/commit/87

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143507748 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup( out

[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on the issue: https://github.com/apache/spark/pull/19374 @skonto One more question: in your screen shot of the History Server, I noticed the "Completed" time is 1969-12-31 for all the drivers (the original one, retry-1, and retry-2). Is that to be exp

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #82554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82554/testReport)** for PR 18853 at commit [`2ada11a`](https://github.com/apache/spark/commit/2a

[GitHub] spark pull request #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19269#discussion_r143514294 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java --- @@ -0,0 +1,297 @@ +/* + * Licensed to

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516384 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516496 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516595 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516772 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-09 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143517490 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -274,19 +274,26 @@ abstract class SparkPlan extends QueryPlan[SparkPl

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82555/testReport)** for PR 19394 at commit [`a2976fe`](https://github.com/apache/spark/commit/a2

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-09 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143517522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -274,19 +274,26 @@ abstract class SparkPlan extends QueryPlan[SparkPl

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #82552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82552/testReport)** for PR 19439 at commit [`b0c4ace`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19439 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82552/ Test PASSed. ---

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82556/testReport)** for PR 19394 at commit [`1a813ac`](https://github.com/apache/spark/commit/1a

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-09 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18460 Gentle ping~, @gatorsmile . :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-09 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19061 Hi, @vanzin and @jerryshao . Could you review this `ConsoleProgressBar` issue when you have some time? --- - To unsubsc

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19266 @srowen Thanks! @liufengdb Could you submit a separate PR to fix the issues and also please include the test cases? --- --

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19394 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.ap

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 @HyukjinKwon and @ueshin so with Arrow, the Pandas DataFrame from `toPandas()` timestamp columns will not have a timezone - are we going to do the same thing for `pandas_udf` Series? I was plan

[GitHub] spark pull request #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19269#discussion_r143524057 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java --- @@ -0,0 +1,297 @@ +/* + * Licensed to th

[GitHub] spark pull request #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks...

2017-10-09 Thread superbobry
GitHub user superbobry opened a pull request: https://github.com/apache/spark/pull/19458 [SPARK-7][CORE] DiskBlockManager.getAllBlocks now tolerates temp files ## What changes were proposed in this pull request? Prior to this commit getAllBlocks implicitly assumed that t

[GitHub] spark issue #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks now to...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19458 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19433: [SPARK-3162] [MLlib][WIP] Add local tree training for de...

2017-10-09 Thread smurching
Github user smurching commented on the issue: https://github.com/apache/spark/pull/19433 Thanks! I'll remove the WIP. To clear things up for the future, I'd thought [WIP] was the appropriate tag for a PR that's ready for review but not ready to be merged (based on https://spark.apache

[GitHub] spark pull request #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19458#discussion_r143527010 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala --- @@ -100,7 +102,9 @@ private[spark] class DiskBlockManager(conf: SparkConf,

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-09 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143527821 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,11 @@ class CodegenContext {

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143528173 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -171,20 +210,46 @@ final class Word2Vec @Since("1.4.0") ( @Sin

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143528339 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -171,20 +210,46 @@ final class Word2Vec @Since("1.4.0") ( @Sin

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143529286 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala --- @@ -189,6 +305,136 @@ class Word2VecSuite extends SparkFunSuite wit

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143529694 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala --- @@ -245,5 +508,28 @@ class Word2VecSuite extends SparkFunSuite with

  1   2   3   4   >