[GitHub] spark issue #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSVCSuite

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16784 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSVCSuite

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73606/ Test PASSed. ---

[GitHub] spark issue #17104: [MINOR][ML] Fix comments in LSH Examples and Python API

2017-02-28 Thread Yunni
Github user Yunni commented on the issue: https://github.com/apache/spark/pull/17104 @srowen The full name works. Just want to make the comments shorter so that it's easier to read. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #17105: [SPARK-19773][SparkR] SparkDataFrame should not allow du...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17105 **[Test build #73609 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73609/testReport)** for PR 17105 at commit

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-28 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/17059 @datumbox I've added some additional comments with regards to the fractional part, please take a look, thanks! --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...

2017-02-28 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/17059#discussion_r103554415 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams

[GitHub] spark issue #17092: [SPARK-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-28 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/17092 @Yunni ok, let us discuss the further optimization step in other ticket. the current patch is LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103560298 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #16959: [SPARK-19631][CORE] OutputCommitCoordinator should not a...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16959 **[Test build #73613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73613/testReport)** for PR 16959 at commit

[GitHub] spark pull request #17107: [SPARK-19774] StreamExecution should call stop() ...

2017-02-28 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/17107 [SPARK-19774] StreamExecution should call stop() on sources when a stream fails ## What changes were proposed in this pull request? We call stop() on a Structured Streaming Source only

[GitHub] spark pull request #17088: [SPARK-19753][CORE] All shuffle files on a host s...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r103541395 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1331,7 +1332,7 @@ class DAGScheduler( //

[GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...

2017-02-28 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/17059#discussion_r103548602 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-28 Thread datumbox
Github user datumbox commented on the issue: https://github.com/apache/spark/pull/17059 @imatiach-msft Thanks mate. Show it and replied. I really don't mind it either way to be honest. @MLnick @srowen I am not sure if this PR will make it to Spark 2.2 due to the upcoming

[GitHub] spark issue #17088: [SPARK-19753][CORE] Un-register all shuffle output on a ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/17088 Why is this a no-op when the shuffle service isn't enabled? It looks like you mark the slave as lost in all cases? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-28 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request: https://github.com/apache/spark/pull/16714#discussion_r103564868 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -64,6 +64,12 @@ private[spark] class EventLoggingListener(

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73614/testReport)** for PR 17043 at commit

[GitHub] spark issue #16639: [SPARK-19276][CORE] Fetch Failure handling robust to use...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16639 **[Test build #73607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73607/testReport)** for PR 16639 at commit

[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16856 **[Test build #73615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73615/testReport)** for PR 16856 at commit

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16782 > it leaves in place the static class variable for all other ML classes that use the wrapper, and those classes continue to use the static class variable. I think this was discussed

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103541861 --- Diff: core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala --- @@ -181,11 +185,20 @@ private[spark] class

[GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...

2017-02-28 Thread datumbox
Github user datumbox commented on a diff in the pull request: https://github.com/apache/spark/pull/17059#discussion_r103550608 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams extends

[GitHub] spark issue #17032: [SPARK-19460][SparkR]:Update dataset used in R documenta...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17032 **[Test build #73608 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73608/testReport)** for PR 17032 at commit

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread pwoody
Github user pwoody commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103552762 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #17105: [SPARK-19773][SparkR] SparkDataFrame should not allow du...

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17105 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73609/ Test FAILed. ---

[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

2017-02-28 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/17056#discussion_r103555753 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala --- @@ -635,6 +636,16 @@ case class HiveHash(children:

[GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...

2017-02-28 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/17059#discussion_r103559188 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams

[GitHub] spark pull request #17043: [SPARK-19719][SS][WIP] Kafka writer for both stru...

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17043#discussion_r103564577 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSinkSuite.scala --- @@ -0,0 +1,413 @@ +/* + * Licensed to the

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 I'm going to go ahead and merge this after tests to make sure it's in 2.2, but can you please send a follow-up for my last 2 comments? Thanks! --- If your project is set up for it, you can

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/16867 LGTM and @squito's JIRA re-reorging sounds perfect --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSVCSuite

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73605/ Test PASSed. ---

[GitHub] spark issue #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSVCSuite

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16784 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103549134 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #17105: [SPARK-19773][SparkR] SparkDataFrame should not allow du...

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17105 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-02-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r103553254 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1331,7 +1332,7 @@ class DAGScheduler( // TODO:

[GitHub] spark issue #17105: [SPARK-19773][SparkR] SparkDataFrame should not allow du...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17105 **[Test build #73609 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73609/testReport)** for PR 17105 at commit

[GitHub] spark issue #11956: [SPARK-14098][SQL] Generate Java code that gets a float/...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11956 **[Test build #73611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73611/consoleFull)** for PR 11956 at commit

[GitHub] spark issue #17105: [SPARK-19773][SparkR] SparkDataFrame should not allow du...

2017-02-28 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17105 @felixcheung Ahh, it seems that we have some conflicting design issues. 1. From the test in collect() and crossJoin, it seems to allow dup names in SparkDataFrame by design: ```

[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-28 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request: https://github.com/apache/spark/pull/16714#discussion_r103565658 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -97,61 +100,80 @@ private[spark] object JsonProtocol { case

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103562238 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103538967 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103565523 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103538636 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamTest.scala --- @@ -208,6 +208,11 @@ trait StreamTest extends QueryTest with

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103567860 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103565647 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103565336 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103569221 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103567273 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103565167 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +665,154 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103563609 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -993,6 +993,12 @@ class DAGScheduler(

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73614/testReport)** for PR 17043 at commit

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103566031 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103569761 --- Diff: core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala --- @@ -82,9 +88,15 @@ private[spark] class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103564111 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -454,33 +452,15 @@ private[spark] class TaskSetManager(

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103564565 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -195,6 +197,11 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103565779 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #17103: [Minor][Doc] Update GLM doc to include tweedie di...

2017-02-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17103 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103569795 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala --- @@ -164,17 +164,18 @@ class ExecutorSuite extends SparkFunSuite with

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103147391 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -130,7 +152,7 @@ private[spark] object TaskDescription {

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103566378 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark issue #17090: [Spark-19535][ML] RecommendForAllUsers RecommendForAllIt...

2017-02-28 Thread sueann
Github user sueann commented on the issue: https://github.com/apache/spark/pull/17090 The output in https://github.com/apache/spark/pull/12574/ looks like a DataFrame with Row(srcCol: Int, "recommendations": Array[(Int, Float)]) so I think this PR as is matches the output type -

[GitHub] spark issue #16959: [SPARK-19631][CORE] OutputCommitCoordinator should not a...

2017-02-28 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16959 @kayousterhout > This commit makes me worried there are more bugs related to #16620. For example, what if a task was OK'ed to commit, but then DAGScheduler decides to ignore it because of the

[GitHub] spark pull request #17105: [SPARK-19773][SparkR] SparkDataFrame should not a...

2017-02-28 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/17105 [SPARK-19773][SparkR] SparkDataFrame should not allow duplicate names ## What changes were proposed in this pull request? SparkDataFrame in SparkR seems to accept duplicate names at

[GitHub] spark pull request #16842: [SPARK-19304] [Streaming] [Kinesis] fix kinesis s...

2017-02-28 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16842#discussion_r103547648 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisBackedBlockRDD.scala --- @@ -36,7 +36,8 @@ import

[GitHub] spark pull request #16842: [SPARK-19304] [Streaming] [Kinesis] fix kinesis s...

2017-02-28 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16842#discussion_r103547760 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisBackedBlockRDD.scala --- @@ -204,10 +210,11 @@ class

[GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...

2017-02-28 Thread datumbox
Github user datumbox commented on a diff in the pull request: https://github.com/apache/spark/pull/17059#discussion_r103557982 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams extends

[GitHub] spark issue #9524: [SPARK-10387][ML] Add code gen for gbt

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9524 **[Test build #73602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73602/consoleFull)** for PR 9524 at commit

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread pwoody
Github user pwoody commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103562268 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #15821: [SPARK-13534][PySpark] Using Apache Arrow to increase pe...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15821 **[Test build #73612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73612/testReport)** for PR 15821 at commit

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73616/testReport)** for PR 15415 at commit

[GitHub] spark pull request #17106: [SPARK-19775][SQL] Remove an obsolete `partitionB...

2017-02-28 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/17106 [SPARK-19775][SQL] Remove an obsolete `partitionBy().insertInto()` test case ## What changes were proposed in this pull request? This issue removes [a test

[GitHub] spark issue #17088: [SPARK-19753][CORE] All shuffle files on a host should b...

2017-02-28 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/17088 Can you update the JIRA and PR description to say "un-register the output locations" (or similar) instead of "remove the files"? The current description is misleading since nothing is

[GitHub] spark issue #17100: [SPARK-13947][PYTHON] PySpark DataFrames: The error mess...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17100 **[Test build #73604 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73604/testReport)** for PR 17100 at commit

[GitHub] spark issue #17100: [SPARK-13947][PYTHON] PySpark DataFrames: The error mess...

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17100 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73604/ Test FAILed. ---

[GitHub] spark issue #17100: [SPARK-13947][PYTHON] PySpark DataFrames: The error mess...

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17100 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

2017-02-28 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/17056#discussion_r103551662 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala --- @@ -371,6 +370,51 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103554187 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,8 +55,26 @@ private[spark] class TaskDescription( val

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread pwoody
Github user pwoody commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103554224 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16856 **[Test build #73615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73615/testReport)** for PR 16856 at commit

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark pull request #17090: [Spark-19535][ML] RecommendForAllUsers RecommendF...

2017-02-28 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17090#discussion_r103539599 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -547,6 +550,45 @@ class ALSSuite ALS.train(ratings)

[GitHub] spark pull request #17090: [Spark-19535][ML] RecommendForAllUsers RecommendF...

2017-02-28 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17090#discussion_r103538498 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -248,18 +248,18 @@ class ALSModel private[ml] ( @Since("1.3.0")

[GitHub] spark pull request #16809: [SPARK-19463][SQL]refresh cache after the InsertI...

2017-02-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16809 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #17090: [Spark-19535][ML] RecommendForAllUsers RecommendF...

2017-02-28 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17090#discussion_r103538921 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/TopByKeyAggregator.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #17105: [SPARK-19773][SparkR] SparkDataFrame should not allow du...

2017-02-28 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17105 @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #17032: [SPARK-19460][SparkR]:Update dataset used in R documenta...

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17032 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73608/ Test PASSed. ---

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103550855 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #17032: [SPARK-19460][SparkR]:Update dataset used in R documenta...

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17032 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103552112 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-28 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/17059 @datumbox I added an additional comment -- since you believe we should keep the current code, we should add a test case and slightly modify the error message to let the user know what other

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103555964 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r10306 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103556809 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -961,56 +968,121 @@ class CSVSuite extends

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103555472 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark issue #16976: [SPARK-19610][SQL] Support parsing multiline CSV files

2017-02-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16976 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2017-02-28 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue: https://github.com/apache/spark/pull/13326 Killed drivers have the same value as the successfully completed drivers when we are showing in the Web UI, users would get surprised when they don't see the driver suddenly which was there

[GitHub] spark pull request #16714: [SPARK-16333][Core] Enable EventLoggingListener t...

2017-02-28 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request: https://github.com/apache/spark/pull/16714#discussion_r103569094 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -62,18 +62,21 @@ private[spark] object JsonProtocol { * JSON

[GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...

2017-02-28 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/17059#discussion_r103547149 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams

[GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...

2017-02-28 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/17059#discussion_r103547192 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams

[GitHub] spark pull request #17088: [SPARK-19753][CORE] All shuffle files on a host s...

2017-02-28 Thread sitalkedia
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r103552476 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1331,7 +1332,7 @@ class DAGScheduler( // TODO:

[GitHub] spark issue #17088: [SPARK-19753][CORE] All shuffle files on a host should b...

2017-02-28 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/17088 >> Also, does the issue here only arise when the shuffle service is enabled? That is correct. For case, when shuffle service is not enabled, this change should be a no-op. --- If your

<    1   2   3   4   5   6   7   8   >