[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17001#discussion_r103046202 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -408,7 +408,15 @@ private[spark] class HiveExternalCatalog(co

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-02-24 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14830 Jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wis

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-02-24 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14830 lets do a jenkins re-run just to make sure everything is up to date and I'll try and get a final pass done soon. I think it would be good to improve our examples to be closer to pep8 style for the s

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17001#discussion_r103045973 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -408,7 +408,15 @@ private[spark] class HiveExternalCatalog(co

[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

2017-02-24 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/16774#discussion_r103045855 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala --- @@ -121,6 +121,33 @@ class CrossValidatorSuite }

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-24 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 @Yunni Yes, we can use the AND-OR to increase the possibility by having more the numHashTables and numHashFunctions. For the further user extension, if users have a hash function with lower poss

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16826 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16826 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73442/ Test PASSed. ---

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16826 **[Test build #73442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73442/testReport)** for PR 16826 at commit [`fd11ee2`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...

2017-02-24 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue: https://github.com/apache/spark/pull/13072 Thanks @mgummelt for the confirmation, It throws SparkException with the bug SPARK-15359/https://github.com/apache/spark/pull/13143. --- If your project is set up for it, you can reply to th

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 I agree that, if the set of rules is small (1-2 GB max), then collecting and broadcasting it is best. But for larger sets of rules, we'd have to keep it distributed. I'm very surprised b

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16930 thanks @kayousterhout ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes s

[GitHub] spark pull request #16930: [SPARK-19597][CORE] test case for task deserializ...

2017-02-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16930 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/16930 I merged this into master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16930 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73437/ Test PASSed. ---

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16930 **[Test build #73437 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73437/testReport)** for PR 16930 at commit [`ce6bf9a`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73443/testReport)** for PR 17043 at commit [`e6b6dc1`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17043 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17043 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73443/ Test PASSed. ---

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103027907 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSchemaInferenceSuite.scala --- @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache Softw

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 The ```assert()``` statements added to ```setupCaseSensitiveTable()``` in **HiveSchemaInferenceSuite** per earlier feedback got squashed somewhere in the course of updating this PR. I've added them ba

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16944 **[Test build #73444 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73444/testReport)** for PR 16944 at commit [`e1ca7c8`](https://github.com/apache/spark/commit/e1

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16944 @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73443 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73443/testReport)** for PR 17043 at commit [`e6b6dc1`](https://github.com/apache/spark/commit/e6

[GitHub] spark issue #16592: [SPARK-19235] [SQL] [TESTS] Enable Test Cases in DDLSuit...

2017-02-24 Thread xwu0226
Github user xwu0226 commented on the issue: https://github.com/apache/spark/pull/16592 @gatorsmile is this PR going to be merged soon, since the ALTER ADD column PR #16626 also depends on this to create test cases for InMemoryCatalog. Thanks! --- If your project is set up for it, you

[GitHub] spark pull request #16626: [SPARK-19261][SQL] Alter add columns for Hive ser...

2017-02-24 Thread xwu0226
Github user xwu0226 commented on a diff in the pull request: https://github.com/apache/spark/pull/16626#discussion_r103024279 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -563,35 +574,47 @@ private[spark] class HiveExternalCatalog(con

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16826 **[Test build #73442 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73442/testReport)** for PR 16826 at commit [`fd11ee2`](https://github.com/apache/spark/commit/fd

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 Thanks, @ericl. Is there anybody else you'd suggest pinging to take a look at this and ultimately get it merged? Re-pinging @viirya to review latest updates addressing his previous feedback.

[GitHub] spark issue #16929: [SPARK-19595][SQL] Support json array in from_json

2017-02-24 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/16929 Hmm, I'm not sure we want to change this to a generator. I think that has performance consequences as well as possibly being surprising. I would probably make it possible to handle arrays (when t

[GitHub] spark issue #16929: [SPARK-19595][SQL] Support json array in from_json

2017-02-24 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/16929 /cc @brkyvz --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #17032: [SPARK-19460][SparkR]:Update dataset used in R documenta...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17032 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73440/ Test PASSed. ---

[GitHub] spark issue #17032: [SPARK-19460][SparkR]:Update dataset used in R documenta...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17032 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17032: [SPARK-19460][SparkR]:Update dataset used in R documenta...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17032 **[Test build #73440 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73440/testReport)** for PR 17032 at commit [`5beca69`](https://github.com/apache/spark/commit/5

[GitHub] spark pull request #16892: [SPARK-19560] Improve DAGScheduler tests.

2017-02-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16892 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16938: [SPARK-19583][SQL]CTAS for data source table with...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16938#discussion_r103018002 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -140,8 +140,8 @@ case class CreateDataSource

[GitHub] spark issue #16892: [SPARK-19560] Improve DAGScheduler tests.

2017-02-24 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/16892 I merged this into master -- thanks for the review @squito. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #16938: [SPARK-19583][SQL]CTAS for data source table with...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16938#discussion_r103017698 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -140,8 +140,8 @@ case class CreateDataSource

[GitHub] spark pull request #16892: [SPARK-19560] Improve DAGScheduler tests.

2017-02-24 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/16892#discussion_r103017616 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2031,6 +2051,11 @@ class DAGSchedulerSuite extends SparkFunSu

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103014004 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala --- @@ -0,0 +1,238 @@ +/* + * Licensed to th

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103014118 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala --- @@ -0,0 +1,238 @@ +/* + * Licensed to th

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103015057 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -269,3 +273,89 @@ private[csv] class Univoc

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103016252 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -961,56 +978,135 @@ class CSVSuite extends QueryTe

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103015384 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -961,56 +978,135 @@ class CSVSuite extends QueryTe

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-02-24 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r103014993 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -400,8 +410,16 @@ private[spark] class Executor( execBackend

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103012984 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CodecStreams.scala --- @@ -86,4 +88,11 @@ object CodecStreams { .

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103012905 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CodecStreams.scala --- @@ -86,4 +88,11 @@ object CodecStreams { .

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103015905 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -958,4 +975,77 @@ class CSVSuite extends QueryTest

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-02-24 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r103015714 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FetchFailedException.scala --- @@ -45,6 +50,12 @@ private[spark] class FetchFailedException(

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13932 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73435/ Test PASSed. ---

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13932 **[Test build #73435 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73435/testReport)** for PR 13932 at commit [`a35f673`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #17045: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17045 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73441/ Test PASSed. ---

[GitHub] spark issue #17045: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17045 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17045: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17045 **[Test build #73441 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73441/testReport)** for PR 17045 at commit [`6818ae0`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16630 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73439/ Test PASSed. ---

[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16630 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16630 **[Test build #73439 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73439/testReport)** for PR 16630 at commit [`8e1c086`](https://github.com/apache/spark/commit/8

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103014135 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +663,101 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103013496 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -243,13 +243,20 @@ case class DataSource(

[GitHub] spark pull request #16987: [SPARK-19633][SS] FileSource read from FileSink

2017-02-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16987#discussion_r103014293 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -662,6 +663,101 @@ class FileStreamSourceSuite extends

[GitHub] spark issue #17031: [SPARK-19702][MESOS] Add suppress/revive support to the ...

2017-02-24 Thread mgummelt
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/17031 @susanxhuynh I don't think it's worth documenting. It should be clear in the logs, which should be where an operator turns if they notice no jobs are launching. --- If your project is set up for

[GitHub] spark issue #14807: [SPARK-17256][Deploy, Windows]Check before adding double...

2017-02-24 Thread tritab
Github user tritab commented on the issue: https://github.com/apache/spark/pull/14807 It would probably be a good idea to get some unit tests for these scenarios. Would anyone be willing to write some tests? On Feb 24, 2017 10:13 AM, "roryodonnell" wrote: > This c

[GitHub] spark issue #17031: [SPARK-19702][MESOS] Add suppress/revive support to the ...

2017-02-24 Thread mgummelt
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/17031 @susanxhuynh Mesos/Spark integration tests: https://github.com/typesafehub/mesos-spark-integration-tests. We run them as a subset of DC/OS Spark integration tests: https://github.com/mesosphere/s

[GitHub] spark issue #17045: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17045 **[Test build #73441 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73441/testReport)** for PR 17045 at commit [`6818ae0`](https://github.com/apache/spark/commit/68

[GitHub] spark issue #16824: [SPARK-18069][PYTHON] Make PySpark doctests for SQL self...

2017-02-24 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16824 I'm slightly against the work to make this change happen and would rather focus on some other Python PRs - I'm not sure it improves readability of the test cases as much as planned but I'll defer to

[GitHub] spark issue #17032: [SPARK-19460][SparkR]:Update dataset used in R documenta...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17032 **[Test build #73440 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73440/testReport)** for PR 17032 at commit [`5beca69`](https://github.com/apache/spark/commit/5b

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16944 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 @ericl: fixed the param doc string and tried to clean up ```createLogicalRelation()``` as you suggested. --- If your project is set up for it, you can reply to this email and have your reply appear o

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 Hi @jkbradley After further performance comparison, I found using broadcast would give much better performance for the transform. I tested with some public data from http://fimi.ua.ac.be/data

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-02-24 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17052 Thanks for doing this. I'm wondering if you can fix `isStreaming` instead. We added it to be able to distinguish batch and streaming dataframes. However, it doesn't work for batch DFs in a streaming

[GitHub] spark pull request #17032: [SPARK-19460][SparkR]:Update dataset used in R do...

2017-02-24 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17032#discussion_r103004945 --- Diff: examples/src/main/r/ml/glm.R --- @@ -25,12 +25,12 @@ library(SparkR) sparkR.session(appName = "SparkR-ML-glm-example") # $exam

[GitHub] spark pull request #16630: [SPARK-19270][ML] Add summary table to GLM summar...

2017-02-24 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/16630#discussion_r103004564 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -1152,4 +1173,33 @@ class GeneralizedLinearRegr

[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14412 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73431/ Test PASSed. ---

[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14412 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14412 **[Test build #73431 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73431/testReport)** for PR 14412 at commit [`212baab`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16630 **[Test build #73439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73439/testReport)** for PR 16630 at commit [`8e1c086`](https://github.com/apache/spark/commit/8e

[GitHub] spark pull request #16630: [SPARK-19270][ML] Add summary table to GLM summar...

2017-02-24 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/16630#discussion_r103003591 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GeneralizedLinearRegressionWrapper.scala --- @@ -99,37 +95,23 @@ private[r] object GeneralizedLin

[GitHub] spark pull request #16630: [SPARK-19270][ML] Add summary table to GLM summar...

2017-02-24 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/16630#discussion_r103003515 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -34,6 +35,7 @@ import org.apache.spark.rdd.RDD

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17043 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73438/ Test FAILed. ---

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17043 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73438/testReport)** for PR 17043 at commit [`67e3c06`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17043 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17043 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73436/ Test PASSed. ---

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73436 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73436/testReport)** for PR 17043 at commit [`129cfcd`](https://github.com/apache/spark/commit/1

[GitHub] spark pull request #17051: [SPARK-17075][SQL] Follow up: fix file line endin...

2017-02-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17051 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16594: [SPARK-17078] [SQL] Show stats when explain

2017-02-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16594 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17058: Refactored code to remove null representation

2017-02-24 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17058 @HarshSharma8 this is not an improvement, please close this. This is not the kind of PR that's worth submitting. See http://spark.apache.org/contributing.html --- If your project is set up for it,

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16594 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark issue #17051: [SPARK-17075][SQL] Follow up: fix file line ending and i...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17051 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/16930 LGTM assuming tests pass --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73438/testReport)** for PR 17043 at commit [`67e3c06`](https://github.com/apache/spark/commit/67

[GitHub] spark issue #17043: [SPARK-19719][SS][WIP] Kafka writer for both structured ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17043 **[Test build #73436 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73436/testReport)** for PR 17043 at commit [`129cfcd`](https://github.com/apache/spark/commit/12

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16930 **[Test build #73437 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73437/testReport)** for PR 16930 at commit [`ce6bf9a`](https://github.com/apache/spark/commit/ce

[GitHub] spark issue #14731: [SPARK-17159] [streaming]: optimise check for new files ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14731 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73434/ Test PASSed. ---

[GitHub] spark issue #14731: [SPARK-17159] [streaming]: optimise check for new files ...

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14731 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14731: [SPARK-17159] [streaming]: optimise check for new files ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14731 **[Test build #73434 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73434/testReport)** for PR 14731 at commit [`724495b`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-02-24 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/15505 @witgo what's the status of this? I'd like to get this merged and am happy to take this over if you don't have time to work on it. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request #16892: [SPARK-19560] Improve DAGScheduler tests.

2017-02-24 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16892#discussion_r102996593 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2031,6 +2051,11 @@ class DAGSchedulerSuite extends SparkFunSuite wit

[GitHub] spark pull request #17049: [SPARK-17495] [SQL] Add more tests for hive hash

2017-02-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17049 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

<    1   2   3   4   5   6   >