[GitHub] spark issue #9158: [SPARK-9695] [ML] Add random seed Param to ML Pipeline

2016-10-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/9158 So also following up from @jkbradley's note on https://issues.apache.org/jira/browse/SPARK-7653 we don't currently expose HasSeed to the public so external models with seeds won't be able to

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread rubenjanssen
Github user rubenjanssen commented on the issue: https://github.com/apache/spark/pull/14233 Updated with @sethah's suggestions. One is still pending however as some functionality needed for the cleanest solution appears to be missing from Python in comparison with Scala --- If your

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14233 **[Test build #67037 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67037/consoleFull)** for PR 14233 at commit

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14233 **[Test build #67037 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67037/consoleFull)** for PR 14233 at commit

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14233 **[Test build #67038 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67038/consoleFull)** for PR 14233 at commit

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15377 I think you can avoid it for now, and if that other issue is addressed, it can go back and use the `Utils` method again everywhere. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15436 **[Test build #3346 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3346/consoleFull)** for PR 15436 at commit

[GitHub] spark pull request #15398: [SPARK-17647][SQL][WIP] Fix backslash escaping in...

2016-10-16 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15398#discussion_r83552457 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala --- @@ -25,26 +25,25 @@ object StringUtils { //

[GitHub] spark pull request #15495: [SPARK-17620][SQL] Determine Serde by hive.defaul...

2016-10-16 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15495#discussion_r83552790 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -587,6 +594,30 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14233 **[Test build #67038 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67038/consoleFull)** for PR 14233 at commit

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15377 **[Test build #67039 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67039/consoleFull)** for PR 15377 at commit

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 @srowen done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 @weiqingy @srowen I see. So do you suggest to avoid using `Utils.classForName` to get this one merged, or rather wait for SPARK-17714? --- If your project is set up for it, you can reply to this

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14233 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67038/ Test PASSed. ---

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14233 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67033/ Test FAILed. ---

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #67033 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67033/consoleFull)** for PR 15505 at commit

[GitHub] spark pull request #15487: [SPARK-17940][SQL] Fixed a typo in LAST function ...

2016-10-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15487#discussion_r83551685 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala --- @@ -29,15 +29,18 @@ import

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15487 **[Test build #67036 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67036/consoleFull)** for PR 15487 at commit

[GitHub] spark issue #15494: [SPARK-17947] [SQL] Add Doc and Comment about spark.sql....

2016-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15494 @rxin because we are assuming that all the hive hacks are isolated inside `HiveExternalCatalog`, but the debug mode will break it and spread the hive metastore specific table properties outside

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread rubenjanssen
Github user rubenjanssen commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r83552931 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,54 @@ +# +# Licensed to the Apache Software Foundation (ASF)

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14233 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15361: [SPARK-17765][SQL] Support for writing out user-defined ...

2016-10-16 Thread chenghao-intel
Github user chenghao-intel commented on the issue: https://github.com/apache/spark/pull/15361 yes, please go ahead. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14233 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67037/ Test FAILed. ---

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67036/ Test PASSed. ---

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15487 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #67035 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67035/consoleFull)** for PR 15505 at commit

[GitHub] spark pull request #15504: [SPARK-17812][SQL][KAFKA] Assign and specific sta...

2016-10-16 Thread koeninger
Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/15504#discussion_r83551653 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala --- @@ -232,6 +232,42 @@ private[kafka010] case class

[GitHub] spark issue #15398: [SPARK-17647][SQL][WIP] Fix backslash escaping in 'LIKE'...

2016-10-16 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15398 For escaping before a non-special character, I don't know if DB2 is special. Because as I try, MySQL behaving like PostgreSQL. --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15487 **[Test build #67036 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67036/consoleFull)** for PR 15487 at commit

[GitHub] spark pull request #15504: [SPARK-17812][SQL][KAFKA] Assign and specific sta...

2016-10-16 Thread ofir-manor
Github user ofir-manor commented on a diff in the pull request: https://github.com/apache/spark/pull/15504#discussion_r83546052 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala --- @@ -232,6 +232,42 @@ private[kafka010] case class

[GitHub] spark issue #15218: [SPARK-17637][Scheduler]Packed scheduling for Spark task...

2016-10-16 Thread zhzhan
Github user zhzhan commented on the issue: https://github.com/apache/spark/pull/15218 @rxin Thanks a lot for the detail review. I will update the patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15432 **[Test build #67030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67030/consoleFull)** for PR 15432 at commit

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15503 @maximerihouey that's just the bot asking us to confirm it can test. Yes, seems OK as a trivial fix. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15503 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15432 Added `extend` is printed as below: ```sql spark-sql> DESCRIBE FUNCTION EXTENDED rand; Function: rand Class: org.apache.spark.sql.catalyst.expressions.Rand Usage: rand(a)

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14627 **[Test build #67027 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67027/consoleFull)** for PR 14627 at commit

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14627 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67027/ Test PASSed. ---

[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14627 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14451: [SPARK-16848][SQL] Make jdbc() and read.format("jdbc") c...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14451 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67028/ Test PASSed. ---

[GitHub] spark issue #14451: [SPARK-16848][SQL] Make jdbc() and read.format("jdbc") c...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14451 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15505: [WIP][SPARK-17931]taskScheduler has some unneeded...

2016-10-16 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/15505 [WIP][SPARK-17931]taskScheduler has some unneeded serialization ## What changes were proposed in this pull request? When taskScheduler instantiates TaskDescription, it calls

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15432 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15432 **[Test build #67031 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67031/consoleFull)** for PR 15432 at commit

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15432 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67031/ Test PASSed. ---

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15436 **[Test build #3345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3345/consoleFull)** for PR 15436 at commit

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67029/ Test PASSed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13036: [SPARK-15243][ML][SQL][PYSPARK] Param methods should use...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13036 **[Test build #67032 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67032/consoleFull)** for PR 13036 at commit

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15432 @gatorsmile @HyukjinKwon As a general comment Spark SQL doesn't claim a particular level of ANSI SQL compatibility. If anything it tries to match "whatever Hive does" and that's probably the best

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15436 **[Test build #3345 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3345/consoleFull)** for PR 15436 at commit

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #67029 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67029/consoleFull)** for PR 14124 at commit

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #67033 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67033/consoleFull)** for PR 15505 at commit

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15432 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15432 **[Test build #67030 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67030/consoleFull)** for PR 15432 at commit

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15432 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67030/ Test PASSed. ---

[GitHub] spark issue #14233: [SPARK-16490] [Examples] added a python example for chis...

2016-10-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14233 Maybe we can loop in @davies to take a look once you've had a chance to resolve @sethah's comments then we can look at getting this in :) --- If your project is set up for it, you can reply to

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15432 **[Test build #67031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67031/consoleFull)** for PR 15432 at commit

[GitHub] spark issue #14451: [SPARK-16848][SQL] Make jdbc() and read.format("jdbc") c...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14451 **[Test build #67028 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67028/consoleFull)** for PR 14451 at commit

[GitHub] spark issue #13036: [SPARK-15243][ML][SQL][PYSPARK] Param methods should use...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13036 **[Test build #67032 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67032/consoleFull)** for PR 13036 at commit

[GitHub] spark issue #13036: [SPARK-15243][ML][SQL][PYSPARK] Param methods should use...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13036 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13036: [SPARK-15243][ML][SQL][PYSPARK] Param methods should use...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13036 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67032/ Test PASSed. ---

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15503 **[Test build #67034 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67034/consoleFull)** for PR 15503 at commit

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15503 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67034/ Test PASSed. ---

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15503 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15503 **[Test build #67034 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67034/consoleFull)** for PR 15503 at commit

[GitHub] spark issue #12913: [SPARK-928][CORE] Add support for Unsafe-based serialize...

2016-10-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/12913 Excellent - lets see so @mateiz was the creator of this issue originally so we should check and see if he is available to review (although I suspect he is rather busy in general). If not we can

[GitHub] spark issue #12491: [SPARK-14712][ML]spark.ml.LogisticRegressionModel.toStri...

2016-10-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/12491 We can also try reaching out to @davies who does a bunch of Python related reviews. @davies can you give Jenkins the OK to run the tests? --- If your project is set up for it, you can reply to

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #67035 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67035/consoleFull)** for PR 15505 at commit

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67035/ Test FAILed. ---

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r8355 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,54 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r83556749 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,56 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark issue #15450: [SPARK-3261] [MLLIB] KMeans clusterer can return duplica...

2016-10-16 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15450 @srowen I'm not against the change per se, I was just hoping to understand how duplicate centers arise. In the case of `initRandom` sampling with replacement makes it possible to select the same

[GitHub] spark issue #15044: [SQL][SPARK-17490] Optimize SerializeFromObject() for a ...

2016-10-16 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/15044 @hvanhovell could you please review this again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15377 **[Test build #67039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67039/consoleFull)** for PR 15377 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14963 Also if it would be useful to have a quick chat off-line and then circle back to the PR with the result of our chat let me know - I'd really like to see us get something like this in so we can have

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread rubenjanssen
Github user rubenjanssen commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r83557977 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,56 @@ +# +# Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r83558440 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,56 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15377 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15377 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67039/ Test PASSed. ---

[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...

2016-10-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/12135 Just re-pinigng @MLnick but we can also see if @davies or @marmbrus have some cycles to do the final review. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread rubenjanssen
Github user rubenjanssen commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r83558453 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,54 @@ +# +# Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #11105: [SPARK-12469][CORE] Data Property accumulators fo...

2016-10-16 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/11105#discussion_r83557436 --- Diff: core/src/main/scala/org/apache/spark/Accumulable.scala --- @@ -56,15 +61,26 @@ class Accumulable[R, T] private ( @transient private val

[GitHub] spark issue #15506: [MINOR][SQL] Add prettyName for current_database functio...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15506 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15506: [MINOR][SQL] Add prettyName for current_database ...

2016-10-16 Thread weiqingy
GitHub user weiqingy opened a pull request: https://github.com/apache/spark/pull/15506 [MINOR][SQL] Add prettyName for current_database function ## What changes were proposed in this pull request? Added a `prettyname` for current_database function. ## How was this patch

[GitHub] spark issue #14702: [SPARK-15694] Implement ScriptTransformation in sql/core...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14702 **[Test build #67040 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67040/consoleFull)** for PR 14702 at commit

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread maximerihouey
Github user maximerihouey commented on the issue: https://github.com/apache/spark/pull/15503 Indeed. The "kmeans_data.txt" file is not ideal for showcasing this feature. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r83560385 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,54 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request #15218: [SPARK-17637][Scheduler]Packed scheduling for Spa...

2016-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15218#discussion_r83561571 --- Diff: docs/configuration.md --- @@ -1334,6 +1334,17 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #14233: [SPARK-16490] [Examples] added a python example f...

2016-10-16 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/14233#discussion_r83560259 --- Diff: examples/src/main/python/mllib/chisq_selector_example.py --- @@ -0,0 +1,56 @@ +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15503 this looks like a reasonable improvement (unfortunate that the examples don't have any with numDoc >2 but such is life). --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #15218: [SPARK-17637][Scheduler]Packed scheduling for Spa...

2016-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15218#discussion_r83562596 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -109,6 +109,72 @@ class TaskSchedulerImplSuite extends

[GitHub] spark pull request #15218: [SPARK-17637][Scheduler]Packed scheduling for Spa...

2016-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15218#discussion_r83562599 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -408,4 +474,5 @@ class TaskSchedulerImplSuite extends

[GitHub] spark issue #15218: [SPARK-17637][Scheduler]Packed scheduling for Spark task...

2016-10-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15218 The test case design is pretty good. It covers all the scenarios. - Could you add a check for the negative case? That means, when users do not provide the right TaskAssigner name, we

[GitHub] spark issue #14702: [SPARK-15694] Implement ScriptTransformation in sql/core...

2016-10-16 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/14702 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #15218: [SPARK-17637][Scheduler]Packed scheduling for Spa...

2016-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15218#discussion_r83560546 --- Diff: docs/configuration.md --- @@ -1334,6 +1334,17 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #15218: [SPARK-17637][Scheduler]Packed scheduling for Spa...

2016-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15218#discussion_r83560691 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -61,6 +59,21 @@ private[spark] class TaskSchedulerImpl(

[GitHub] spark issue #14702: [SPARK-15694] Implement ScriptTransformation in sql/core...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14702 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67040/ Test FAILed. ---

  1   2   3   >