[GitHub] spark issue #18934: [SPARK-21721][SQL] Clear FileSystem deleteOnExit cache w...

2017-08-15 Thread yzheng616
Github user yzheng616 commented on the issue: https://github.com/apache/spark/pull/18934 Please try to fix it in 2.1 too. We have a product running on this version Spark. Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18926: [SPARK-21712] [PySpark] Clarify type error for Column.su...

2017-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18926 Even if we plan to drop `long` in this PR, [the checking](https://github.com/nchammas/spark/blob/fc1d84f002f5bd66bcad038a5581a05ade8dbc35/python/pyspark/sql/column.py#L408) looks weird to me.

[GitHub] spark issue #18939: [SPARK-21724][SQL][DOC] Adds since information in the do...

2017-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18939 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80660/testReport)** for PR 18902 at commit

[GitHub] spark issue #18944: [SPARK-21732][SQL]Lazily init hive metastore client

2017-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18944 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18939: [SPARK-21724][SQL][DOC] Adds since information in...

2017-08-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18939 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to aggregator...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18315 **[Test build #80661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80661/testReport)** for PR 18315 at commit

[GitHub] spark issue #18939: [SPARK-21724][SQL][DOC] Adds since information in the do...

2017-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18939 Thank you @srowrn, @dongjoon-hyun and @gatorsmile. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133130836 --- Diff: sql/core/pom.xml --- @@ -87,6 +87,16 @@ + org.apache.orc + orc-core + ${orc.classifier}

[GitHub] spark pull request #18849: [SPARK-21617][SQL] Store correct table metadata w...

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18849#discussion_r133134000 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -342,6 +359,12 @@ private[spark] class

[GitHub] spark issue #18576: [SPARK-21351][SQL] Update nullability based on children'...

2017-08-15 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18576 @gatorsmile ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18930 **[Test build #80664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80664/testReport)** for PR 18930 at commit

[GitHub] spark pull request #18855: [SPARK-3151] [Block Manager] DiskStore.getBytes f...

2017-08-15 Thread eyalfa
Github user eyalfa commented on a diff in the pull request: https://github.com/apache/spark/pull/18855#discussion_r133135659 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1415,6 +1415,79 @@ class BlockManagerSuite extends SparkFunSuite with

[GitHub] spark pull request #18872: [SPARK-21723][ML] Fix writing LibSVM (key not fou...

2017-08-15 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18872#discussion_r133135541 --- Diff: mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala --- @@ -109,14 +112,15 @@ class LibSVMRelationSuite extends

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133135473 --- Diff: sql/core/pom.xml --- @@ -87,6 +87,16 @@ + org.apache.orc + orc-core + ${orc.classifier}

[GitHub] spark issue #18947: [SPARK-21721][SQL][Backport-2.1] Clear FileSystem delete...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18947 **[Test build #80665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80665/testReport)** for PR 18947 at commit

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80666 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80666/testReport)** for PR 18902 at commit

[GitHub] spark issue #18948: Add the validation of spark.cores.max under Streaming

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18948 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133142740 --- Diff: pom.xml --- @@ -1678,6 +1681,44 @@ +org.apache.orc +orc-core +

[GitHub] spark pull request #18855: [SPARK-3151] [Block Manager] DiskStore.getBytes f...

2017-08-15 Thread eyalfa
Github user eyalfa commented on a diff in the pull request: https://github.com/apache/spark/pull/18855#discussion_r133144224 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1415,6 +1415,79 @@ class BlockManagerSuite extends SparkFunSuite with

[GitHub] spark issue #18918: [SPARK-21707][SQL]Improvement a special case for non-det...

2017-08-15 Thread heary-cao
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18918 @gatorsmile I have fixt it in PhysicalOperation . and extract a new object FilterOperation to idle it. but Unit testing is not good to added. --- If your project is set up for it, you

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18902 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18736: [SPARK-21481][ML] Add indexOf method for ml.feature.Hash...

2017-08-15 Thread facaiy
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/18736 Sure, @yanboliang . Thanks for your suggestion. I'll work on it later, perhaps next week. Is it OK? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18948: Add the validation of spark.cores.max under Streaming

2017-08-15 Thread SOmeONee
Github user SOmeONee commented on the issue: https://github.com/apache/spark/pull/18948 Ok, sorry, I'ill review it, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18949: [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper code for S...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18949 **[Test build #80673 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80673/testReport)** for PR 18949 at commit

[GitHub] spark issue #18736: [SPARK-21481][ML] Add indexOf method for ml.feature.Hash...

2017-08-15 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18736 @facaiy Sure. Please open a JIRA to track it. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80675 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80675/testReport)** for PR 18902 at commit

[GitHub] spark issue #18589: [SPARK-16872][ML] Add Gaussian NB

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18589 **[Test build #80677 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80677/testReport)** for PR 18589 at commit

[GitHub] spark issue #16763: [SPARK-19422][ML][WIP] Cache input data in algorithms

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16763 **[Test build #80678 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80678/testReport)** for PR 16763 at commit

[GitHub] spark pull request #18849: [SPARK-21617][SQL] Store correct table metadata w...

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18849#discussion_r133134774 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -342,6 +359,12 @@ private[spark] class

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133134666 --- Diff: sql/core/pom.xml --- @@ -87,6 +87,16 @@ + org.apache.orc + orc-core + ${orc.classifier}

[GitHub] spark pull request #18947: [SPARK-21721][SQL][Backport-2.1] Clear FileSystem...

2017-08-15 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/18947 [SPARK-21721][SQL][Backport-2.1] Clear FileSystem deleteOnExit cache when paths are successfully removed ## What changes were proposed in this pull request? Backport SPARK-21721 to branch

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133137450 --- Diff: sql/core/pom.xml --- @@ -87,6 +87,16 @@ + org.apache.orc + orc-core + ${orc.classifier}

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18798 **[Test build #80668 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80668/testReport)** for PR 18798 at commit

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18798 **[Test build #80668 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80668/testReport)** for PR 18798 at commit

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18798 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80668/ Test FAILed. ---

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18798 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133142140 --- Diff: pom.xml --- @@ -1678,6 +1681,44 @@ +org.apache.orc +orc-core +

[GitHub] spark issue #18948: Add the validation of spark.cores.max under Streaming

2017-08-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18948 I don't think that's valid. It doesn't catch all cases where you need > 1 core, and isn't necessary for cases that don't involve receivers, I think. --- If your project is set up for it, you can

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133144317 --- Diff: pom.xml --- @@ -1678,6 +1681,44 @@ +org.apache.orc +orc-core +

[GitHub] spark issue #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18640 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80663 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80663/testReport)** for PR 18902 at commit

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18902 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80663/ Test PASSed. ---

[GitHub] spark pull request #18949: [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper cod...

2017-08-15 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/18949 [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper code for SnappyOutputStream ## What changes were proposed in this pull request? This pr removed the wrapper code (commit:

[GitHub] spark pull request #18872: [SPARK-21723][ML] Fix writing LibSVM (key not fou...

2017-08-15 Thread ProtD
Github user ProtD commented on a diff in the pull request: https://github.com/apache/spark/pull/18872#discussion_r133150156 --- Diff: mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala --- @@ -109,14 +112,15 @@ class LibSVMRelationSuite extends

[GitHub] spark issue #18949: [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper code for S...

2017-08-15 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18949 Probably, we forgot to remove this when upgrading `snappy-java` to `1.1.2.6`. cc: @srowen @JoshRosen --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80666 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80666/testReport)** for PR 18902 at commit

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18902 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80666/ Test PASSed. ---

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18902 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80667/ Test PASSed. ---

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80667 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80667/testReport)** for PR 18902 at commit

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18902 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18902 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18872: [SPARK-21723][ML] Fix writing LibSVM (key not found: num...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18872 **[Test build #80676 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80676/testReport)** for PR 18872 at commit

[GitHub] spark issue #18946: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18946 **[Test build #80674 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80674/testReport)** for PR 18946 at commit

[GitHub] spark issue #16763: [SPARK-19422][ML][WIP] Cache input data in algorithms

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16763 **[Test build #80679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80679/testReport)** for PR 16763 at commit

[GitHub] spark issue #18947: [SPARK-21721][SQL][Backport-2.1] Clear FileSystem delete...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18947 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80665/ Test PASSed. ---

[GitHub] spark issue #18947: [SPARK-21721][SQL][Backport-2.1] Clear FileSystem delete...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18947 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18942: [BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a J...

2017-08-15 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18942#discussion_r133132317 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala --- @@ -123,6 +124,38 @@ object ExternalCatalogUtils

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/18902 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18849: [SPARK-21617][SQL] Store correct table metadata w...

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18849#discussion_r133133656 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -288,6 +303,7 @@ private[spark] class

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133134410 --- Diff: pom.xml --- @@ -1678,6 +1681,44 @@ +org.apache.orc +orc-core +

[GitHub] spark pull request #18736: [SPARK-21481][ML] Add indexOf method for ml.featu...

2017-08-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18736#discussion_r133127624 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala --- @@ -74,26 +74,41 @@ class HashingTF @Since("1.4.0") (@Since("1.4.0")

[GitHub] spark issue #18849: [SPARK-21617][SQL] Store correct table metadata when alt...

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18849 My main concern is, we should not change the write path if not necessary. For backward compatibility, the read path needs to handle all possible cases, and changing write path just adds more

[GitHub] spark issue #18947: [SPARK-21721][SQL][Backport-2.1] Clear FileSystem delete...

2017-08-15 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18947 cc @yzheng616 @gatorsmile Backport #18934 to branch 2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18855: [SPARK-3151] [Block Manager] DiskStore.getBytes f...

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18855#discussion_r133136857 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1415,6 +1415,79 @@ class BlockManagerSuite extends SparkFunSuite

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133138546 --- Diff: sql/core/pom.xml --- @@ -87,6 +87,16 @@ + org.apache.orc + orc-core + ${orc.classifier}

[GitHub] spark pull request #18948: Add the validation of spark.cores.max under Strea...

2017-08-15 Thread SOmeONee
GitHub user SOmeONee opened a pull request: https://github.com/apache/spark/pull/18948 Add the validation of spark.cores.max under Streaming ## What changes were proposed in this pull request? By using spark streaming, --total-executor-cores must greater than 1. This pull

[GitHub] spark pull request #18855: [SPARK-3151] [Block Manager] DiskStore.getBytes f...

2017-08-15 Thread eyalfa
Github user eyalfa commented on a diff in the pull request: https://github.com/apache/spark/pull/18855#discussion_r133141988 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1415,6 +1415,79 @@ class BlockManagerSuite extends SparkFunSuite with

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133143664 --- Diff: pom.xml --- @@ -1678,6 +1681,44 @@ +org.apache.orc +orc-core +

[GitHub] spark issue #18918: [SPARK-21707][SQL]Improvement a special case for non-det...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18918 **[Test build #80670 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80670/testReport)** for PR 18918 at commit

[GitHub] spark issue #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18640 Thank you again, @viirya . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18948: Add the validation of spark.cores.max under Streaming

2017-08-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18948 That's not the problem -- I'm saying there are cases where 1 is valid. There are obviously cases where it isn't. But your change makes them all fail. --- If your project is set up for it, you can

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18798 **[Test build #80671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80671/testReport)** for PR 18798 at commit

[GitHub] spark issue #18936: [SPARK-21688][ML][MLLIB] make native BLAS the first choi...

2017-08-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18936 BLAS doesn't work on sparse data. All of those invocations are on dense data of some kind. Many of the remaining ones operate on dense matrices even; they're not even level 1. I think all of them

[GitHub] spark issue #18948: Add the validation of spark.cores.max under Streaming

2017-08-15 Thread SOmeONee
Github user SOmeONee commented on the issue: https://github.com/apache/spark/pull/18948 To example NetworkWordCount, if run with '--total-executor-cores 1', it can' be success. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #18589: [SPARK-16872][ML] Add Gaussian NB

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18589 **[Test build #80672 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80672/testReport)** for PR 18589 at commit

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80663/testReport)** for PR 18902 at commit

[GitHub] spark issue #18736: [SPARK-21481][ML] Add indexOf method for ml.feature.Hash...

2017-08-15 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18736 @facaiy I support to add ```indexOf``` to ```ml.feature.HashingTF```, but I think the way you fixed is incorrect for PySpark. So I'd suggest to migrate the ```HashingTF``` implementation from

[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread jmchung
Github user jmchung commented on a diff in the pull request: https://github.com/apache/spark/pull/18930#discussion_r133135302 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -359,14 +359,14 @@ case class

[GitHub] spark pull request #18849: [SPARK-21617][SQL] Store correct table metadata w...

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18849#discussion_r133135343 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -1175,6 +1205,27 @@ private[spark] class

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80667 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80667/testReport)** for PR 18902 at commit

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18798 **[Test build #80669 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80669/testReport)** for PR 18798 at commit

[GitHub] spark issue #18949: [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper code for S...

2017-08-15 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18949 Also, we could safely remove the code (commit: https://github.com/apache/spark/commit/5936bf9fa85ccf7f0216145356140161c2801682) to avoid memory leak in `snappy-java`? cc: @viirya This fix has been

[GitHub] spark issue #18947: [SPARK-21721][SQL][Backport-2.1] Clear FileSystem delete...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18947 **[Test build #80665 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80665/testReport)** for PR 18947 at commit

[GitHub] spark pull request #18943: [SPARK-21731][build] Upgrade scalastyle to 0.9.

2017-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18943#discussion_r133157163 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/BucketedRandomProjectionLSHExample.scala --- @@ -21,9 +21,9 @@ package

[GitHub] spark issue #18731: [SPARK-20990][SQL] Read all JSON documents in files when...

2017-08-15 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/18731 @gatorsmile any feedback on this? I added the support for all the corrupt record handling modes and I added the relative tests. Is anything else needed? --- If your project is set up for

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18930 **[Test build #80664 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80664/testReport)** for PR 18930 at commit

[GitHub] spark pull request #18855: [SPARK-3151] [Block Manager] DiskStore.getBytes f...

2017-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18855#discussion_r133166421 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala --- @@ -1415,6 +1415,79 @@ class BlockManagerSuite extends SparkFunSuite

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18798 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80669/ Test PASSed. ---

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18798 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14377: [SPARK-16625][SQL] General data types to be mapped to Or...

2017-08-15 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/14377 @gatorsmile We can consider merging this PR: https://github.com/apache/spark/pull/18266. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #18946: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18946 **[Test build #80674 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80674/testReport)** for PR 18946 at commit

[GitHub] spark issue #18946: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18946 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80674/ Test PASSed. ---

[GitHub] spark issue #18946: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18946 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18926: [SPARK-21712] [PySpark] Clarify type error for Column.su...

2017-08-15 Thread nchammas
Github user nchammas commented on the issue: https://github.com/apache/spark/pull/18926 @gatorsmile > Even if we plan to drop `long` in this PR We are not dropping `long` in this PR. It was [never

[GitHub] spark issue #18949: [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper code for S...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18949 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18949: [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper code for S...

2017-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18949 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80673/ Test PASSed. ---

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r133158750 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r133157997 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with...

2017-08-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18538#discussion_r133157575 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

<    1   2   3   4   5   >