date:20170718

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread MLnick

Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18620 I'm not understanding why `sorted` is slower than `sortBy` - `sortBy` uses `sorted` in its implementation: ```scala def sortBy[B](f: A => B)(implicit ord: Ordering[B]): Repr = sorted(ord

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread ueshin

Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18637: [SPARK-15526][ML][FOLLOWUP][test-maven] Make JPMML provi...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18637 **[Test build #79701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79701/testReport)** for PR 18637 at commit

[GitHub] spark pull request #18635: [SPARK-21415] Triage scapegoat warnings, part 1

2017-07-18 Thread srowen

Github user srowen closed the pull request at: https://github.com/apache/spark/pull/18635 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127903537 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread MLnick

Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18620 That would make sense. There must be something else going on. Overall, I don't think it is compelling enough evidence to make the `poll` change. (Though as mentioned it's not a huge deal so if

[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18513 **[Test build #79699 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79699/testReport)** for PR 18513 at commit

[GitHub] spark issue #18632: [SPARK-21412][SQL] Reset BufferHolder while initialize a...

2017-07-18 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18632 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18513 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18513 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79699/ Test PASSed. ---

[GitHub] spark pull request #18669: tfidf-new edit

2017-07-18 Thread chlyzzo

Github user chlyzzo closed the pull request at: https://github.com/apache/spark/pull/18669 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #18659: [SPARK-21404][PYSPARK][WIP] Simple Python Vectori...

2017-07-18 Thread kiszk

Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r127913117 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -132,6 +135,61 @@ private[sql] object ArrowConverters {

[GitHub] spark issue #18665: [SPARK-21446] [SQL] Fix setAutoCommit never executed

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18665 **[Test build #3844 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3844/testReport)** for PR 18665 at commit

[GitHub] spark pull request #15471: [SPARK-17919] Make timeout to RBackend configurab...

2017-07-18 Thread QCTW

Github user QCTW commented on a diff in the pull request: https://github.com/apache/spark/pull/15471#discussion_r127771891 --- Diff: R/pkg/R/backend.R --- @@ -108,13 +108,27 @@ invokeJava <- function(isStatic, objId, methodName, ...) { conn <- get(".sparkRCon", .sparkREnv)

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127897413 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread viirya

Github user viirya commented on the issue: https://github.com/apache/spark/pull/18656 No. I meant if there's a CodegenFallback expression, wholestage codegen will not be enabled. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread xuanyuanking

Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/18654 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18633: [SPARK-21411][YARN] Lazily create FS within kerberized U...

2017-07-18 Thread jiangxb1987

Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18633 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18669: tfidf-new edit

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18669 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18635: [SPARK-21415] Triage scapegoat warnings, part 1

2017-07-18 Thread srowen

Github user srowen commented on the issue: https://github.com/apache/spark/pull/18635 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #18637: [SPARK-15526][ML][FOLLOWUP][test-maven] Make JPMM...

2017-07-18 Thread srowen

Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18637#discussion_r127903284 --- Diff: mllib/pom.xml --- @@ -139,8 +133,38 @@ + target/scala-${scala.binary.version}/classes

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127918499 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18639: [SPARK-21408][core] Better default number of RPC ...

2017-07-18 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/18639#discussion_r127898848 --- Diff: core/src/main/scala/org/apache/spark/rpc/netty/Dispatcher.scala --- @@ -33,7 +33,7 @@ import org.apache.spark.util.ThreadUtils /**

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127901508 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18667: Fix the simpleString used in error messages

2017-07-18 Thread srowen

Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18667#discussion_r127903964 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/LongType.scala --- @@ -43,7 +43,7 @@ class LongType private() extends IntegralType {

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread mpjlu

Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18620 I am ok to close this. Thanks @MLnick --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18632: [SPARK-21412][SQL] Reset BufferHolder while initi...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18632#discussion_r127904688 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java --- @@ -51,6 +51,7 @@ public

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread srowen

Github user srowen commented on the issue: https://github.com/apache/spark/pull/18620 My benchmarks locally said poll() is a little faster on moderately large collections, like 100 elements in the queue. I'm really neutral. If it affords a little help, that's great. It's a natural

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread mpjlu

Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18620 Thanks @srowen , my test also said pq.poll is a little faster on some cases. One possible benefit here is if we provide pq.poll, user's first choice may use pq.poll, not pq.toArray.sorted, which

[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18513 **[Test build #79699 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79699/testReport)** for PR 18513 at commit

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127903005 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18667: Fix the simpleString used in error messages

2017-07-18 Thread fxbonnet

Github user fxbonnet commented on a diff in the pull request: https://github.com/apache/spark/pull/18667#discussion_r127905854 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/LongType.scala --- @@ -43,7 +43,7 @@ class LongType private() extends IntegralType {

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127907280 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127909294 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18632: [SPARK-21412][SQL] Reset BufferHolder while initialize a...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18632 **[Test build #79702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79702/testReport)** for PR 18632 at commit

[GitHub] spark issue #18468: [SPARK-20873][SQL] Creat CachedBatchColumnVector to abst...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18468 **[Test build #79703 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79703/testReport)** for PR 18468 at commit

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18654 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79695/ Test FAILed. ---

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone

Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 Yeah, CodegenFallback just provide a fallback mode. However, in such case, SortMergeJoinExec passes incomplete row as input to hiveUDF that implements CodegenFallback. --- If your project

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12646 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18655 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79696/ Test FAILed. ---

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18654 **[Test build #79695 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79695/testReport)** for PR 18654 at commit

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12646 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79697/ Test FAILed. ---

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127896217 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18654 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12646 **[Test build #79697 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79697/testReport)** for PR 12646 at commit

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18655 **[Test build #79696 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79696/testReport)** for PR 18655 at commit

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread mpjlu

Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18620 I also very confused about this. You can change https://github.com/apache/spark/pull/18624 to sorted and test. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127897096 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18655 **[Test build #79698 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79698/testReport)** for PR 18655 at commit

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127898565 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread ueshin

Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 @BryanCutler I'd like to share the motivation of refactoring `ArrowConverters` and `ColumnWriter`. For `ColumnWriter`, at first I'd like to support complex types like `ArrayType` and

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread mpjlu

Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18620 My micro benchmark (write a program only test pq.toArray.sorted and pq.Array.sortBy and pq.poll), not find significant performance difference. Only in the Spark job, there is big difference.

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18654 **[Test build #79700 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79700/testReport)** for PR 18654 at commit

[GitHub] spark pull request #18669: tfidf-new edit

2017-07-18 Thread chlyzzo

GitHub user chlyzzo opened a pull request: https://github.com/apache/spark/pull/18669 tfidf-new edit ## What changes were proposed in this pull request? i add a TfIdf.scala,it can compute docs tfidf's vector. i hava a case that is compute docs similarity,so i use the spark

[GitHub] spark issue #18669: tfidf-new edit

2017-07-18 Thread srowen

Github user srowen commented on the issue: https://github.com/apache/spark/pull/18669 @chlyzzo close this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark pull request #18632: [SPARK-21412][SQL] Reset BufferHolder while initi...

2017-07-18 Thread gczsjdy

Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/18632#discussion_r127907518 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java --- @@ -51,6 +51,7 @@ public

[GitHub] spark pull request #18632: [SPARK-21412][SQL] Reset BufferHolder while initi...

2017-07-18 Thread gczsjdy

Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/18632#discussion_r127908258 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java --- @@ -51,6 +51,7 @@ public

[GitHub] spark issue #18669: tfidf-new edit

2017-07-18 Thread chlyzzo

Github user chlyzzo commented on the issue: https://github.com/apache/spark/pull/18669 closed, - åå§é®ä»¶ - åä»¶äººï¼Sean Owen æ¶ä»¶äººï¼apache/spark æéäººï¼chlyzzo , Mention

[GitHub] spark issue #18624: [SPARK-21389][ML][MLLIB] Optimize ALS recommendForAll by...

2017-07-18 Thread mpjlu

Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18624 Hi @srowen @MLnick @jkbradley @mengxr @yanboliang Is this change acceptable? if it is acceptable, I will update ALS ML code following this method. Also update Test Suite, which are too simple,

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127891910 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread viirya

Github user viirya commented on the issue: https://github.com/apache/spark/pull/18656 Will CodegenFallback be used in wholestage codegen? I think it's not supported. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127894313 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-18 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/18654#discussion_r127888746 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala --- @@ -0,0 +1,43 @@ +/* + *

[GitHub] spark issue #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties from s...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18668 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127893543 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12646 **[Test build #79697 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79697/testReport)** for PR 12646 at commit

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127892847 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties...

2017-07-18 Thread yaooqinn

GitHub user yaooqinn opened a pull request: https://github.com/apache/spark/pull/18668 [SPARK-21451][SQL]get `spark.hadoop.*` properties from sysProps to hiveconf ## What changes were proposed in this pull request? get `spark.hadoop.*` properties from sysProps to

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127895586 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone

Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 Hi, @cloud-fan, @vanzin , could you help to take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127893995 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127895248 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127895399 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127895419 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18654 **[Test build #79695 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79695/testReport)** for PR 18654 at commit

[GitHub] spark issue #18620: [SPARK-21401][ML][MLLIB] add poll function for BoundedPr...

2017-07-18 Thread mpjlu

Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18620 Hi @MLnick , @srowen . My test showing: pq.poll is not significantly faster than pq.toArray.sortBy, but significantly faster than pq.toArray.sorted. Seems not each pq.toArray.sorted (such as

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18655 **[Test build #79696 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79696/testReport)** for PR 18655 at commit

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127893174 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2017-07-18 Thread heary-cao

Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18555 @gatorsmile Could you please review this code again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127894772 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18468: [SPARK-20873][SQL] Creat CachedBatchColumnVector ...

2017-07-18 Thread kiszk

Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18468#discussion_r127962023 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/CachedBatchColumnVector.java --- @@ -0,0 +1,421 @@ +/* + * Licensed to

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread viirya

Github user viirya commented on the issue: https://github.com/apache/spark/pull/18656 I think the check for `SortMergeJoinExec` in `insertInputAdapter` should be corrected to: private def insertInputAdapter(plan: SparkPlan): SparkPlan = plan match { case p if

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127965749 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18652: [WIP] Pull non-deterministic joining keys from Jo...

2017-07-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18652#discussion_r127965550 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1912,6 +1913,26 @@ class Analyzer(

[GitHub] spark pull request #18468: [SPARK-20873][SQL] Creat CachedBatchColumnVector ...

2017-07-18 Thread kiszk

Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18468#discussion_r127969028 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/CachedBatchColumnVector.java --- @@ -0,0 +1,421 @@ +/* + * Licensed to

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18654 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79704/ Test PASSed. ---

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18654 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18641: [SPARK-21413][SQL] Fix 64KB JVM bytecode limit pr...

2017-07-18 Thread kiszk

Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18641#discussion_r127982825 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -273,12 +274,26 @@ case class

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18654 **[Test build #79704 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79704/testReport)** for PR 18654 at commit

[GitHub] spark issue #18670: [SPARK-21455][CORE]RpcFailure should be call on RpcRespo...

2017-07-18 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18670 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #18670: [SPARK-21455][CORE]RpcFailure should be call on RpcRespo...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18670 **[Test build #79706 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79706/testReport)** for PR 18670 at commit

[GitHub] spark pull request #18670: [SPARK-21455][CORE]RpcFailure should be call on R...

2017-07-18 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18670#discussion_r127988207 --- Diff: core/src/test/scala/org/apache/spark/rpc/RpcEnvSuite.scala --- @@ -624,7 +624,9 @@ abstract class RpcEnvSuite extends SparkFunSuite with

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18654 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18654: [SPARK-21435][SQL] Empty files should be skipped while w...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18654 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79700/ Test FAILed. ---

[GitHub] spark pull request #18305: [SPARK-20988][ML] Logistic regression uses aggreg...

2017-07-18 Thread MLnick

Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/18305#discussion_r127934107 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -598,8 +598,23 @@ class LogisticRegression

[GitHub] spark issue #18305: [SPARK-20988][ML] Logistic regression uses aggregator hi...

2017-07-18 Thread MLnick

Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18305 @sethah IMO we should back out the test-related bc var explicit destroy code as it complicates things. I hear that this _may_ help catch bugs... but frankly I'm not convinced. Because the

[GitHub] spark issue #18632: [SPARK-21412][SQL] Reset BufferHolder while initialize a...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18632 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79702/ Test PASSed. ---

[GitHub] spark issue #18632: [SPARK-21412][SQL] Reset BufferHolder while initialize a...

2017-07-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18632 **[Test build #79702 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79702/testReport)** for PR 18632 at commit

[GitHub] spark issue #18632: [SPARK-21412][SQL] Reset BufferHolder while initialize a...

2017-07-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18632 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

1 2 3 4 5 >

1 - 100 of 432 matches

Mail list logo