[GitHub] spark issue #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19243 **[Test build #81978 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81978/testReport)** for PR 19243 at commit

[GitHub] spark issue #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19243 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81978/ Test PASSed. ---

[GitHub] spark issue #19274: [SPARK-22056] Add subconcurrency for KafkaRDDPartition

2017-09-20 Thread bjkonglu
Github user bjkonglu commented on the issue: https://github.com/apache/spark/pull/19274 I tried this method . It worked well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19277: [SPARK-22058][CORE]the BufferedInputStream will not be c...

2017-09-20 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19277 @zuotingbing no, `fs.open` could succeed, but `new BufferedInputStream` could fail, leaving the underlying stream open. In practice, it can't actually fail because this constructor does no I/O or

[GitHub] spark issue #19288: [SPARK-22075][ML][GRAPHX] GBTs/Pregel unpersist datasets...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19288 **[Test build #81975 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81975/testReport)** for PR 19288 at commit

[GitHub] spark issue #19288: [SPARK-22075][ML][GRAPHX] GBTs/Pregel unpersist datasets...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19288 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19282: [SPARK-22066][BUILD] Update checkstyle to 8.2, en...

2017-09-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19282#discussion_r139917224 --- Diff: pom.xml --- @@ -2012,7 +2012,7 @@ net.alchim31.maven scala-maven-plugin - 3.2.2 +

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19290 **[Test build #81988 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81988/testReport)** for PR 19290 at commit

[GitHub] spark pull request #19291: Branch 2.1

2017-09-20 Thread zhizu2018
GitHub user zhizu2018 opened a pull request: https://github.com/apache/spark/pull/19291 Branch 2.1 ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2017-09-20 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r139920393 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -233,17 +235,13 @@ private[spark] class MemoryStore( }

[GitHub] spark pull request #19292: [SPARK-22066][BUILD][HOTFIX] Revert scala-maven-p...

2017-09-20 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/19292 [SPARK-22066][BUILD][HOTFIX] Revert scala-maven-plugin to 3.2.2 to work with Maven+zinc again ## What changes were proposed in this pull request? See

[GitHub] spark issue #19291: Branch 2.1

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19291 Looks mistakenly open. Could you close this please @zhizu2018 --- - To unsubscribe, e-mail:

[GitHub] spark issue #19291: Branch 2.1

2017-09-20 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19291 @zhizu2018 close this --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #81989 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81989/testReport)** for PR 19218 at commit

[GitHub] spark issue #19288: [SPARK-22075][ML] GBTs unpersist datasets cached by Chec...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19288 **[Test build #81991 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81991/testReport)** for PR 19288 at commit

[GitHub] spark pull request #19292: [SPARK-22066][BUILD][HOTFIX] Revert scala-maven-p...

2017-09-20 Thread srowen
Github user srowen closed the pull request at: https://github.com/apache/spark/pull/19292 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139909208 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala --- @@ -0,0 +1,235 @@

[GitHub] spark issue #19282: [SPARK-22066][BUILD] Update checkstyle to 8.2, enable it...

2017-09-20 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19282 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19282: [SPARK-22066][BUILD] Update checkstyle to 8.2, en...

2017-09-20 Thread srowen
Github user srowen closed the pull request at: https://github.com/apache/spark/pull/19282 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19286 **[Test build #81976 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81976/testReport)** for PR 19286 at commit

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19271 **[Test build #81986 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81986/testReport)** for PR 19271 at commit

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19286 **[Test build #81985 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81985/testReport)** for PR 19286 at commit

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19287 **[Test build #81987 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81987/testReport)** for PR 19287 at commit

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-20 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139917384 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala --- @@ -0,0 +1,235 @@

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/19290 [WIP][SPARK-22063][R] Upgrades lintr to latest commit sha1 ID ## What changes were proposed in this pull request? Currently, we set lintr to `jimhester/lintr@a769c0b` (see

[GitHub] spark issue #19291: Branch 2.1

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19291 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19290 @felixcheung and @shivaram, Should we upgrade this to the latest commit? it increases the time by 5ish mins. If there is any worry about upgrading, I am willing to leave the

[GitHub] spark pull request #19282: [SPARK-22066][BUILD] Update checkstyle to 8.2, en...

2017-09-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19282#discussion_r139921363 --- Diff: pom.xml --- @@ -2012,7 +2012,7 @@ net.alchim31.maven scala-maven-plugin - 3.2.2 +

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #81989 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81989/testReport)** for PR 19218 at commit

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81989/ Test FAILed. ---

[GitHub] spark issue #19288: [SPARK-22075][ML][GRAPHX] GBTs/Pregel unpersist datasets...

2017-09-20 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/19288 @srowen I found that the cached rdds in `Pregel` is just the result graph. and the intermidiate rdds are already unpersisted directly out of the graphCheckpointer. So I think we don't need to

[GitHub] spark issue #19292: [SPARK-22066][BUILD][HOTFIX] Revert scala-maven-plugin t...

2017-09-20 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19292 Merging to master because the PR builder (SBT) won't be affected either way, it's a hotfix, and just reverts the same version we used before. ---

[GitHub] spark issue #19289: [SPARK-22076][SQL] Expand.projections should not be a St...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19289 **[Test build #81977 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81977/testReport)** for PR 19289 at commit

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2017-09-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r139929616 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -233,17 +235,13 @@ private[spark] class MemoryStore( }

[GitHub] spark issue #19255: [WIP][SPARK-22029][PySpark] Add lru_cache to _parse_data...

2017-09-20 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19255 @HyukjinKwon I added perf tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19266#discussion_r139656129 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/LongArray.java --- @@ -39,7 +39,7 @@ private final long length;

[GitHub] spark issue #19276: [SPARK-22049][DOCS] Confusing behavior of from_utc_times...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19276 **[Test build #81983 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81983/testReport)** for PR 19276 at commit

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19266 **[Test build #81984 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81984/testReport)** for PR 19266 at commit

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19286 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81976/ Test FAILed. ---

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19286 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-20 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/19271 jenkins retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19286 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2017-09-20 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r139917367 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -233,17 +235,13 @@ private[spark] class MemoryStore( }

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2017-09-20 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r139917831 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -252,7 +250,7 @@ private[spark] class MemoryStore( if

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919707 --- Diff: R/pkg/R/mllib_tree.R --- @@ -352,10 +353,10 @@ setMethod("write.ml", signature(object = "GBTClassificationModel", path = "chara #'

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919715 --- Diff: R/pkg/R/mllib_tree.R --- @@ -132,10 +132,10 @@ print.summary.decisionTree <- function(x) { #' Gradient Boosted Tree model, \code{predict}

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919722 --- Diff: R/pkg/R/mllib_tree.R --- @@ -567,10 +569,10 @@ setMethod("write.ml", signature(object = "RandomForestClassificationModel", path #'

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919889 --- Diff: dev/lint-r.R --- @@ -28,6 +28,7 @@ if (! library(SparkR, lib.loc = LOCAL_LIB_LOC, logical.return = TRUE)) { # NOTE: The CRAN's version

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919103 --- Diff: R/pkg/R/DataFrame.R --- @@ -2649,15 +2651,15 @@ setMethod("merge", #' @return list of columns #' #' @note

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139918553 --- Diff: R/pkg/.lintr --- @@ -1,2 +1,2 @@ -linters: with_defaults(line_length_linter(100), multiple_dots_linter = NULL, camel_case_linter = NULL,

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919377 --- Diff: R/pkg/R/context.R --- @@ -329,7 +329,7 @@ spark.addFile <- function(path, recursive = FALSE) { #' spark.getSparkFilesRootDirectory()

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919190 --- Diff: R/pkg/R/column.R --- @@ -238,8 +238,8 @@ setMethod("between", signature(x = "Column"), #' @param x a Column. #' @param dataType a

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-09-20 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 rdd1.cartesian(rdd2). For each task we need pool all the data of rdd1 (or rdd2) from the cluster. If we have n task running parallel in the same executor, that means we need duplicate poll n same

[GitHub] spark issue #19292: [SPARK-22066][BUILD][HOTFIX] Revert scala-maven-plugin t...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19292 **[Test build #81990 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81990/testReport)** for PR 19292 at commit

[GitHub] spark issue #19289: [SPARK-22076][SQL] Expand.projections should not be a St...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19289 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19289: [SPARK-22076][SQL] Expand.projections should not be a St...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19289 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81977/ Test PASSed. ---

[GitHub] spark issue #19289: [SPARK-22076][SQL] Expand.projections should not be a St...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19289 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81979/ Test PASSed. ---

[GitHub] spark issue #19289: [SPARK-22076][SQL] Expand.projections should not be a St...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19289 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19289: [SPARK-22076][SQL] Expand.projections should not be a St...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19289 **[Test build #81979 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81979/testReport)** for PR 19289 at commit

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19281 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81980/ Test PASSed. ---

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19281 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19281 **[Test build #81980 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81980/testReport)** for PR 19281 at commit

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17673 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82005/ Test PASSed. ---

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17673 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17673 **[Test build #82005 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82005/testReport)** for PR 17673 at commit

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16578 **[Test build #82008 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82008/testReport)** for PR 16578 at commit

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-20 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r140049300 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,46 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark issue #19296: Branch 2.2

2017-09-20 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19296 @rkp2916 close this --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r140052754 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala --- @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r140053073 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala --- @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r140050228 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala --- @@ -0,0 +1,585 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-20 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/16578 Dammit. I forgot to commit a couple of build fixes. Fixed commit coming momentarily... --- - To unsubscribe, e-mail:

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-09-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18887 can you also include the format? i.e. how to convert the list information to key-value entries --- - To unsubscribe, e-mail:

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82001/ Test PASSed. ---

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r140017131 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -0,0 +1,403 @@ +/* + *

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r139844541 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -0,0 +1,405 @@ +/* + *

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r139850114 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -0,0 +1,405 @@ +/* + *

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r139892683 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala --- @@ -0,0 +1,344 @@ +/* + *

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r139896338 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala --- @@ -0,0 +1,407 @@ +/* + *

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r139895913 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala --- @@ -0,0 +1,407 @@ +/* + *

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r139895235 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -179,6 +167,31 @@ trait WatermarkSupport extends

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82004/ Test PASSed. ---

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-09-20 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r140031900 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +462,46 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-09-20 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r140032198 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -503,17 +518,15 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19290 **[Test build #82000 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82000/testReport)** for PR 19290 at commit

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-20 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/16578 I have some commits to address some of your comments, @viirya. However, I'm going to push a rebase first. I've validated that all catalyst, sql and hive project tests pass locally. Hopefully

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16578 **[Test build #82008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82008/testReport)** for PR 16578 at commit

[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18123 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82006/ Test PASSed. ---

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/19287#discussion_r140047573 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala --- @@ -66,6 +66,12 @@ class TaskInfo( */ var finishTime: Long =

[GitHub] spark pull request #19277: [SPARK-22058][CORE]the BufferedInputStream will n...

2017-09-20 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19277#discussion_r140059471 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -351,14 +351,14 @@ private[spark] object

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-20 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 @jkbradley, thank you for your comments! Please, check out the commit adding the necessary docs. Regarding tests: I believe, `OnlineLDAOptimizer alpha hyperparameter optimization` from

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-20 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18704 @cloud-fan as you proposed before, we will first work for reading table cache that are frequently executed. Then, we will work for optimizing columnar table cache building in other PRs. ---

[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18123 **[Test build #82006 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82006/testReport)** for PR 18123 at commit

[GitHub] spark issue #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Samplin...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18123 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19296: Branch 2.2

2017-09-20 Thread rkp2916
GitHub user rkp2916 opened a pull request: https://github.com/apache/spark/pull/19296 Branch 2.2 ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch

[GitHub] spark pull request #19271: [SPARK-22053][SS] Stream-stream inner join in App...

2017-09-20 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/19271#discussion_r140050970 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -0,0 +1,405 @@ +/* + *

<    1   2   3   4   5   >