[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 @WeichenXu123 Sure. And I must point out that I ran this benchmark in spark-shell under local mode. It is great if you can run the benchmark too to verify the numbers. ---

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 Updated test codes: import org.apache.spark.ml.feature._ import org.apache.spark.sql.{DataFrame, Row} import org.apache.spark.sql.types._ import spark.implicits._

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-18 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18805 It looks like zstd-jni has now been updated to pull 1.3.1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #81876 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81876/testReport)** for PR 18853 at commit

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19230 **[Test build #81877 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81877/testReport)** for PR 19230 at commit

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19135 **[Test build #81878 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81878/testReport)** for PR 19135 at commit

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19210 **[Test build #81875 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81875/testReport)** for PR 19210 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81879 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81879/testReport)** for PR 19265 at commit

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #81876 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81876/testReport)** for PR 18853 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81880 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81880/testReport)** for PR 19265 at commit

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 New numbers: numColums | Old Mean | Old Median | New Mean | New Median -- | -- | -- | -- | -- 1 | 0.1727832915997 | 0.1537169693 | 0.1687325048997 | 0.1521283075 10 |

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 I'm ok for that but I think adding an interface doesn't break binary compatibility? --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya It is possible I think. A similar example is, `HasRegParam` trait, do not put `setRegParam` in trait but moved into concrete estimator/transformer class, should be the same reason.

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-18 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19266 CC @maropu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19264 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya Yes. But if there is some better design I will be happy to listen. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81880/ Test PASSed. ---

[GitHub] spark pull request #19238: [SPARK-22016][SQL] Add HiveDialect for JDBC conne...

2017-09-18 Thread danielfx90
Github user danielfx90 commented on a diff in the pull request: https://github.com/apache/spark/pull/19238#discussion_r139441849 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -1103,6 +1103,17 @@ class JDBCSuite extends SparkFunSuite

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19230 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81877/ Test PASSed. ---

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19230 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19266 **[Test build #81882 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81882/testReport)** for PR 19266 at commit

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 Great! That's it. thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 I don't think re-using shuffle is the reason behind the numbers. If you looked at the previous comments, you will find that I ran the test before without `count` after `model.transform`. Namely the

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15544 **[Test build #81881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81881/testReport)** for PR 15544 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81879 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81879/testReport)** for PR 19265 at commit

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-18 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/19266 [SPARK-22033][CORE] BufferHolder, other size checks should account for the specific VM array size limitations ## What changes were proposed in this pull request? Try to avoid allocating an

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 @viirya I guess the reason is, the old PR version: `df.withColumn(..).withColumn(..).withColumn(..)`, the long df chain prevent the shuffle re-using... but now you merge them into one

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 Btw, I don't see any hint about `df.withColumn(..).withColumn(..).withColumn(..)` can prevent the shuffle re-using. --- - To

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 @WeichenXu123 Btw, the test is basically re-using the codes from https://github.com/apache/spark/pull/18902#issuecomment-321727416. Is your concern is specified for this? ---

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/15544 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19261: [SPARK-22040] Add current_date function with timezone id

2017-09-18 Thread jaceklaskowski
Github user jaceklaskowski commented on the issue: https://github.com/apache/spark/pull/19261 @gatorsmile Dunno, but the logical operator does. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-09-18 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19269 [SPARK-22026][SQL][WIP] data source v2 write path ## What changes were proposed in this pull request? A working prototype for data source v2 write path. TODO: doc. ##

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18704 **[Test build #81883 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81883/testReport)** for PR 18704 at commit

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18704 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81883/ Test PASSed. ---

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19254 **[Test build #3925 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3925/testReport)** for PR 19254 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 @WeichenXu123, thank you for your prompt reply! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #81893 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81893/testReport)** for PR 18924 at commit

[GitHub] spark issue #19232: [SPARK-22009][ML] Using treeAggregate improve some algs

2017-09-18 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/19232 Sure, we all agree there is a mechanism for avoiding overhead. However, performance tests are very tricky things, 5% is not a huge improvement, and hard-coding the aggregation depth to `2` limits

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139493581 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19250 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19250 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81888/ Test FAILed. ---

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19266 **[Test build #81882 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81882/testReport)** for PR 19266 at commit

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12646 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19211 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81884/ Test PASSed. ---

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19250 **[Test build #81892 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81892/testReport)** for PR 19250 at commit

[GitHub] spark issue #19268: Incorrect Metric reported in MetricsReporter.scala

2017-09-18 Thread Taaffy
Github user Taaffy commented on the issue: https://github.com/apache/spark/pull/19268 Will do. Delete this pull afterwards? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19250 **[Test build #81888 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81888/testReport)** for PR 19250 at commit

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12646 **[Test build #81886 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81886/testReport)** for PR 12646 at commit

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19211 **[Test build #81884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81884/testReport)** for PR 19211 at commit

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-09-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12646 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-09-18 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r139514301 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +462,44 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-09-18 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r139514402 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +462,44 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19268: Incorrect Metric reported in MetricsReporter.scala

2017-09-18 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19268 No way to make the change without a PR, so no leave it. http://spark.apache.org/contributing.html --- - To unsubscribe,

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19218 - You can get it from the table metadata `table: CatalogTable` - `Insertintohadoopfsrelationcommand.scala ` is for data source tables. We only have the issues for Hive table writing,

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139496371 --- Diff: python/pyspark/sql/functions.py --- @@ -2142,18 +2159,26 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE|

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19269 **[Test build #81891 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81891/testReport)** for PR 19269 at commit

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-18 Thread buryat
Github user buryat commented on a diff in the pull request: https://github.com/apache/spark/pull/19266#discussion_r139492024 --- Diff: core/src/main/scala/org/apache/spark/util/collection/CompactBuffer.scala --- @@ -126,22 +126,20 @@ private[spark] class CompactBuffer[T: ClassTag]

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19266 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19266 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81882/ Test PASSed. ---

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19211 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12646 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81886/ Test PASSed. ---

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139513283 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,304 @@ case class

[GitHub] spark issue #19268: Incorrect Metric reported in MetricsReporter.scala

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19268 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19268: Incorrect Metric reported in MetricsReporter.scala

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19268 **[Test build #3926 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3926/testReport)** for PR 19268 at commit

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19269 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19269 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81891/ Test FAILed. ---

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19269 **[Test build #81891 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81891/testReport)** for PR 19269 at commit

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-18 Thread buryat
Github user buryat commented on a diff in the pull request: https://github.com/apache/spark/pull/19266#discussion_r139489491 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/LongArray.java --- @@ -39,7 +39,7 @@ private final long length;

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-18 Thread buryat
Github user buryat commented on a diff in the pull request: https://github.com/apache/spark/pull/19266#discussion_r139489658 --- Diff: core/src/main/java/org/apache/spark/unsafe/map/HashMapGrowthStrategy.java --- @@ -30,11 +30,15 @@ HashMapGrowthStrategy DOUBLING = new

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19196 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19196 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81887/ Test PASSed. ---

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19196 **[Test build #81887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81887/testReport)** for PR 19196 at commit

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139513207 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,304 @@ case class

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/12646 Thanks! Merged to master. Could you resolve the above comments in the follow-up PR? Thanks! --- - To unsubscribe,

[GitHub] spark issue #19207: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-09-18 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/19207 It looks like you have a bunch of unrelated code in this PR, this seems to be caused by how you're doing development. You've opened this PR from your master branch and it includes work on 3 other

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-18 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19194 @tgraves I have addressed the comments and tried to cover the possible cases in the existing test for job groups and speculation. Kindly let me know if we need to add or address more use cases.

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19194 **[Test build #81894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81894/testReport)** for PR 19194 at commit

[GitHub] spark pull request #19211: [SPARK-18838][core] Add separate listener queues ...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19211#discussion_r139529920 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -39,20 +41,13 @@ import org.apache.spark.util.Utils * has

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-09-18 Thread kevinyu98
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139535041 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,304 @@ case class

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-09-18 Thread kevinyu98
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139535018 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,304 @@ case class

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-09-18 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/19270 Thanks, I'll try to review this by EOD tomorrow --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19252: [SPARK-21969][SQL] CommandUtils.updateTableStats should ...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19252 **[Test build #81896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81896/testReport)** for PR 19252 at commit

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-18 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19186 @zhengruifeng Can you please update the PR description so it describes the actual functionality being added? --- - To

[GitHub] spark issue #19268: [SPARK-22052] Incorrect Metric reported in MetricsReport...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19268 **[Test build #3926 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3926/testReport)** for PR 19268 at commit

[GitHub] spark issue #16774: [SPARK-19357][ML] Adding parallel model evaluation in ML...

2017-09-18 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16774 @WeichenXu123 Thanks for finding that bug! Can you please separate out your bugfix? It's good to get fixes in, rather than attaching them to PRs which may require discussion, so that we make

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18887 **[Test build #81890 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81890/testReport)** for PR 18887 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81893/ Test PASSed. ---

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #81893 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81893/testReport)** for PR 18924 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19207: [SPARK-21809] : Change Stage Page to use datatabl...

2017-09-18 Thread pgandhi999
Github user pgandhi999 closed the pull request at: https://github.com/apache/spark/pull/19207 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19270: [SPARK-21809] : Change Stage Page to use datatabl...

2017-09-18 Thread pgandhi999
GitHub user pgandhi999 opened a pull request: https://github.com/apache/spark/pull/19270 [SPARK-21809] : Change Stage Page to use datatables to support sorting columns and searching Support column sort and search for Stage Server using jQuery DataTable and REST API. Before this

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/12646 Hello Sean : Thank so much for the help on this PR, appreciated all the help from you and all the reviewers. --- - To

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-09-18 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19106 These are fair arguments. I guess it makes sense to throw an exception; that's fine with me. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-18 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19186 This has ended up being more complex than we envisioned. It would be valuable to describe the design succinctly so that people can debate it on JIRA. Could you please describe your solution on

[GitHub] spark issue #19252: [SPARK-21969][SQL] CommandUtils.updateTableStats should ...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19252 **[Test build #81897 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81897/testReport)** for PR 19252 at commit

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19250 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19250 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81892/ Test PASSed. ---

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19250 **[Test build #81892 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81892/testReport)** for PR 19250 at commit

<    1   2   3   4   5   >