[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19475 **[Test build #82659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82659/testReport)** for PR 19475 at commit

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19475 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19475 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82655/ Test PASSed. ---

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19475 **[Test build #82655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82655/testReport)** for PR 19475 at commit

[GitHub] spark pull request #19474: [SPARK-22252][SQL] FileFormatWriter should respec...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19474#discussion_r144198505 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala --- @@ -30,6 +31,15 @@ import

[GitHub] spark issue #19464: [SPARK-22233] [core] Allow user to filter out empty spli...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19464 **[Test build #82658 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82658/testReport)** for PR 19464 at commit

[GitHub] spark pull request #19474: [SPARK-22252][SQL] FileFormatWriter should respec...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19474#discussion_r144198270 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -117,7 +117,7 @@ object FileFormatWriter

[GitHub] spark issue #19464: [SPARK-22233] [core] Allow user to filter out empty spli...

2017-10-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19464 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19474: [SPARK-22252][SQL] FileFormatWriter should respec...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19474#discussion_r144197934 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala --- @@ -30,6 +31,15 @@ import

[GitHub] spark issue #19464: [SPARK-22233] [core] Allow user to filter out empty spli...

2017-10-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19464 I think the optimisation by `spark.sql.files.maxPartitionBytes` sql specific conf includes this concept in `FileScanRDD` and it looks already partially doing it in combining input splits. I'd

[GitHub] spark issue #19477: [SPARK-22258][SQL] Writing empty dataset fails with ORC ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19477 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19477: [SPARK-22258][SQL] Writing empty dataset fails with ORC ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19477 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82654/ Test PASSed. ---

[GitHub] spark issue #19389: [SPARK-22165][SQL] Resolve type conflicts between decima...

2017-10-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19389 Thank you so much @gatorsmile. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19477: [SPARK-22258][SQL] Writing empty dataset fails with ORC ...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19477 **[Test build #82654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82654/testReport)** for PR 19477 at commit

[GitHub] spark pull request #19474: [SPARK-22252][SQL] FileFormatWriter should respec...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19474#discussion_r144197368 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala --- @@ -30,4 +31,12 @@ class

[GitHub] spark pull request #19464: [SPARK-22233] [core] Allow user to filter out emp...

2017-10-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19464#discussion_r144197159 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -196,7 +196,10 @@ class HadoopRDD[K, V]( // add the credentials here

[GitHub] spark issue #19389: [SPARK-22165][SQL] Resolve type conflicts between decima...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19389 Will review it this weekend. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r143906469 --- Diff: python/pyspark/sql/tests.py --- @@ -3095,16 +3095,32 @@ def setUpClass(cls): StructField("3_long_t", LongType(), True),

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r144194374 --- Diff: python/pyspark/sql/session.py --- @@ -510,9 +511,43 @@ def createDataFrame(self, data, schema=None, samplingRatio=None, verifySchema=Tr

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r144194084 --- Diff: python/pyspark/sql/session.py --- @@ -510,9 +511,43 @@ def createDataFrame(self, data, schema=None, samplingRatio=None, verifySchema=Tr

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19475 The idea sounds good to me. Please add unit test cases for all the ExpressionSet APIs. Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19429: [SPARK-20055] [Docs] Added documentation for load...

2017-10-11 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19429 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-11 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/19337 For the comments about change the name of epsilon and add setter in localLADModel, we have agreed not to change it now after some offline discussion. Because epsilon doesn't control model

[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19429 @jomach is a new contributor to Apache Spark. It might be hard for him to address the above comments. Please submit a separate PR for addressing it. Will review it. Thanks! ---

[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19429 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19448: [SPARK-22217] [SQL] ParquetFileFormat to support ...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19448#discussion_r144195079 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -138,6 +138,10 @@ class

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19337 **[Test build #82657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82657/testReport)** for PR 19337 at commit

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #82656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82656/testReport)** for PR 19439 at commit

[GitHub] spark issue #18979: [SPARK-21762][SQL] FileFormatWriter/BasicWriteTaskStatsT...

2017-10-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18979 Could you also include the [test

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144194076 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144193549 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144193350 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144193840 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144193584 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144193367 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144191899 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144191439 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144191203 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144191154 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144191057 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144190588 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144190450 --- Diff: python/pyspark/ml/image.py --- @@ -0,0 +1,133 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144190368 --- Diff: python/pyspark/ml/image.py --- @@ -0,0 +1,133 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144190224 --- Diff: python/pyspark/ml/image.py --- @@ -0,0 +1,133 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144189855 --- Diff: python/pyspark/ml/image.py --- @@ -0,0 +1,133 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-11 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r144189582 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #19472: [WIP][SPARK-22246][SQL] Improve performance of UnsafeRow...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19472 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82651/ Test FAILed. ---

[GitHub] spark issue #19472: [WIP][SPARK-22246][SQL] Improve performance of UnsafeRow...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19472 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19472: [WIP][SPARK-22246][SQL] Improve performance of UnsafeRow...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19472 **[Test build #82651 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82651/testReport)** for PR 19472 at commit

[GitHub] spark pull request #19477: [SPARK-22258][SQL] Writing empty dataset fails wi...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19477#discussion_r144188789 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,12 @@ class SQLQuerySuite extends

[GitHub] spark issue #18979: [SPARK-21762][SQL] FileFormatWriter/BasicWriteTaskStatsT...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18979 +1. This solves the regression on writing emtpy dataset with ORC format, too! --- - To unsubscribe, e-mail:

[GitHub] spark issue #19477: [SPARK-22258][SQL] Writing empty dataset fails with ORC ...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19477 Wow. There is a PR for that. Thank you for informing that, @viirya ! Then, it's good. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19477: [SPARK-22258][SQL] Writing empty dataset fails wi...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/19477 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks now to...

2017-10-11 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19458 Instead of filtering out temp blocks, why not adding parsing rule for `TempLocalBlockId` and `TempShuffleBlockId`? That could also solve the problem. Since `DiskBlockManager#getAllFiles` doesn't

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r144187454 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -728,4 +732,254 @@ class InsertSuite extends QueryTest with

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r144187309 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -728,4 +732,254 @@ class InsertSuite extends QueryTest with

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r144187101 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -728,4 +732,254 @@ class InsertSuite extends QueryTest with

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r144186944 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -728,4 +732,254 @@ class InsertSuite extends QueryTest with

[GitHub] spark pull request #19477: [SPARK-22258][SQL] Writing empty dataset fails wi...

2017-10-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19477#discussion_r144186840 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,12 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144186680 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -74,9 +81,13 @@ class ExpressionSet

[GitHub] spark issue #19477: [SPARK-22258][SQL] Writing empty dataset fails with ORC ...

2017-10-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19477 @dongjoon-hyun This is kind of duplicate to #18979, although the viewpoint of the issue is different. --- - To unsubscribe,

[GitHub] spark pull request #19419: [SPARK-22188] [CORE] Adding security headers for ...

2017-10-11 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19419#discussion_r144186220 --- Diff: conf/spark-defaults.conf.template --- @@ -25,3 +25,10 @@ # spark.serializer org.apache.spark.serializer.KryoSerializer

[GitHub] spark issue #19419: [SPARK-22188] [CORE] Adding security headers for prevent...

2017-10-11 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19419 @vanzin @tgravescs @ajbozarth what is your opinion on this PR? Is it a necessary fix for Spark? --- - To unsubscribe,

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144185928 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -74,9 +81,13 @@ class ExpressionSet protected(

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-11 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 cc @gatorsmile @dongjoon-hyun Is it ok now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19475 **[Test build #82655 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82655/testReport)** for PR 19475 at commit

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-11 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/19475 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144184258 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -74,9 +81,13 @@ class ExpressionSet

[GitHub] spark issue #19477: [SPARK-22258][SQL] Writing empty dataset fails with ORC ...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19477 Hi, @gatorsmile and @cloud-fan . This is a regression of SPARK-21669 (Internal API for collecting metrics/stats during FileFormatWriter jobs) at Spark 2.3.0. Could you review this PR?

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144184155 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -74,9 +81,13 @@ class ExpressionSet

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144183735 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -74,9 +81,13 @@ class ExpressionSet protected(

[GitHub] spark issue #19477: [SPARK-22258][SQL] Writing empty dataset fails with ORC ...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19477 **[Test build #82654 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82654/testReport)** for PR 19477 at commit

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144183653 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -46,14 +47,20 @@ object ExpressionSet { *

[GitHub] spark pull request #19477: [SPARK-22258][SQL] Writing empty dataset fails wi...

2017-10-11 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/19477 [SPARK-22258][SQL] Writing empty dataset fails with ORC format ## What changes were proposed in this pull request? Since

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144182958 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -46,14 +47,20 @@ object ExpressionSet {

[GitHub] spark issue #19318: [WIP][SPARK-22096][ML] use aggregateByKeyLocally in feat...

2017-10-11 Thread VinceShieh
Github user VinceShieh commented on the issue: https://github.com/apache/spark/pull/19318 thanks :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82652/ Test FAILed. ---

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-11 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r144182701 --- Diff: pom.xml --- @@ -2649,6 +2649,13 @@ + kubernetes --- End diff -- We should also change the sbt

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82652 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82652/testReport)** for PR 19433 at commit

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-11 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/19475#discussion_r144182675 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -74,9 +81,13 @@ class ExpressionSet

[GitHub] spark issue #19464: [SPARK-22233] [core] Allow user to filter out empty spli...

2017-10-11 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19464 IIUC this issue also existed in `NewHadoopRDD` and `FileScanRDD` (possibly), we'd better also fix them. --- - To unsubscribe,

[GitHub] spark pull request #19464: [SPARK-22233] [core] Allow user to filter out emp...

2017-10-11 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19464#discussion_r144181321 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -196,7 +196,10 @@ class HadoopRDD[K, V]( // add the credentials here as

[GitHub] spark issue #19474: [SPARK-22252][SQL] FileFormatWriter should respect the i...

2017-10-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19474 Minor comments. LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19474: [SPARK-22252][SQL] FileFormatWriter should respec...

2017-10-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19474#discussion_r144180788 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -117,7 +117,7 @@ object FileFormatWriter extends

[GitHub] spark pull request #19474: [SPARK-22252][SQL] FileFormatWriter should respec...

2017-10-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19474#discussion_r144180666 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -117,7 +117,7 @@ object FileFormatWriter extends

[GitHub] spark issue #19476: [SPARK-22062][CORE] Spill large block to disk in BlockMa...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19476 **[Test build #82653 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82653/testReport)** for PR 19476 at commit

[GitHub] spark issue #19474: [SPARK-22252][SQL] FileFormatWriter should respect the i...

2017-10-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19474 I like this change because the relation between `ExecutedCommandExec` and `RunnableCommand` is a little entangled. --- - To

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-11 Thread jerryshao
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/19476 [SPARK-22062][CORE] Spill large block to disk in BlockManager's remote fetch to avoid OOM ## What changes were proposed in this pull request? In the current BlockManager's

[GitHub] spark issue #19474: [SPARK-22252][SQL] FileFormatWriter should respect the i...

2017-10-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19474 > The scan node is no longer visible above the insert node, I'll fix this later. The writer bug is more important and we should fix it ASAP. Totally agreed. LGTM ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82650/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82650 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82650/testReport)** for PR 18664 at commit

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82649/ Test PASSed. ---

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18460 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18460 **[Test build #82649 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82649/testReport)** for PR 18460 at commit

[GitHub] spark pull request #19467: [SPARK-22238] Fix plan resolution bug caused by E...

2017-10-11 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19467#discussion_r144154295 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -590,10 +590,33 @@ case class

[GitHub] spark pull request #19467: [SPARK-22238] Fix plan resolution bug caused by E...

2017-10-11 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19467#discussion_r144152923 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala --- @@ -131,17 +132,17 @@ class IncrementalExecution(

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82652 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82652/testReport)** for PR 19433 at commit

[GitHub] spark issue #19472: [WIP][SPARK-22246][SQL] Improve performance of UnsafeRow...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19472 **[Test build #82651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82651/testReport)** for PR 19472 at commit

  1   2   3   4   >