[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r136258558 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveDirCommand.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to t

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r136258236 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -155,6 +156,9 @@ object HiveAnalysis extends Rule[LogicalPlan] {

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r136258072 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -140,6 +141,9 @@ case class DataSourceAnaly

[GitHub] spark issue #17980: [SPARK-20728][SQL] Make ORCFileFormat configurable betwe...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17980 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81274/ Test PASSed. ---

[GitHub] spark issue #17980: [SPARK-20728][SQL] Make ORCFileFormat configurable betwe...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17980 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17980: [SPARK-20728][SQL] Make ORCFileFormat configurable betwe...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17980 **[Test build #81274 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81274/testReport)** for PR 17980 at commit [`edead5d`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18975 Just left a comment : https://github.com/apache/spark/pull/18975#discussion_r136256422 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as w

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r136256422 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1509,4 +1509,84 @@ class SparkSqlAstBuilder(conf: SQLConf)

[GitHub] spark issue #19093: [SPARK-21880][web UI]In the SQL table page, modify jobs ...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19093 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request #19093: [SPARK-21880][web UI]In the SQL table page, modif...

2017-08-30 Thread Geek-He
GitHub user Geek-He opened a pull request: https://github.com/apache/spark/pull/19093 [SPARK-21880][web UI]In the SQL table page, modify jobs trace information ## What changes were proposed in this pull request? As shown below, for example, When the job 5 is running, It was a mis

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 Thanks, I will refine soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19086 Test cases are missing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wish

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136252314 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala --- @@ -95,6 +95,9 @@ private[hive] trait HiveClient { /** Upd

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136252223 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -576,25 +576,25 @@ class SessionCatalog(

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136252051 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala --- @@ -264,21 +264,22 @@ class InMemoryCatalog(

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136251887 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala --- @@ -131,13 +131,21 @@ abstract class ExternalCatal

[GitHub] spark issue #19092: [SPARK-21878] [SQL] [TEST] Create SQLMetricsTestUtils

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19092 **[Test build #81275 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81275/testReport)** for PR 19092 at commit [`3dad127`](https://github.com/apache/spark/commit/3d

[GitHub] spark pull request #19092: [SPARK-21878] [SQL] [TEST] Create SQLMetricsTestU...

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19092#discussion_r136251106 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLMetricsSuite.scala --- @@ -17,112 +17,10 @@ package org.apache.spark

[GitHub] spark pull request #19092: [SPARK-21878] [SQL] [TEST] Create SQLMetricsUtils

2017-08-30 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/19092 [SPARK-21878] [SQL] [TEST] Create SQLMetricsUtils ## What changes were proposed in this pull request? Creates `SQLMetricsTestUtils` for the utility functions of both Hive-specific and the ot

[GitHub] spark issue #19092: [SPARK-21878] [SQL] [TEST] Create SQLMetricsTestUtils

2017-08-30 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19092 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-08-30 Thread rednaxelafx
Github user rednaxelafx commented on a diff in the pull request: https://github.com/apache/spark/pull/19083#discussion_r136250152 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1001,6 +1001,16 @@ abstract class C

[GitHub] spark pull request #19072: [SPARK-17139][ML][FOLLOW-UP] Add convenient metho...

2017-08-30 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/19072#discussion_r136249376 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -1471,6 +1471,17 @@ sealed trait LogisticRegressionSumm

[GitHub] spark pull request #19078: [SPARK-21862][ML] Add overflow check in PCA

2017-08-30 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/19078#discussion_r136248809 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/PCA.scala --- @@ -110,3 +115,17 @@ class PCAModel private[spark] ( } } }

[GitHub] spark pull request #19078: [SPARK-21862][ML] Add overflow check in PCA

2017-08-30 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/19078#discussion_r136249083 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/PCA.scala --- @@ -44,6 +44,11 @@ class PCA @Since("1.4.0") (@Since("1.4.0") val k: Int) {

[GitHub] spark pull request #19078: [SPARK-21862][ML] Add overflow check in PCA

2017-08-30 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/19078#discussion_r136248974 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/PCA.scala --- @@ -44,6 +44,11 @@ class PCA @Since("1.4.0") (@Since("1.4.0") val k: Int) {

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19077 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19077 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81273/ Test PASSed. ---

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #81273 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81273/testReport)** for PR 19077 at commit [`ba5717e`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19077 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19077 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81272/ Test PASSed. ---

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #81272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81272/testReport)** for PR 19077 at commit [`fc8b895`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #18999: [SPARK-21779][PYTHON] Simpler DataFrame.sample API in Py...

2017-08-30 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18999 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featu

[GitHub] spark pull request #19080: [SPARK-21865][SQL] simplify the distribution sema...

2017-08-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/19080#discussion_r136243745 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -30,18 +30,43 @@ import org.apache.spark.sql

[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

2017-08-30 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/16774#discussion_r136243309 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala --- @@ -120,6 +120,33 @@ class CrossValidatorSuite }

[GitHub] spark issue #19089: [SPARK-21728][core] Follow up: fix user config, auth in ...

2017-08-30 Thread jaceklaskowski
Github user jaceklaskowski commented on the issue: https://github.com/apache/spark/pull/19089 Logs are back with the change. 👍 Thanks (and don't mess it up again fixing STS :)) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub a

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18787 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #18787: [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColu...

2017-08-30 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18787 Thanks! merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishe

[GitHub] spark issue #17980: [SPARK-20728][SQL] Make ORCFileFormat configurable betwe...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17980 **[Test build #81274 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81274/testReport)** for PR 17980 at commit [`edead5d`](https://github.com/apache/spark/commit/ed

[GitHub] spark pull request #19085: [SPARK-21534][SQL][PySpark] PickleException when ...

2017-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19085 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #19085: [SPARK-21534][SQL][PySpark] PickleException when creatin...

2017-08-30 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19085 Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-08-30 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19083#discussion_r136240836 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1001,6 +1001,16 @@ abstract class CodeGe

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-08-30 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19083#discussion_r136240918 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1058,7 +1068,17 @@ object CodeGenerator

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19083#discussion_r136240660 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1058,7 +1068,17 @@ object CodeGenerator

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-08-30 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19083#discussion_r136240500 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1079,7 +1099,7 @@ object CodeGenerator e

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19083#discussion_r136239358 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1079,7 +1099,7 @@ object CodeGenerator e

[GitHub] spark issue #19091: Merge pull request #1 from apache/master

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19091 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #19091: Merge pull request #1 from apache/master

2017-08-30 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19091 @wangdanaadf, it looks mistakenly open. Couls you close this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] spark pull request #19091: Merge pull request #1 from apache/master

2017-08-30 Thread wangdanaadf
GitHub user wangdanaadf opened a pull request: https://github.com/apache/spark/pull/19091 Merge pull request #1 from apache/master spark-2.1 ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this pa

[GitHub] spark pull request #19083: [SPARK-21871][SQL] Check actual bytecode size whe...

2017-08-30 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19083#discussion_r136237952 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1001,6 +1001,16 @@ abstract class CodeGen

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-08-30 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18704 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #16803: [SPARK-19458][BUILD]load hive jars from local repo which...

2017-08-30 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/16803 @windpiger can you please rebase the code, it seems too old to review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] spark issue #19085: [SPARK-21534][SQL][PySpark] PickleException when creatin...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19085 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81269/ Test PASSed. ---

[GitHub] spark issue #19085: [SPARK-21534][SQL][PySpark] PickleException when creatin...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19085 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #19085: [SPARK-21534][SQL][PySpark] PickleException when creatin...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19085 **[Test build #81269 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81269/testReport)** for PR 19085 at commit [`c7dda23`](https://github.com/apache/spark/commit/c

[GitHub] spark pull request #19067: [SPARK-21849][Core]Make the serializer function m...

2017-08-30 Thread djvulee
Github user djvulee closed the pull request at: https://github.com/apache/spark/pull/19067 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is en

[GitHub] spark issue #18961: [SPARK-21746][SQL]there is an java.lang.IllegalArgumentE...

2017-08-30 Thread heary-cao
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18961 @gatorsmile , This should be a problem for code execution, semantics, and consistency. This trigger condition: _FileSourceScanExec.partitionFilters_ is not null and contains nondeterministic

[GitHub] spark pull request #18865: [SPARK-21610][SQL] Corrupt records are not handle...

2017-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18865#discussion_r136234146 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonFileFormat.scala --- @@ -114,7 +114,16 @@ class JsonFileFormat extends

[GitHub] spark issue #18787: [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColu...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18787 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18787: [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColu...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18787 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81270/ Test PASSed. ---

[GitHub] spark issue #18787: [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColu...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18787 **[Test build #81270 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81270/testReport)** for PR 18787 at commit [`ffcbf75`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #18029: [SPARK-20168] [DStream] Add changes to use kinesis fetch...

2017-08-30 Thread yssharma
Github user yssharma commented on the issue: https://github.com/apache/spark/pull/18029 Could I get some love here from the committers please @brkyvz @HyukjinKwon @srowen . Would love to work on any changes if required. --- If your project is set up for it, you can reply to this emai

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #81273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81273/testReport)** for PR 19077 at commit [`ba5717e`](https://github.com/apache/spark/commit/ba

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #81272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81272/testReport)** for PR 19077 at commit [`fc8b895`](https://github.com/apache/spark/commit/fc

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-08-30 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/19077 @jerryshao Thanks,i will add unit tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #19032: [SPARK-17321][YARN] Avoid writing shuffle metadat...

2017-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19032 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #19001: [SPARK-19256][SQL] Hive bucketing support

2017-08-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19001 With the simplified distribution semantic, I think it's much easier to support the hive bucketing. We only need to create a `HiveHashPartitioning`, implement it similar to `HashPartitioning` witho

[GitHub] spark issue #19080: [SPARK-21865][SQL] simplify the distribution semantic of...

2017-08-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19080 also cc @rxin , to support the "pre-shuffle" feature for data source v2, I need to create similar `Distribution` and `Partitioning` interfaces in the data source package. However, the current mode

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18953 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81268/ Test PASSed. ---

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-30 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19032 Merge to master branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishe

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18953 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18953 **[Test build #81268 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81268/testReport)** for PR 18953 at commit [`6548cf8`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #19080: [SPARK-21865][SQL] simplify the distribution semantic of...

2017-08-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19080 so my whole point of view is, co-partition is a really tricky requirement, and it's really hard to implicitly guarantee it during shuffle planning. We should have a weaker guarantee(same number of

[GitHub] spark issue #18787: [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColu...

2017-08-30 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18787 LGTM, pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #19074: [SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-upl...

2017-08-30 Thread jerryshao
Github user jerryshao closed the pull request at: https://github.com/apache/spark/pull/19074 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #19074: [SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading ...

2017-08-30 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19074 Thanks @vanzin , it should be passed now 😄 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19032 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19032 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81271/ Test PASSed. ---

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19032 **[Test build #81271 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81271/testReport)** for PR 19032 at commit [`ebe0a24`](https://github.com/apache/spark/commit/e

[GitHub] spark pull request #19080: [SPARK-21865][SQL] simplify the distribution sema...

2017-08-30 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19080#discussion_r136226265 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -30,18 +30,43 @@ import org.apache.spark.sql

[GitHub] spark issue #19080: [SPARK-21865][SQL] simplify the distribution semantic of...

2017-08-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19080 > Both sides will satisfy the required distribution of the join This is not true now. After this PR, join has a stricter distribution requirement called `HashPartitionedDistribution`, so r

[GitHub] spark pull request #9518: [SPARK-11574][Core] Add metrics StatsD sink

2017-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9518 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark issue #19001: [SPARK-19256][SQL] Hive bucketing support

2017-08-30 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/19001 https://github.com/apache/spark/pull/19080 is improving the distribution semantic in planner. Will wait for that to get in. --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #9518: [SPARK-11574][Core] Add metrics StatsD sink

2017-08-30 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/9518 Merge to master branch, thanks @xflin ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature en

[GitHub] spark issue #19089: [SPARK-21728][core] Follow up: fix user config, auth in ...

2017-08-30 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19089 Hmm, the change seems to have messed up the logging of the thrift server run during its test suite, which is parsed by the test code. --- If your project is set up for it, you can reply to this emai

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19032 **[Test build #81271 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81271/testReport)** for PR 19032 at commit [`ebe0a24`](https://github.com/apache/spark/commit/eb

[GitHub] spark issue #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows ...

2017-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19090 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in W...

2017-08-30 Thread minixalpha
GitHub user minixalpha opened a pull request: https://github.com/apache/spark/pull/19090 [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows command scripts ## What changes were proposed in this pull request? All the windows command scripts can not handle quotes in param

[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-08-30 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r136223648 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +47,29 @@ private boolean shouldPool(long size

[GitHub] spark pull request #19032: [SPARK-17321][YARN] Avoid writing shuffle metadat...

2017-08-30 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19032#discussion_r136223332 --- Diff: common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java --- @@ -321,6 +326,7 @@ public ByteBuffer getMetaData()

[GitHub] spark issue #19088: [SPARK-21875][BUILD] Fix Java style bugs

2017-08-30 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19088 Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #19088: [SPARK-21875][BUILD] Fix Java style bugs

2017-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19088 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile Thanks for taking time look at this. I updated description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18787: [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColu...

2017-08-30 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18787 @ueshin I updated the test to use a seq of Rows now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16774: [SPARK-19357][ML] Adding parallel model evaluation in ML...

2017-08-30 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/16774 ping @MLnick , does this look ok to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #19085: [SPARK-21534][SQL][PySpark] PickleException when creatin...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19085 **[Test build #81269 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81269/testReport)** for PR 19085 at commit [`c7dda23`](https://github.com/apache/spark/commit/c7

[GitHub] spark issue #18787: [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColu...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18787 **[Test build #81270 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81270/testReport)** for PR 18787 at commit [`ffcbf75`](https://github.com/apache/spark/commit/ff

[GitHub] spark pull request #19085: [SPARK-21534][SQL][PySpark] PickleException when ...

2017-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19085#discussion_r136216872 --- Diff: core/src/main/scala/org/apache/spark/api/python/SerDeUtil.scala --- @@ -35,6 +35,16 @@ import org.apache.spark.rdd.RDD /** Utilities for

[GitHub] spark pull request #19085: [SPARK-21534][SQL][PySpark] PickleException when ...

2017-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19085#discussion_r136216859 --- Diff: python/pyspark/sql/tests.py --- @@ -2480,6 +2480,11 @@ def assertCollectSuccess(typecode, value): a = array.array(t)

[GitHub] spark issue #18576: [SPARK-21351][SQL] Update nullability based on children'...

2017-08-30 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18576 yea, I also think `nullability` has good effects on many places as you suggested, so we better propagate this info correctly as much as possible. But, the plan nodes in the current implementation doe

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18953 **[Test build #81268 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81268/testReport)** for PR 18953 at commit [`6548cf8`](https://github.com/apache/spark/commit/65

  1   2   3   >