[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread NiharS
Github user NiharS commented on a diff in the pull request: https://github.com/apache/spark/pull/22192#discussion_r216210046 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -136,6 +136,26 @@ private[spark] class Executor( // for fetching remote

[GitHub] spark pull request #21433: [SPARK-23820][CORE] Enable use of long form of ca...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21433#discussion_r216209985 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -53,10 +55,16 @@ class RDDInfo( } private[spark] object

[GitHub] spark pull request #21433: [SPARK-23820][CORE] Enable use of long form of ca...

2018-09-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21433#discussion_r216209988 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -72,6 +72,9 @@ package object config { private[spark] val

[GitHub] spark issue #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for DataSour...

2018-09-10 Thread tigerquoll
Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21308 @rdblue when you say "you don't think the API proposed here needs to support a first-class partition concept", are you referring to the "DeleteSupport" Interface, or to DataSourceV2 in general?

[GitHub] spark pull request #22343: [SPARK-25391][SQL] Make behaviors consistent when...

2018-09-10 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22343#discussion_r216212552 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala --- @@ -1390,7 +1395,11 @@ class

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22318 Can you define the scope of this PR? In which case we should change the references in the join condition? --- - To

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22377 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22376: [SPARK-25021][K8S][BACKPORT] Add spark.executor.pyspark....

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22376 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22378 **[Test build #95860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95860/testReport)** for PR 22378 at commit

[GitHub] spark issue #22376: [SPARK-25021][K8S][BACKPORT] Add spark.executor.pyspark....

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22376 **[Test build #95856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95856/testReport)** for PR 22376 at commit

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22378 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nullable ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22375 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22377 **[Test build #95858 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95858/testReport)** for PR 22377 at commit

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22343 **[Test build #95857 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95857/testReport)** for PR 22343 at commit

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22192 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95862/ Test FAILed. ---

[GitHub] spark issue #22376: [SPARK-25021][K8S][BACKPORT] Add spark.executor.pyspark....

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22376 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22376: [SPARK-25021][K8S][BACKPORT] Add spark.executor.pyspark....

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22376 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95855/ Test FAILed. ---

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22377 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95858/ Test FAILed. ---

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22343 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22347 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95859/ Test FAILed. ---

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22378 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95860/ Test FAILed. ---

[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22347 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nullable ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22375 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95861/ Test FAILed. ---

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22343 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95857/ Test FAILed. ---

[GitHub] spark issue #22376: [SPARK-25021][K8S][BACKPORT] Add spark.executor.pyspark....

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22376 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95856/ Test FAILed. ---

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22378 **[Test build #95865 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95865/testReport)** for PR 22378 at commit

[GitHub] spark issue #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson version ...

2018-09-10 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22372 Jackson version below 2.9.5 has CVE issues, I would suggest to upgrade to 2.9.6 as #21596 did. --- - To unsubscribe, e-mail:

[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22237 **[Test build #95867 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95867/testReport)** for PR 22237 at commit

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22357 **[Test build #95868 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95868/testReport)** for PR 22357 at commit

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22357 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nullable ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22375 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22377 **[Test build #95863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95863/testReport)** for PR 22377 at commit

[GitHub] spark pull request #21968: [SPARK-24999][SQL]Reduce unnecessary 'new' memory...

2018-09-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21968 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22370: don't link to deprecated function

2018-09-10 Thread MichaelChirico
Github user MichaelChirico commented on a diff in the pull request: https://github.com/apache/spark/pull/22370#discussion_r216222159 --- Diff: R/pkg/R/catalog.R --- @@ -69,7 +69,6 @@ createExternalTable <- function(x, ...) { #' @param ... additional named parameters as options

[GitHub] spark pull request #22370: don't link to deprecated function

2018-09-10 Thread MichaelChirico
Github user MichaelChirico commented on a diff in the pull request: https://github.com/apache/spark/pull/22370#discussion_r216222094 --- Diff: R/pkg/R/catalog.R --- @@ -69,7 +69,6 @@ createExternalTable <- function(x, ...) { #' @param ... additional named parameters as options

[GitHub] spark pull request #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Supp...

2018-09-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20999#discussion_r216227938 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -293,6 +293,28 @@ class AstBuilder(conf: SQLConf)

[GitHub] spark issue #22373: [SPARK-25371][ML] VectorAssembler should not fail with e...

2018-09-10 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22373 @HyukjinKwon I am sure, since I tried removing the added check and the UT I added here passed. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #22227: [SPARK-25202] [SQL] Implements split with limit s...

2018-09-10 Thread phegstrom
Github user phegstrom commented on a diff in the pull request: https://github.com/apache/spark/pull/7#discussion_r216247365 --- Diff: R/pkg/R/functions.R --- @@ -3404,19 +3404,24 @@ setMethod("collect_set", #' Equivalent to \code{split} SQL function. #' #'

[GitHub] spark issue #22353: [SPARK-25357][SQL] Abbreviated simpleString in DataSourc...

2018-09-10 Thread LantaoJin
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/22353 Thanks @dongjoon-hyun . That would be a problem. Seems setting to 200 or 500 are cause a limited regression on hover text. Hard code to 500 shows:

[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22379 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22379 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson version ...

2018-09-10 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22372 Do we officially support hadoop3 in branch 2.4? If branch 2.4 doesn't target to support Hadoop3 and this fix is only for Hadoop3, then I don't think it is meaningful to have this fix. ---

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18142 > BTW, I believe there's no particular standard for backticks themselves since different DBMS uses different backtick implementations. You are right, but SQL standard does define how to

[GitHub] spark issue #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nullable ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22375 **[Test build #95861 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95861/testReport)** for PR 22375 at commit

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22371 How much perf can we save here? I don't think shuffle writing will be bottlenecked by this lock. --- - To unsubscribe,

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216218409 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -196,6 +201,9 @@ private[sql]

[GitHub] spark pull request #22343: [SPARK-25391][SQL] Make behaviors consistent when...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22343#discussion_r216218261 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -69,12 +69,25 @@ class ParquetOptions(

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22343 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22343 **[Test build #95864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95864/testReport)** for PR 22343 at commit

[GitHub] spark pull request #22365: [SPARK-25381][SQL] Stratified sampling by Column ...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22365#discussion_r216233575 --- Diff: python/pyspark/sql/dataframe.py --- @@ -880,18 +880,23 @@ def sampleBy(self, col, fractions, seed=None): | 0|5|

[GitHub] spark pull request #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nu...

2018-09-10 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/22375#discussion_r216206769 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -223,9 +223,9 @@ trait

[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread NiharS
Github user NiharS commented on a diff in the pull request: https://github.com/apache/spark/pull/22192#discussion_r216209462 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -240,6 +240,19 @@ private[spark] object Utils extends Logging { //

[GitHub] spark pull request #21433: [SPARK-23820][CORE] Enable use of long form of ca...

2018-09-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21433#discussion_r216209470 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -53,10 +55,16 @@ class RDDInfo( } private[spark] object

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18142 I mean https://spark.apache.org/docs/latest/sql-programming-guide.html#supported-hive-features and

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22377 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #22343: [SPARK-25391][SQL] Make behaviors consistent when...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22343#discussion_r216216422 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -69,12 +69,25 @@ class

[GitHub] spark pull request #22370: don't link to deprecated function

2018-09-10 Thread MichaelChirico
Github user MichaelChirico commented on a diff in the pull request: https://github.com/apache/spark/pull/22370#discussion_r216220254 --- Diff: R/pkg/R/catalog.R --- @@ -69,7 +69,6 @@ createExternalTable <- function(x, ...) { #' @param ... additional named parameters as options

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22378 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #22284: [SPARK-25278][SQL] Avoid duplicated Exec nodes when the ...

2018-09-10 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22284 kindly ping @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #22364: [SPARK-25379][SQL] Improve AttributeSet and Colum...

2018-09-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22364#discussion_r216224645 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala --- @@ -39,10 +41,15 @@ object AttributeSet {

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22378 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22357 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r216257678 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -646,7 +647,47 @@ private[spark] class AppStatusListener( }

[GitHub] spark issue #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson version ...

2018-09-10 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22372 Btw, I don't think we can run current Spark with Hadoop 3.1 without any change. --- - To unsubscribe, e-mail:

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18142 We do not need to follow Hive if Hive does not follow SQL compliance. Our main goal is to follow the mainstream DBMS vendors. BTW, we can enhance our parser to recognize the other

[GitHub] spark pull request #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nu...

2018-09-10 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22375#discussion_r216206924 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -223,9 +223,9 @@ trait

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22377 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22377 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22353: [SPARK-25357][SQL] Abbreviated simpleString in DataSourc...

2018-09-10 Thread LantaoJin
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/22353 The purpose is logging meta info like file input file path to event log. So I revert the changes about simpleString and add back the metadata to SparkPlanInfo interface. This change will log

[GitHub] spark pull request #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson v...

2018-09-10 Thread wangyum
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22372#discussion_r216208793 --- Diff: pom.xml --- @@ -2694,6 +2694,8 @@ 3.1.0 2.12.0 3.4.9 +2.7.8 + 2.7.8 --- End

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22318 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22318 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95854/ Test PASSed. ---

[GitHub] spark pull request #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson v...

2018-09-10 Thread wangyum
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22372#discussion_r216208966 --- Diff: pom.xml --- @@ -2694,6 +2694,8 @@ 3.1.0 2.12.0 3.4.9 +2.7.8 + 2.7.8 --- End

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-09-10 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 @cloud-fan @gatorsmile I think the old `Upgrading From Spark SQL 2.3.1 to 2.3.2 and above` is not needed since we do not backport SPARK-25132 to branch-2.3. I'm wondering if we need `Upgrading

[GitHub] spark pull request #22370: don't link to deprecated function

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22370#discussion_r216229836 --- Diff: R/pkg/R/catalog.R --- @@ -69,7 +69,6 @@ createExternalTable <- function(x, ...) { #' @param ... additional named parameters as options

[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22316 Seems fine to me. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-10 Thread rvesse
Github user rvesse commented on the issue: https://github.com/apache/spark/pull/21669 @vanzin I think in the current implementation of this PR the Kerberos login is happening inside the driver pod which is running inside the K8S cluster. The old design from the Spark on K8S

[GitHub] spark issue #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson version ...

2018-09-10 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22372 I did a simple test for 2.9.6. It works well. But that pr for 3.0. It means that a simple test on branch 2.4 will fail: ```scala scala> spark.range(10).write.parquet("/tmp/spark/parquet")

[GitHub] spark issue #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nullable ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22375 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18142 Yea, I didn't mean it super seriously @cloud-fan - I just left a comment in case for a better documentation since I see many users go from Hive to Spark. ---

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22192 **[Test build #95862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95862/testReport)** for PR 22192 at commit

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216218951 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -156,7 +161,7 @@ private[sql]

[GitHub] spark issue #22373: [SPARK-25371][ML] VectorAssembler should not fail with e...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22373 @mgaido91, BTW are you sure SPARK-21281 introduced that behaviour change? Before: ``` scala> import org.apache.spark.sql.functions.struct import

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216244620 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -155,6 +161,47 @@ class

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22378 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95865/ Test FAILed. ---

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 Thanks @dbtsai and @HyukjinKwon. Your comments are addressed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22378 **[Test build #95865 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95865/testReport)** for PR 22378 at commit

[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-09-10 Thread MaxGekk
GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/22379 [SPARK-25393][SQL] Adding new function from_csv() ## What changes were proposed in this pull request? The PR adds new function `from_csv()` similar to `from_json()` to parse columns with

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216255549 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -155,6 +161,47 @@ class

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22318 **[Test build #95854 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95854/testReport)** for PR 22318 at commit

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-09-10 Thread peter-toth
Github user peter-toth commented on the issue: https://github.com/apache/spark/pull/22318 @cloud-fan this PR doesn't solve that question. There are some hacks in `Dataset.join` to handle `EqualTo` and `EqualNullSafe` with duplicated attributes and those hacks are still required

[GitHub] spark issue #22010: [SPARK-21436][CORE] Take advantage of known partitioner ...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22010 I think this works, can we post some Spark web UI screenshots to confirm the shuffle is indeed eliminated? BTW one idea to simplify the implementation: ``` def

[GitHub] spark pull request #21433: [SPARK-23820][CORE] Enable use of long form of ca...

2018-09-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21433#discussion_r216213289 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -53,10 +55,16 @@ class RDDInfo( } private[spark] object

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18142 > Spark SQL is designed to be compatible with the Hive Metastore, SerDes and UDFs. This is different from `Spark can run any Hive SQL`. Spark can load and use Hive UDFs, with the right

[GitHub] spark pull request #22365: [SPARK-25381][SQL] Stratified sampling by Column ...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22365#discussion_r216233066 --- Diff: python/pyspark/sql/dataframe.py --- @@ -880,18 +880,23 @@ def sampleBy(self, col, fractions, seed=None): | 0|5|

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216256434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -199,6 +209,15 @@ private[sql]

[GitHub] spark pull request #21433: [SPARK-23820][CORE] Enable use of long form of ca...

2018-09-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21433#discussion_r216209901 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -72,6 +72,9 @@ package object config { private[spark] val

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22192 **[Test build #95862 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95862/testReport)** for PR 22192 at commit

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22192 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22376: [SPARK-25021][K8S][BACKPORT] Add spark.executor.pyspark....

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22376 **[Test build #95855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95855/testReport)** for PR 22376 at commit

  1   2   3   4   5   6   >