[GitHub] [spark] cloud-fan commented on issue #24696: [SPARK-27832][SQL] Don't decompress and create column batch when the task is completed

2019-05-27 Thread GitBox
cloud-fan commented on issue #24696: [SPARK-27832][SQL] Don't decompress and create column batch when the task is completed URL: https://github.com/apache/spark/pull/24696#issuecomment-496355838 > At the moment, the returned batch is also immediately closed I'm a little lost here.

[GitHub] [spark] dongjoon-hyun commented on issue #24472: [SPARK-27578][SQL] Support INTERVAL ... HOUR TO SECOND syntax

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24472: [SPARK-27578][SQL] Support INTERVAL ... HOUR TO SECOND syntax URL: https://github.com/apache/spark/pull/24472#issuecomment-496358371 Hi, @gatorsmile and @cloud-fan . Could you give us some directional advice, please? - First, this PR wants

[GitHub] [spark] dongjoon-hyun edited a comment on issue #24711: [Minor][SS]avoid inefficient sort when getLatest in HDFSMetadataLog

2019-05-27 Thread GitBox
dongjoon-hyun edited a comment on issue #24711: [Minor][SS]avoid inefficient sort when getLatest in HDFSMetadataLog URL: https://github.com/apache/spark/pull/24711#issuecomment-496362843 Thank you for pinging me, @wenxuanguan . Please make a JIRA issue and use the ID in the PR title. This

[GitHub] [spark] dongjoon-hyun commented on issue #24711: [Minor][SS]avoid inefficient sort when getLatest in HDFSMetadataLog

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24711: [Minor][SS]avoid inefficient sort when getLatest in HDFSMetadataLog URL: https://github.com/apache/spark/pull/24711#issuecomment-496362843 Thank you for pinging me, @wenxuanguan . Please make a JIRA issue and use the ID. This is trivial but worth

[GitHub] [spark] dongjoon-hyun closed pull request #24711: [SPARK-27859][SS] Use efficient sorting instead of `.sorted.reverse` sequence

2019-05-27 Thread GitBox
dongjoon-hyun closed pull request #24711: [SPARK-27859][SS] Use efficient sorting instead of `.sorted.reverse` sequence URL: https://github.com/apache/spark/pull/24711 This is an automated message from the Apache Git

[GitHub] [spark] SparkQA removed a comment on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC

2019-05-27 Thread GitBox
SparkQA removed a comment on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC URL: https://github.com/apache/spark/pull/24043#issuecomment-496340435 **[Test build #105853 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/105853/testReport)** for

[GitHub] [spark] SparkQA commented on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC

2019-05-27 Thread GitBox
SparkQA commented on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC URL: https://github.com/apache/spark/pull/24043#issuecomment-496369350 **[Test build #105853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/105853/testReport)** for PR

[GitHub] [spark] AmplabJenkins removed a comment on issue #24382: [SPARK-27330][SS] support task abort in foreach writer

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24382: [SPARK-27330][SS] support task abort in foreach writer URL: https://github.com/apache/spark/pull/24382#issuecomment-483678508 Can one of the admins verify this patch? This is

[GitHub] [spark] HeartSaVioR commented on issue #24382: [SPARK-27330][SS] support task abort in foreach writer

2019-05-27 Thread GitBox
HeartSaVioR commented on issue #24382: [SPARK-27330][SS] support task abort in foreach writer URL: https://github.com/apache/spark/pull/24382#issuecomment-496369272 test this please This is an automated message from the

[GitHub] [spark] HyukjinKwon closed pull request #24716: [SPARK-27848][R][BUILD] AppVeyor change to latest R version (3.6.0)

2019-05-27 Thread GitBox
HyukjinKwon closed pull request #24716: [SPARK-27848][R][BUILD] AppVeyor change to latest R version (3.6.0) URL: https://github.com/apache/spark/pull/24716 This is an automated message from the Apache Git Service. To

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24716: [SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0)

2019-05-27 Thread GitBox
HyukjinKwon commented on a change in pull request #24716: [SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0) URL: https://github.com/apache/spark/pull/24716#discussion_r287898937 ## File path: appveyor.yml ## @@ -52,6 +52,10 @@ build_script:

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations

2019-05-27 Thread GitBox
HyukjinKwon commented on a change in pull request #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations URL: https://github.com/apache/spark/pull/24700#discussion_r287899639 ## File path: docs/sql-pyspark-pandas-with-arrow.md ##

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24043: [SPARK-11412][SQL] Support merge schema for ORC

2019-05-27 Thread GitBox
dongjoon-hyun commented on a change in pull request #24043: [SPARK-11412][SQL] Support merge schema for ORC URL: https://github.com/apache/spark/pull/24043#discussion_r287910607 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala

[GitHub] [spark] AmplabJenkins removed a comment on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC URL: https://github.com/apache/spark/pull/24043#issuecomment-496341410 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24043: [SPARK-11412][SQL] Support merge schema for ORC URL: https://github.com/apache/spark/pull/24043#issuecomment-496341405 Merged build finished. Test PASSed. This is an automated

[GitHub] [spark] SparkQA commented on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead.

2019-05-27 Thread GitBox
SparkQA commented on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead. URL: https://github.com/apache/spark/pull/24671#issuecomment-496341756 **[Test build #105854 has

[GitHub] [spark] AmplabJenkins commented on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead.

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead. URL: https://github.com/apache/spark/pull/24671#issuecomment-496342765 Test PASSed. Refer to this link for build results (access rights to

[GitHub] [spark] AmplabJenkins removed a comment on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead.

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead. URL: https://github.com/apache/spark/pull/24671#issuecomment-496342762 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins removed a comment on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead.

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead. URL: https://github.com/apache/spark/pull/24671#issuecomment-496342765 Test PASSed. Refer to this link for build results (access

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24043: [SPARK-11412][SQL] Support merge schema for ORC

2019-05-27 Thread GitBox
dongjoon-hyun commented on a change in pull request #24043: [SPARK-11412][SQL] Support merge schema for ORC URL: https://github.com/apache/spark/pull/24043#discussion_r287911373 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala

[GitHub] [spark] AmplabJenkins commented on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead.

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24671: [SPARK-27811][Core][Docs]Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead. URL: https://github.com/apache/spark/pull/24671#issuecomment-496342762 Merged build finished. Test PASSed.

[GitHub] [spark] SparkQA commented on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics

2019-05-27 Thread GitBox
SparkQA commented on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics URL: https://github.com/apache/spark/pull/24717#issuecomment-496346720 **[Test build #105856 has

[GitHub] [spark] dongjoon-hyun commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#issuecomment-496351652 @gcmerz . What is your id in Apache JIRA? If you don't have, please create one. Then,

[GitHub] [spark] jzhuge commented on a change in pull request #24689: [SPARK-26946][SQL][FOLLOWUP] Handle lookupCatalog function not defined

2019-05-27 Thread GitBox
jzhuge commented on a change in pull request #24689: [SPARK-26946][SQL][FOLLOWUP] Handle lookupCatalog function not defined URL: https://github.com/apache/spark/pull/24689#discussion_r287922975 ## File path:

[GitHub] [spark] cloud-fan closed pull request #24569: [SPARK-23191][CORE] Warn rather than terminate when duplicate worker register happens

2019-05-27 Thread GitBox
cloud-fan closed pull request #24569: [SPARK-23191][CORE] Warn rather than terminate when duplicate worker register happens URL: https://github.com/apache/spark/pull/24569 This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins removed a comment on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics URL: https://github.com/apache/spark/pull/24717#issuecomment-496356858 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins removed a comment on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics URL: https://github.com/apache/spark/pull/24717#issuecomment-496356864 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] dongjoon-hyun commented on issue #24711: [SPARK-27859][SS] Use efficient sorting instead of `.sorted.reverse` sequence

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24711: [SPARK-27859][SS] Use efficient sorting instead of `.sorted.reverse` sequence URL: https://github.com/apache/spark/pull/24711#issuecomment-496364119 Merged to master. This is an

[GitHub] [spark] dongjoon-hyun commented on issue #24711: [Minor][SS] Use efficient sorting instead of `.sorted.reverse` sequence

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24711: [Minor][SS] Use efficient sorting instead of `.sorted.reverse` sequence URL: https://github.com/apache/spark/pull/24711#issuecomment-496363920 I'll create for you. This is an

[GitHub] [spark] AmplabJenkins removed a comment on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations URL: https://github.com/apache/spark/pull/24700#issuecomment-496328402 Test PASSed. Refer to this link for build results (access rights to CI server

[GitHub] [spark] AmplabJenkins commented on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations URL: https://github.com/apache/spark/pull/24700#issuecomment-496328397 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations URL: https://github.com/apache/spark/pull/24700#issuecomment-496328402 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations URL: https://github.com/apache/spark/pull/24700#issuecomment-496328397 Merged build finished. Test PASSed.

[GitHub] [spark] WangGuangxin closed pull request #24175: [SPARK-27232][SQL]Ignore file locality in InMemoryFileIndex if spark.locality.wait is set to zero

2019-05-27 Thread GitBox
WangGuangxin closed pull request #24175: [SPARK-27232][SQL]Ignore file locality in InMemoryFileIndex if spark.locality.wait is set to zero URL: https://github.com/apache/spark/pull/24175 This is an automated message from

[GitHub] [spark] WangGuangxin commented on issue #24175: [SPARK-27232][SQL]Ignore file locality in InMemoryFileIndex if spark.locality.wait is set to zero

2019-05-27 Thread GitBox
WangGuangxin commented on issue #24175: [SPARK-27232][SQL]Ignore file locality in InMemoryFileIndex if spark.locality.wait is set to zero URL: https://github.com/apache/spark/pull/24175#issuecomment-496339014 Close this since there is a better solution

[GitHub] [spark] gcmerz commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
gcmerz commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#issuecomment-496344189 Applied the tweaks--thank you so much for the quick review!

[GitHub] [spark] AmplabJenkins commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#issuecomment-496344050 Test PASSed. Refer to this link for build results (access rights to CI server

[GitHub] [spark] AmplabJenkins commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#issuecomment-496344048 Merged build finished. Test PASSed.

[GitHub] [spark] gcmerz commented on a change in pull request #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
gcmerz commented on a change in pull request #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#discussion_r287912367 ## File path:

[GitHub] [spark] AmplabJenkins removed a comment on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#issuecomment-496344048 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins removed a comment on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#issuecomment-496344050 Test PASSed. Refer to this link for build results (access rights to CI

[GitHub] [spark] gcmerz commented on a change in pull request #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
gcmerz commented on a change in pull request #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#discussion_r287912404 ## File path:

[GitHub] [spark] dongjoon-hyun closed pull request #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
dongjoon-hyun closed pull request #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722 This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics

2019-05-27 Thread GitBox
SparkQA removed a comment on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics URL: https://github.com/apache/spark/pull/24717#issuecomment-496346720 **[Test build #105856 has

[GitHub] [spark] AmplabJenkins commented on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics URL: https://github.com/apache/spark/pull/24717#issuecomment-496356864 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24717: [SPARK-27847][ML] One-Pass MultilabelMetrics & MulticlassMetrics URL: https://github.com/apache/spark/pull/24717#issuecomment-496356858 Merged build finished. Test PASSed. This is an

[GitHub] [spark] cloud-fan commented on issue #24569: [SPARK-23191][CORE] Warn rather than terminate when duplicate worker register happens

2019-05-27 Thread GitBox
cloud-fan commented on issue #24569: [SPARK-23191][CORE] Warn rather than terminate when duplicate worker register happens URL: https://github.com/apache/spark/pull/24569#issuecomment-496356630 thanks, merging to master!

[GitHub] [spark] jzhuge commented on issue #24689: [SPARK-26946][SQL][FOLLOWUP] Handle lookupCatalog function not defined

2019-05-27 Thread GitBox
jzhuge commented on issue #24689: [SPARK-26946][SQL][FOLLOWUP] Handle lookupCatalog function not defined URL: https://github.com/apache/spark/pull/24689#issuecomment-496329314 Well said @rdblue. I will proceed with the requirement of a lookup function. Since the interface is not widely

[GitHub] [spark] jzhuge edited a comment on issue #24689: [SPARK-26946][SQL][FOLLOWUP] Handle lookupCatalog function not defined

2019-05-27 Thread GitBox
jzhuge edited a comment on issue #24689: [SPARK-26946][SQL][FOLLOWUP] Handle lookupCatalog function not defined URL: https://github.com/apache/spark/pull/24689#issuecomment-496329314 Well said @rdblue. I will proceed with the requirement of a lookup function. Since the interface is not

[GitHub] [spark] BestOreo edited a comment on issue #24720: [SPARK-27852][Spark Core] updateBytesWritten() operaton is missed

2019-05-27 Thread GitBox
BestOreo edited a comment on issue #24720: [SPARK-27852][Spark Core] updateBytesWritten() operaton is missed URL: https://github.com/apache/spark/pull/24720#issuecomment-496335936 > Look at `recordWritten()`. `recordWritten()` is never used in `override def write(kvBytes:

[GitHub] [spark] lipzhu commented on issue #24472: [SPARK-27578][SQL] Add support for "interval '23:59:59' hour to second"

2019-05-27 Thread GitBox
lipzhu commented on issue #24472: [SPARK-27578][SQL] Add support for "interval '23:59:59' hour to second" URL: https://github.com/apache/spark/pull/24472#issuecomment-496339787 @dongjoon-hyun Thanks for your update to remove the duplicate codes.

[GitHub] [spark] AmplabJenkins commented on issue #24472: [SPARK-27578][SQL] Add support for "interval '23:59:59' hour to second"

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24472: [SPARK-27578][SQL] Add support for "interval '23:59:59' hour to second" URL: https://github.com/apache/spark/pull/24472#issuecomment-496340128 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] dongjoon-hyun commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24722: [SPARK-27858][SQL] Fix for avro deserialization on union types with multiple non-null types URL: https://github.com/apache/spark/pull/24722#issuecomment-496348246 You're welcome. Thank you for swift update.

[GitHub] [spark] zhengruifeng commented on issue #14325: [SPARK-16692] [ML] Add multi label classification evaluator, DataFrame

2019-05-27 Thread GitBox
zhengruifeng commented on issue #14325: [SPARK-16692] [ML] Add multi label classification evaluator, DataFrame URL: https://github.com/apache/spark/pull/14325#issuecomment-496350955 What's the progress now? @liwzhi @WeichenXu123 @srowen If @liwzhi are not working on this, can I take it

[GitHub] [spark] dongjoon-hyun edited a comment on issue #24472: [SPARK-27578][SQL] Support INTERVAL ... HOUR TO SECOND syntax

2019-05-27 Thread GitBox
dongjoon-hyun edited a comment on issue #24472: [SPARK-27578][SQL] Support INTERVAL ... HOUR TO SECOND syntax URL: https://github.com/apache/spark/pull/24472#issuecomment-496358371 Hi, @gatorsmile and @cloud-fan . Could you give us some directional advice, please? - First, this

[GitHub] [spark] dongjoon-hyun commented on issue #24724: User friendly dataset, dataframe generation for csv datasources without explicit StructType definitions.

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24724: User friendly dataset, dataframe generation for csv datasources without explicit StructType definitions. URL: https://github.com/apache/spark/pull/24724#issuecomment-496365550 Hi, @swapnilushinde . Thank you for making a PR, but do you the

[GitHub] [spark] swapnilushinde edited a comment on issue #24724: User friendly dataset, dataframe generation for csv datasources without explicit StructType definitions.

2019-05-27 Thread GitBox
swapnilushinde edited a comment on issue #24724: User friendly dataset, dataframe generation for csv datasources without explicit StructType definitions. URL: https://github.com/apache/spark/pull/24724#issuecomment-496367606 Hi, @dongjoon-hyun Thanks for reply. Yes, I use this API

[GitHub] [spark] HyukjinKwon commented on issue #24716: [SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0)

2019-05-27 Thread GitBox
HyukjinKwon commented on issue #24716: [SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0) URL: https://github.com/apache/spark/pull/24716#issuecomment-496368610 Oops, thanks This is an automated message from

[GitHub] [spark] HyukjinKwon commented on issue #24716: [SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0)

2019-05-27 Thread GitBox
HyukjinKwon commented on issue #24716: [SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0) URL: https://github.com/apache/spark/pull/24716#issuecomment-496372569 Merged to master. This is an automated message

[GitHub] [spark] dongjoon-hyun commented on issue #24711: [SPARK-27859][SS] Use efficient sorting instead of `.sorted.reverse` sequence

2019-05-27 Thread GitBox
dongjoon-hyun commented on issue #24711: [SPARK-27859][SS] Use efficient sorting instead of `.sorted.reverse` sequence URL: https://github.com/apache/spark/pull/24711#issuecomment-496374107 You're welcome, @wenxuanguan .

[GitHub] [spark] wenxuanguan commented on issue #24711: [Minor][SS]avoid inefficient sort when getLatest in HDFSMetadataLog

2019-05-27 Thread GitBox
wenxuanguan commented on issue #24711: [Minor][SS]avoid inefficient sort when getLatest in HDFSMetadataLog URL: https://github.com/apache/spark/pull/24711#issuecomment-496091191 retest this please. This is an automated

[GitHub] [spark] SparkQA commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
SparkQA commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496099261 **[Test build #105816 has

[GitHub] [spark] wenxuanguan commented on issue #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-05-27 Thread GitBox
wenxuanguan commented on issue #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming URL: https://github.com/apache/spark/pull/22282#issuecomment-496102490 since this pr is out of date with base branch, please update or close it.

[GitHub] [spark] SparkQA removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
SparkQA removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496099261 **[Test build #105816 has

[GitHub] [spark] AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496103194 Test FAILed. Refer to this link for build results (access rights to CI server

[GitHub] [spark] AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496103185 Build finished. Test FAILed.

[GitHub] [spark] SparkQA commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
SparkQA commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496105017 **[Test build #105818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/105818/testReport)**

[GitHub] [spark] AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496106508 Merged build finished. Test PASSed. This is an

[GitHub] [spark] AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496106515 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496106515 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496106508 Merged build finished. Test PASSed. This is an

[GitHub] [spark] AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496106055 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496106049 Merged build finished. Test FAILed. This is an

[GitHub] [spark] SparkQA removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
SparkQA removed a comment on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496105017 **[Test build #105818 has

[GitHub] [spark] AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496108080 Merged build finished. Test FAILed. This is an

[GitHub] [spark] AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496108088 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
SparkQA commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496108070 **[Test build #105819 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/105819/testReport)**

[GitHub] [spark] jzhuge opened a new pull request #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs

2019-05-27 Thread GitBox
jzhuge opened a new pull request #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs URL: https://github.com/apache/spark/pull/24718 ## What changes were proposed in this pull request? Support multi-catalog in the following SELECT code paths: - SELECT *

[GitHub] [spark] AmplabJenkins commented on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs URL: https://github.com/apache/spark/pull/24718#issuecomment-496110779 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs URL: https://github.com/apache/spark/pull/24718#issuecomment-496110774 Merged build finished. Test PASSed. This is an

[GitHub] [spark] cloud-fan commented on a change in pull request #24685: [SPARK-27814][SQL] The cast operation for partition key may push down uncorrect filter, which is fatal.

2019-05-27 Thread GitBox
cloud-fan commented on a change in pull request #24685: [SPARK-27814][SQL] The cast operation for partition key may push down uncorrect filter, which is fatal. URL: https://github.com/apache/spark/pull/24685#discussion_r287676514 ## File path:

[GitHub] [spark] AmplabJenkins removed a comment on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs URL: https://github.com/apache/spark/pull/24718#issuecomment-496110779 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24718: [SPARK-27322][SQL] DataSourceV2: Select from multiple catalogs URL: https://github.com/apache/spark/pull/24718#issuecomment-496110774 Merged build finished. Test PASSed. This

[GitHub] [spark] AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496112876 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #24719: [SPARK-27849][SQL] Redact treeString of FileTable and DataSourceV2ScanExecBase

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24719: [SPARK-27849][SQL] Redact treeString of FileTable and DataSourceV2ScanExecBase URL: https://github.com/apache/spark/pull/24719#issuecomment-496112851 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24416: [SPARK-27521][SQL] move data source v2 to catalyst module URL: https://github.com/apache/spark/pull/24416#issuecomment-496112868 Merged build finished. Test PASSed. This is an

[GitHub] [spark] AmplabJenkins commented on issue #24719: [SPARK-27849][SQL] Redact treeString of FileTable and DataSourceV2ScanExecBase

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24719: [SPARK-27849][SQL] Redact treeString of FileTable and DataSourceV2ScanExecBase URL: https://github.com/apache/spark/pull/24719#issuecomment-496112854 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] cloud-fan commented on issue #24699: [SPARK-27666][CORE] Stop PythonRunner's WriteThread immediately when task finishes

2019-05-27 Thread GitBox
cloud-fan commented on issue #24699: [SPARK-27666][CORE] Stop PythonRunner's WriteThread immediately when task finishes URL: https://github.com/apache/spark/pull/24699#issuecomment-496088487 After a second thought, it seems overkill to block the main thread and wait for the python writer

[GitHub] [spark] breakdawn commented on a change in pull request #24684: [SPARK-27657][ML]Fix the log format of ml.util.Instrumentation.logFai…

2019-05-27 Thread GitBox
breakdawn commented on a change in pull request #24684: [SPARK-27657][ML]Fix the log format of ml.util.Instrumentation.logFai… URL: https://github.com/apache/spark/pull/24684#discussion_r287660742 ## File path: mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala

[GitHub] [spark] cloud-fan commented on a change in pull request #24653: [SPARK-27783][SQL] Add customizable hint error handler

2019-05-27 Thread GitBox
cloud-fan commented on a change in pull request #24653: [SPARK-27783][SQL] Add customizable hint error handler URL: https://github.com/apache/spark/pull/24653#discussion_r287661852 ## File path:

[GitHub] [spark] cloud-fan commented on a change in pull request #24653: [SPARK-27783][SQL] Add customizable hint error handler

2019-05-27 Thread GitBox
cloud-fan commented on a change in pull request #24653: [SPARK-27783][SQL] Add customizable hint error handler URL: https://github.com/apache/spark/pull/24653#discussion_r287661778 ## File path:

[GitHub] [spark] ivoson commented on a change in pull request #24537: [SPARK-23887][SS] continuous query progress reporting

2019-05-27 Thread GitBox
ivoson commented on a change in pull request #24537: [SPARK-23887][SS] continuous query progress reporting URL: https://github.com/apache/spark/pull/24537#discussion_r287664306 ## File path:

[GitHub] [spark] ivoson commented on a change in pull request #24537: [SPARK-23887][SS] continuous query progress reporting

2019-05-27 Thread GitBox
ivoson commented on a change in pull request #24537: [SPARK-23887][SS] continuous query progress reporting URL: https://github.com/apache/spark/pull/24537#discussion_r287658210 ## File path:

[GitHub] [spark] ivoson commented on a change in pull request #24537: [SPARK-23887][SS] continuous query progress reporting

2019-05-27 Thread GitBox
ivoson commented on a change in pull request #24537: [SPARK-23887][SS] continuous query progress reporting URL: https://github.com/apache/spark/pull/24537#discussion_r287659997 ## File path:

[GitHub] [spark] AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496098818 Build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
AmplabJenkins removed a comment on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496098822 Test PASSed. Refer to this link for build results (access rights to CI server

[GitHub] [spark] AmplabJenkins commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496098822 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
AmplabJenkins commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496098818 Build finished. Test PASSed.

[GitHub] [spark] wenxuanguan edited a comment on issue #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-05-27 Thread GitBox
wenxuanguan edited a comment on issue #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming URL: https://github.com/apache/spark/pull/22282#issuecomment-496102490 since this pr is out of date with base branch and not update for two months, please update or close

[GitHub] [spark] SparkQA commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-05-27 Thread GitBox
SparkQA commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-496103167 **[Test build #105816 has

<    1   2   3   4   5   6   7   8   >