date:20190814

[GitHub] [spark] SparkQA commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly

2019-08-14 Thread GitBox

SparkQA commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data 
source table SerDe correctly
URL: https://github.com/apache/spark/pull/24486#issuecomment-521537900
 
 
   **[Test build #109147 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109147/testReport)**
 for PR 24486 at commit 
[`842bd3e`](https://github.com/apache/spark/commit/842bd3ec57a33093a5f47ceb38016ebabf9503e1).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] 19kka commented on issue #25450: [SPARK-23793][SQL]Handle database names in spark.udf.register()

2019-08-14 Thread GitBox

19kka commented on issue #25450: [SPARK-23793][SQL]Handle database names in 
spark.udf.register()
URL: https://github.com/apache/spark/pull/25450#issuecomment-521537674
 
 
   > Thank you for your first contribution, @19kka . Could you run 
`dev/scalastyle` and fix the errors? I saw some violation like 
[this](https://github.com/apache/spark/pull/25450/files#diff-85fdb913077429ac8e211a3c68375994L24)
 here.
   
   I'm awfully sorry about forget check the style, now I fixed the style error 
and add UDFSuite Test.
   
   I read the related register code again, I realized `spark.sql.resigter()` is 
responsible for  **Create Temp Function** , so I modify the code if 
`spark.sql.resigter()` function name with  **database** name It will throw new 
AnalysisException 
   
   e.g.
   
   ```scala 
   spark.udf.register("db.fun1", (x: Long) => x + 1)
   // throw new AnalysisException
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] 
fallback to hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521535225
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

SparkQA commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to 
hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521535973
 
 
   **[Test build #109146 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109146/testReport)**
 for PR 25460 at commit 
[`c7edcb4`](https://github.com/apache/spark/commit/c7edcb4f89e57332f24a7f4994c7e762eecf12df).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] 
fallback to hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521535442
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] 
fallback to hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521535447
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14215/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback 
to hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521535442
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback 
to hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521535447
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14215/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #25418: [SPARK-28695][SS] Use CaseInsensitiveMap in KafkaSourceProvider to make source param handling more robust

2019-08-14 Thread GitBox

cloud-fan closed pull request #25418: [SPARK-28695][SS] Use CaseInsensitiveMap 
in KafkaSourceProvider to make source param handling more robust
URL: https://github.com/apache/spark/pull/25418
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables

2019-08-14 Thread GitBox

shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] 
Support `spark.sql.statistics.fallBackToHdfs` in data source tables
URL: https://github.com/apache/spark/pull/22502#discussion_r314191902
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala
 ##
 @@ -71,7 +70,13 @@ case class HadoopFsRelation(
 
   override def sizeInBytes: Long = {
 val compressionFactor = sqlContext.conf.fileCompressionFactor
-(location.sizeInBytes * compressionFactor).toLong
+val defaultSize = (location.sizeInBytes * compressionFactor).toLong
+location match {
+  case cfi: CatalogFileIndex if 
sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled =>
 
 Review comment:
   @cloud-fan I have created a followup PR to add the extra condition. Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback 
to hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521535225
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] shahidki31 commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

shahidki31 commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to 
hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460#issuecomment-521534572
 
 
   cc @cloud-fan @dongjoon-hyun 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore

2019-08-14 Thread GitBox

dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate 
Database/Table names starting with underscore
URL: https://github.com/apache/spark/pull/25448#issuecomment-521533557
 
 
   @cloud-fan @dongjoon-hyun @HyukjinKwon 
   Was just checking the db2 definition of a identifier in 
[link](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r720.html)
   
   Its defined as following :
   ```
   An ordinary identifier is an uppercase letter followed by zero or more 
characters, each of which is an uppercase letter, a digit, or the underscore 
character. Note that lower case letters can be used when specifying an ordinary 
identifier, but they are converted to uppercase when processed. An ordinary 
identifier should not be a reserved word.
   ```
   
   Hive seems to have allowed  digit as first character as well.
   
   ```
   Identifier
   :
   (Letter | Digit) (Letter | Digit | '_')*
   | {allowQuotedId()}? QuotedIdentifier  /* though at the language level 
we allow all Identifiers to be QuotedIdentifiers;
 at the API level only columns 
are allowed to be of this form */
   | '`' RegexComponent+ '`'
   ;
   ```
   
   Not sure why in spark we allowed "_" as starting char to begin with ? Is it 
to match some other system ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' 
and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-521533643
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109139/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] shahidki31 opened a new pull request #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available

2019-08-14 Thread GitBox

shahidki31 opened a new pull request #25460: [SPARK-25474][SQL][Followup] 
fallback to hdfs when relation table stats is not available
URL: https://github.com/apache/spark/pull/25460
 
 
   …ts not available
   
   ## What changes were proposed in this pull request?
   When the table relation stats are not empty, do not fall back to HDFS for 
size estimation.
   
   ## How was this patch tested?
   
   Existing tests


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore

2019-08-14 Thread GitBox

dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate 
Database/Table names starting with underscore
URL: https://github.com/apache/spark/pull/25448#issuecomment-521533557
 
 
   @cloud-fan @dongjoon-hyun 
   Was just checking the db2 definition of a identifier in 
[link](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r720.html)
   
   Its defined as following :
   ```
   An ordinary identifier is an uppercase letter followed by zero or more 
characters, each of which is an uppercase letter, a digit, or the underscore 
character. Note that lower case letters can be used when specifying an ordinary 
identifier, but they are converted to uppercase when processed. An ordinary 
identifier should not be a reserved word.
   ```
   
   Hive seems to have allowed  digit as first character as well.
   
   ```
   Identifier
   :
   (Letter | Digit) (Letter | Digit | '_')*
   | {allowQuotedId()}? QuotedIdentifier  /* though at the language level 
we allow all Identifiers to be QuotedIdentifiers;
 at the API level only columns 
are allowed to be of this form */
   | '`' RegexComponent+ '`'
   ;
   ```
   
   Not sure why in spark we allowed "_" as starting char to begin with ? Is it 
to match some other system ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #25418: [SPARK-28695][SS] Use CaseInsensitiveMap in KafkaSourceProvider to make source param handling more robust

2019-08-14 Thread GitBox

cloud-fan commented on issue #25418: [SPARK-28695][SS] Use CaseInsensitiveMap 
in KafkaSourceProvider to make source param handling more robust
URL: https://github.com/apache/spark/pull/25418#issuecomment-521534340
 
 
   thanks, merging to master!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' 
and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-521533637
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore

2019-08-14 Thread GitBox

dilipbiswal commented on issue #25448: [SPARK-28697][SQL] Invalidate 
Database/Table names starting with underscore
URL: https://github.com/apache/spark/pull/25448#issuecomment-521533557
 
 
   @cloud-fan @dongjoon-hyun 
   Was just checking the db2 definition of a identifier in 
[link](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r720.html)
   
   Its defined as following :
   ```
   An ordinary identifier is an uppercase letter followed by zero or more 
characters, each of which is an uppercase letter, a digit, or the underscore 
character. Note that lower case letters can be used when specifying an ordinary 
identifier, but they are converted to uppercase when processed. An ordinary 
identifier should not be a reserved word.
   ```
   
   Hive seems to have allowed  digit as first character as well.
   
   ```
   Identifier
   :
   (Letter | Digit) (Letter | Digit | '_')*
   | {allowQuotedId()}? QuotedIdentifier  /* though at the language level 
we allow all Identifiers to be QuotedIdentifiers;
 at the API level only columns 
are allowed to be of this form */
   | '`' RegexComponent+ '`'
   ;
   ```
   
   Not sure why in spark we allowed "_" as starting char to begin with ? Is it 
match some other system ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 
'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-521533643
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109139/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 
'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-521533637
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

SparkQA removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 
'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-521506392
 
 
   **[Test build #109139 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109139/testReport)**
 for PR 25458 at commit 
[`7d61642`](https://github.com/apache/spark/commit/7d61642860125ff8049578507b6b1143eacad88b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

SparkQA commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as 
input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-521533431
 
 
   **[Test build #109139 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109139/testReport)**
 for PR 25458 at commit 
[`7d61642`](https://github.com/apache/spark/commit/7d61642860125ff8049578507b6b1143eacad88b).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add 
`date_part` function for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#discussion_r314189408
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ##
 @@ -1963,3 +1963,64 @@ case class Epoch(child: Expression, timeZoneId: 
Option[String] = None)
 defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)")
   }
 }
+
+@ExpressionDescription(
+  usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.",
+  arguments = """
+Arguments:
+  * field - selects which part of the source should be extracted. 
Supported string values are:
+["MILLENNIUM", "CENTURY", "DECADE", "YEAR", "QUARTER", "MONTH",
+ "WEEK", "DAY", "DAYOFWEEK", "DOW", "ISODOW", "DOY",
+ "HOUR", "MINUTE", "SECOND"]
+  * source - a date (or timestamp) column from where `field` should be 
extracted
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
+   2019
+  > SELECT _FUNC_('week', timestamp'2019-08-12 01:00:00.123456');
+   33
+  > SELECT _FUNC_('doy', DATE'2019-08-12');
+   224
+  """,
+  since = "3.0.0")
+case class DatePart(field: Expression, source: Expression, child: Expression)
+  extends RuntimeReplaceable {
+
+  def this(field: Expression, source: Expression) {
+this(field, source, {
+  if (!field.foldable) {
+throw new AnalysisException("The field parameter needs to be a 
foldable string value.")
 
 Review comment:
   According to PostgreSQL docs 
https://www.postgresql.org/docs/11/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT:
   
   >_source_ must be a value expression ...  the _**field**_ parameter 
needs to be **a string value**
   
   Accepting _field_ as an expression is undocumented feature. We could support 
that separately if it is needed.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

SparkQA commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function 
for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#issuecomment-521532339
 
 
   **[Test build #109145 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109145/testReport)**
 for PR 25410 at commit 
[`1b2c8d4`](https://github.com/apache/spark/commit/1b2c8d4d72394cfd27e8d4e6b0a9291706cd62e5).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add 
`date_part` function for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#issuecomment-521531847
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14214/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add 
`date_part` function for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#issuecomment-521531838
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` 
function for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#issuecomment-521531838
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` 
function for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#issuecomment-521531847
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14214/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add 
`date_part` function for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#discussion_r314188108
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/timestamp.sql
 ##
 @@ -187,12 +187,11 @@ SELECT '' AS date_trunc_week, date_trunc( 'week', 
timestamp '2004-02-29 15:44:17
 --   WHERE d1 BETWEEN timestamp '1902-01-01'
 --AND timestamp '2038-01-01';
 
--- [SPARK-28420] Date/Time Functions: date_part
--- SELECT '' AS "54", d1 as "timestamp",
---date_part( 'year', d1) AS year, date_part( 'month', d1) AS month,
---date_part( 'day', d1) AS day, date_part( 'hour', d1) AS hour,
---date_part( 'minute', d1) AS minute, date_part( 'second', d1) AS second
---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01';
+SELECT '' AS `54`, d1 as `timestamp`,
+date_part( 'year', d1) AS `year`, date_part( 'month', d1) AS `month`,
+date_part( 'day', d1) AS `day`, date_part( 'hour', d1) AS `hour`,
+date_part( 'minute', d1) AS `minute`, date_part( 'second', d1) AS `second`
+FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01';
 
 -- SELECT '' AS "54", d1 as "timestamp",
 --date_part( 'quarter', d1) AS quarter, date_part( 'msec', d1) AS msec,
 
 Review comment:
   I uncommented those 2 queries


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly

2019-08-14 Thread GitBox

cloud-fan commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data 
source table SerDe correctly
URL: https://github.com/apache/spark/pull/24486#issuecomment-521531072
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly

2019-08-14 Thread GitBox

cloud-fan commented on a change in pull request #24486: [SPARK-27592][SQL] Set 
the bucketed data source table SerDe correctly
URL: https://github.com/apache/spark/pull/24486#discussion_r314187995
 
 

 ##
 File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveMetastoreCatalogSuite.scala
 ##
 @@ -284,4 +284,40 @@ class DataSourceWithHiveMetastoreCatalogSuite
 }
 
   }
+
+  test("Set the bucketed data source table SerDe correctly") {
 
 Review comment:
   let's include the jira id in test name.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates

2019-08-14 Thread GitBox

MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add 
`date_part` function for timestamps/dates
URL: https://github.com/apache/spark/pull/25410#discussion_r314188026
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ##
 @@ -1963,3 +1963,64 @@ case class Epoch(child: Expression, timeZoneId: 
Option[String] = None)
 defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)")
   }
 }
+
+@ExpressionDescription(
+  usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.",
+  arguments = """
+Arguments:
+  * field - selects which part of the source should be extracted. 
Supported string values are:
+["MILLENNIUM", "CENTURY", "DECADE", "YEAR", "QUARTER", "MONTH",
+ "WEEK", "DAY", "DAYOFWEEK", "DOW", "ISODOW", "DOY",
+ "HOUR", "MINUTE", "SECOND"]
 
 Review comment:
   I documented all values for consistency.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly

2019-08-14 Thread GitBox

wangyum commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data 
source table SerDe correctly
URL: https://github.com/apache/spark/pull/24486#issuecomment-521528867
 
 
   ping @cloud-fan


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24715: [SPARK-25474][SQL] Data source tables support fallback to HDFS for size estimation

2019-08-14 Thread GitBox

cloud-fan commented on issue #24715: [SPARK-25474][SQL] Data source tables 
support fallback to HDFS for size estimation
URL: https://github.com/apache/spark/pull/24715#issuecomment-521528661
 
 
   The idea LGTM, can you rebase this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables

2019-08-14 Thread GitBox

cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] 
Support `spark.sql.statistics.fallBackToHdfs` in data source tables
URL: https://github.com/apache/spark/pull/22502#discussion_r314185683
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala
 ##
 @@ -71,7 +70,13 @@ case class HadoopFsRelation(
 
   override def sizeInBytes: Long = {
 val compressionFactor = sqlContext.conf.fileCompressionFactor
-(location.sizeInBytes * compressionFactor).toLong
+val defaultSize = (location.sizeInBytes * compressionFactor).toLong
+location match {
+  case cfi: CatalogFileIndex if 
sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled =>
 
 Review comment:
   IIUC the issue in this PR is, we always fallback to HDFS stats even if table 
stats are available.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore

2019-08-14 Thread GitBox

cloud-fan commented on issue #25448: [SPARK-28697][SQL] Invalidate 
Database/Table names starting with underscore
URL: https://github.com/apache/spark/pull/25448#issuecomment-521527717
 
 
   Wait, does table name starting with `_` work in Spark currently? From 
SPARK-19059 it seems supported, but from SPARK-28697 it seems not.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs

2019-08-14 Thread GitBox

SparkQA commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to 
track registered v2 catalogs
URL: https://github.com/apache/spark/pull/25368#issuecomment-521525967
 
 
   **[Test build #109144 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109144/testReport)**
 for PR 25368 at commit 
[`45cbbd0`](https://github.com/apache/spark/commit/45cbbd04408251e14a9157d1a5b93ae6a8e91401).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create 
CatalogManager to track registered v2 catalogs
URL: https://github.com/apache/spark/pull/25368#issuecomment-521525502
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14213/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create 
CatalogManager to track registered v2 catalogs
URL: https://github.com/apache/spark/pull/25368#issuecomment-521525496
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create 
CatalogManager to track registered v2 catalogs
URL: https://github.com/apache/spark/pull/25368#issuecomment-521525502
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14213/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create 
CatalogManager to track registered v2 catalogs
URL: https://github.com/apache/spark/pull/25368#issuecomment-521525496
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables

2019-08-14 Thread GitBox

shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] 
Support `spark.sql.statistics.fallBackToHdfs` in data source tables
URL: https://github.com/apache/spark/pull/22502#discussion_r314183178
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala
 ##
 @@ -71,7 +70,13 @@ case class HadoopFsRelation(
 
   override def sizeInBytes: Long = {
 val compressionFactor = sqlContext.conf.fileCompressionFactor
-(location.sizeInBytes * compressionFactor).toLong
+val defaultSize = (location.sizeInBytes * compressionFactor).toLong
+location match {
+  case cfi: CatalogFileIndex if 
sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled =>
 
 Review comment:
   I am not sure, is there any issue in this PR. As per this code, if the table 
doesn't have any statistics, then only will come to the `sizeInBytes` method. 
May be we can add the extra check mentioned above.
   
https://github.com/apache/spark/blob/0526529b31737e5bf4829f8259f3a020f2cc51f1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala#L42-L46


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables

2019-08-14 Thread GitBox

wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] 
Support `spark.sql.statistics.fallBackToHdfs` in data source tables
URL: https://github.com/apache/spark/pull/22502#discussion_r314180567
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala
 ##
 @@ -71,7 +70,13 @@ case class HadoopFsRelation(
 
   override def sizeInBytes: Long = {
 val compressionFactor = sqlContext.conf.fileCompressionFactor
-(location.sizeInBytes * compressionFactor).toLong
+val defaultSize = (location.sizeInBytes * compressionFactor).toLong
+location match {
+  case cfi: CatalogFileIndex if 
sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled =>
 
 Review comment:
   https://github.com/apache/spark/pull/24715


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables

2019-08-14 Thread GitBox

cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] 
Support `spark.sql.statistics.fallBackToHdfs` in data source tables
URL: https://github.com/apache/spark/pull/22502#discussion_r314180114
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala
 ##
 @@ -71,7 +70,13 @@ case class HadoopFsRelation(
 
   override def sizeInBytes: Long = {
 val compressionFactor = sqlContext.conf.fileCompressionFactor
-(location.sizeInBytes * compressionFactor).toLong
+val defaultSize = (location.sizeInBytes * compressionFactor).toLong
+location match {
+  case cfi: CatalogFileIndex if 
sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled =>
 
 Review comment:
   @wangyum can you send a PR for your proposal? It's unclear to me what you 
are proposing here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] 
Accept 'on' and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#discussion_r314179583
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -65,12 +65,15 @@ object StringUtils extends Logging {
 "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match 
new lines
   }
 
-  private[this] val trueStrings = Set("t", "true", "y", "yes", 
"1").map(UTF8String.fromString)
-  private[this] val falseStrings = Set("f", "false", "n", "no", 
"0").map(UTF8String.fromString)
+  private[this] val trueStrings =
+Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString)
+
+  private[this] val falseStrings =
+Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString)
 
 Review comment:
   Ah okay. Let me add that too. Thank you


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables

2019-08-14 Thread GitBox

wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] 
Support `spark.sql.statistics.fallBackToHdfs` in data source tables
URL: https://github.com/apache/spark/pull/22502#discussion_r314179379
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala
 ##
 @@ -71,7 +70,13 @@ case class HadoopFsRelation(
 
   override def sizeInBytes: Long = {
 val compressionFactor = sqlContext.conf.fileCompressionFactor
-(location.sizeInBytes * compressionFactor).toLong
+val defaultSize = (location.sizeInBytes * compressionFactor).toLong
+location match {
+  case cfi: CatalogFileIndex if 
sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled =>
 
 Review comment:
   Yes. I have prepared some tests to illustrate this issue. These tests can be 
passed before this commit:
   ```scala
 test("Non-partitioned data source table") {
   withTempDir { dir =>
 withTable("spark_25474") {
   sql(s"CREATE TABLE spark_25474 (c1 BIGINT) USING PARQUET LOCATION 
'${dir.toURI}'")
   
spark.range(5).write.mode(SaveMode.Overwrite).parquet(dir.getCanonicalPath)
   
   assert(getCatalogTable("spark_25474").stats.isEmpty)
   val relation = 
spark.table("spark_25474").queryExecution.analyzed.children.head
   assert(relation.stats.sizeInBytes === 935)
 }
   }
 }
   
 test("Partitioned data source table default") {
   withTempDir { dir =>
 withTable("spark_25474") {
   spark.sql("CREATE TABLE spark_25474(a int, b int) USING parquet " +
 s"PARTITIONED BY(a) LOCATION '${dir.toURI}'")
   spark.sql("INSERT INTO TABLE spark_25474 PARTITION(a=1) SELECT 2")
   
   assert(getCatalogTable("spark_25474").stats.isEmpty)
   val relation = 
spark.table("spark_25474").queryExecution.analyzed.children.head
   // scalastyle:off line.size.limit
   // It's 8.0EB in this case. This 8.0EB from:
   // 
https://github.com/apache/spark/blob/c30b5297bc607ae33cc2fcf624b127942154e559/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L383-L387
   // scalastyle:on line.size.limit
   assert(relation.stats.sizeInBytes === conf.defaultSizeInBytes)
 }
   }
 }
   
 test("Partitioned data source table and disable 
HIVE_MANAGE_FILESOURCE_PARTITIONS") {
   withSQLConf(SQLConf.HIVE_MANAGE_FILESOURCE_PARTITIONS.key -> "false") {
 withTempDir { dir =>
   withTable("spark_25474") {
 spark.sql("CREATE TABLE spark_25474(a int, b int) USING parquet " +
   s"PARTITIONED BY(a) LOCATION '${dir.toURI}'")
 spark.sql("INSERT INTO TABLE spark_25474 PARTITION(a=1) SELECT 2")
   
 assert(getCatalogTable("spark_25474").stats.isEmpty)
 val relation = 
spark.table("spark_25474").queryExecution.analyzed.children.head
 assert(relation.stats.sizeInBytes === 418)
   }
 }
   }
 }
   ```
   
   
https://github.com/apache/spark/compare/master...wangyum:SPARK-25474-DEV?expand=1
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …

2019-08-14 Thread GitBox

SparkQA commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to 
delete the temporary view …
URL: https://github.com/apache/spark/pull/24440#issuecomment-521519898
 
 
   **[Test build #109143 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109143/testReport)**
 for PR 24440 at commit 
[`770ee42`](https://github.com/apache/spark/commit/770ee4261335635fafe79afebb1ce7302db96d92).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache 
table needs to delete the temporary view …
URL: https://github.com/apache/spark/pull/24440#issuecomment-521519493
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache 
table needs to delete the temporary view …
URL: https://github.com/apache/spark/pull/24440#issuecomment-521519498
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14212/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table 
needs to delete the temporary view …
URL: https://github.com/apache/spark/pull/24440#issuecomment-521519498
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14212/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table 
needs to delete the temporary view …
URL: https://github.com/apache/spark/pull/24440#issuecomment-521519493
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] httfighter commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …

2019-08-14 Thread GitBox

httfighter commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs 
to delete the temporary view …
URL: https://github.com/apache/spark/pull/24440#issuecomment-521518729
 
 
   @dongjoon-hyun Thank you for reminding.I have added a test case.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a 
simple cost check for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#issuecomment-521517932
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14211/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost 
check for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#issuecomment-521517930
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a 
simple cost check for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#issuecomment-521517930
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost 
check for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#issuecomment-521517932
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14211/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521517572
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109141/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521517570
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial 
table of content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521517572
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109141/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content 
in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521517517
 
 
   **[Test build #109141 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109141/testReport)**
 for PR 25459 at commit 
[`791ee67`](https://github.com/apache/spark/commit/791ee67a26230d44b6839e4d414980d9889cea74).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial 
table of content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521517570
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

SparkQA removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521515799
 
 
   **[Test build #109141 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109141/testReport)**
 for PR 25459 at commit 
[`791ee67`](https://github.com/apache/spark/commit/791ee67a26230d44b6839e4d414980d9889cea74).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-14 Thread GitBox

SparkQA commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check 
for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#issuecomment-521516999
 
 
   **[Test build #109142 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109142/testReport)**
 for PR 25456 at commit 
[`74dd386`](https://github.com/apache/spark/commit/74dd3865e0fe3287d73a7b6aa954cc63bf17e9fd).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial 
table of content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521516649
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial 
table of content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521516655
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14210/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521516649
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521516655
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14210/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial 
table of content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521515466
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial 
table of content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521515467
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14209/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 
'on' and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#discussion_r314176109
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -65,12 +65,15 @@ object StringUtils extends Logging {
 "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match 
new lines
   }
 
-  private[this] val trueStrings = Set("t", "true", "y", "yes", 
"1").map(UTF8String.fromString)
-  private[this] val falseStrings = Set("f", "false", "n", "no", 
"0").map(UTF8String.fromString)
+  private[this] val trueStrings =
+Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString)
+
+  private[this] val falseStrings =
+Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString)
 
 Review comment:
   But PostgreSQL also accepts`of`, `tru`, `fals`, ...:
   ```sql
   postgres=# select cast('of' as boolean), cast('tru' as boolean), cast('fals' 
as boolean);
bool | bool | bool
   --+--+--
f| t| f
   (1 row)
   ```
   
   
https://github.com/postgres/postgres/commit/9729c9360886bee7feddc6a1124b0742de4b9f3d


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content 
in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521515799
 
 
   **[Test build #109141 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109141/testReport)**
 for PR 25459 at commit 
[`791ee67`](https://github.com/apache/spark/commit/791ee67a26230d44b6839e4d414980d9889cea74).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521515466
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521515467
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14209/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal opened a new pull request #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

dilipbiswal opened a new pull request #25459: [SPARK-28734[DOC] Initial table 
of content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459
 
 
   ## What changes were proposed in this pull request?
   This is a initial PR that creates the table of content for SQL reference 
guide. The left side bar will displays additional menu items corresponding to 
supported SQL constructs. One this PR is merged, we will fill in the content 
incrementally.  Additionally this PR contains a minor change to make the left 
sidebar scrollable. Currently it is not possible to scroll in the left hand 
side window.
   
   ## How was this patch tested?
   Used jekyll build and serve to verify.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc

2019-08-14 Thread GitBox

dilipbiswal commented on issue #25459: [SPARK-28734[DOC] Initial table of 
content in the left hand side bar for SQL doc
URL: https://github.com/apache/spark/pull/25459#issuecomment-521514797
 
 
   cc @gatorsmile 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] 
Accept 'on' and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#discussion_r314174879
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -65,12 +65,15 @@ object StringUtils extends Logging {
 "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match 
new lines
   }
 
-  private[this] val trueStrings = Set("t", "true", "y", "yes", 
"1").map(UTF8String.fromString)
-  private[this] val falseStrings = Set("f", "false", "n", "no", 
"0").map(UTF8String.fromString)
+  private[this] val trueStrings =
+Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString)
+
+  private[this] val falseStrings =
+Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString)
 
 Review comment:
   Yes I guess so. Do you know other common string representattion used in 
other databases?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-14 Thread GitBox

wangyum commented on a change in pull request #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#discussion_r314174619
 
 

 ##
 File path: pom.xml
 ##
 @@ -115,7 +115,7 @@
   
 UTF-8
 UTF-8
-11
+1.8
 
 Review comment:
   Let's wait for the fix of PySpark and SparkR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter

2019-08-14 Thread GitBox

BestOreo commented on a change in pull request #25342: 
[SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the 
SortShuffleWriter
URL: https://github.com/apache/spark/pull/25342#discussion_r314174130
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala
 ##
 @@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.util.collection
+
+import java.io.{Closeable, FilterOutputStream, OutputStream}
+
+import org.apache.spark.serializer.{SerializationStream, SerializerInstance, 
SerializerManager}
+import org.apache.spark.shuffle.ShuffleWriteMetricsReporter
+import org.apache.spark.shuffle.api.ShufflePartitionWriter
+import org.apache.spark.storage.BlockId
+
+/**
+ * A key-value writer inspired by {@link DiskBlockObjectWriter} that pushes 
the bytes to an
+ * arbitrary partition writer instead of writing to local disk through the 
block manager.
+ */
+private[spark] class ShufflePartitionPairsWriter(
+partitionWriter: ShufflePartitionWriter,
+serializerManager: SerializerManager,
+serializerInstance: SerializerInstance,
+blockId: BlockId,
+writeMetrics: ShuffleWriteMetricsReporter)
+  extends PairsWriter with Closeable {
+
+  private var isOpen = false
+  private var partitionStream: OutputStream = _
+  private var wrappedStream: OutputStream = _
+  private var objOut: SerializationStream = _
+  private var numRecordsWritten = 0
+  private var curNumBytesWritten = 0L
+
+  override def write(key: Any, value: Any): Unit = {
+if (!isOpen) {
+  open()
+  isOpen = true
+}
+objOut.writeKey(key)
+objOut.writeValue(value)
+writeMetrics.incRecordsWritten(1)
+  }
+
+  private def open(): Unit = {
+partitionStream = partitionWriter.openStream
+wrappedStream = serializerManager.wrapStream(blockId, partitionStream)
+objOut = serializerInstance.serializeStream(wrappedStream)
+  }
+
+  override def close(): Unit = {
+if (isOpen) {
 
 Review comment:
   The worry is unnecessary because `wrappedStream` and `objOut` would must be 
initialized successfully if `partitionStream` is opened as OutputStream without 
exception.
   And I think flag `isOpen` makes code easier to understand.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter

2019-08-14 Thread GitBox

BestOreo commented on a change in pull request #25342: 
[SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the 
SortShuffleWriter
URL: https://github.com/apache/spark/pull/25342#discussion_r314174130
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala
 ##
 @@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.util.collection
+
+import java.io.{Closeable, FilterOutputStream, OutputStream}
+
+import org.apache.spark.serializer.{SerializationStream, SerializerInstance, 
SerializerManager}
+import org.apache.spark.shuffle.ShuffleWriteMetricsReporter
+import org.apache.spark.shuffle.api.ShufflePartitionWriter
+import org.apache.spark.storage.BlockId
+
+/**
+ * A key-value writer inspired by {@link DiskBlockObjectWriter} that pushes 
the bytes to an
+ * arbitrary partition writer instead of writing to local disk through the 
block manager.
+ */
+private[spark] class ShufflePartitionPairsWriter(
+partitionWriter: ShufflePartitionWriter,
+serializerManager: SerializerManager,
+serializerInstance: SerializerInstance,
+blockId: BlockId,
+writeMetrics: ShuffleWriteMetricsReporter)
+  extends PairsWriter with Closeable {
+
+  private var isOpen = false
+  private var partitionStream: OutputStream = _
+  private var wrappedStream: OutputStream = _
+  private var objOut: SerializationStream = _
+  private var numRecordsWritten = 0
+  private var curNumBytesWritten = 0L
+
+  override def write(key: Any, value: Any): Unit = {
+if (!isOpen) {
+  open()
+  isOpen = true
+}
+objOut.writeKey(key)
+objOut.writeValue(value)
+writeMetrics.incRecordsWritten(1)
+  }
+
+  private def open(): Unit = {
+partitionStream = partitionWriter.openStream
+wrappedStream = serializerManager.wrapStream(blockId, partitionStream)
+objOut = serializerInstance.serializeStream(wrappedStream)
+  }
+
+  override def close(): Unit = {
+if (isOpen) {
 
 Review comment:
   The worry is unnecessary because wrappedStream and objOut would must be 
initialized successfully if partitionStream is opened as OutputStream without 
exception.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter

2019-08-14 Thread GitBox

BestOreo commented on a change in pull request #25342: 
[SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the 
SortShuffleWriter
URL: https://github.com/apache/spark/pull/25342#discussion_r314174130
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala
 ##
 @@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.util.collection
+
+import java.io.{Closeable, FilterOutputStream, OutputStream}
+
+import org.apache.spark.serializer.{SerializationStream, SerializerInstance, 
SerializerManager}
+import org.apache.spark.shuffle.ShuffleWriteMetricsReporter
+import org.apache.spark.shuffle.api.ShufflePartitionWriter
+import org.apache.spark.storage.BlockId
+
+/**
+ * A key-value writer inspired by {@link DiskBlockObjectWriter} that pushes 
the bytes to an
+ * arbitrary partition writer instead of writing to local disk through the 
block manager.
+ */
+private[spark] class ShufflePartitionPairsWriter(
+partitionWriter: ShufflePartitionWriter,
+serializerManager: SerializerManager,
+serializerInstance: SerializerInstance,
+blockId: BlockId,
+writeMetrics: ShuffleWriteMetricsReporter)
+  extends PairsWriter with Closeable {
+
+  private var isOpen = false
+  private var partitionStream: OutputStream = _
+  private var wrappedStream: OutputStream = _
+  private var objOut: SerializationStream = _
+  private var numRecordsWritten = 0
+  private var curNumBytesWritten = 0L
+
+  override def write(key: Any, value: Any): Unit = {
+if (!isOpen) {
+  open()
+  isOpen = true
+}
+objOut.writeKey(key)
+objOut.writeValue(value)
+writeMetrics.incRecordsWritten(1)
+  }
+
+  private def open(): Unit = {
+partitionStream = partitionWriter.openStream
+wrappedStream = serializerManager.wrapStream(blockId, partitionStream)
+objOut = serializerInstance.serializeStream(wrappedStream)
+  }
+
+  override def close(): Unit = {
+if (isOpen) {
 
 Review comment:
   The worry is unnecessary because wrappedStream and objOut would be 
initialized successfully if partitionStream is opened as OutputStream without 
exception.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25446: [SPARK-28724] [SQL] Throw error message when cast out range decimal to long

2019-08-14 Thread GitBox

dongjoon-hyun commented on issue #25446: [SPARK-28724] [SQL] Throw error 
message when cast out range decimal to long
URL: https://github.com/apache/spark/pull/25446#issuecomment-521513201
 
 
   Thank you for your understanding, @LiShuMing .
   Thank you, @maropu .


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths

2019-08-14 Thread GitBox

cloud-fan commented on a change in pull request #25348: [SPARK-28554][SQL] Adds 
a v1 fallback writer implementation for v2 data source codepaths
URL: https://github.com/apache/spark/pull/25348#discussion_r314173218
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
 ##
 @@ -200,24 +202,37 @@ object DataSourceV2Strategy extends Strategy with 
PredicateHelper {
 catalog,
 ident,
 parts,
+query,
 planLater(query),
 props,
 writeOptions,
 orCreate = orCreate) :: Nil
   }
 
 case AppendData(r: DataSourceV2Relation, query, _) =>
-  AppendDataExec(r.table.asWritable, r.options, planLater(query)) :: Nil
 
 Review comment:
   If end-users look at the SQL tab and see `AppendDataExecV1`, they would 
expect to see v1 version of CTAS physical plan as well, and may report a bug if 
they don't see it.
   
   BTW I think there are other ways to implement this feature (users know if v1 
fallback is triggered from SQL tab), e.g. we can use SQLMetrics to report it, 
which can be updated at runtime and support CTAS as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25457: 
[SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current 
epoch in EpochTracker (to support Python UDFs)
URL: https://github.com/apache/spark/pull/25457#issuecomment-521511185
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25457: 
[SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current 
epoch in EpochTracker (to support Python UDFs)
URL: https://github.com/apache/spark/pull/25457#issuecomment-521511188
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109135/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] 
Use InheritableThreadLocal for current epoch in EpochTracker (to support Python 
UDFs)
URL: https://github.com/apache/spark/pull/25457#issuecomment-521511188
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109135/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] 
Use InheritableThreadLocal for current epoch in EpochTracker (to support Python 
UDFs)
URL: https://github.com/apache/spark/pull/25457#issuecomment-521511185
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

SparkQA removed a comment on issue #25457: 
[SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current 
epoch in EpochTracker (to support Python UDFs)
URL: https://github.com/apache/spark/pull/25457#issuecomment-521473909
 
 
   **[Test build #109135 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109135/testReport)**
 for PR 25457 at commit 
[`4c5fdd6`](https://github.com/apache/spark/commit/4c5fdd668be1f31561849e5fe485e814e318a3ef).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

SparkQA commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use 
InheritableThreadLocal for current epoch in EpochTracker (to support Python 
UDFs)
URL: https://github.com/apache/spark/pull/25457#issuecomment-521511008
 
 
   **[Test build #109135 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109135/testReport)**
 for PR 25457 at commit 
[`4c5fdd6`](https://github.com/apache/spark/commit/4c5fdd668be1f31561849e5fe485e814e318a3ef).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-14 Thread GitBox

dongjoon-hyun commented on a change in pull request #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#discussion_r314172342
 
 

 ##
 File path: pom.xml
 ##
 @@ -115,7 +115,7 @@
   
 UTF-8
 UTF-8
-11
+1.8
 
 Review comment:
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

dongjoon-hyun commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] 
Use InheritableThreadLocal for current epoch in EpochTracker (to support Python 
UDFs)
URL: https://github.com/apache/spark/pull/25457#issuecomment-521510518
 
 
   Merged to `branch-2.4`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)

2019-08-14 Thread GitBox

dongjoon-hyun closed pull request #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] 
Use InheritableThreadLocal for current epoch in EpochTracker (to support Python 
UDFs)
URL: https://github.com/apache/spark/pull/25457
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-14 Thread GitBox

wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 
'on' and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#discussion_r314171694
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -65,12 +65,15 @@ object StringUtils extends Logging {
 "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match 
new lines
   }
 
-  private[this] val trueStrings = Set("t", "true", "y", "yes", 
"1").map(UTF8String.fromString)
-  private[this] val falseStrings = Set("f", "false", "n", "no", 
"0").map(UTF8String.fromString)
+  private[this] val trueStrings =
+Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString)
+
+  private[this] val falseStrings =
+Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString)
 
 Review comment:
   It seems only PostgreSQL accepts `on` and `off`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog

2019-08-14 Thread GitBox

cloud-fan closed pull request #25402: [SPARK-28666] Support saveAsTable for V2 
tables through Session Catalog
URL: https://github.com/apache/spark/pull/25402
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-521509487
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey 
to 2.29
URL: https://github.com/apache/spark/pull/25455#issuecomment-521509388
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109137/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-521509492
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14208/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-14 Thread GitBox

AmplabJenkins commented on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-521509492
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14208/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] 
Update Jersey to 2.29
URL: https://github.com/apache/spark/pull/25455#issuecomment-521509386
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29

2019-08-14 Thread GitBox

AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] 
Update Jersey to 2.29
URL: https://github.com/apache/spark/pull/25455#issuecomment-521509388
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109137/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 956 matches

Mail list logo