date:20191114

[GitHub] [spark] SparkQA commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

SparkQA commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test 
Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554254541
 
 
   **[Test build #113859 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113859/testReport)**
 for PR 26533 at commit 
[`a95e8bf`](https://github.com/apache/spark/commit/a95e8bf68981fd182dc442f323093d845604f659).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26102: [SPARK-29448][SQL] Support the `INTERVAL` type by Parquet datasource

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26102: [SPARK-29448][SQL] 
Support the `INTERVAL` type by Parquet datasource
URL: https://github.com/apache/spark/pull/26102#discussion_r346698272
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala
 ##
 @@ -207,7 +208,16 @@ class ParquetWriteSupport extends 
WriteSupport[InternalRow] with Logging {
 
   case t: UserDefinedType[_] => makeWriter(t.sqlType)
 
-  // TODO Adds IntervalType support
+  case CalendarIntervalType =>
+(row: SpecializedGetters, ordinal: Int) =>
+  val interval = row.getInterval(ordinal)
+  val buf = ByteBuffer.wrap(reusableBuffer)
+  buf.order(ByteOrder.LITTLE_ENDIAN)
+.putInt((interval.milliseconds()).toInt)
 
 Review comment:
   @MaxGekk the doc says:
   
   > three little-endian unsigned integers
   
   what happens if we set negative values for some parts in the interval and 
negative values are written here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BryanCutler commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

BryanCutler commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test 
Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554254028
 
 
   Oh, I saw you just added it as a property in the pom. Hopefully that should 
do the trick.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26416: [SPARK-29779][CORE] Compact old event log files and cleanup

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26416: [SPARK-29779][CORE] Compact 
old event log files and cleanup
URL: https://github.com/apache/spark/pull/26416#issuecomment-554253786
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113845/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26416: [SPARK-29779][CORE] Compact old event log files and cleanup

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26416: [SPARK-29779][CORE] Compact old event 
log files and cleanup
URL: https://github.com/apache/spark/pull/26416#issuecomment-554253786
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113845/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BryanCutler commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

BryanCutler commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test 
Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554253576
 
 
   right, I think for JDK >= 9 this needs to be set 
"-Dio.netty.tryReflectionSetAccessible=true". This was due to some Netty API 
that Arrow started using. I can look into it and try to figure out what we 
should do for Spark. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26416: [SPARK-29779][CORE] Compact old event log files and cleanup

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26416: [SPARK-29779][CORE] Compact old event 
log files and cleanup
URL: https://github.com/apache/spark/pull/26416#issuecomment-554253778
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26416: [SPARK-29779][CORE] Compact old event log files and cleanup

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26416: [SPARK-29779][CORE] Compact 
old event log files and cleanup
URL: https://github.com/apache/spark/pull/26416#issuecomment-554253778
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26416: [SPARK-29779][CORE] Compact old event log files and cleanup

2019-11-14 Thread GitBox

SparkQA commented on issue #26416: [SPARK-29779][CORE] Compact old event log 
files and cleanup
URL: https://github.com/apache/spark/pull/26416#issuecomment-554253366
 
 
   **[Test build #113845 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113845/testReport)**
 for PR 26416 at commit 
[`a3a0c4a`](https://github.com/apache/spark/commit/a3a0c4acfdd602ad0a3566cc49ac38b707603fe7).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26416: [SPARK-29779][CORE] Compact old event log files and cleanup

2019-11-14 Thread GitBox

SparkQA removed a comment on issue #26416: [SPARK-29779][CORE] Compact old 
event log files and cleanup
URL: https://github.com/apache/spark/pull/26416#issuecomment-554213947
 
 
   **[Test build #113845 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113845/testReport)**
 for PR 26416 at commit 
[`a3a0c4a`](https://github.com/apache/spark/commit/a3a0c4acfdd602ad0a3566cc49ac38b707603fe7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] 
Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554252945
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26533: 
[WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554252950
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18728/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] 
Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554252950
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18728/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26533: 
[WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554252945
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-14 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r346696563
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala
 ##
 @@ -157,31 +159,61 @@ abstract class AggregationIterator(
   inputAttributes: Seq[Attribute]): (InternalRow, InternalRow) => Unit = {
 val joinedRow = new JoinedRow
 if (expressions.nonEmpty) {
-  val mergeExpressions = functions.zip(expressions).flatMap {
-case (ae: DeclarativeAggregate, expression) =>
-  expression.mode match {
+  val filterExpressions = expressions.map(_.filter)
+  val notExistsFilter = !filterExpressions.exists(_ != None)
+  var isFinalOrMerge = false
+  val mergeExpressions = functions.zipWithIndex.collect {
+case (ae: DeclarativeAggregate, i) =>
+  expressions(i).mode match {
 case Partial | Complete => ae.updateExpressions
-case PartialMerge | Final => ae.mergeExpressions
+case PartialMerge | Final =>
+  isFinalOrMerge = true
+  ae.mergeExpressions
   }
 case (agg: AggregateFunction, _) => 
Seq.fill(agg.aggBufferAttributes.length)(NoOp)
   }
   val updateFunctions = functions.zipWithIndex.collect {
 case (ae: ImperativeAggregate, i) =>
   expressions(i).mode match {
-case Partial | Complete =>
+case Partial | Complete if filterExpressions(i).isDefined =>
+  (buffer: InternalRow, row: InternalRow) =>
+if (predicates(i).eval(row)) { ae.update(buffer, row) }
+case Partial | Complete if filterExpressions(i).isEmpty =>
   (buffer: InternalRow, row: InternalRow) => ae.update(buffer, row)
 case PartialMerge | Final =>
   (buffer: InternalRow, row: InternalRow) => ae.merge(buffer, row)
   }
   }.toArray
   // This projection is used to merge buffer values for all 
expression-based aggregates.
   val aggregationBufferSchema = functions.flatMap(_.aggBufferAttributes)
-  val updateProjection =
-newMutableProjection(mergeExpressions, aggregationBufferSchema ++ 
inputAttributes)
+  val updateProjection = newMutableProjection(
+mergeExpressions.flatMap(_.seq), aggregationBufferSchema ++ 
inputAttributes)
 
   (currentBuffer: InternalRow, row: InternalRow) => {
 // Process all expression-based aggregate functions.
-updateProjection.target(currentBuffer)(joinedRow(currentBuffer, row))
+if (notExistsFilter || isFinalOrMerge) {
 
 Review comment:
   Good idea!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #26409: [SPARK-29655][SQL] Read bucketed tables obeys spark.sql.shuffle.partitions

2019-11-14 Thread GitBox

cloud-fan commented on issue #26409: [SPARK-29655][SQL] Read bucketed tables 
obeys spark.sql.shuffle.partitions
URL: https://github.com/apache/spark/pull/26409#issuecomment-554252095
 
 
   thanks, merging to master!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26102: [SPARK-29448][SQL] Support the `INTERVAL` type by Parquet datasource

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26102: [SPARK-29448][SQL] 
Support the `INTERVAL` type by Parquet datasource
URL: https://github.com/apache/spark/pull/26102#discussion_r346695819
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ##
 @@ -498,10 +498,8 @@ case class DataSource(
   outputColumnNames: Seq[String],
   physicalPlan: SparkPlan): BaseRelation = {
 val outputColumns = DataWritingCommand.logicalPlanOutputWithNames(data, 
outputColumnNames)
-if 
(outputColumns.map(_.dataType).exists(_.isInstanceOf[CalendarIntervalType])) {
 
 Review comment:
   and Python and R needs a proper conversion for both to read and write as 
well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #26409: [SPARK-29655][SQL] Read bucketed tables obeys spark.sql.shuffle.partitions

2019-11-14 Thread GitBox

cloud-fan closed pull request #26409: [SPARK-29655][SQL] Read bucketed tables 
obeys spark.sql.shuffle.partitions
URL: https://github.com/apache/spark/pull/26409
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26102: [SPARK-29448][SQL] Support the `INTERVAL` type by Parquet datasource

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #26102: [SPARK-29448][SQL] 
Support the `INTERVAL` type by Parquet datasource
URL: https://github.com/apache/spark/pull/26102#discussion_r344585132
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ##
 @@ -498,10 +498,8 @@ case class DataSource(
   outputColumnNames: Seq[String],
   physicalPlan: SparkPlan): BaseRelation = {
 val outputColumns = DataWritingCommand.logicalPlanOutputWithNames(data, 
outputColumnNames)
-if 
(outputColumns.map(_.dataType).exists(_.isInstanceOf[CalendarIntervalType])) {
 
 Review comment:
   interval type is kind of an internal type for now. It's a big decision if we 
can read/write it from/to data sources.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG from order-insensitive aggregates in EliminateSorts

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] 
Remove AVG from order-insensitive aggregates in EliminateSorts
URL: https://github.com/apache/spark/pull/26534#issuecomment-554251312
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18727/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG from order-insensitive aggregates in EliminateSorts

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] 
Remove AVG from order-insensitive aggregates in EliminateSorts
URL: https://github.com/apache/spark/pull/26534#issuecomment-554251307
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG from order-insensitive aggregates in EliminateSorts

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove 
AVG from order-insensitive aggregates in EliminateSorts
URL: https://github.com/apache/spark/pull/26534#issuecomment-554251307
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG from order-insensitive aggregates in EliminateSorts

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove 
AVG from order-insensitive aggregates in EliminateSorts
URL: https://github.com/apache/spark/pull/26534#issuecomment-554251312
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18727/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346694871
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -547,14 +478,16 @@ object IntervalUtils {
 case ' ' =>
   fraction /= NANOS_PER_MICROS.toInt
   state = TRIM_BEFORE_UNIT
-case _ => return null
+case _ if '0' <= b && b <= '9' =>
+  throwIAE(s"invalid value fractional part '$fraction$nextWord' 
out of range")
 
 Review comment:
   I'd improve this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG from order-insensitive aggregates in EliminateSorts

2019-11-14 Thread GitBox

SparkQA commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG 
from order-insensitive aggregates in EliminateSorts
URL: https://github.com/apache/spark/pull/26534#issuecomment-554250511
 
 
   **[Test build #113858 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113858/testReport)**
 for PR 26534 at commit 
[`a44f218`](https://github.com/apache/spark/commit/a44f21890f6f52e80a80f31a0437e0ac99fe6b98).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] Add array function - unnest

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] 
Add array function - unnest
URL: https://github.com/apache/spark/pull/26279#discussion_r346694150
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ##
 @@ -343,6 +344,87 @@ abstract class ExplodeBase extends UnaryExpression with 
CollectionGenerator with
   }
 }
 
+@ExpressionDescription(
+  usage = "_FUNC_(expr) - Separates the elements of array `expr` into multiple 
rows recursively.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(10, 20));
+   10
+   20
+  > SELECT _FUNC_(a) FROM VALUES (array(1,2)), (array(3,4)) AS v1(a);
+   1
+   2
+   3
+   4
+  > SELECT _FUNC_(a) FROM VALUES (array(array(1,2), array(3,4))) AS v1(a);
+   1
+   2
+   3
+   4
+  """,
+  since = "3.0.0")
+case class UnNest(child: Expression) extends UnaryExpression with Generator 
with CodegenFallback {
+
+  override def prettyName: String = "unnest"
+
+  override def elementSchema: StructType = {
+new StructType().add(prettyName, getLeafDataType(child.dataType), true)
+  }
+
+  // TODO: multidimensional arrays must have array expressions with matching 
dimensions
+  override def checkInputDataTypes(): TypeCheckResult = child.dataType match {
+case ArrayType(_: NullType, _) => TypeCheckResult.TypeCheckSuccess
+case _: ArrayType =>
+  var currentType = child.dataType
+  while(currentType.isInstanceOf[ArrayType]) {
+val arrayType = currentType.asInstanceOf[ArrayType]
+if (arrayType.containsNull && 
!arrayType.elementType.isInstanceOf[AtomicType]) {
+  return TypeCheckResult.TypeCheckFailure("multidimensional arrays 
must not contains nulls")
+}
+currentType = getElementTypeOrItself(currentType)
+  }
+  TypeUtils.checkForSameTypeInputExpr(child.children.map(_.dataType), 
s"function $prettyName")
+
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"input to function unnest should be array type not 
${child.dataType.catalogString}")
+  }
+
+  @scala.annotation.tailrec
+  private def getLeafDataType(typ: DataType): DataType = typ match {
+case ArrayType(et, _) => getLeafDataType(et)
+case at => at
+  }
+
+  private def getElementTypeOrItself(typ: DataType): DataType = typ match {
+case ArrayType(et, _) => et
+case _ => typ
+  }
+
+  override def eval(input: InternalRow): TraversableOnce[InternalRow] = {
+var inputArray = child.eval(input).asInstanceOf[ArrayData]
+if (inputArray == null) {
+  Nil
+} else {
+  val rows = ArrayBuffer[InternalRow]()
+  var currentType = child.dataType
+  while (currentType.isInstanceOf[ArrayType]) {
+val currentElementType = getElementTypeOrItself(currentType)
+if (!currentElementType.isInstanceOf[ArrayType]) {
+  inputArray.foreach(currentElementType, (_, e) => rows += 
InternalRow(e))
 
 Review comment:
   Can we avoid Scala's foreach in such critical path (see 
https://github.com/databricks/scala-style-guide#traversal-and-zipwithindex)?  
We can at least switch to a regular for loop instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG from order-insensitive aggregates in EliminateSorts

2019-11-14 Thread GitBox

cloud-fan commented on issue #26534: [SPARK-29343][SQL][FOLLOW-UP] Remove AVG 
from order-insensitive aggregates in EliminateSorts
URL: https://github.com/apache/spark/pull/26534#issuecomment-554249765
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] Add array function - unnest

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] 
Add array function - unnest
URL: https://github.com/apache/spark/pull/26279#discussion_r346693625
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ##
 @@ -343,6 +344,87 @@ abstract class ExplodeBase extends UnaryExpression with 
CollectionGenerator with
   }
 }
 
+@ExpressionDescription(
+  usage = "_FUNC_(expr) - Separates the elements of array `expr` into multiple 
rows recursively.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(10, 20));
+   10
+   20
+  > SELECT _FUNC_(a) FROM VALUES (array(1,2)), (array(3,4)) AS v1(a);
+   1
+   2
+   3
+   4
+  > SELECT _FUNC_(a) FROM VALUES (array(array(1,2), array(3,4))) AS v1(a);
+   1
+   2
+   3
+   4
+  """,
+  since = "3.0.0")
+case class UnNest(child: Expression) extends UnaryExpression with Generator 
with CodegenFallback {
 
 Review comment:
   Can we implement codegen as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-14 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r346693286
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala
 ##
 @@ -157,31 +159,61 @@ abstract class AggregationIterator(
   inputAttributes: Seq[Attribute]): (InternalRow, InternalRow) => Unit = {
 val joinedRow = new JoinedRow
 if (expressions.nonEmpty) {
-  val mergeExpressions = functions.zip(expressions).flatMap {
-case (ae: DeclarativeAggregate, expression) =>
-  expression.mode match {
+  val filterExpressions = expressions.map(_.filter)
+  val notExistsFilter = !filterExpressions.exists(_ != None)
+  var isFinalOrMerge = false
 
 Review comment:
   `isFinalOrMerge` is related to `expressions`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] Add array function - unnest

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] 
Add array function - unnest
URL: https://github.com/apache/spark/pull/26279#discussion_r346693230
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ##
 @@ -343,6 +344,87 @@ abstract class ExplodeBase extends UnaryExpression with 
CollectionGenerator with
   }
 }
 
+@ExpressionDescription(
+  usage = "_FUNC_(expr) - Separates the elements of array `expr` into multiple 
rows recursively.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(10, 20));
+   10
+   20
+  > SELECT _FUNC_(a) FROM VALUES (array(1,2)), (array(3,4)) AS v1(a);
+   1
+   2
+   3
+   4
+  > SELECT _FUNC_(a) FROM VALUES (array(array(1,2), array(3,4))) AS v1(a);
+   1
+   2
+   3
+   4
+  """,
+  since = "3.0.0")
+case class UnNest(child: Expression) extends UnaryExpression with Generator 
with CodegenFallback {
+
+  override def prettyName: String = "unnest"
+
+  override def elementSchema: StructType = {
+new StructType().add(prettyName, getLeafDataType(child.dataType), true)
+  }
+
+  // TODO: multidimensional arrays must have array expressions with matching 
dimensions
 
 Review comment:
   what does this mean?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26509: [SPARK-29427][SQL] Add API to convert RelationalGroupedDataset to KeyValueGroupedDataset

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #26509: [SPARK-29427][SQL] Add 
API to convert RelationalGroupedDataset to KeyValueGroupedDataset
URL: https://github.com/apache/spark/pull/26509#discussion_r346692869
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala
 ##
 @@ -129,6 +132,32 @@ class RelationalGroupedDataset protected[sql](
 (inputExpr: Expression) => exprToFunc(inputExpr)
   }
 
+  /**
+   * Returns a `KeyValueGroupedDataset` where the data is grouped by the 
grouping expressions
+   * of current `RelationalGroupedDataset`.
+   *
+   * @since 3.0.0
+   */
+  def keyAs[U : Encoder]: KeyValueGroupedDataset[U, T] = {
+val keyEncoder = encoderFor[U]
+val aliasedGrps = groupingExprs.map { g =>
+ g.transformDown {
+   case u: UnresolvedAttribute => df.resolve(u.name)
+ }
+}.map(alias)
+val additionalCols = aliasedGrps.filter(g => 
!df.logicalPlan.outputSet.contains(g.toAttribute))
+val qe = Dataset.ofRows(
+  df.sparkSession,
+  Project(df.logicalPlan.output ++ additionalCols, 
df.logicalPlan)).queryExecution
 
 Review comment:
   `groupingAttributes` is private in `KeyValueGroupedDataset`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] Add array function - unnest

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] 
Add array function - unnest
URL: https://github.com/apache/spark/pull/26279#discussion_r346692879
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ##
 @@ -343,6 +344,87 @@ abstract class ExplodeBase extends UnaryExpression with 
CollectionGenerator with
   }
 }
 
+@ExpressionDescription(
+  usage = "_FUNC_(expr) - Separates the elements of array `expr` into multiple 
rows recursively.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(10, 20));
+   10
+   20
+  > SELECT _FUNC_(a) FROM VALUES (array(1,2)), (array(3,4)) AS v1(a);
+   1
+   2
+   3
+   4
+  > SELECT _FUNC_(a) FROM VALUES (array(array(1,2), array(3,4))) AS v1(a);
+   1
+   2
+   3
+   4
+  """,
+  since = "3.0.0")
+case class UnNest(child: Expression) extends UnaryExpression with Generator 
with CodegenFallback {
+
+  override def prettyName: String = "unnest"
 
 Review comment:
   I think we don't need to override this since it's matched to class name


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] Pass in number of partitions to WriteBuilder

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] 
Pass in number of partitions to WriteBuilder
URL: https://github.com/apache/spark/pull/25990#discussion_r346692667
 
 

 ##
 File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/streaming/StreamingWrite.java
 ##
 @@ -48,8 +49,10 @@
*
* If this method fails (by throwing an exception), the action will fail and 
no Spark job will be
* submitted.
+   *
+   * @param writeInfo Information about the RDD that will be written to this 
data writer
*/
-  StreamingDataWriterFactory createStreamingWriterFactory();
+  StreamingDataWriterFactory createStreamingWriterFactory(PhysicalWriteInfo 
writeInfo);
 
 Review comment:
   for simplicity, we can just call the parameter "info"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] Add array function - unnest

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] 
Add array function - unnest
URL: https://github.com/apache/spark/pull/26279#discussion_r346692072
 
 

 ##
 File path: 
sql/core/src/test/resources/sql-tests/inputs/postgreSQL/aggregates_part4.sql
 ##
 @@ -404,12 +403,8 @@
 
 -- [SPARK-28382] Array Functions: unnest
 -- check collation-sensitive matching between grouping expressions
--- select v||'a', case v||'a' when 'aa' then 1 else 0 end, count(*)
---   from unnest(array['a','b']) u(v)
---  group by v||'a' order by 1;
--- select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*)
---   from unnest(array['a','b']) u(v)
---  group by v||'a' order by 1;
+select v||'a', case v||'a' when 'aa' then 1 else 0 end, count(*) from (select 
unnest(array('a','b')) as v) u group by v||'a' order by 1;
+select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*) from 
(select unnest(array('a','b')) as v) u group by v||'a' order by 1;
 
 Review comment:
   Why should we inline them? can't we just uncomment the tests?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] Pass in number of partitions to WriteBuilder

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] 
Pass in number of partitions to WriteBuilder
URL: https://github.com/apache/spark/pull/25990#discussion_r346691872
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/write/WriteInfoImpl.scala
 ##
 @@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.write
+
+import org.apache.spark.sql.types.StructType
+
+private[sql] case class WriteInfoImpl(
+queryId: String,
+schema: StructType) extends WriteInfo
 
 Review comment:
   how about `LogicalWriteInfo`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] Add array function - unnest

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] 
Add array function - unnest
URL: https://github.com/apache/spark/pull/26279#discussion_r346691887
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/postgreSQL/join.sql
 ##
 @@ -1089,10 +1089,9 @@ order by fault;
 -- left join unnest(v1ys) as u1(u1y) on u1y = v2y;
 
 -- [SPARK-28382] Array Functions: unnest
--- select * from
--- (values (1, array(10,20)), (2, array(20,30))) as v1(v1x,v1ys)
--- left join (values (1, 10), (2, 20)) as v2(v2x,v2y) on v2x = v1x
--- left join unnest(v1ys) as u1(u1y) on u1y = v2y;
+create or replace temporary view v1 as select * from values (1, array(10,20)), 
(2, array(20,30)) as v1(v1x,v1ys);
+create or replace temporary view v2 as select * from values (1, 10), (2, 20) 
as v2(v2x,v2y);
 
 Review comment:
   Hm, why is it different from commented? Does it match to PostgreSQL's?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-14 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r346691763
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala
 ##
 @@ -40,6 +41,7 @@ abstract class AggregationIterator(
 aggregateAttributes: Seq[Attribute],
 initialInputBufferOffset: Int,
 resultExpressions: Seq[NamedExpression],
+predicates: HashMap[Int, GenPredicate],
 
 Review comment:
   OK.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] Add array function - unnest

2019-11-14 Thread GitBox

HyukjinKwon commented on a change in pull request #26279: [SPARK-28382][SQL] 
Add array function - unnest
URL: https://github.com/apache/spark/pull/26279#discussion_r346691561
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/GeneratorFunctionSuite.scala
 ##
 @@ -344,6 +344,7 @@ class GeneratorFunctionSuite extends QueryTest with 
SharedSparkSession {
   }
 }
 
+
 
 Review comment:
   I would remove unnecessary changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] Pass in number of partitions to WriteBuilder

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] 
Pass in number of partitions to WriteBuilder
URL: https://github.com/apache/spark/pull/25990#discussion_r346691710
 
 

 ##
 File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/PhysicalWriteInfo.java
 ##
 @@ -0,0 +1,17 @@
+package org.apache.spark.sql.connector.write;
+
+import org.apache.spark.annotation.Experimental;
+import 
org.apache.spark.sql.connector.write.streaming.StreamingDataWriterFactory;
+
+/**
+ * :: Experimental ::
+ * This interface contains physical (i.e. RDD) write information that data 
sources can use when generating a
+ * {@link DataWriterFactory} or a {@link StreamingDataWriterFactory}.
+ */
+@Experimental
 
 Review comment:
   we shall use `@Evolving`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-14 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r346691731
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala
 ##
 @@ -157,31 +159,61 @@ abstract class AggregationIterator(
   inputAttributes: Seq[Attribute]): (InternalRow, InternalRow) => Unit = {
 val joinedRow = new JoinedRow
 if (expressions.nonEmpty) {
-  val mergeExpressions = functions.zip(expressions).flatMap {
-case (ae: DeclarativeAggregate, expression) =>
-  expression.mode match {
+  val filterExpressions = expressions.map(_.filter)
+  val notExistsFilter = !filterExpressions.exists(_ != None)
 
 Review comment:
   Good suggestion!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #26458: [SPARK-29821] Allow calling non-aggregate SQL functions with column name

2019-11-14 Thread GitBox

HyukjinKwon commented on issue #26458: [SPARK-29821] Allow calling 
non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26458#issuecomment-554247661
 
 
   haha actually we left a note
   
   
https://github.com/apache/spark/blob/6f687691ef6212c125f4023fcd79839ee9244049/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L58-L60


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

wangyum commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test 
Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554247101
 
 
   @HyukjinKwon @BryanCutler It seems failed tests related to 
https://issues.apache.org/jira/browse/ARROW-5412?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #26491: [SPARK-29870][SQL] 
Unify the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346690817
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -598,26 +531,26 @@ object IntervalUtils {
 } else if (s.matchAt(microsStr, i)) {
   microseconds = Math.addExact(microseconds, currentValue)
   i += microsStr.numBytes()
-} else return null
-  case _ => return null
+} else throwIAE(s"invalid unit '$nextWord'")
+  case _ => throwIAE(s"invalid unit '$nextWord'")
 }
   } catch {
-case _: ArithmeticException => return null
+case e: ArithmeticException => throwIAE(e.getMessage, e)
   }
   state = UNIT_SUFFIX
 case UNIT_SUFFIX =>
   b match {
 case 's' => state = UNIT_END
 case ' ' => state = TRIM_BEFORE_SIGN
-case _ => return null
+case _ => throwIAE(s"invalid unit suffix '$nextWord'")
 
 Review comment:
   for error reporting, usually backtracing is necessary. For example, it's 
better to tell users that `123a` is not valid, instead of just saying `a` is 
not valid.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #26509: [SPARK-29427][SQL] Add API to convert RelationalGroupedDataset to KeyValueGroupedDataset

2019-11-14 Thread GitBox

viirya commented on a change in pull request #26509: [SPARK-29427][SQL] Add API 
to convert RelationalGroupedDataset to KeyValueGroupedDataset
URL: https://github.com/apache/spark/pull/26509#discussion_r346690836
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala
 ##
 @@ -129,6 +132,32 @@ class RelationalGroupedDataset protected[sql](
 (inputExpr: Expression) => exprToFunc(inputExpr)
   }
 
+  /**
+   * Returns a `KeyValueGroupedDataset` where the data is grouped by the 
grouping expressions
+   * of current `RelationalGroupedDataset`.
+   *
+   * @since 3.0.0
+   */
+  def keyAs[U : Encoder]: KeyValueGroupedDataset[U, T] = {
+val keyEncoder = encoderFor[U]
+val aliasedGrps = groupingExprs.map { g =>
+ g.transformDown {
+   case u: UnresolvedAttribute => df.resolve(u.name)
+ }
+}.map(alias)
+val additionalCols = aliasedGrps.filter(g => 
!df.logicalPlan.outputSet.contains(g.toAttribute))
+val qe = Dataset.ofRows(
+  df.sparkSession,
+  Project(df.logicalPlan.output ++ additionalCols, 
df.logicalPlan)).queryExecution
 
 Review comment:
   Will it also break break source compatibility?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen Stage Id to Spark plan graphs in Web UI SQL Tab

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26519: [SPARK-29894][SQL][WEBUI] Add 
Codegen Stage Id to Spark plan graphs in Web UI SQL Tab
URL: https://github.com/apache/spark/pull/26519#issuecomment-554246861
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113838/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen Stage Id to Spark plan graphs in Web UI SQL Tab

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26519: [SPARK-29894][SQL][WEBUI] Add 
Codegen Stage Id to Spark plan graphs in Web UI SQL Tab
URL: https://github.com/apache/spark/pull/26519#issuecomment-554246855
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen Stage Id to Spark plan graphs in Web UI SQL Tab

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen 
Stage Id to Spark plan graphs in Web UI SQL Tab
URL: https://github.com/apache/spark/pull/26519#issuecomment-554246861
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113838/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen Stage Id to Spark plan graphs in Web UI SQL Tab

2019-11-14 Thread GitBox

SparkQA commented on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen Stage 
Id to Spark plan graphs in Web UI SQL Tab
URL: https://github.com/apache/spark/pull/26519#issuecomment-554246674
 
 
   **[Test build #113838 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113838/testReport)**
 for PR 26519 at commit 
[`85f8d69`](https://github.com/apache/spark/commit/85f8d693de63a28642f1f05d2b3930c93ebcbf5c).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346690471
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -598,26 +531,26 @@ object IntervalUtils {
 } else if (s.matchAt(microsStr, i)) {
   microseconds = Math.addExact(microseconds, currentValue)
   i += microsStr.numBytes()
-} else return null
-  case _ => return null
+} else throwIAE(s"invalid unit '$nextWord'")
+  case _ => throwIAE(s"invalid unit '$nextWord'")
 }
   } catch {
-case _: ArithmeticException => return null
+case e: ArithmeticException => throwIAE(e.getMessage, e)
   }
   state = UNIT_SUFFIX
 case UNIT_SUFFIX =>
   b match {
 case 's' => state = UNIT_END
 case ' ' => state = TRIM_BEFORE_SIGN
-case _ => return null
+case _ => throwIAE(s"invalid unit suffix '$nextWord'")
 
 Review comment:
   or we change the logic of unit parsing we can extract the whole part like 
`case  _ if b>= 'A' $$ b<= 'z' => unit = s.substring(i, 
s.numBytes()).subStringIndex(UTF8String.blankString(1), 1)` than do `unit` case 
matching and error capture.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen Stage Id to Spark plan graphs in Web UI SQL Tab

2019-11-14 Thread GitBox

SparkQA removed a comment on issue #26519: [SPARK-29894][SQL][WEBUI] Add 
Codegen Stage Id to Spark plan graphs in Web UI SQL Tab
URL: https://github.com/apache/spark/pull/26519#issuecomment-554204749
 
 
   **[Test build #113838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113838/testReport)**
 for PR 26519 at commit 
[`85f8d69`](https://github.com/apache/spark/commit/85f8d693de63a28642f1f05d2b3930c93ebcbf5c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen Stage Id to Spark plan graphs in Web UI SQL Tab

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26519: [SPARK-29894][SQL][WEBUI] Add Codegen 
Stage Id to Spark plan graphs in Web UI SQL Tab
URL: https://github.com/apache/spark/pull/26519#issuecomment-554246855
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] edrevo commented on a change in pull request #25945: [SPARK-29248][SQL] Pass in number of partitions to WriteBuilder

2019-11-14 Thread GitBox

edrevo commented on a change in pull request #25945: [SPARK-29248][SQL] Pass in 
number of partitions to WriteBuilder
URL: https://github.com/apache/spark/pull/25945#discussion_r346689861
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/MicroBatchWrite.scala
 ##
 @@ -36,8 +36,8 @@ class MicroBatchWrite(eppchId: Long, val writeSupport: 
StreamingWrite) extends B
 writeSupport.abort(eppchId, messages)
   }
 
-  override def createBatchWriterFactory(): DataWriterFactory = {
-new MicroBatchWriterFactory(eppchId, 
writeSupport.createStreamingWriterFactory())
+  override def createBatchWriterFactory(numPartitions: Int): DataWriterFactory 
= {
 
 Review comment:
   No problem! Since we still want to move forward with the interface-based 
approach, I've decided to evolve https://github.com/apache/spark/pull/25990/ to 
include both the `PhysicalWriteInfo` as well as `WriteInfo`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-14 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r346689827
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala
 ##
 @@ -130,7 +135,8 @@ case class AggregateExpression(
   @transient
   override lazy val references: AttributeSet = {
 mode match {
-  case Partial | Complete => aggregateFunction.references
+  case Partial | Complete if filter == None => aggregateFunction.references
 
 Review comment:
   Yeah, you said right.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] edrevo commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] Pass in number of partitions to WriteBuilder

2019-11-14 Thread GitBox

edrevo commented on a change in pull request #25990: [SPARK-29248][SQL][WIP] 
Pass in number of partitions to WriteBuilder
URL: https://github.com/apache/spark/pull/25990#discussion_r346689656
 
 

 ##
 File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/PhysicalWriteInfo.java
 ##
 @@ -0,0 +1,17 @@
+package org.apache.spark.sql.connector.write;
+
+import org.apache.spark.annotation.Experimental;
+import 
org.apache.spark.sql.connector.write.streaming.StreamingDataWriterFactory;
+
+/**
+ * :: Experimental ::
+ * This interface contains physical (i.e. RDD) write information that data 
sources can use when generating a
+ * {@link DataWriterFactory} or a {@link StreamingDataWriterFactory}.
+ */
+@Experimental
 
 Review comment:
   Is this the correct annotation for the new interfaces I'm adding?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26507: [SPARK-29904][SQL][2.4] Parse timestamps in microsecond precision by JSON/CSV datasources

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26507: [SPARK-29904][SQL][2.4] Parse 
timestamps in microsecond precision by JSON/CSV datasources
URL: https://github.com/apache/spark/pull/26507#issuecomment-554245237
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18726/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26507: [SPARK-29904][SQL][2.4] Parse timestamps in microsecond precision by JSON/CSV datasources

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26507: [SPARK-29904][SQL][2.4] Parse 
timestamps in microsecond precision by JSON/CSV datasources
URL: https://github.com/apache/spark/pull/26507#issuecomment-554245229
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26537: [SPARK-29587][SQL] Support SQL Standard type real as float(4) numeric as decimal

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #26537: [SPARK-29587][SQL] 
Support SQL Standard type real as float(4) numeric as decimal
URL: https://github.com/apache/spark/pull/26537#discussion_r346688874
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ##
 @@ -2154,17 +2154,17 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   case ("smallint" | "short", Nil) => ShortType
   case ("int" | "integer", Nil) => IntegerType
   case ("bigint" | "long", Nil) => LongType
-  case ("float", Nil) => FloatType
+  case ("float" | "real", Nil) => FloatType
 
 Review comment:
   I take a look at SQL standard, float and real are different
   ```
   FLOAT specifies the data type approximate numeric, with binary precision 
equal to or greater than the
   value of the specified . The maximum value of  is 
implementation-defined.
shall not be greater than this value.
   
   REAL specifies the data type approximate numeric, with 
implementation-defined precision.
   ```
   
   We can say that now Spark supports real type, and its precision is the same 
as java float. But it needs to be proposed explicitly, e.g. send an email to 
dev list.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26507: [SPARK-29904][SQL][2.4] Parse timestamps in microsecond precision by JSON/CSV datasources

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26507: [SPARK-29904][SQL][2.4] Parse 
timestamps in microsecond precision by JSON/CSV datasources
URL: https://github.com/apache/spark/pull/26507#issuecomment-554245237
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18726/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26507: [SPARK-29904][SQL][2.4] Parse timestamps in microsecond precision by JSON/CSV datasources

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26507: [SPARK-29904][SQL][2.4] Parse 
timestamps in microsecond precision by JSON/CSV datasources
URL: https://github.com/apache/spark/pull/26507#issuecomment-554245229
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26509: [SPARK-29427][SQL] Add API to convert RelationalGroupedDataset to KeyValueGroupedDataset

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26509: [SPARK-29427][SQL] Add API to 
convert RelationalGroupedDataset to KeyValueGroupedDataset
URL: https://github.com/apache/spark/pull/26509#issuecomment-554244851
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18725/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26509: [SPARK-29427][SQL] Add API to convert RelationalGroupedDataset to KeyValueGroupedDataset

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26509: [SPARK-29427][SQL] Add API to 
convert RelationalGroupedDataset to KeyValueGroupedDataset
URL: https://github.com/apache/spark/pull/26509#issuecomment-554244844
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26509: [SPARK-29427][SQL] Add API to convert RelationalGroupedDataset to KeyValueGroupedDataset

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26509: [SPARK-29427][SQL] Add API to convert 
RelationalGroupedDataset to KeyValueGroupedDataset
URL: https://github.com/apache/spark/pull/26509#issuecomment-554244844
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26509: [SPARK-29427][SQL] Add API to convert RelationalGroupedDataset to KeyValueGroupedDataset

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26509: [SPARK-29427][SQL] Add API to convert 
RelationalGroupedDataset to KeyValueGroupedDataset
URL: https://github.com/apache/spark/pull/26509#issuecomment-554244851
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18725/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26507: [SPARK-29904][SQL][2.4] Parse timestamps in microsecond precision by JSON/CSV datasources

2019-11-14 Thread GitBox

SparkQA commented on issue #26507: [SPARK-29904][SQL][2.4] Parse timestamps in 
microsecond precision by JSON/CSV datasources
URL: https://github.com/apache/spark/pull/26507#issuecomment-554244491
 
 
   **[Test build #113857 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113857/testReport)**
 for PR 26507 at commit 
[`1852511`](https://github.com/apache/spark/commit/18525117051143a4306bdcfcbfdd878f40983799).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26543: [SPARK-29911][SQL] Fix cache table leak when session closed

2019-11-14 Thread GitBox

SparkQA commented on issue #26543: [SPARK-29911][SQL] Fix cache table leak when 
session closed
URL: https://github.com/apache/spark/pull/26543#issuecomment-554244513
 
 
   **[Test build #113855 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113855/testReport)**
 for PR 26543 at commit 
[`e6bb796`](https://github.com/apache/spark/commit/e6bb79623f097b03d3edfef7a6cfebcfb06e33fd).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26543: [SPARK-29911][SQL] Fix cache table leak when session closed

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26543: [SPARK-29911][SQL] Fix cache 
table leak when session closed
URL: https://github.com/apache/spark/pull/26543#issuecomment-554242752
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26543: [SPARK-29911][SQL] Fix cache table leak when session closed

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26543: [SPARK-29911][SQL] Fix cache 
table leak when session closed
URL: https://github.com/apache/spark/pull/26543#issuecomment-554242761
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18724/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346688075
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -598,26 +531,26 @@ object IntervalUtils {
 } else if (s.matchAt(microsStr, i)) {
   microseconds = Math.addExact(microseconds, currentValue)
   i += microsStr.numBytes()
-} else return null
-  case _ => return null
+} else throwIAE(s"invalid unit '$nextWord'")
+  case _ => throwIAE(s"invalid unit '$nextWord'")
 }
   } catch {
-case _: ArithmeticException => return null
+case e: ArithmeticException => throwIAE(e.getMessage, e)
   }
   state = UNIT_SUFFIX
 case UNIT_SUFFIX =>
   b match {
 case 's' => state = UNIT_END
 case ' ' => state = TRIM_BEFORE_SIGN
-case _ => return null
+case _ => throwIAE(s"invalid unit suffix '$nextWord'")
 
 Review comment:
   the `nextword ` represents which character the error occurs, the only 
special case is for nanoseconds out of range, the fraction + `nextword` could 
just exactly show the out of range number


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26509: [SPARK-29427][SQL] Add API to convert RelationalGroupedDataset to KeyValueGroupedDataset

2019-11-14 Thread GitBox

SparkQA commented on issue #26509: [SPARK-29427][SQL] Add API to convert 
RelationalGroupedDataset to KeyValueGroupedDataset
URL: https://github.com/apache/spark/pull/26509#issuecomment-554244415
 
 
   **[Test build #113856 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113856/testReport)**
 for PR 26509 at commit 
[`576558d`](https://github.com/apache/spark/commit/576558d02c8994ee408be48615f777018c809957).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-14 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r346687485
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -1084,6 +1084,11 @@ class Analyzer(
   result
 case UnresolvedExtractValue(child, fieldExpr) if child.resolved =>
   ExtractValue(child, fieldExpr, resolver)
+case f @ UnresolvedFunction(_, children, _, filter) if 
filter.isDefined =>
+  val newChildren = children.map(resolveExpressionTopDown(_, q))
+  val newFilter = filter.map{ expr => 
expr.mapChildren(resolveExpressionTopDown(_, q))}
 
 Review comment:
   Due to the above modifications, this problem does not exist.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26529: [SPARK-29902][Doc][Minor]Add listener event queue capacity configuration to documentation

2019-11-14 Thread GitBox

SparkQA commented on issue #26529: [SPARK-29902][Doc][Minor]Add listener event 
queue capacity configuration to documentation
URL: https://github.com/apache/spark/pull/26529#issuecomment-554243400
 
 
   **[Test build #113853 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113853/testReport)**
 for PR 26529 at commit 
[`76180fa`](https://github.com/apache/spark/commit/76180fac61fb8f4c4fd4e97dfa5382df2df60dad).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26529: [SPARK-29902][Doc][Minor]Add listener event queue capacity configuration to documentation

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26529: [SPARK-29902][Doc][Minor]Add listener 
event queue capacity configuration to documentation
URL: https://github.com/apache/spark/pull/26529#issuecomment-554243485
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113853/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26529: [SPARK-29902][Doc][Minor]Add listener event queue capacity configuration to documentation

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26529: [SPARK-29902][Doc][Minor]Add 
listener event queue capacity configuration to documentation
URL: https://github.com/apache/spark/pull/26529#issuecomment-554243485
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113853/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26533: 
[WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554243174
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113847/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26529: [SPARK-29902][Doc][Minor]Add listener event queue capacity configuration to documentation

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26529: [SPARK-29902][Doc][Minor]Add listener 
event queue capacity configuration to documentation
URL: https://github.com/apache/spark/pull/26529#issuecomment-554243481
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26529: [SPARK-29902][Doc][Minor]Add listener event queue capacity configuration to documentation

2019-11-14 Thread GitBox

SparkQA removed a comment on issue #26529: [SPARK-29902][Doc][Minor]Add 
listener event queue capacity configuration to documentation
URL: https://github.com/apache/spark/pull/26529#issuecomment-554240295
 
 
   **[Test build #113853 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113853/testReport)**
 for PR 26529 at commit 
[`76180fa`](https://github.com/apache/spark/commit/76180fac61fb8f4c4fd4e97dfa5382df2df60dad).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26529: [SPARK-29902][Doc][Minor]Add listener event queue capacity configuration to documentation

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26529: [SPARK-29902][Doc][Minor]Add 
listener event queue capacity configuration to documentation
URL: https://github.com/apache/spark/pull/26529#issuecomment-554243481
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] 
Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554243167
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26533: [WIP][test-hadoop3.2][test-java11] 
Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554243174
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113847/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26533: 
[WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554243167
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

SparkQA commented on issue #26533: [WIP][test-hadoop3.2][test-java11] Test 
Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554242942
 
 
   **[Test build #113847 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113847/testReport)**
 for PR 26533 at commit 
[`275390f`](https://github.com/apache/spark/commit/275390f6c6d57346cd359630fd4d4e022f0a3fa0).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26533: [WIP][test-hadoop3.2][test-java11] Test Hadoop 2.7 with JDK 11

2019-11-14 Thread GitBox

SparkQA removed a comment on issue #26533: [WIP][test-hadoop3.2][test-java11] 
Test Hadoop 2.7 with JDK 11
URL: https://github.com/apache/spark/pull/26533#issuecomment-554216800
 
 
   **[Test build #113847 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113847/testReport)**
 for PR 26533 at commit 
[`275390f`](https://github.com/apache/spark/commit/275390f6c6d57346cd359630fd4d4e022f0a3fa0).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346686779
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -547,14 +478,16 @@ object IntervalUtils {
 case ' ' =>
   fraction /= NANOS_PER_MICROS.toInt
   state = TRIM_BEFORE_UNIT
-case _ => return null
+case _ if '0' <= b && b <= '9' =>
+  throwIAE(s"invalid value fractional part '$fraction$nextWord' 
out of range")
 
 Review comment:
   
https://github.com/apache/spark/pull/26491/files#diff-105a430951a95bd0c9c899d2a24e3badR107-R115
 you can check some test case here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26543: [SPARK-29911][SQL] Fix cache table leak when session closed

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26543: [SPARK-29911][SQL] Fix cache table 
leak when session closed
URL: https://github.com/apache/spark/pull/26543#issuecomment-554242752
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26543: [SPARK-29911][SQL] Fix cache table leak when session closed

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26543: [SPARK-29911][SQL] Fix cache table 
leak when session closed
URL: https://github.com/apache/spark/pull/26543#issuecomment-554242761
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18724/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LantaoJin opened a new pull request #26543: [SPARK-29911][SQL] Fix cache table leak when session closed

2019-11-14 Thread GitBox

LantaoJin opened a new pull request #26543: [SPARK-29911][SQL] Fix cache table 
leak when session closed
URL: https://github.com/apache/spark/pull/26543
 
 
   ### What changes were proposed in this pull request?
   How to reproduce:
   1. create a local temporary view v1
   2. cache it in memory
   3. close session without drop v1.
   The application will hold the memory forever. In a long running thrift 
server scenario. It's worse.
   ```shell
   0: jdbc:hive2://localhost:1> CACHE TABLE testCacheTable AS SELECT 1;
   CACHE TABLE testCacheTable AS SELECT 1;
   +-+--+
   | Result  |
   +-+--+
   +-+--+
   No rows selected (1.498 seconds)
   0: jdbc:hive2://localhost:1> !close
   !close
   Closing: 0: jdbc:hive2://localhost:1
   0: jdbc:hive2://localhost:1 (closed)> !connect 
'jdbc:hive2://localhost:1'
   !connect 'jdbc:hive2://localhost:1'
   Connecting to jdbc:hive2://localhost:1
   Enter username for jdbc:hive2://localhost:1:
   lajin
   Enter password for jdbc:hive2://localhost:1:
   ***
   Connected to: Spark SQL (version 3.0.0-SNAPSHOT)
   Driver: Hive JDBC (version 1.2.1.spark2)
   Transaction isolation: TRANSACTION_REPEATABLE_READ
   1: jdbc:hive2://localhost:1> select * from testCacheTable;
   select * from testCacheTable;
   Error: Error running query: org.apache.spark.sql.AnalysisException: Table or 
view not found: testCacheTable; line 1 pos 14;
   'Project [*]
   +- 'UnresolvedRelation [testCacheTable] (state=,code=0)
   ```
   https://user-images.githubusercontent.com/1853780/68923527-7ca8c180-07b9-11ea-9cc7-74f276c46840.png;>
   
   
   
   ### Why are the changes needed?
   Resolve memory leak for thrift server
   
   
   ### Does this PR introduce any user-facing change?
   No
   
   
   ### How was this patch tested?
   Manual test in UI storage tab
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346686470
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -547,14 +478,16 @@ object IntervalUtils {
 case ' ' =>
   fraction /= NANOS_PER_MICROS.toInt
   state = TRIM_BEFORE_UNIT
-case _ => return null
+case _ if '0' <= b && b <= '9' =>
+  throwIAE(s"invalid value fractional part '$fraction$nextWord' 
out of range")
 
 Review comment:
   > how about `interval can only support nanosecond precision, '$nextWord' is 
out of range`
   
   the is not suitable, `0.99` the nextword will be `9` only but 
'$fraction$nextWord' is the exact `99`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346686470
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -547,14 +478,16 @@ object IntervalUtils {
 case ' ' =>
   fraction /= NANOS_PER_MICROS.toInt
   state = TRIM_BEFORE_UNIT
-case _ => return null
+case _ if '0' <= b && b <= '9' =>
+  throwIAE(s"invalid value fractional part '$fraction$nextWord' 
out of range")
 
 Review comment:
   > how about `interval can only support nanosecond precision, '$nextWord' is 
out of range`
   the is not suitable, `0.99` the nextword will be `9` only but 
'$fraction$nextWord' is the exact `99`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26516: [SPARK-29893] improve the local shuffle reader performance by changing the reading task number from 1 to multi.

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26516: [SPARK-29893] improve the local 
shuffle reader performance by changing the reading task number from 1 to multi.
URL: https://github.com/apache/spark/pull/26516#issuecomment-554242211
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26516: [SPARK-29893] improve the local shuffle reader performance by changing the reading task number from 1 to multi.

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26516: [SPARK-29893] improve the 
local shuffle reader performance by changing the reading task number from 1 to 
multi.
URL: https://github.com/apache/spark/pull/26516#issuecomment-554242215
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113842/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26516: [SPARK-29893] improve the local shuffle reader performance by changing the reading task number from 1 to multi.

2019-11-14 Thread GitBox

AmplabJenkins commented on issue #26516: [SPARK-29893] improve the local 
shuffle reader performance by changing the reading task number from 1 to multi.
URL: https://github.com/apache/spark/pull/26516#issuecomment-554242215
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113842/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26516: [SPARK-29893] improve the local shuffle reader performance by changing the reading task number from 1 to multi.

2019-11-14 Thread GitBox

AmplabJenkins removed a comment on issue #26516: [SPARK-29893] improve the 
local shuffle reader performance by changing the reading task number from 1 to 
multi.
URL: https://github.com/apache/spark/pull/26516#issuecomment-554242211
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26516: [SPARK-29893] improve the local shuffle reader performance by changing the reading task number from 1 to multi.

2019-11-14 Thread GitBox

SparkQA removed a comment on issue #26516: [SPARK-29893] improve the local 
shuffle reader performance by changing the reading task number from 1 to multi.
URL: https://github.com/apache/spark/pull/26516#issuecomment-554207880
 
 
   **[Test build #113842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113842/testReport)**
 for PR 26516 at commit 
[`beca952`](https://github.com/apache/spark/commit/beca952a3a977814203936de99e059ecbef2bafa).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346685776
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -518,10 +449,10 @@ object IntervalUtils {
   isNegative = false
 case '.' =>
   isNegative = false
-  fractionScale = (NANOS_PER_SECOND / 10).toInt
   i += 1
+  fractionScale = (NANOS_PER_SECOND / 10).toInt
   state = VALUE_FRACTIONAL_PART
-case _ => return null
+case _ => throwIAE( s"unrecognized sign '$nextWord'")
 
 Review comment:
   number seems ok to me


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26516: [SPARK-29893] improve the local shuffle reader performance by changing the reading task number from 1 to multi.

2019-11-14 Thread GitBox

SparkQA commented on issue #26516: [SPARK-29893] improve the local shuffle 
reader performance by changing the reading task number from 1 to multi.
URL: https://github.com/apache/spark/pull/26516#issuecomment-554241724
 
 
   **[Test build #113842 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113842/testReport)**
 for PR 26516 at commit 
[`beca952`](https://github.com/apache/spark/commit/beca952a3a977814203936de99e059ecbef2bafa).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346685433
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -518,10 +449,10 @@ object IntervalUtils {
   isNegative = false
 case '.' =>
   isNegative = false
-  fractionScale = (NANOS_PER_SECOND / 10).toInt
   i += 1
+  fractionScale = (NANOS_PER_SECOND / 10).toInt
   state = VALUE_FRACTIONAL_PART
-case _ => return null
+case _ => throwIAE( s"unrecognized sign '$nextWord'")
 
 Review comment:
   checkFromInvalidString("~1 hour", "unrecognized sign")


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-14 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r346685495
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -1084,6 +1084,11 @@ class Analyzer(
   result
 case UnresolvedExtractValue(child, fieldExpr) if child.resolved =>
   ExtractValue(child, fieldExpr, resolver)
+case f @ UnresolvedFunction(_, children, _, filter) if 
filter.isDefined =>
 
 Review comment:
   OK. I will try like this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

yaooqinn commented on a change in pull request #26491: [SPARK-29870][SQL] Unify 
the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346685433
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -518,10 +449,10 @@ object IntervalUtils {
   isNegative = false
 case '.' =>
   isNegative = false
-  fractionScale = (NANOS_PER_SECOND / 10).toInt
   i += 1
+  fractionScale = (NANOS_PER_SECOND / 10).toInt
   state = VALUE_FRACTIONAL_PART
-case _ => return null
+case _ => throwIAE( s"unrecognized sign '$nextWord'")
 
 Review comment:
   `checkFromInvalidString("~1 hour", "unrecognized sign")`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #26491: [SPARK-29870][SQL] 
Unify the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346685302
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -598,26 +531,26 @@ object IntervalUtils {
 } else if (s.matchAt(microsStr, i)) {
   microseconds = Math.addExact(microseconds, currentValue)
   i += microsStr.numBytes()
-} else return null
-  case _ => return null
+} else throwIAE(s"invalid unit '$nextWord'")
+  case _ => throwIAE(s"invalid unit '$nextWord'")
 }
   } catch {
-case _: ArithmeticException => return null
+case e: ArithmeticException => throwIAE(e.getMessage, e)
   }
   state = UNIT_SUFFIX
 case UNIT_SUFFIX =>
   b match {
 case 's' => state = UNIT_END
 case ' ' => state = TRIM_BEFORE_SIGN
-case _ => return null
+case _ => throwIAE(s"invalid unit suffix '$nextWord'")
 
 Review comment:
   `invalid unit '$nextWord'` is better if we can implement `nextword` 
correctly. Or we should introduce "currentWord"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26491: [SPARK-29870][SQL] Unify the logic of multi-units interval string to CalendarInterval

2019-11-14 Thread GitBox

cloud-fan commented on a change in pull request #26491: [SPARK-29870][SQL] 
Unify the logic of multi-units interval string to CalendarInterval
URL: https://github.com/apache/spark/pull/26491#discussion_r346685009
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -547,14 +478,16 @@ object IntervalUtils {
 case ' ' =>
   fraction /= NANOS_PER_MICROS.toInt
   state = TRIM_BEFORE_UNIT
-case _ => return null
+case _ if '0' <= b && b <= '9' =>
+  throwIAE(s"invalid value fractional part '$fraction$nextWord' 
out of range")
+case _ => throwIAE(s"invalid value '$nextWord' in fractional part")
   }
   i += 1
 case TRIM_BEFORE_UNIT => trimToNextState(b, UNIT_BEGIN)
 case UNIT_BEGIN =>
   // Checks that only seconds can have the fractional part
   if (b != 's' && fractionScale >= 0) {
-return null
+throwIAE(s"'$nextWord' with fractional part is unsupported")
 
 Review comment:
   nit: `$nextWord' cannot have fractional part`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1534 matches

Mail list logo