[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22617 **[Test build #97438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97438/testReport)** for PR 22617 at commit [`1c30755`](https://github.com/apache/spark/commit/1c307553d5aa64c2365b14084bb79569f6c14be1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22617: [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsa...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22617 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22677: [SPARK-25683][Core] Updated the log for the firstTime ev...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22677 **[Test build #97437 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97437/testReport)** for PR 22677 at commit [`7105cce`](https://github.com/apache/spark/commit/7105cce2657e81ed118c4713b517c89abaa7a9f2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22597: [SPARK-25579][SQL] Use quoted attribute names if ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22597#discussion_r225404390 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcTest.scala --- @@ -106,4 +106,14 @@ abstract class OrcTest extends QueryTest with SQLTestUtils with BeforeAndAfterAl df: DataFrame, path: File): Unit = { df.write.mode(SaveMode.Overwrite).orc(path.getCanonicalPath) } + + protected def checkPredicatePushDown(df: DataFrame, numRows: Int, predicate: String): Unit = { --- End diff -- @HyukjinKwon . I refactor this since it's repeated three times now. And, this function should be here because the existing two instances are in `OrcQueryTest` and new instance is in `OrcQuerySuite`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22677: [SPARK-25683][Core] Updated the log for the firstTime ev...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22677 **[Test build #97435 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97435/testReport)** for PR 22677 at commit [`4b36af3`](https://github.com/apache/spark/commit/4b36af375e28154e0f978416bd4ffe9abeda2bda). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22597 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22597 **[Test build #97436 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97436/testReport)** for PR 22597 at commit [`7686179`](https://github.com/apache/spark/commit/7686179678b86369e3b62fb2ae8ae9a4384e5c14). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22597 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4023/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97433/testReport)** for PR 22724 at commit [`1c042f3`](https://github.com/apache/spark/commit/1c042f3018e5fd5c1f14324b3d20b741d202e643). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97433/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22728 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22728 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97420/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22728 **[Test build #97420 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97420/testReport)** for PR 22728 at commit [`e3aaa90`](https://github.com/apache/spark/commit/e3aaa90bd3b5f12a892f11be67ac26326c3b18ce). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22724 I'm checking the reason of the test failures... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22608 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22608 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97418/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22608 **[Test build #97418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97418/testReport)** for PR 22608 at commit [`cccf027`](https://github.com/apache/spark/commit/cccf0275cc58b464aba544742d7300ba4939f5a6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97432 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97432/testReport)** for PR 22724 at commit [`4df5abd`](https://github.com/apache/spark/commit/4df5abd23203e4dc3499001e8dd5c589fa03c535). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97432/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22466: [SPARK-25464][SQL] Create Database to the location,only ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22466 **[Test build #97434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97434/testReport)** for PR 22466 at commit [`d290998`](https://github.com/apache/spark/commit/d2909983c26586f078933ef499388ff4189f037c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4022/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22724 LGTM pending jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22597: [SPARK-25579][SQL] Use quoted attribute names if ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22597#discussion_r225398146 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala --- @@ -383,4 +385,17 @@ class OrcFilterSuite extends OrcTest with SharedSQLContext { )).get.toString } } + + test("SPARK-25579 ORC PPD should support column names with dot") { +import testImplicits._ + +withSQLConf(SQLConf.ORC_FILTER_PUSHDOWN_ENABLED.key -> "true") { + withTempDir { dir => +val path = new File(dir, "orc").getCanonicalPath +Seq((1, 2), (3, 4)).toDF("col.dot.1", "col.dot.2").write.orc(path) --- End diff -- Sure. Thank you for confirmation, @cloud-fan and @HyukjinKwon . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97433/testReport)** for PR 22724 at commit [`1c042f3`](https://github.com/apache/spark/commit/1c042f3018e5fd5c1f14324b3d20b741d202e643). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97431 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97431/testReport)** for PR 22724 at commit [`56b4778`](https://github.com/apache/spark/commit/56b4778d3ff3467c53ac1a9e4370dfffcb12a4de). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97431/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225397940 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { --- End diff -- updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22597: [SPARK-25579][SQL] Use quoted attribute names if ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22597#discussion_r225397862 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala --- @@ -383,4 +385,17 @@ class OrcFilterSuite extends OrcTest with SharedSQLContext { )).get.toString } } + + test("SPARK-25579 ORC PPD should support column names with dot") { +import testImplicits._ + +withSQLConf(SQLConf.ORC_FILTER_PUSHDOWN_ENABLED.key -> "true") { + withTempDir { dir => +val path = new File(dir, "orc").getCanonicalPath +Seq((1, 2), (3, 4)).toDF("col.dot.1", "col.dot.2").write.orc(path) +val df = spark.read.orc(path).where("`col.dot.1` = 1 and `col.dot.2` = 2") +checkAnswer(stripSparkFilter(df), Row(1, 2)) --- End diff -- Thank you for review, @dbtsai ! I ignored PPDs with nested columns here because Spark doesn't pushdown in Spark 2.4 and until now. With your PR (#22573), Spark 3.0 will support that and we can update this to handle that cases, too. @cloud-fan . Actually, ORC 1.5.0 starts to support PPD with nested columns [ORC-323](https://issues.apache.org/jira/browse/ORC-323). So, @dbtsai and I discussed about supporting that before. We are going to support ORC PPDs with nested columns in Spark 3.0 without regression. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225397788 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...
Github user sandeep-katta commented on a diff in the pull request: https://github.com/apache/spark/pull/22466#discussion_r225396925 --- Diff: python/pyspark/sql/tests.py --- @@ -2993,6 +2990,7 @@ def test_current_database(self): AnalysisException, "does_not_exist", lambda: spark.catalog.setCurrentDatabase("does_not_exist")) +spark.sql("DROP DATABASE some_db") --- End diff -- create and drop should be part of test case,if there is any exception then test case will fail.So no need to put in the finally block --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225396907 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { --- End diff -- yup --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22379 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22727: [SPARK-25735][CORE][MINOR]Improve start-thriftserver.sh:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22727 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97430/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22727: [SPARK-25735][CORE][MINOR]Improve start-thriftserver.sh:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22727 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22727: [SPARK-25735][CORE][MINOR]Improve start-thriftserver.sh:...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22727 **[Test build #97430 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97430/testReport)** for PR 22727 at commit [`8b2500f`](https://github.com/apache/spark/commit/8b2500fa97ab258785160df25d3a0f6afdd74a10). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21929: [SPARK-24970][Kinesis] Create WriteAheadLogBackedBlockRD...
Github user brucezhao11 commented on the issue: https://github.com/apache/spark/pull/21929 Hi, @brkyvz , @srowen , could you please review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97432 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97432/testReport)** for PR 22724 at commit [`4df5abd`](https://github.com/apache/spark/commit/4df5abd23203e4dc3499001e8dd5c589fa03c535). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225395710 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { --- End diff -- you suggested like this? ``` case mt: MapType => v.isInstanceOf[MapData] && { val map = v.asInstanceOf[MapData] doValidate(map.keyArray(), ArrayType(mt.keyType)) && doValidate(map.valueArray(), ArrayType(mt.valueType)) } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4021/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225395382 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { --- End diff -- done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97431/testReport)** for PR 22724 at commit [`56b4778`](https://github.com/apache/spark/commit/56b4778d3ff3467c53ac1a9e4370dfffcb12a4de). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4020/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22636: [SPARK-25629][TEST] Reduce ParquetFilterSuite: fi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22636 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225393919 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +40,29 @@ import org.apache.spark.sql.types.DataType * @param nullable True if the UDF can return null value. * @param udfDeterministic True if the UDF is deterministic. Deterministic UDF returns same result * each time it is invoked with a particular input. - * @param nullableTypes which of the inputTypes are nullable (i.e. not primitive) */ case class ScalaUDF( function: AnyRef, dataType: DataType, children: Seq[Expression], +handleNullForInputs: Seq[Boolean], --- End diff -- It could be merged. I guess the preference is to minimize the change required, so we just have a new field. That said, UDFs now really require nullability information to work correctly in Scala 2.12. This is the reason the new field is required and not optional. It already caused a new test to fail, so I'm more persuaded that it's OK to make a less backwards-compatible change to make it clear to any future callers that new info is needed. It's an internal class, so reasonable to do so. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22636: [SPARK-25629][TEST] Reduce ParquetFilterSuite: filter pu...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22636 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225393373 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +40,29 @@ import org.apache.spark.sql.types.DataType * @param nullable True if the UDF can return null value. * @param udfDeterministic True if the UDF is deterministic. Deterministic UDF returns same result * each time it is invoked with a particular input. - * @param nullableTypes which of the inputTypes are nullable (i.e. not primitive) */ case class ScalaUDF( function: AnyRef, dataType: DataType, children: Seq[Expression], +handleNullForInputs: Seq[Boolean], --- End diff -- Adding `handleNullForInputs` doesn't look reducing confusion a lot to me. Since this PR targets only refactoring mainly to reduce confusion and the easy of use, this concern should be addressed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225393220 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +40,29 @@ import org.apache.spark.sql.types.DataType * @param nullable True if the UDF can return null value. * @param udfDeterministic True if the UDF is deterministic. Deterministic UDF returns same result * each time it is invoked with a particular input. - * @param nullableTypes which of the inputTypes are nullable (i.e. not primitive) */ case class ScalaUDF( function: AnyRef, dataType: DataType, children: Seq[Expression], +handleNullForInputs: Seq[Boolean], --- End diff -- Maybe I missed something but: 1. Why don't we just merge `handleNullForInputs` and `inputTypes`? 2. Why `handleNullForInputs` is required whereas `inputTypes`'s default is `Nil`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22727: [SPARK-25735][CORE][MINOR]Improve start-thriftserver.sh:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22727 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225392951 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { --- End diff -- we don't need this. The array validation already consider numElements --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22727: [SPARK-25735][CORE][MINOR]Improve start-thriftserver.sh:...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22727 **[Test build #97430 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97430/testReport)** for PR 22727 at commit [`8b2500f`](https://github.com/apache/spark/commit/8b2500fa97ab258785160df25d3a0f6afdd74a10). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22727: [SPARK-25735][CORE][MINOR]Improve start-thriftserver.sh:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22727 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4019/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22727: [SPARK-25735][CORE][MINOR]Improve start-thriftser...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22727#discussion_r225392908 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala --- @@ -71,6 +71,12 @@ object HiveThriftServer2 extends Logging { } def main(args: Array[String]) { +// If the arguments contains "-h" or "--help", print out the usage and exit. +if (args.contains("-h") || args.contains("--help")) { + HiveServer2.main(args) --- End diff -- I see. I was following `HiveServer2` to call `System.exit`. Now I change to `return`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225392843 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { --- End diff -- can we do the same for array and map? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225392872 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) --- End diff -- I think we need to validate all the elements --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22708#discussion_r225391643 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -282,6 +283,27 @@ case class StaticInvoke( } } +/** + * When constructing [[Invoke]], the data type must be given, which may be not possible to define + * before analysis. This class acts like a placeholder for [[Invoke]], and will be replaced by + * [[Invoke]] during analysis after the input data is resolved. Data type passed to [[Invoke]] + * will be defined by applying [[dataTypeFunction]] to the data type of the input data. + */ +case class UnresolvedInvoke( --- End diff -- Should we move this to `unresolved.scala`? cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22379 **[Test build #97429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97429/testReport)** for PR 22379 at commit [`cb23bd7`](https://github.com/apache/spark/commit/cb23bd7c8c1d91661ff7c8241768df51891f6b7d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97427/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22729: [SPARK-25737][CORE] Remove JavaSparkContextVarargsWorkar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22729 **[Test build #97428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97428/testReport)** for PR 22729 at commit [`0860d27`](https://github.com/apache/spark/commit/0860d27a205d3dd3d94e6bbe2c9db49b7e432ef4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97427/testReport)** for PR 22724 at commit [`a30e9ce`](https://github.com/apache/spark/commit/a30e9ce7b3a5b0965801fe69e8aabb31de50bc13). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22729: [SPARK-25737][CORE] Remove JavaSparkContextVarargsWorkar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22729 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4018/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22729: [SPARK-25737][CORE] Remove JavaSparkContextVarargsWorkar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22729 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22728 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225389273 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { +doValidate(map.keyArray.array.head, mt.keyType) && + doValidate(map.valueArray.array.head, mt.valueType) + } +} + case ObjectType(cls) => cls.isInstance(v) + case udt: UserDefinedType[_] => doValidate(v, udt.sqlType) + case _ => false --- End diff -- We need to add `NullType` case? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22263 @dongjoon-hyun can you do a final check and merge this ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225388391 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { +doValidate(map.keyArray.array.head, mt.keyType) && + doValidate(map.valueArray.array.head, mt.valueType) + } +} --- End diff -- Since the whole element check seems to be expensive, the current one is ok to me. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21588 ping @wangyum, if you're willing to make a progress about this, please provide some input here and/or in the JIRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22742: [SPARK-25588][WIP] SchemaParseException: Can't redefine:...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22742 **[Test build #97426 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97426/testReport)** for PR 22742 at commit [`fd238e0`](https://github.com/apache/spark/commit/fd238e0149966c5b12b38443f7dba13f5d6878b2). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Inner(` * `case class Middle(` * `case class Outer(` * `class Spark25588Suite extends QueryTest with SharedSQLContext ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22742: [SPARK-25588][WIP] SchemaParseException: Can't redefine:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22742 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97426/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22742: [SPARK-25588][WIP] SchemaParseException: Can't redefine:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22742 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22724 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4017/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225387436 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[InternalRow] && { + val row = v.asInstanceOf[InternalRow] + st.fields.map(_.dataType).zipWithIndex.forall { +case (dt, i) => doValidate(row.get(i, dt), dt) + } +} + case at: ArrayType => +v.isInstanceOf[GenericArrayData] && { + val ar = v.asInstanceOf[GenericArrayData].array + ar.isEmpty || doValidate(ar.head, at.elementType) +} + case mt: MapType => +v.isInstanceOf[ArrayBasedMapData] && { + val map = v.asInstanceOf[ArrayBasedMapData] + map.numElements() == 0 || { +doValidate(map.keyArray.array.head, mt.keyType) && + doValidate(map.valueArray.array.head, mt.valueType) + } +} --- End diff -- I'm wondering whether we don't need to check the whole elements for `ArrayType` and `MapType`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22742: [SPARK-25588][WIP] SchemaParseException: Can't redefine:...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22742 **[Test build #97426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97426/testReport)** for PR 22742 at commit [`fd238e0`](https://github.com/apache/spark/commit/fd238e0149966c5b12b38443f7dba13f5d6878b2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22742: [SPARK-25588][WIP] SchemaParseException: Can't redefine:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22742 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4016/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22742: [SPARK-25588][WIP] SchemaParseException: Can't redefine:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22742 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22724 **[Test build #97427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97427/testReport)** for PR 22724 at commit [`a30e9ce`](https://github.com/apache/spark/commit/a30e9ce7b3a5b0965801fe69e8aabb31de50bc13). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22742: [SPARK-25588][WIP] SchemaParseException: Can't re...
GitHub user heuermh opened a pull request: https://github.com/apache/spark/pull/22742 [SPARK-25588][WIP] SchemaParseException: Can't redefine: list when re⦠â¦ading from Parquet ## What changes were proposed in this pull request? Added failing unit test that demonstrates the issue described in SPARK-25588. You can merge this pull request into a Git repository by running: $ git pull https://github.com/heuermh/spark SPARK-25588 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22742.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22742 commit fd238e0149966c5b12b38443f7dba13f5d6878b2 Author: Michael L Heuer Date: 2018-10-16T03:31:59Z [SPARK-25588][WIP] SchemaParseException: Can't redefine: list when reading from Parquet --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22381 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22381 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97415/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22381 **[Test build #97415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97415/testReport)** for PR 22381 at commit [`a08836c`](https://github.com/apache/spark/commit/a08836cc9ca91164cf5dd95a63929d0613779cf2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97414/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22504 **[Test build #97414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97414/testReport)** for PR 22504 at commit [`a46734c`](https://github.com/apache/spark/commit/a46734c46355b31863178ff9f0169118ca15a695). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21929: [SPARK-24970][Kinesis] Create WriteAheadLogBackedBlockRD...
Github user brucezhao11 commented on the issue: https://github.com/apache/spark/pull/21929 Thank you, @shuyang-truex . Really glad to hear that it works in your production environment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4015/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22204 **[Test build #97425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97425/testReport)** for PR 22204 at commit [`638b650`](https://github.com/apache/spark/commit/638b650b323b8856302e88705337c814482871c6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4014/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22204 **[Test build #97424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97424/testReport)** for PR 22204 at commit [`8d6f599`](https://github.com/apache/spark/commit/8d6f599c26fbfea75194d06bcb779b966872f35f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225384590 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case other => throw new RuntimeException(s"no default for type $dataType") } + + private[expressions] def validateLiteralValue(value: Any, dataType: DataType): Unit = { +def doValidate(v: Any, dataType: DataType): Boolean = dataType match { + case BooleanType => v.isInstanceOf[Boolean] + case ByteType => v.isInstanceOf[Byte] + case ShortType => v.isInstanceOf[Short] + case IntegerType | DateType => v.isInstanceOf[Int] + case LongType | TimestampType => v.isInstanceOf[Long] + case FloatType => v.isInstanceOf[Float] + case DoubleType => v.isInstanceOf[Double] + case _: DecimalType => v.isInstanceOf[Decimal] + case CalendarIntervalType => v.isInstanceOf[CalendarInterval] + case BinaryType => v.isInstanceOf[Array[Byte]] + case StringType => v.isInstanceOf[UTF8String] + case st: StructType => +v.isInstanceOf[GenericInternalRow] && { --- End diff -- ah, yes. ok, I'll fix. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22722: [SPARK-24432][k8s] Add support for dynamic resource allo...
Github user huafengw commented on the issue: https://github.com/apache/spark/pull/22722 Hi, @liyinan926 @mccheah, thanks for your reminding. Looks like it's quite a new design and needs a lot of work so what is the status right now? And is it planned to be in version 3.0? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22705: [SPARK-25704][CORE] Allocate a bit less than Int.MaxValu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22705 **[Test build #97423 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97423/testReport)** for PR 22705 at commit [`c811476`](https://github.com/apache/spark/commit/c811476410b4b6e5c29265ce60ec1c2a327209f4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org