[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22688 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22688#discussion_r224341077 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala --- @@ -116,7 +116,6 @@ class SimpleWritableDataSource extends DataSourceV2 schema: StructType, mode: SaveMode, options: DataSourceOptions): Optional[BatchWriteSupport] = { -assert(DataType.equalsStructurally(schema.asNullable, this.schema.asNullable)) --- End diff -- Yea .. but it's in test code and just sanity check.. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22688#discussion_r224337926 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala --- @@ -116,7 +116,6 @@ class SimpleWritableDataSource extends DataSourceV2 schema: StructType, mode: SaveMode, options: DataSourceOptions): Optional[BatchWriteSupport] = { -assert(DataType.equalsStructurally(schema.asNullable, this.schema.asNullable)) --- End diff -- For modes other than Append, I think we still need this assert, don't we? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22688#discussion_r224326923 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala --- @@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with SharedSQLContext { } } } + + test("SPARK-25700: do not read schema when writing in other modes except append mode") { +withTempPath { file => + val cls = classOf[SimpleWriteOnlyDataSource] + val path = file.getCanonicalPath + val df = spark.range(5).select('id as 'i, -'id as 'j) + try { +df.write.format(cls.getName).option("path", path).mode("error").save() +df.write.format(cls.getName).option("path", path).mode("overwrite").save() +df.write.format(cls.getName).option("path", path).mode("ignore").save() + } catch { +case e: SchemaReadAttemptException => fail("Schema read was attempted.", e) + } --- End diff -- Yup --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22688#discussion_r224326576 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala --- @@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with SharedSQLContext { } } } + + test("SPARK-25700: do not read schema when writing in other modes except append mode") { +withTempPath { file => + val cls = classOf[SimpleWriteOnlyDataSource] + val path = file.getCanonicalPath + val df = spark.range(5).select('id as 'i, -'id as 'j) + try { +df.write.format(cls.getName).option("path", path).mode("error").save() +df.write.format(cls.getName).option("path", path).mode("overwrite").save() +df.write.format(cls.getName).option("path", path).mode("ignore").save() + } catch { +case e: SchemaReadAttemptException => fail("Schema read was attempted.", e) + } --- End diff -- To validate new code path [line 250](https://github.com/apache/spark/pull/22688/files#diff-94fbd986b04087223f53697d4b6cab24R250), could you add `intercept[SchemaReadAttemptException]` and do `append`, too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22688#discussion_r224316297 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala --- @@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with SharedSQLContext { } } } + + test("SPARK-25700: do not read schema when writing in other modes except append mode") { +withTempPath { file => + val cls = classOf[SimpleWriteOnlyDataSource] + val path = file.getCanonicalPath + val df = spark.range(5).select('id as 'i, -'id as 'j) --- End diff -- The write path looks requiring two columns: https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala#L214 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22688#discussion_r224316130 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala --- @@ -351,6 +351,21 @@ class DataSourceV2Suite extends QueryTest with SharedSQLContext { } } } + + test("SPARK-25700: do not read schema when writing in other modes except append mode") { +withTempPath { file => + val cls = classOf[SimpleWriteOnlyDataSource] + val path = file.getCanonicalPath + val df = spark.range(5).select($"id", $"id") --- End diff -- The write path looks requiring two columns: https://github.com/apache/spark/blob/e06da95cd9423f55cdb154a2778b0bddf7be984c/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala#L214 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22688#discussion_r224153758 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala --- @@ -190,12 +190,13 @@ class DataSourceV2Suite extends QueryTest with SharedSQLContext { test("simple writable data source") { // TODO: java implementation. +val writeOnlySource = classOf[SimpleWriteOnlyDataSource] --- End diff -- can we create a new test case? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22688: [SPARK-25700][SQL] Creates ReadSupport in only Ap...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/22688 [SPARK-25700][SQL] Creates ReadSupport in only Append Mode in Data Source V2 write path ## What changes were proposed in this pull request? Alternative for https://github.com/apache/spark/pull/22686: In other save modes, we don't need to make a readsupport and read schema. ## How was this patch tested? Unit test and manual tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark append-revert-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22688.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22688 commit a445e163675eb330eade07f1cbdd88b99caab117 Author: hyukjinkwon Date: 2018-10-10T10:38:55Z Creates ReadSupport in only Append Mode in Data Source V2 write path --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org