Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23208
@cloud-fan, what are you suggesting to use as a design? If you think this
shouldn't mirror the read side, then let's be clear on what it should look
like. Maybe that's a design doc,
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239889152
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,52 +17,49 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239888975
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchWrite.java
---
@@ -25,14 +25,14 @@
import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239888795
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -25,7 +25,10 @@
* The base interface for v2 data sources
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239613722
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchWrite.java
---
@@ -25,14 +25,14 @@
import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239613088
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,52 +17,49 @@
package
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23208
@cloud-fan, I see that this adds `Table` and uses `TableProvider`, but I
was expecting this to also update the write side to mirror the read side, like
PR #22190 for [SPARK-25188](https
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239598346
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -241,32 +241,28 @@ final class DataFrameWriter[T] private[sql](ds
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239596456
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -241,32 +241,28 @@ final class DataFrameWriter[T] private[sql](ds
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239581374
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -25,7 +25,10 @@
* The base interface for v2 data sources
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239578059
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -25,7 +25,10 @@
* The base interface for v2 data sources
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239559037
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -25,7 +25,10 @@
* The base interface for v2 data sources
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23055
@HyukjinKwon, for the future, I should note that I'm not a committer so my
+1 for a PR is not binding. I'm fairly sure @vanzin would +1 this commit as
well, but it's best not to me
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23208
Thanks for posting this PR @cloud-fan! I'll have a look in the next day or
so.
---
-
To unsubscribe, e-mail: reviews-uns
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23055
+1 with the latest changes. Thanks for taking care of this, @HyukjinKwon!
Functionality is in two parts: changing the resource requests (which
doesn't change) and limiting memory u
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r238046730
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/PartitionTransforms.java
---
@@ -0,0 +1,229 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237995405
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237984050
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/TableCatalog.java ---
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
@stczwd, I agree with @mccheah. Tables are basically named data sets.
Whether they support batch, micro-batch streaming, or continuous streaming is
determined by checking whether they implement
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237975013
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/CatalogProvider.java
---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237974718
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/PartitionTransforms.java
---
@@ -0,0 +1,229 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237974410
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Table.java ---
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237973548
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/catalog/v2/V1MetadataTable.scala
---
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237972742
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/catalog/v2/V1MetadataTable.scala
---
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237972182
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/TableCatalog.java ---
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237971241
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Table.java ---
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237971288
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Table.java ---
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r237971092
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/PartitionTransforms.java
---
@@ -0,0 +1,229 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23086
@cloud-fan, thanks for getting this done! I'll wait for the equivalent
write-side PR.
---
-
To unsubscribe, e-mail: re
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23086
> I still do not think we should mix the catalog support with the data
source APIs
We are trying to keep these separate. `Table` is the only overlap between
the two. If you prefer m
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237966188
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Scan.java ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23055
+1 once the docs are updated to note that resource requests still include
python memory, even in Windows.
---
-
To unsubscribe
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r237963488
--- Diff: docs/configuration.md ---
@@ -190,6 +190,8 @@ of the most common options to set are:
and it is up to the application to avoid exceeding
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21978
Rebased on master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23086
+1
There are only minor suggestions left from me. I'd like to see the default
implementation of `Table.name` removed, but I don't think that should block
commi
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237670228
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -22,86 +22,56 @@ import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237670099
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Scan.java ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237668483
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21978#discussion_r237660050
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala ---
@@ -18,48 +18,106 @@
package org.apache.spark.sql.catalyst
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21978#discussion_r237585203
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala ---
@@ -18,48 +18,106 @@
package org.apache.spark.sql.catalyst
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
@stczwd, thanks for taking a look at this. What are the differences between
batch and stream DDL that you think will come up
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237179854
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -54,27 +53,17 @@ case class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237178976
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -23,29 +23,28 @@ import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237176552
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -38,7 +38,7 @@ import org.apache.spark.sql.execution.datasources.jdbc
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237176100
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r237172065
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r237169532
--- Diff: python/pyspark/worker.py ---
@@ -22,7 +22,12 @@
import os
import sys
import time
-import resource
+# 'resource'
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23086
@cloud-fan, sorry to spread review comments over two days, but I've
finished the first pass. Overall, it looks great.
I think we can simplify a couple of areas, like all of the args p
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236859358
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala
---
@@ -396,87 +392,66 @@ object SimpleReaderFactory extends
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236858793
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
---
@@ -116,16 +116,20 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236858449
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -54,27 +53,17 @@ case class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236858107
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -23,29 +23,28 @@ import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236857220
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -23,29 +23,28 @@ import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236856960
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -23,29 +23,28 @@ import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236852153
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -170,15 +157,24 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236850263
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236849290
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -40,8 +40,8 @@ import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236844174
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -38,7 +38,7 @@ import org.apache.spark.sql.execution.datasources.jdbc
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236823417
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Scan.java ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236820896
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Batch.java ---
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236820065
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236819758
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236818511
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236816739
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236796331
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236491385
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23086#discussion_r236487464
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r236480711
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r236345625
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r235082191
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r234691652
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22547
I agree that there is consensus for the proposal in the design doc and I
don't think there are any blockers. If there's something I can do to help,
please let me know. Otherwise ping me
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r234286173
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r234084002
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/23055
Thanks for fixing this so quickly, @HyukjinKwon! I'd like a couple of
changes, but overall it is going in the right direction.
We should also plan on porting this to the 2.4 branch wh
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r234080578
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r234080290
--- Diff: python/pyspark/worker.py ---
@@ -268,9 +272,11 @@ def main(infile, outfile):
# set up memory limits
memory_limit_mb
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r231707076
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/TableChange.java ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r231706583
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Table.java ---
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r230528510
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala
---
@@ -46,17 +45,22 @@ import
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
@felixcheung, we're waiting on more reviews and a community decision about
how to pass partition transforms.
For passing transforms, I think the most reasonable compromise is to go
w
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22547
@jose-torres, I don't mean that the primary purpose of the v2 API is for
catalog integration, I mean that the primary use of v2 is with tables that are
stored in some catalog. So we should make
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226798538
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/Format.java ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226798213
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/Format.java ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22547
After looking at the changes, I want to reiterate that request for a design
doc. I think that code is a great way to prototype a design, but that we need
to step back and make sure that the design
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226796934
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -173,12 +185,17 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226790252
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/InputStream.java
---
@@ -17,14 +17,18 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226789748
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java
---
@@ -15,37 +15,43 @@
* limitations under the License
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226789610
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchRead.java
---
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226785695
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/BatchScan.java ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226784919
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchRead.java
---
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226783272
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
---
@@ -106,85 +107,96 @@ private[kafka010] class
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22547
@cloud-fan, is there a design doc that outlines these changes and the new
API structure?
---
-
To unsubscribe, e-mail: reviews
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226782371
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala
---
@@ -46,17 +45,22 @@ import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r226780862
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -169,15 +174,16 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22501#discussion_r226765772
--- Diff: sql/core/benchmarks/WideSchemaBenchmark-results.txt ---
@@ -1,117 +1,145 @@
-Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14 on Mac OS X 10.11.6
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22547
@cloud-fan, sorry to look at this so late, I was out on vacation for a
little while. Is this about ready for review?
---
-
To
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22573
@dongjoon-hyun, Iceberg schema evolution is based on the field IDs, not on
names. The current table schema's names are the runtime names for columns in
that table, and all reads happen by
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22573
The approach we've taken in Iceberg is to allow `.` in names by using an
index in the top-level schema. The full path of every leaf in the schema is
produced and added to a map from the full
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22413
Thanks @MaxGekk, sorry for the original omission!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
1 - 100 of 1371 matches
Mail list logo