Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22358#discussion_r218916987
--- Diff: docs/sql-programming-guide.md ---
@@ -965,6 +965,8 @@ Configuration of Parquet can be done using the
`setConf` method on `SparkSession
Github user rdblue closed the pull request at:
https://github.com/apache/spark/pull/13206
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22413
+1, thanks for fixing this, @dongjoon-hyun!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22388
Thanks for doing this, @cloud-fan! Sorry I'm late to reply, I was at Strata
all last week.
---
-
To unsubscribe, e-mail: reviews
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21308
@tigerquoll, I'm talking about the DataSourceV2 API in general. I'm not
sure if I think there is value in exposing partitions, but I'd be happy to hear
why you think they are valuable and think
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21308#discussion_r216382704
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/DeleteSupport.java ---
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21308
@tigerquoll, I'm not debating whether we should or shouldn't expose
partitions here. In general, I'm undecided. I don't think that the API proposed
here needs to support a first-class partition
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21308
@tigerquoll, there is currently no support to expose partitions through the
v2 API. That would be a different operation. If you wanted to implement
partition operations through this API, then you
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
@tigerquoll, the proposal isn't to make partitions part of table
configuration. It is to make the partitioning scheme part of the table
configuration. How sources choose to handle individual
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21308
@tigerquoll, what we come up with needs to work across a variety of data
sources, including those like JDBC that can delete at a lower granularity than
partition.
For Hive tables
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
> Can we support column range partition predicates please?
This has an "apply" transform for passing other functions directly through,
so that may help if you have addition
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r214978286
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/TableChange.java ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r214977998
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/TableChange.java ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22255
@npoberezkin, Parquet already supports custom key-value metadata in the
file footer. The Spark version would go
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22298
Looks fine to me, but I'm not familiar enough with the K8S code to have
much of an opinion.
---
-
To unsubscribe, e-mail
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22281
Thanks for working on this, @HyukjinKwon. I think it's great that this is
getting the conversation started. I agree with @cloud-fan that we should think
through how we want v2 to work
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22255
I don't think this fits the intent of the model name. The model name is
intended to encode what the data model was that was written to Parquet. I can
write Avro records to a Parquet file
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
@vanzin, thanks for merging! And thanks to everyone for the reviews!
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r213777162
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -62,14 +63,20 @@ private[spark] object PythonEvalType
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
The last couple of commits have failed a test case, but there have been no
code changes since a passing test. I think master is just a bit flaky right now
and that this PR is fine
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r213407352
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -62,14 +63,20 @@ private[spark] object PythonEvalType
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22193
@HyukjinKwon, those changes probably don't need to be in this PR, but this
is just a demonstration that we can remove `SaveMode` without changing test
cases. The larger issue is that this doesn't
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/2
@xuanyuanking, while this does remove the hack, it doesn't address the
underlying problem. The problem is that there is a single RDD, which may
contain InternalRow or may contain ColumnarBatch
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r213122284
--- Diff: docs/configuration.md ---
@@ -179,6 +179,15 @@ of the most common options to set are:
(e.g. 2g, 8g
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r213121178
--- Diff: docs/configuration.md ---
@@ -179,6 +179,15 @@ of the most common options to set are:
(e.g. 2g, 8g
Github user rdblue closed the pull request at:
https://github.com/apache/spark/pull/22206
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22206
@HyukjinKwon and @viirya, thank you for looking at this commit, but I like
@cloud-fan's approach to fixing this in #22244 better than this work-around.
I'm going to close this in favor
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22244
@cloud-fan, I like this solution better than adding a special case in the
v2 conversion to physical plan. This explains why the Python exec nodes weren't
already in the tree! I'd much rather commit
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22206#discussion_r213098202
--- Diff: python/pyspark/sql/tests.py ---
@@ -6394,6 +6394,17 @@ def test_invalid_args(self):
df.withColumn('mean_v', mean_udf(df['v
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r213035238
--- Diff:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
---
@@ -91,6 +91,13 @@ private[spark] class Client
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r212782657
--- Diff:
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/BaseYarnClusterSuite.scala
---
@@ -161,6 +162,11 @@ abstract class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r212714476
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -114,6 +114,10 @@ package object config {
.checkValue(_ >
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22206#discussion_r212676265
--- Diff: python/pyspark/sql/tests.py ---
@@ -6394,6 +6394,17 @@ def test_invalid_args(self):
df.withColumn('mean_v', mean_udf(df['v
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22206#discussion_r212489210
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
---
@@ -130,10 +133,22 @@ object
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22190
Retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22206#discussion_r212488266
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
---
@@ -130,10 +133,22 @@ object
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
Looks like tests are passing now.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22190
@rxin, @cloud-fan, @jose-torres: this is the update to add `WriteConfig`.
There's one failed test that I think is unrelated, so this is ready for you to
have a look. This will probably need
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/22206
SPARK-25213: Add project to v2 scans before python filters.
## What changes were proposed in this pull request?
The v2 API always adds a projection when converting to physical plan
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22190
Retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
This is close. The Java and Scala tests were passing and I think I fixed
the remaining issue for the Python tests. Unfortunately, Scala tests are
failing again and I was trying to run tests a couple
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/22193
[SPARK-25186][SQL] Remove v2 save mode.
## What changes were proposed in this pull request?
This removes `SaveMode` from the v2 write API. Overwrite is temporarily
implemented by deleting
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22190
This is related to #21308, which adds `DeleteSupport`. Both
`BatchOverwriteSupport` and `DeleteSupport` use the same input to remove data
(`Filter[]`) and can reject deletes that don't align
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22190#discussion_r212121224
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/MicroBatchWriteSupport.scala
---
@@ -18,27 +18,38 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22190#discussion_r212120878
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/BatchPartitionOverwriteSupport.java
---
@@ -0,0 +1,44 @@
+/*
+ * Licensed
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22190#discussion_r212120411
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/BatchPartitionOverwriteSupport.java
---
@@ -0,0 +1,44 @@
+/*
+ * Licensed
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22190#discussion_r212119716
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/BatchOverwriteSupport.java
---
@@ -0,0 +1,61 @@
+/*
+ * Licensed
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22190#discussion_r212118021
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
---
@@ -279,10 +277,7 @@ private[kafka010] class
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/22190
SPARK-25188: Add WriteConfig to v2 write API.
## What changes were proposed in this pull request?
This updates the v2 write path to a similar structure as the v2 read path.
Individual
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22185
+1 when tests pass. Convenient that there weren't any internal
implementations using this.
---
-
To unsubscribe, e-mail: reviews
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
Retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
Retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22009
@cloud-fan, I think that the scan config builder needs to accept the
options and that SaveMode needs to be removed before we should merge this PR.
I'm fine with following up with the WriteConfig
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r211763717
--- Diff:
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/BaseYarnClusterSuite.scala
---
@@ -161,6 +162,11 @@ abstract class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r211763465
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -60,14 +61,20 @@ private[spark] object PythonEvalType
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
@BryanCutler, thanks for taking a look at this.
Despite the problems you hit when the limit was set too low, I think we do
want to use that limit. It was the most reliable one from our
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r211692277
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/BatchReadSupportProvider.java
---
@@ -18,48 +18,44 @@
package
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
Looks like the test problems were caused by accessing the SparkConf through
either SparkContext or SparkSession on the executor side. The Scala tests are
passing and I've fixed a couple more
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21306#discussion_r211057651
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/catalog/v2/V1MetadataTable.scala
---
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
Retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
@rxin, I've updated this to use a new interface, `PartitionTransform`,
instead of `Expression`. This is used to pass well-known transformations when
creating tables, like `Filter` is used to pass
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21308
@rxin, I've updated this API to use `Filter` instead of `Expression`. I'd
ideally like to get it in soon if you guys have a chance to review it. It's
pretty small.
cc @cloud-fan
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21308#discussion_r210382412
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/DeleteSupport.java ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21123
> We should do things incrementally and always prepare for the worst case.
This is why I'm pushing for Append support and adding interfaces to finish
the logical plans. Releasing logi
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21123
> this mapping is not mentioned in the logical plan standardization design
doc and I doubt if it's doable
I agree! This is why I propose we add an entirely new API for v2 with cl
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r210317086
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/BatchEvalPythonExec.scala
---
@@ -69,7 +67,7 @@ case class BatchEvalPythonExec(udfs
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
@holdenk, what could be the cause?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21306
@cloud-fan, on the dev thread about 2.4 you talked about getting this PR
in. What do we need to do next?
I can call a vote on the SPIP if you think that's ready. I just bumped the
thread
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21123
@gatorsmile, I agree. The new logical plans should clean up cases where
behavior is ambiguous, examples of which are pointed out in the background of
that SPIP.
The problem I'm referring
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
You're right on the line number. Maybe it was that I hadn't done a full
rebuild in this branch locally before running the test. I'll look into the
other error if that's consistent in the Jenkins
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
@squito, the last Jenkins test had a different error message and two of the
tests didn't show the stdout/stderr output. I updated the other tests to show
that output, and hopefully we get a run
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r210024440
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/StreamingReadSupport.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r210021990
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/StreamingReadSupport.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21123
@HyukjinKwon, I don't think there has been a discussion about how v1 and v2
compatibility will work. Discussion on #22009 brought up one aspect of it:
whether v2 sources should be passed a `SaveMode
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209998505
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateControlMicroBatchReadSupport.scala
---
@@ -0,0 +1,31
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209997335
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/ReadSupport.java
---
@@ -45,9 +45,6 @@
* Note that, this may not be a full
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209996219
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
---
@@ -76,41 +76,43 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r209992878
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -60,14 +61,20 @@ private[spark] object PythonEvalType
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r209771361
--- Diff:
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/BaseYarnClusterSuite.scala
---
@@ -179,17 +185,23 @@ abstract class
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
@squito, I updated `YarnClusterSuite` in d58ad7a to capture the output of
the child processes to find out what is causing the test failures. I think it
is related to the commit you reviewed
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21978
@cloud-fan, on the dev thread about 2.4 you talked about getting this PR
in. What do we need to do next?
I can call a vote on the SPIP if you think that's ready. I just bumped the
thread
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209726516
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
---
@@ -76,41 +76,43 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209712363
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/BatchWriteSupportProvider.java
---
@@ -21,33 +21,39 @@
import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r209707560
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala
---
@@ -137,13 +135,12 @@ case class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r209707290
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -60,14 +61,20 @@ private[spark] object PythonEvalType
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r209703555
--- Diff: python/pyspark/worker.py ---
@@ -259,6 +260,26 @@ def main(infile, outfile):
"PYSPARK_DRIVER_PYTHON are corr
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r209703452
--- Diff:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
---
@@ -91,6 +91,13 @@ private[spark] class Client
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
@gatorsmile, we tried both RLIMIT_HEAP and RLIMIT_RSS but those limits
didn't consistently work.
---
-
To unsubscribe, e-mail
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/21977#discussion_r209702097
--- Diff: python/pyspark/worker.py ---
@@ -259,6 +260,26 @@ def main(infile, outfile):
"PYSPARK_DRIVER_PYTHON are corr
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209699841
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java
---
@@ -27,10 +27,10 @@
@InterfaceStability.Evolving
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209698964
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/BatchWriteSupportProvider.java
---
@@ -21,33 +21,39 @@
import
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209697665
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/StreamingReadSupport.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22043
Thanks for reviewing, @cloud-fan!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209094259
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/StreamingReadSupport.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209093885
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/MicroBatchReadSupport.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209044995
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/ContinuousPartitionReaderFactory.java
---
@@ -0,0 +1,71
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209042787
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -39,52 +36,43 @@ case class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209042604
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala
---
@@ -51,18 +58,19 @@ class DataSourceRDD[T: ClassTag
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209042348
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala
---
@@ -51,18 +58,19 @@ class DataSourceRDD[T: ClassTag
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209042148
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/StreamingReadSupport.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed
101 - 200 of 1244 matches
Mail list logo