Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15710#discussion_r86057467
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOutputWriter.scala
---
@@ -17,125 +17,13 @@
package
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15723
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15710#discussion_r86056877
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOutputWriter.scala
---
@@ -17,125 +17,13 @@
package
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15710#discussion_r86057220
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSinkSuite.scala
---
@@ -17,106 +17,16 @@
package
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15710#discussion_r86056587
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ManifestFileCommitProtocol.scala
---
@@ -0,0 +1,114 @@
+/*
+ * Licensed
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15724
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15702#discussion_r86053925
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/EventTimeWatermark.scala
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15354
Thanks, I'm going to merge this to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15702
Not a dumb question! You can certainly use processing time if those are
the semantics you require. I do think there is a little bit of work we need to
do to ensure determinism for these
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15702
@ekl - flaky test... Should we turn it off for now?
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15699
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15696#discussion_r85847969
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/sources/SimpleTextRelation.scala
---
@@ -141,15 +139,14 @@ class SimpleTextOutputWriter
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15702#discussion_r85845880
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/EventTimeWatermarkExec.scala
---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to
GitHub user marmbrus opened a pull request:
https://github.com/apache/spark/pull/15702
[SPARK-18124] Observed-delay based Even Time Watermarks
This PR adds a new method `withWatermark` to the `Dataset` API, which can
be used specify an _event time watermark_. An event time
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15626
Thanks for working on this! Could you include examples of the various
logs, since we are committing to this specific JSON.
---
If your project is set up for it, you can reply to this email and
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15354#discussion_r85583061
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
---
@@ -494,3 +495,46 @@ case class JsonToStruct
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15354#discussion_r85583042
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2936,6 +2936,51 @@ object functions {
def from_json(e: Column, schema
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15453
Thanks, merging to master!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15354#discussion_r85248697
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
---
@@ -123,8 +122,9 @@ private[sql] class
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15354#discussion_r85248522
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
---
@@ -29,4 +31,28 @@ object JacksonUtils {
case x
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/10162
I would not bake this logic into the logical operator, I think the general
approach of doing it in Dataset is better. I just think that we need to do it
in a way that does not change existing
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15629
I don't think we have to update deprecated methods. This LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15634
This is okay, but note that "Provider" is equally overloaded in the Data
Source API.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitH
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15483
I will try to take a look soon. Can we close this PR until we have an
updated implementation?
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15354
It would be really nice to fail in analysis rather than execution. What if
it only fails after hours of computation? As a user I'd be upset. I'm also
concerned they will think it
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15483
It would be good to post on the design / interfaces before you get too far.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15483
Thanks for working on this! However, I'm not sure that this is something
that we should merge into the core repository (Though I think its an awesome
example of how to use the `ForeachW
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15469
LGTM, merging to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/9766
Thanks, merging to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/12337
+1 to this feature! I think this might be the first step in a better story
for people trying to use `nullable = false` as an enforcement mechanism (I'd
bring this idea up on
[SPARK-17939](
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/10162
Thanks for working on this! Sorry for letting this PR go stale. While I
think this could be a good feature, I'm worried that as its implemented it
would be a breaking change (since w
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/13780
Sorry for the delay. I'm going to merge this to master. I'll update the
since versions while merging. Thanks for working on this!
---
If your project is set up for it, you can rep
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r82297835
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala
---
@@ -92,21 +105,64 @@ class TextSocketSource(host: String, port
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15284
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15307
This LGTM as a first cut. Thanks for working on it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15354#discussion_r83110773
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
---
@@ -343,4 +343,23 @@ class
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15453
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83096825
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala
---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83074367
--- Diff:
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
---
@@ -264,6 +266,44 @@ class KafkaSourceSuite
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83082691
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -516,12 +568,127 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83079220
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -351,25 +403,26 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83085871
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryStatus.scala
---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83074741
--- Diff:
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
---
@@ -264,6 +266,44 @@ class KafkaSourceSuite
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83074489
--- Diff:
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
---
@@ -264,6 +266,44 @@ class KafkaSourceSuite
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83084266
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryListenerSuite.scala
---
@@ -259,15 +260,37 @@ class
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83061090
--- Diff:
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
---
@@ -264,6 +266,44 @@ class KafkaSourceSuite
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83077567
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala
---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83082403
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala
---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83082993
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -516,12 +568,127 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83083859
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
---
@@ -17,22 +17,78 @@
package
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83079156
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -278,8 +315,14 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r83060899
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulAggregate.scala
---
@@ -86,7 +93,13 @@ case class StateStoreSaveExec
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r83057793
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r83056944
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r83056792
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/test/JavaStringLength.java ---
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r83056607
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,32 @@ def registerFunction(self, name, f,
returnType=StringType
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r83057132
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r82876529
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r82876442
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r82876419
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,26 @@ def registerFunction(self, name, f,
returnType=StringType
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r82876509
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -17,9 +17,15 @@
package org.apache.spark.sql
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15422
I agree that the data that was already read is probably good. I also think
that this is a pretty big behavior change where there are legitimate cases
(i.e. tons of data and it is fine to miss
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15392
LGTM, merging to master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/9766
+1 to this functionality, but also to the request to add more tests and
documentation. It would also to be good to comment on the idea of using SQL as
a more general way to implement this
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/9766#discussion_r82450939
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,10 @@ def registerFunction(self, name, f,
returnType=StringType
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15354#discussion_r82443073
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
---
@@ -343,4 +343,23 @@ class
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/14087
Thanks, merging to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15367
No, if we backport this I would plan to continue to backport changes (that
are safe) until the next release. Either way this should not affect what goes
into master.
---
If your project is set
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/14087#discussion_r82293680
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala
---
@@ -311,6 +311,37 @@ final class DataStreamReader
private[sql
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/14087#discussion_r82293331
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
---
@@ -378,6 +378,24 @@ class FileStreamSourceSuite extends
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15367
We should definitly vet this PR carefully to make sure its safe. One thing
that is missing from that guide, that I do believe is accepted practice, is
more leeway when the feature is marked
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15380
Merged, can you close this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15352
LGTM, I'm going to merge this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
en
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15352#discussion_r82269155
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -207,13 +207,18 @@ class StreamExecution
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15380
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15362
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81889941
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -525,8 +645,62 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81873789
--- Diff: project/MimaExcludes.scala ---
@@ -53,7 +53,14 @@ object MimaExcludes {
ProblemFilters.exclude[ReversedMissingMethodProblem
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81873207
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala
---
@@ -0,0 +1,244 @@
+/*
+ * Licensed to the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81882888
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryInfo.scala
---
@@ -30,8 +30,15 @@ import
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81872358
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
---
@@ -100,28 +110,138 @@ class StreamingQuerySuite extends
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81872395
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
---
@@ -100,28 +110,138 @@ class StreamingQuerySuite extends
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81873307
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -511,12 +572,71 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81874159
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -136,16 +145,30 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81872770
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryInfo.scala
---
@@ -30,8 +30,15 @@ import
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81875491
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -525,8 +645,62 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81873537
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -525,8 +645,62 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81872893
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/SourceStatus.scala ---
@@ -26,9 +26,13 @@ import
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81882933
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala
---
@@ -0,0 +1,244 @@
+/*
+ * Licensed to the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81874295
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -105,11 +111,14 @@ class StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81873108
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala
---
@@ -0,0 +1,244 @@
+/*
+ * Licensed to the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15102#discussion_r81844299
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -0,0 +1,231 @@
+---
+layout: global
+title: Structured Streaming + Kafka
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15102#discussion_r81840127
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -0,0 +1,231 @@
+---
+layout: global
+title: Structured Streaming + Kafka
Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15333
This seems like a reasonable simplification to me. A little bit of history
(though this has diverged significantly, so don't take this authoritative): I
think this complexity all stems
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15102#discussion_r81834388
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -0,0 +1,231 @@
+---
+layout: global
+title: Structured Streaming + Kafka
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15102#discussion_r81836478
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -530,3 +530,8 @@ object StreamExecution
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15102#discussion_r81836317
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamTest.scala ---
@@ -469,29 +469,49 @@ trait StreamTest extends QueryTest with
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15102#discussion_r81833814
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -0,0 +1,231 @@
+---
+layout: global
+title: Structured Streaming + Kafka
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/15102#discussion_r81835883
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
---
@@ -0,0 +1,282 @@
+/*
+ * Licensed to
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/14087#discussion_r81610590
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala
---
@@ -21,13 +21,13 @@ import scala.collection.JavaConverters
301 - 400 of 6860 matches
Mail list logo