Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19984
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19984
The result says it fails Spark unit tests, but clicking through shows a
count of 0.
---
-
To unsubscribe, e-mail: reviews
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19984#discussion_r158159270
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala
---
@@ -0,0 +1,343
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19984#discussion_r158158855
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousDataSourceRDDIter.scala
---
@@ -0,0 +1,205
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19984#discussion_r158156114
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala
---
@@ -0,0 +1,343
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/20012
[SPARK-22824] Restore old offset for binary compatibility
## What changes were proposed in this pull request?
Some users depend on source compatibility with the
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19984#discussion_r157278184
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1035,6 +1035,22 @@ object SQLConf {
.booleanConf
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19984
[SPARK-22789] Map-only continuous processing execution
## What changes were proposed in this pull request?
Basic continuous execution, supporting map/flatMap/filter, with commits and
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19926#discussion_r157047178
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
---
@@ -0,0 +1,407 @@
+/*
+ * Licensed
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19925#discussion_r156741131
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
---
@@ -0,0 +1,155
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19925#discussion_r156740666
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memoryV2.scala
---
@@ -0,0 +1,185 @@
+/*
+ * Licensed to the
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19925#discussion_r156542366
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
---
@@ -0,0 +1,155
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19925#discussion_r156540689
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
---
@@ -0,0 +1,155
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19925#discussion_r156539696
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
---
@@ -0,0 +1,155
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19926#discussion_r156532817
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -447,296 +384,6 @@ class StreamExecution
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19926#discussion_r156532669
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala
---
@@ -237,7 +237,7 @@ class StreamingQueryManager
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19925
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19926
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19926#discussion_r156471697
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala
---
@@ -237,7 +237,7 @@ class StreamingQueryManager
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19926#discussion_r156471355
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -783,29 +430,29 @@ class StreamExecution
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19926#discussion_r156470973
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -447,296 +384,6 @@ class StreamExecution
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19926#discussion_r156468624
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -71,27 +68,29 @@ class StreamExecution
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19925
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19926
/cc @brkyvz @zsxwing
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19925
/cc @marmbrus @cloud-fan @rxin @brkyvz @zsxwing
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19925#discussion_r155867163
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/MicroBatchWriteSupport.java
---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19926
[SPARK-22733] Split StreamExecution into MicroBatchExecution and
StreamExecution.
## What changes were proposed in this pull request?
StreamExecution is now an abstract base class
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19925
[SPARK-22732] Add Structured Streaming APIs to DataSourceV2
## What changes were proposed in this pull request?
This PR provides DataSourceV2 API support for structured streaming
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19611
One issue I want to explicitly bring up: this new unit test takes very
long, almost 2 minutes on my computer. Creating 10k files isn't going to be
super fast no matter what we do, b
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19611
[SPARK-22305] Write HDFSBackedStateStoreProvider.loadMap non-recursively
## What changes were proposed in this pull request?
Write HDFSBackedStateStoreProvider.loadMap non-recursively
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19581
[SPARK-22366] Support ignoring missing files
## What changes were proposed in this pull request?
Add a flag "spark.sql.files.ignoreMissingFiles" to parallel the exis
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19452#discussion_r144620005
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingSymmetricHashJoinHelperSuite.scala
---
@@ -0,0 +1,118
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19452#discussion_r10844
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala
---
@@ -161,6 +164,10 @@ case class
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19452#discussion_r10608
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala
---
@@ -206,10 +213,19 @@ case class
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19452#discussion_r144396781
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala
---
@@ -66,6 +67,60 @@ object
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19452#discussion_r144148695
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala
---
@@ -66,6 +67,60 @@ object
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19465
LGTM
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19461
[SPARK-22230] Swap per-row order in state store restore.
## What changes were proposed in this pull request?
In state store restore, for each row, put the saved state before the row in
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19452
[SPARK-22136][SS] Evaluate one-sided conditions early in stream-stream
joins.
## What changes were proposed in this pull request?
Evaluate one-sided conditions early in stream
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19327#discussion_r142442657
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala
---
@@ -425,6 +426,10 @@ class StreamingJoinSuite extends
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19327#discussion_r141984167
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala
---
@@ -413,36 +414,103 @@ class
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19327#discussion_r140968754
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala
---
@@ -87,70 +87,157 @@ class
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19327#discussion_r140936872
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala
---
@@ -87,70 +87,157 @@ class
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19327#discussion_r140935586
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala
---
@@ -216,22 +232,70 @@ case class
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19327#discussion_r140933095
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala
---
@@ -157,11 +164,20 @@ case class
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19327
I believe I've addressed all comments. Some refactorings made some comments
obsolete, though.
I've also fixed 1 bug and 1 test issue causing the 2 unit test failures.
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19327
[WIP] Implement stream-stream outer joins.
## What changes were proposed in this pull request?
Allow one-sided outer joins between two streams when a watermark is defined
GitHub user joseph-torres reopened a pull request:
https://github.com/apache/spark/pull/19239
[SPARK-22017] Take minimum of all watermark execs in StreamExecution.
## What changes were proposed in this pull request?
Take the minimum of all watermark exec nodes as the "
Github user joseph-torres closed the pull request at:
https://github.com/apache/spark/pull/19239
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19239
addressed comments
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19239#discussion_r139029401
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -560,13 +567,24 @@ class StreamExecution
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19239#discussion_r139029187
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -130,6 +130,13 @@ class StreamExecution
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19239#discussion_r139028613
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/EventTimeWatermarkSuite.scala
---
@@ -300,6 +300,67 @@ class
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19239
[SPARK-22017] Take minimum of all watermark execs in StreamExecution.
## What changes were proposed in this pull request?
Take the minimum of all watermark exec nodes as the "
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19212
@zsxwing for review
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19212
[SPARK-21988] Add default stats to StreamingExecutionRelation.
## What changes were proposed in this pull request?
Add default stats to StreamingExecutionRelation.
## How
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18973#discussion_r138114144
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
---
@@ -443,7 +444,8 @@ case class
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19056#discussion_r137143373
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala
---
@@ -128,8 +128,9 @@ class TextSocketSource(host: String
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19056#discussion_r137048000
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala
---
@@ -130,16 +130,7 @@ class TextSocketSource(host: String
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19056#discussion_r135851647
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala
---
@@ -126,16 +128,17 @@ class TextSocketSource(host
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19056#discussion_r135851433
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala
---
@@ -126,16 +128,17 @@ class TextSocketSource(host
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/19056#discussion_r135851225
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -39,6 +39,16 @@ abstract class Optimizer
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/19056
Addressed all comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/19056
[SPARK-21765] Check that optimization doesn't affect isStreaming bit.
## What changes were proposed in this pull request?
Add an assert in logical plan optimization tha
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18925
The underlying JIRA ticket is won'tfixed because this model doesn't seem
better.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user joseph-torres closed the pull request at:
https://github.com/apache/spark/pull/18925
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18973#discussion_r134367553
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala
---
@@ -118,8 +122,15 @@ case class MemoryStream[A : Encoder
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18973#discussion_r134367543
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -420,8 +420,10 @@ class SQLContext private[sql](val sparkSession
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18973#discussion_r134367548
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala ---
@@ -728,7 +729,16 @@ class FakeDefaultSource extends
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18973
Addressed comments from @tdas
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18973#discussion_r133844686
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala
---
@@ -43,7 +43,9 @@ object LocalRelation
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18973#discussion_r133844671
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
---
@@ -564,10 +564,14 @@ class SparkSession private
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18973#discussion_r133844681
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/LogicalPlanSuite.scala
---
@@ -75,7 +75,7 @@ class LogicalPlanSuite
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/18973
[SPARK-21765] Set isStreaming on leaf nodes for streaming plans.
## What changes were proposed in this pull request?
All logically streaming plans will now have is. This involved adding
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18925#discussion_r132795536
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
---
@@ -779,10 +780,16 @@ case object
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18925
@marmbrus
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18925
@tdas - please review
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/18925
[SPARK-21713][SC] Replace streaming bit with OutputMode
## What changes were proposed in this pull request?
* Replace LogicalPlan.isStreaming with output mode.
* Replace
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18790
done
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18790#discussion_r132238309
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -872,6 +886,25 @@ object PushDownPredicate
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18790
Agreed. I've restricted this PR to just filter, since the original story
was about enabling partition pruning for filters above the watermark.
---
If your project is set up for it, yo
GitHub user joseph-torres reopened a pull request:
https://github.com/apache/spark/pull/18790
[SPARK-21587][SS] Added pushdown through watermarks.
## What changes were proposed in this pull request?
* Filter predicates can be pushed through EventTimeWatermark if they
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18790
I'm told I can reopen this instead of making a new PR for the same branch.
Reopening with fixed commit history.
---
If your project is set up for it, you can reply to this email and
Github user joseph-torres closed the pull request at:
https://github.com/apache/spark/pull/18889
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user joseph-torres commented on the issue:
https://github.com/apache/spark/pull/18790
Created https://github.com/apache/spark/pull/18889 with everything
cherrypicked into the right place.
---
If your project is set up for it, you can reply to this email and have your
reply
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/18889
[SPARK-21587][SS] Added pushdown through watermarks.
## What changes were proposed in this pull request?
Filter predicates can be pushed through EventTimeWatermark if they
Github user joseph-torres closed the pull request at:
https://github.com/apache/spark/pull/18790
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18790#discussion_r131761258
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -872,6 +886,25 @@ object PushDownPredicate
Github user joseph-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/18840#discussion_r131707863
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/EventTimeWatermarkSuite.scala
---
@@ -391,6 +391,30 @@ class
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/18840
[SPARK-21565] Propagate metadata in attribute replacement.
## What changes were proposed in this pull request?
Propagate metadata in attribute replacement during streaming execution
GitHub user joseph-torres opened a pull request:
https://github.com/apache/spark/pull/18790
Added pushdown through watermarks.
## What changes were proposed in this pull request?
Deterministic filter predicates can now be pushed through an
EventTimeWatermark
91 matches
Mail list logo