[GitHub] spark pull request #18840: [SPARK-21565] Propagate metadata in attribute rep...

2017-08-03 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/18840 [SPARK-21565] Propagate metadata in attribute replacement. ## What changes were proposed in this pull request? Propagate metadata in attribute replacement during streaming execution

[GitHub] spark issue #18925: [SPARK-21713][SC] Replace streaming bit with OutputMode

2017-08-11 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18925 @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18925: [SPARK-21713][SC] Replace streaming bit with Outp...

2017-08-11 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/18925 [SPARK-21713][SC] Replace streaming bit with OutputMode ## What changes were proposed in this pull request? * Replace LogicalPlan.isStreaming with output mode. * Replace

[GitHub] spark issue #18925: [SPARK-21713][SC] Replace streaming bit with OutputMode

2017-08-11 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18925 @tdas - please review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18925: [SPARK-21713][SC] Replace streaming bit with Outp...

2017-08-11 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18925#discussion_r132795536 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -779,10 +780,16 @@ case object

[GitHub] spark pull request #18790: [SPARK-21587][SS] Added pushdown through watermar...

2017-08-08 Thread joseph-torres
GitHub user joseph-torres reopened a pull request: https://github.com/apache/spark/pull/18790 [SPARK-21587][SS] Added pushdown through watermarks. ## What changes were proposed in this pull request? * Filter predicates can be pushed through EventTimeWatermark if they're

[GitHub] spark pull request #18790: [SPARK-21587][SS] Added pushdown through watermar...

2017-08-08 Thread joseph-torres
Github user joseph-torres closed the pull request at: https://github.com/apache/spark/pull/18790 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-08 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18790 I'm told I can reopen this instead of making a new PR for the same branch. Reopening with fixed commit history. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #18889: [SPARK-21587][SS] Added pushdown through watermar...

2017-08-08 Thread joseph-torres
Github user joseph-torres closed the pull request at: https://github.com/apache/spark/pull/18889 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18889: [SPARK-21587][SS] Added pushdown through watermar...

2017-08-08 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/18889 [SPARK-21587][SS] Added pushdown through watermarks. ## What changes were proposed in this pull request? Filter predicates can be pushed through EventTimeWatermark if they're

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-08 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18790 Created https://github.com/apache/spark/pull/18889 with everything cherrypicked into the right place. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #18790: Added pushdown through watermarks.

2017-07-31 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/18790 Added pushdown through watermarks. ## What changes were proposed in this pull request? Deterministic filter predicates can now be pushed through an EventTimeWatermark

[GitHub] spark issue #18790: [SPARK-21587][SS] Added filter pushdown through watermar...

2017-08-09 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18790 done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-09 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18790 Agreed. I've restricted this PR to just filter, since the original story was about enabling partition pruning for filters above the watermark. --- If your project is set up for it, you can

[GitHub] spark pull request #18790: [SPARK-21587][SS] Added pushdown through watermar...

2017-08-09 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18790#discussion_r132238309 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -872,6 +886,25 @@ object PushDownPredicate

[GitHub] spark pull request #18790: [SPARK-21587][SS] Added pushdown through watermar...

2017-08-07 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18790#discussion_r131761258 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -872,6 +886,25 @@ object PushDownPredicate

[GitHub] spark pull request #18840: [SPARK-21565] Propagate metadata in attribute rep...

2017-08-07 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18840#discussion_r131707863 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/EventTimeWatermarkSuite.scala --- @@ -391,6 +391,30 @@ class

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-08-17 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r133844671 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala --- @@ -564,10 +564,14 @@ class SparkSession private

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-08-17 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r133844681 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/LogicalPlanSuite.scala --- @@ -75,7 +75,7 @@ class LogicalPlanSuite

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-08-17 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r133844686 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala --- @@ -43,7 +43,9 @@ object LocalRelation

[GitHub] spark pull request #18925: [SPARK-21713][SC] Replace streaming bit with Outp...

2017-08-22 Thread joseph-torres
Github user joseph-torres closed the pull request at: https://github.com/apache/spark/pull/18925 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18925: [SPARK-21713][SC] Replace streaming bit with OutputMode

2017-08-22 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18925 The underlying JIRA ticket is won'tfixed because this model doesn't seem better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #19239: [SPARK-22017] Take minimum of all watermark execs...

2017-09-15 Thread joseph-torres
GitHub user joseph-torres reopened a pull request: https://github.com/apache/spark/pull/19239 [SPARK-22017] Take minimum of all watermark execs in StreamExecution. ## What changes were proposed in this pull request? Take the minimum of all watermark exec nodes as the "

[GitHub] spark pull request #19239: [SPARK-22017] Take minimum of all watermark execs...

2017-09-15 Thread joseph-torres
Github user joseph-torres closed the pull request at: https://github.com/apache/spark/pull/19239 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19327 [WIP] Implement stream-stream outer joins. ## What changes were proposed in this pull request? Allow one-sided outer joins between two streams when a watermark is defined

[GitHub] spark issue #19212: [SPARK-21988] Add default stats to StreamingExecutionRel...

2017-09-13 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19212 @zsxwing for review --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19239: [SPARK-22017] Take minimum of all watermark execs in Str...

2017-09-14 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19239 addressed comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19239: [SPARK-22017] Take minimum of all watermark execs...

2017-09-14 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19239#discussion_r139028613 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/EventTimeWatermarkSuite.scala --- @@ -300,6 +300,67 @@ class

[GitHub] spark pull request #19239: [SPARK-22017] Take minimum of all watermark execs...

2017-09-14 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19239#discussion_r139029401 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -560,13 +567,24 @@ class StreamExecution

[GitHub] spark pull request #19239: [SPARK-22017] Take minimum of all watermark execs...

2017-09-14 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19239#discussion_r139029187 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -130,6 +130,13 @@ class StreamExecution

[GitHub] spark pull request #19239: [SPARK-22017] Take minimum of all watermark execs...

2017-09-14 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19239 [SPARK-22017] Take minimum of all watermark execs in StreamExecution. ## What changes were proposed in this pull request? Take the minimum of all watermark exec nodes as the "

[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

2017-10-06 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19452 [SPARK-22136][SS] Evaluate one-sided conditions early in stream-stream joins. ## What changes were proposed in this pull request? Evaluate one-sided conditions early in stream

[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

2017-10-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19452#discussion_r144396781 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala --- @@ -66,6 +67,60 @@ object

[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

2017-10-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19452#discussion_r10608 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala --- @@ -206,10 +213,19 @@ case class

[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

2017-10-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19452#discussion_r10844 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala --- @@ -161,6 +164,10 @@ case class

[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

2017-10-13 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19452#discussion_r144620005 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingSymmetricHashJoinHelperSuite.scala --- @@ -0,0 +1,118

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-29 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19056#discussion_r135851433 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala --- @@ -126,16 +128,17 @@ class TextSocketSource(host

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-29 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19056#discussion_r135851647 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala --- @@ -126,16 +128,17 @@ class TextSocketSource(host

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-29 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19056#discussion_r135851225 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -39,6 +39,16 @@ abstract class Optimizer

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-09-11 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r138114144 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -443,7 +444,8 @@ case class

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-09-05 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19056#discussion_r137143373 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala --- @@ -128,8 +128,9 @@ class TextSocketSource(host: String

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-09-05 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19056#discussion_r137048000 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala --- @@ -130,16 +130,7 @@ class TextSocketSource(host: String

[GitHub] spark pull request #19212: [SPARK-21988] Add default stats to StreamingExecu...

2017-09-12 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19212 [SPARK-21988] Add default stats to StreamingExecutionRelation. ## What changes were proposed in this pull request? Add default stats to StreamingExecutionRelation. ## How

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-25 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140933095 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala --- @@ -157,11 +164,20 @@ case class

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-25 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140935586 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala --- @@ -216,22 +232,70 @@ case class

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-25 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140936872 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -87,70 +87,157 @@ class

[GitHub] spark pull request #19327: [SPARK-22136][SS] Implement stream-stream outer j...

2017-09-29 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r141984167 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala --- @@ -413,36 +414,103 @@ class

[GitHub] spark pull request #19327: [SPARK-22136][SS] Implement stream-stream outer j...

2017-10-03 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r142442657 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala --- @@ -425,6 +426,10 @@ class StreamingJoinSuite extends

[GitHub] spark issue #19327: [WIP] Implement stream-stream outer joins.

2017-09-25 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19327 I believe I've addressed all comments. Some refactorings made some comments obsolete, though. I've also fixed 1 bug and 1 test issue causing the 2 unit test failures. There's

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-26 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140968754 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -87,70 +87,157 @@ class

[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...

2017-08-21 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/18973 Addressed comments from @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-08-21 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r134367543 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -420,8 +420,10 @@ class SQLContext private[sql](val sparkSession

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-08-21 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r134367548 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -728,7 +729,16 @@ class FakeDefaultSource extends

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-08-21 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r134367553 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala --- @@ -118,8 +122,15 @@ case class MemoryStream[A : Encoder

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19056 Addressed all comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-25 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19056 [SPARK-21765] Check that optimization doesn't affect isStreaming bit. ## What changes were proposed in this pull request? Add an assert in logical plan optimization

[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...

2017-08-17 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/18973 [SPARK-21765] Set isStreaming on leaf nodes for streaming plans. ## What changes were proposed in this pull request? All logically streaming plans will now have is. This involved adding

[GitHub] spark pull request #19461: [SPARK-22230] Swap per-row order in state store r...

2017-10-09 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19461 [SPARK-22230] Swap per-row order in state store restore. ## What changes were proposed in this pull request? In state store restore, for each row, put the saved state before the row

[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

2017-10-11 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19452#discussion_r144148695 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala --- @@ -66,6 +67,60 @@ object

[GitHub] spark issue #19465: [SPARK-21988][SS]Implement StreamingRelation.computeStat...

2017-10-11 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19465 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19926#discussion_r156471355 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -783,29 +430,29 @@ class StreamExecution

[GitHub] spark issue #19926: [SPARK-22733] Split StreamExecution into MicroBatchExecu...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19926 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19925: [SPARK-22732] Add Structured Streaming APIs to DataSourc...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19925 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19926#discussion_r156532669 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala --- @@ -237,7 +237,7 @@ class StreamingQueryManager

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19926#discussion_r156532817 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -447,296 +384,6 @@ class StreamExecution

[GitHub] spark pull request #19925: [SPARK-22732] Add Structured Streaming APIs to Da...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19925#discussion_r156539696 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala --- @@ -0,0 +1,155

[GitHub] spark pull request #19925: [SPARK-22732] Add Structured Streaming APIs to Da...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19925#discussion_r156540689 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala --- @@ -0,0 +1,155

[GitHub] spark pull request #19925: [SPARK-22732] Add Structured Streaming APIs to Da...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19925#discussion_r156542366 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala --- @@ -0,0 +1,155

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-14 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19926#discussion_r157047178 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala --- @@ -0,0 +1,407 @@ +/* + * Licensed

[GitHub] spark pull request #20012: [SPARK-22824] Restore old offset for binary compa...

2017-12-18 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/20012 [SPARK-22824] Restore old offset for binary compatibility ## What changes were proposed in this pull request? Some users depend on source compatibility

[GitHub] spark pull request #19984: [SPARK-22789] Map-only continuous processing exec...

2017-12-14 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19984 [SPARK-22789] Map-only continuous processing execution ## What changes were proposed in this pull request? Basic continuous execution, supporting map/flatMap/filter, with commits

[GitHub] spark pull request #19984: [SPARK-22789] Map-only continuous processing exec...

2017-12-15 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19984#discussion_r157278184 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1035,6 +1035,22 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #19925: [SPARK-22732] Add Structured Streaming APIs to Da...

2017-12-13 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19925#discussion_r156741131 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala --- @@ -0,0 +1,155

[GitHub] spark pull request #19925: [SPARK-22732] Add Structured Streaming APIs to Da...

2017-12-13 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19925#discussion_r156740666 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memoryV2.scala --- @@ -0,0 +1,185 @@ +/* + * Licensed

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19926#discussion_r156471697 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala --- @@ -237,7 +237,7 @@ class StreamingQueryManager

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19926#discussion_r156470973 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -447,296 +384,6 @@ class StreamExecution

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-12 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19926#discussion_r156468624 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -71,27 +68,29 @@ class StreamExecution

[GitHub] spark pull request #19925: [SPARK-22732] Add Structured Streaming APIs to Da...

2017-12-07 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19925 [SPARK-22732] Add Structured Streaming APIs to DataSourceV2 ## What changes were proposed in this pull request? This PR provides DataSourceV2 API support for structured streaming

[GitHub] spark pull request #19926: [SPARK-22733] Split StreamExecution into MicroBat...

2017-12-07 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19926 [SPARK-22733] Split StreamExecution into MicroBatchExecution and StreamExecution. ## What changes were proposed in this pull request? StreamExecution is now an abstract base class

[GitHub] spark issue #19925: [SPARK-22732] Add Structured Streaming APIs to DataSourc...

2017-12-11 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19925 /cc @marmbrus @cloud-fan @rxin @brkyvz @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19926: [SPARK-22733] Split StreamExecution into MicroBatchExecu...

2017-12-11 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19926 /cc @brkyvz @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19925: [SPARK-22732] Add Structured Streaming APIs to DataSourc...

2017-12-11 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19925 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19925: [SPARK-22732] Add Structured Streaming APIs to Da...

2017-12-08 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19925#discussion_r155867163 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/MicroBatchWriteSupport.java --- @@ -0,0 +1,58 @@ +/* + * Licensed

[GitHub] spark pull request #19611: [SPARK-22305] Write HDFSBackedStateStoreProvider....

2017-10-30 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19611 [SPARK-22305] Write HDFSBackedStateStoreProvider.loadMap non-recursively ## What changes were proposed in this pull request? Write HDFSBackedStateStoreProvider.loadMap non-recursively

[GitHub] spark issue #19611: [SPARK-22305] Write HDFSBackedStateStoreProvider.loadMap...

2017-10-30 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19611 One issue I want to explicitly bring up: this new unit test takes very long, almost 2 minutes on my computer. Creating 10k files isn't going to be super fast no matter what we do

[GitHub] spark pull request #19581: [SPARK-22366] Support ignoring missing files

2017-10-26 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19581 [SPARK-22366] Support ignoring missing files ## What changes were proposed in this pull request? Add a flag "spark.sql.files.ignoreMissingFiles" to parallel the exis

[GitHub] spark pull request #19984: [SPARK-22789] Map-only continuous processing exec...

2017-12-20 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19984#discussion_r158156114 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala --- @@ -0,0 +1,343

[GitHub] spark pull request #19984: [SPARK-22789] Map-only continuous processing exec...

2017-12-20 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19984#discussion_r158159270 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala --- @@ -0,0 +1,343

[GitHub] spark pull request #19984: [SPARK-22789] Map-only continuous processing exec...

2017-12-20 Thread joseph-torres
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/19984#discussion_r158158855 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousDataSourceRDDIter.scala --- @@ -0,0 +1,205

[GitHub] spark issue #19984: [SPARK-22789] Map-only continuous processing execution

2017-12-20 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19984 The result says it fails Spark unit tests, but clicking through shows a count of 0. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19984: [SPARK-22789] Map-only continuous processing execution

2017-12-20 Thread joseph-torres
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19984 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail