GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/20445
[SPARK-23092][SQL] Migrate MemoryStream to DataSourceV2 APIs
## What changes were proposed in this pull request?
This PR migrates the MemoryStream to DataSourceV2 APIs. It fixes a few
things along the way.
1. Fixed bug in DataSourceV2ScanExec that prevents it from being
canonicalized, required for some tests to pass (StreamingDeduplicateSuite)
2. Changed the reported keys in StreamingQueryProgress.durationMs.
- "getOffset" and "getBatch" replaced with "setOffsetRange" and
"getEndOffset" as tracking that makese more sense. Unit tests changed
accordingly.
## How was this patch tested?
Existing unit tests, few updated unit tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tdas/spark SPARK-23092
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20445.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20445
----
commit 7c09b376eef6a4e6c118c78ad9459cb55e59e67f
Author: Burak Yavuz <brkyvz@...>
Date: 2018-01-11T16:44:19Z
save for so far
commit 78c50f860aa13f569669f4ad77f4325d80085c8b
Author: Burak Yavuz <brkyvz@...>
Date: 2018-01-12T18:27:49Z
Save so far
commit 2777b5b38596a1fb68bcf8ee928aec1a58dc372c
Author: Burak Yavuz <brkyvz@...>
Date: 2018-01-13T01:43:03Z
save so far
commit 50a541b5890f328a655a7ef1fca4f8480b9a35f0
Author: Burak Yavuz <brkyvz@...>
Date: 2018-01-16T19:14:08Z
Compiles and I think also runs correctly
commit fd61724c6afcab5831fe8c602ad134d0c473184b
Author: Burak Yavuz <brkyvz@...>
Date: 2018-01-16T19:25:39Z
save
commit 7a0b564bd0c74525ebcea55b31f9658b1c2f0e12
Author: Burak Yavuz <brkyvz@...>
Date: 2018-01-16T19:28:31Z
fix merge conflicts
commit a81c2ecdafd54a2c5bfb07c6f1f53546eaa96c7c
Author: Burak Yavuz <brkyvz@...>
Date: 2018-01-16T22:26:28Z
fix hive
commit 1a4f4108118d976857778916b18499b4e0bf140c
Author: Tathagata Das <tathagata.das1565@...>
Date: 2018-01-27T01:11:01Z
Undo changes to HiveSessionStateBuilder.scala
commit 083e93c26fd2d1e8c4c738b251a27724115a0001
Author: Tathagata Das <tathagata.das1565@...>
Date: 2018-01-27T01:11:06Z
Merge remote-tracking branch 'apache-github/master' into HEAD
commit a817c8d40e4ecaf5e4e0c46f43313c5cceeec54e
Author: Tathagata Das <tathagata.das1565@...>
Date: 2018-01-29T22:27:22Z
Fixed the setOffsetRange bug
commit 35b8854ae466e0313ff926cc1efb8c423d3eefea
Author: Tathagata Das <tathagata.das1565@...>
Date: 2018-01-30T20:42:56Z
Fixed DataSourceV2ScanExec canonicalization bug
commit e66d809fe501b19b923a88d1b4cb9df69b4ae329
Author: Tathagata Das <tathagata.das1565@...>
Date: 2018-01-31T00:57:59Z
Fixed metrics reported by MicroBatchExecution
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]