ash. This PR fixed this issue.
## How was this patch tested?
The new unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18822 from zsxwing/SPARK-21546.
(cherry picked from commit 0d26b3aa55f9cc75096b0e2b309f64fe3270b9a5)
Signed-off-by: Shixiong Zhu <shixi...@databricks.co
Repository: spark
Updated Branches:
refs/heads/master 77cc0d67d -> 4cc704b12
[CORE][MINOR] Improve the error message of checkpoint RDD verification
### What changes were proposed in this pull request?
The original error message is pretty confusing. It is unable to tell which
number is
Repository: spark
Updated Branches:
refs/heads/master 24367f23f -> e16e8c7ad
[SPARK-21146][CORE] Master/Worker should handle and shutdown when any thread
gets UncaughtException
## What changes were proposed in this pull request?
Adding the default UncaughtExceptionHandler to the Worker.
##
Repository: spark
Updated Branches:
refs/heads/master 9760c15ac -> d0bfc6733
[SPARK-21069][SS][DOCS] Add rate source to programming guide.
## What changes were proposed in this pull request?
SPARK-20979 added a new structured streaming source: Rate source. This patch
adds the corresponding
Repository: spark
Updated Branches:
refs/heads/branch-2.2 576fd4c3a -> ab12848d6
[SPARK-21069][SS][DOCS] Add rate source to programming guide.
## What changes were proposed in this pull request?
SPARK-20979 added a new structured streaming source: Rate source. This patch
adds the
ing it.
## How was this patch tested?
The new unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18638 from zsxwing/SPARK-21421.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2d968a07
Tree: http:
Repository: spark
Updated Branches:
refs/heads/master 53465075c -> 9d8c83179
[SPARK-21409][SS] Expose state store memory usage in SQL metrics and progress
updates
## What changes were proposed in this pull request?
Currently, there is no tracking of memory usage of state stores. This JIRA
Zhu <shixi...@databricks.com>
Closes #18478 from zsxwing/SPARK-21253-2.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cfc696f4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cfc696f4
Diff: http://git-wip-us.a
ong Zhu <shixi...@databricks.com>
Closes #18478 from zsxwing/SPARK-21253-2.
(cherry picked from commit cfc696f4a4289acf132cb26baf7c02c5b6305277)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.a
Repository: spark
Updated Branches:
refs/heads/master 18066f2e6 -> f9151bebc
[SPARK-21188][CORE] releaseAllLocksForTask should synchronize the whole method
## What changes were proposed in this pull request?
Since the objects `readLocksByTask`, `writeLocksByTask` and `info`s are coupled
and
des in
StreamExecution is interrupted. It also removes an optimization in
`runUninterruptibly` to make sure this method never throw
`InterruptedException`.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18461 from zsxwing/SPARK-21248.
Project: http:
Repository: spark
Updated Branches:
refs/heads/master 838effb98 -> e68aed70f
[SPARK-21216][SS] Hive strategies missed in Structured Streaming
IncrementalExecution
## What changes were proposed in this pull request?
If someone creates a HiveSession, the planner in `IncrementalExecution`
Repository: spark
Updated Branches:
refs/heads/master 40c7add3a -> e5bb26174
[SPARK-21329][SS] Make EventTimeWatermarkExec explicitly UnaryExecNode
## What changes were proposed in this pull request?
Making EventTimeWatermarkExec explicitly UnaryExecNode
/cc tdas zsxwing
##
Repository: spark
Updated Branches:
refs/heads/branch-2.2 4e53a4edd -> 576fd4c3a
[SPARK-21267][SS][DOCS] Update Structured Streaming Documentation
## What changes were proposed in this pull request?
Few changes to the Structured Streaming documentation
- Clarify that the entire stream input
ess` to enable/disable it. Credit goes to aramesh117
Closes #17024
## How was this patch tested?
The new unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Author: Aaditya Ramesh <aram...@conviva.com>
Closes #17789 from zsxwing/pr17024.
Project: http://git-wip-us.apache.org/repos
ess` to enable/disable it. Credit goes to aramesh117
Closes #17024
## How was this patch tested?
The new unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Author: Aaditya Ramesh <aram...@conviva.com>
Closes #17789 from zsxwing/pr17024.
(cherry pic
Repository: spark
Updated Branches:
refs/heads/branch-2.2 7446be332 -> f6d56d2f1
[SPARK-21596][SS] Ensure places calling HDFSMetadataLog.get check the return
value
Same PR as #18799 but for branch 2.2. Main discussion the other PR.
When I was investigating a flaky test, I realized
ted?
The new unit tests
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18803 from zsxwing/avg.
(cherry picked from commit 7f63e85b47a93434030482160e88fe63bf9cff4e)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/re
ted?
The new unit tests
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18803 from zsxwing/avg.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7f63e85b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7f63
t;
Author: Andrey Taptunov <taptu...@amazon.com>
Closes #18848 from zsxwing/review-pr18623.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/43f9c84b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/43f9c84b
Diff:
Repository: spark
Updated Branches:
refs/heads/branch-2.2 43f9c84b6 -> fa92a7be7
[SPARK-21565][SS] Propagate metadata in attribute replacement.
## What changes were proposed in this pull request?
Propagate metadata in attribute replacement during streaming execution. This is
necessary for
Repository: spark
Updated Branches:
refs/heads/master 4f7ec3a31 -> cce25b360
[SPARK-21565][SS] Propagate metadata in attribute replacement.
## What changes were proposed in this pull request?
Propagate metadata in attribute replacement during streaming execution. This is
necessary for
Repository: spark
Updated Branches:
refs/heads/master 300807c6e -> 16612638f
[SPARK-21517][CORE] Avoid copying memory when transfer chunks remotely
## What changes were proposed in this pull request?
In our production cluster,oom happens when NettyBlockRpcServer receive
OpenBlocks
Repository: spark
Updated Branches:
refs/heads/branch-2.2 f0e80aa2d -> 36d807906
[SPARK-19965][SS] DataFrame batch reader may fail to infer partitions when
reading FileStreamSink's output
## The Problem
Right now DataFrame batch reader may fail to infer partitions when reading
Repository: spark
Updated Branches:
refs/heads/master 527fc5d0c -> 6b9e49d12
[SPARK-19965][SS] DataFrame batch reader may fail to infer partitions when
reading FileStreamSink's output
## The Problem
Right now DataFrame batch reader may fail to infer partitions when reading
FileStreamSink's
Repository: spark
Updated Branches:
refs/heads/branch-2.2 dd9e3b2c9 -> 5844151bc
[SPARK-20600][SS] KafkaRelation should be pretty printed in web UI
## What changes were proposed in this pull request?
User-friendly name of `KafkaRelation` in web UI (under Details for Query).
### Before
Repository: spark
Updated Branches:
refs/heads/master 3aa4e464a -> 7144b5180
[SPARK-20600][SS] KafkaRelation should be pretty printed in web UI
## What changes were proposed in this pull request?
User-friendly name of `KafkaRelation` in web UI (under Details for Query).
### Before
Repository: spark
Updated Branches:
refs/heads/branch-2.2 d191b962d -> 7600a7ab6
[SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWatermark()`
does not execute
## What changes were proposed in this pull request?
Any Dataset/DataFrame batch query with the operation
Repository: spark
Updated Branches:
refs/heads/master f79aa285c -> c0189abc7
[SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWatermark()`
does not execute
## What changes were proposed in this pull request?
Any Dataset/DataFrame batch query with the operation
Repository: spark
Updated Branches:
refs/heads/master d2416925c -> 499ba2cb4
[SPARK-20717][SS] Minor tweaks to the MapGroupsWithState behavior
## What changes were proposed in this pull request?
Timeout and state data are two independent entities and should be settable
independently.
Repository: spark
Updated Branches:
refs/heads/branch-2.2 82ae1f0ac -> a79a120a8
[SPARK-20717][SS] Minor tweaks to the MapGroupsWithState behavior
## What changes were proposed in this pull request?
Timeout and state data are two independent entities and should be settable
independently.
Repository: spark
Updated Branches:
refs/heads/branch-2.2 0bd918f67 -> 82ae1f0ac
[SPARK-20716][SS] StateStore.abort() should not throw exceptions
## What changes were proposed in this pull request?
StateStore.abort() should do a best effort attempt to clean up temporary
resources. It should
Repository: spark
Updated Branches:
refs/heads/master e1aaab1e2 -> 271175e2b
[SPARK-20716][SS] StateStore.abort() should not throw exceptions
## What changes were proposed in this pull request?
StateStore.abort() should do a best effort attempt to clean up temporary
resources. It should not
ask is finishing but being killed at the same time.
The fix is pretty easy, just flip the "finished" flag when a task is successful.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18021 from zsxwing/SPARK-20788.
Project: http://git-wip-
ask is finishing but being killed at the same time.
The fix is pretty easy, just flip the "finished" flag when a task is successful.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18021 from zsxwing/SPARK-20788.
(cherry
low
`Await.ready`.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17763 from zsxwing/awaitready.
(cherry picked from commit 324a904d8e80089d8865e4c7edaedb92ab2ec1b2)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wi
low
`Await.ready`.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17763 from zsxwing/awaitready.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/324a904d
Tree: http://git-wip-us.a
How was this patch tested?
The new added unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17821 from zsxwing/SPARK-20529.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9150bca4
Tree: http://git-wip-us.apache.
How was this patch tested?
The new added unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17821 from zsxwing/SPARK-20529.
(cherry picked from commit 9150bca47e4b8782e20441386d3d225eb5f2f404)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wi
ted` to
propagate the original error.
It also fixes an issue that `TaskCompletionListenerException.getMessage`
doesn't include `previousError`.
## How was this patch tested?
New unit tests.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17942 from zsxwing/SPARK-20702.
Project: h
ter to `TaskContextImpl.markTaskCompleted` to
propagate the original error.
It also fixes an issue that `TaskCompletionListenerException.getMessage`
doesn't include `previousError`.
## How was this patch tested?
New unit tests.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17942 from zsxwing/SPARK-20702.
(cher
Repository: spark
Updated Branches:
refs/heads/branch-2.2 7123ec8e1 -> f14246959
[SPARK-20714][SS] Fix match error when watermark is set with timeout = no
timeout / processing timeout
## What changes were proposed in this pull request?
When watermark is set, and timeout conf is NoTimeout or
Repository: spark
Updated Branches:
refs/heads/master 7d6ff3910 -> 0d3a63193
[SPARK-20714][SS] Fix match error when watermark is set with timeout = no
timeout / processing timeout
## What changes were proposed in this pull request?
When watermark is set, and timeout conf is NoTimeout or
e added tests.
Author: Shixiong Zhu <shixi...@databricks.com>
Author: Michael Armbrust <mich...@databricks.com>
Closes #18199 from zsxwing/rate.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/74a432d3
Tree: http://git-wip
ted the
structured streaming programming guide.
zsxwing This is the PR to fix version 2.1 as discussed in PR #18342
Author: assafmendelson <assaf.mendel...@gmail.com>
Closes #18363 from assafmendelson/spark-21123-for-spark2.1.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Com
Repository: spark
Updated Branches:
refs/heads/master ad459cfb1 -> 7a00c658d
[SPARK-21147][SS] Throws an analysis exception when a user-specified schema is
given in socket/rate sources
## What changes were proposed in this pull request?
This PR proposes to throw an exception if a schema is
ers.
## How was this patch tested?
The added unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18381 from zsxwing/SPARK-21167.
(cherry picked from commit d66b143eec7f604595089f72d8786edbdcd74282)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project:
How was this patch tested?
The added unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18381 from zsxwing/SPARK-21167.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d66b143e
Tree: http://git-wip-us.a
ers.
## How was this patch tested?
The added unit test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18381 from zsxwing/SPARK-21167.
(cherry picked from commit d66b143eec7f604595089f72d8786edbdcd74282)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project:
Repository: spark
Updated Branches:
refs/heads/master 19331b8e4 -> e55a105ae
[SPARK-20599][SS] ConsoleSink should work with (batch)
## What changes were proposed in this pull request?
Currently, if we read a batch and want to display it on the console sink, it
will lead a runtime exception.
Repository: spark
Updated Branches:
refs/heads/master 1ebe7ffe0 -> 2ebd0838d
[SPARK-21192][SS] Preserve State Store provider class configuration across
StreamingQuery restarts
## What changes were proposed in this pull request?
If the SQL conf for StateStore provider class is changed
Repository: spark
Updated Branches:
refs/heads/master 6b3d02285 -> 5282bae04
[SPARK-21153] Use project instead of expand in tumbling windows
## What changes were proposed in this pull request?
Time windowing in Spark currently performs an Expand + Filter, because there is
no way to
Repository: spark
Updated Branches:
refs/heads/branch-2.2 f7fcdec6c -> 7b50736c4
[SPARK-21123][DOCS][STRUCTURED STREAMING] Options for file stream source are in
a wrong table
## What changes were proposed in this pull request?
The description for several options of File Source for
Repository: spark
Updated Branches:
refs/heads/master e92ffe6f1 -> 66a792cd8
[SPARK-21123][DOCS][STRUCTURED STREAMING] Options for file stream source are in
a wrong table
## What changes were proposed in this pull request?
The description for several options of File Source for structured
Repository: spark
Updated Branches:
refs/heads/master 0fd84b05d -> d935e0a9d
[SPARK-20844] Remove experimental from Structured Streaming APIs
Now that Structured Streaming has been out for several Spark release and has
large production use cases, the `Experimental` label is no longer
Repository: spark
Updated Branches:
refs/heads/master d935e0a9d -> 473d7552a
[SPARK-20014] Optimize mergeSpillsWithFileStream method
## What changes were proposed in this pull request?
When the individual partition size in a spill is small,
mergeSpillsWithTransferTo method does many small
Repository: spark
Updated Branches:
refs/heads/branch-2.2 92837aeb4 -> 2b59ed4f1
[SPARK-20844] Remove experimental from Structured Streaming APIs
Now that Structured Streaming has been out for several Spark release and has
large production use cases, the `Experimental` label is no longer
Repository: spark
Updated Branches:
refs/heads/branch-2.2 f99456b5f -> 92837aeb4
[SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB
bytecode size limit
## What changes were proposed in this pull request?
When an expression for `df.filter()` has many nodes (e.g.
How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18126 from zsxwing/SPARK-20843.
(cherry picked from commit 6c1dbd6fc8d49acf7c1c902d2ebf89ed5e788a4e)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wip-us.apache.org/
How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18126 from zsxwing/SPARK-20843.
(cherry picked from commit 6c1dbd6fc8d49acf7c1c902d2ebf89ed5e788a4e)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wip-us.apache.org/
How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18126 from zsxwing/SPARK-20843.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6c1dbd6f
Tree: http://git-wip-us.apache.org/repos/
Repository: spark
Updated Branches:
refs/heads/branch-2.2 dc51be1e7 -> 26640a269
[SPARK-20907][TEST] Use testQuietly for test suites that generate long log
output
## What changes were proposed in this pull request?
Supress console output by using `testQuietly` in test suites
## How was
Repository: spark
Updated Branches:
refs/heads/master ef9fd920c -> c9749068e
[SPARK-20907][TEST] Use testQuietly for test suites that generate long log
output
## What changes were proposed in this pull request?
Supress console output by using `testQuietly` in test suites
## How was this
Repository: spark
Updated Branches:
refs/heads/master 9150bca47 -> 6f62e9d9b
[SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB
bytecode size limit
## What changes were proposed in this pull request?
When an expression for `df.filter()` has many nodes (e.g. 400),
[SPARK-20883][SPARK-20376][SS] Refactored StateStore APIs and added conf to
choose implementation
## What changes were proposed in this pull request?
A bunch of changes to the StateStore APIs and implementation.
Current state store API has a bunch of problems that causes too many transient
Repository: spark
Updated Branches:
refs/heads/master 4bb6a53eb -> fa757ee1d
http://git-wip-us.apache.org/repos/asf/spark/blob/fa757ee1/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
Repository: spark
Updated Branches:
refs/heads/master 1c7db00c7 -> 96a4d1d08
[SPARK-19968][SS] Use a cached instance of `KafkaProducer` instead of creating
one every batch.
## What changes were proposed in this pull request?
In summary, cost of recreating a KafkaProducer for writing every
Repository: spark
Updated Branches:
refs/heads/branch-2.2 3b79e4cda -> f6730a70c
[SPARK-19968][SS] Use a cached instance of `KafkaProducer` instead of creating
one every batch.
## What changes were proposed in this pull request?
In summary, cost of recreating a KafkaProducer for writing
org/apache/spark/sql/execution/datasources/DataSource.scala#L402),
it doesn't make things worse.
## How was this patch tested?
The new added test.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18149 from zsxwing/SPARK-20894.
Project: http://git-wip-us.apache.org/repos/asf/s
8168 from zsxwing/SPARK-20940.
(cherry picked from commit 24db35826a81960f08e3eb68556b0f51781144e1)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a607a26b
Tree:
8168 from zsxwing/SPARK-20940.
(cherry picked from commit 24db35826a81960f08e3eb68556b0f51781144e1)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cd870c0c
Tree:
8168 from zsxwing/SPARK-20940.
(cherry picked from commit 24db35826a81960f08e3eb68556b0f51781144e1)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dade85f7
Tree:
org/jira/browse/SPARK-20666) is an example
of killing SparkContext due to `IllegalAccessError`). I think the correct type
of exception in AccumulatorV2 should be `IllegalStateException`.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18168 fro
e added tests.
Author: Shixiong Zhu <shixi...@databricks.com>
Author: Michael Armbrust <mich...@databricks.com>
Closes #18199 from zsxwing/rate.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/220943d8
Tree: http://git-wip
Repository: spark
Updated Branches:
refs/heads/branch-2.2 38edb9256 -> 6f0d29672
[SPARK-20464][SS] Add a job group and description for streaming queries and fix
cancellation of running jobs using the job group
## What changes were proposed in this pull request?
Job group: adding a job group
Repository: spark
Updated Branches:
refs/heads/master ab30590f4 -> 6fc6cf88d
[SPARK-20464][SS] Add a job group and description for streaming queries and fix
cancellation of running jobs using the job group
## What changes were proposed in this pull request?
Job group: adding a job group is
PR changes `offsets.topic.num.partitions` from the default value 50 to 1
to make creating `__consumer_offsets` (50 partitions -> 1 partition) much
faster.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17863 from zsxwing/fix-kafka-flaky-te
PR changes `offsets.topic.num.partitions` from the default value 50 to 1
to make creating `__consumer_offsets` (50 partitions -> 1 partition) much
faster.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17863 from zsxwing/fix-kafka-flaky-te
ges `offsets.topic.num.partitions` from the default value 50 to 1
to make creating `__consumer_offsets` (50 partitions -> 1 partition) much
faster.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #17863 from zsxwing/fix-kafka-flaky-test.
P
so that people
can run `bin/run-example StructuredKafkaWordCount ...`.
## How was this patch tested?
manually tested it.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18101 from zsxwing/add-missing-example-dep.
(cherry picked from commit 98c3852986a2cb5f2d249d6c8ef602be283bd90e)
S
so that people
can run `bin/run-example StructuredKafkaWordCount ...`.
## How was this patch tested?
manually tested it.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18101 from zsxwing/add-missing-example-dep.
(cherry picked from commit 98c3852986a2cb5f2d249d6c8ef602be283bd90e)
S
ple
can run `bin/run-example StructuredKafkaWordCount ...`.
## How was this patch tested?
manually tested it.
Author: Shixiong Zhu <shixi...@databricks.com>
Closes #18101 from zsxwing/add-missing-example-dep.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http:
Repository: spark
Updated Branches:
refs/heads/master bbd8d7def -> 9d6661c82
[SPARK-20792][SS] Support same timeout operations in mapGroupsWithState
function in batch queries as in streaming queries
## What changes were proposed in this pull request?
Currently, in the batch queries, timeout
Repository: spark
Updated Branches:
refs/heads/branch-2.2 3aad5982a -> cfd1bf0be
[SPARK-20792][SS] Support same timeout operations in mapGroupsWithState
function in batch queries as in streaming queries
## What changes were proposed in this pull request?
Currently, in the batch queries,
Repository: spark
Updated Branches:
refs/heads/master bc537e40a -> 88a23d3de
[SPARK-20991][SQL] BROADCAST_TIMEOUT conf should be a TimeoutConf
## What changes were proposed in this pull request?
The construction of BROADCAST_TIMEOUT conf should take the TimeUnit argument as
a TimeoutConf.
Repository: spark
Updated Branches:
refs/heads/master 7c7266208 -> 1e978b17d
[SPARK-21113][CORE] Read ahead input stream to amortize disk IO cost â¦
Profiling some of our big jobs, we see that around 30% of the time is being
spent in reading the spill files from disk. In order to amortize
Repository: spark
Updated Branches:
refs/heads/master ddd7f5e11 -> 054ddb2f5
[SPARK-21988] Add default stats to StreamingExecutionRelation.
## What changes were proposed in this pull request?
Add default stats to StreamingExecutionRelation.
## How was this patch tested?
existing unit tests
Zhu <zsxw...@gmail.com>
Closes #19314 from zsxwing/SPARK-22094.
(cherry picked from commit fedf6961be4e99139eb7ab08d5e6e29187ea5ccf)
Signed-off-by: Shixiong Zhu <zsxw...@gmail.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/a
uld return.
## How was this patch tested?
The new unit test.
Author: Shixiong Zhu <zsxw...@gmail.com>
Closes #19314 from zsxwing/SPARK-22094.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fedf6961
Tree: http:
7-9c2b-7bf80b153adb.png;>
Author: Shixiong Zhu <zsxw...@gmail.com>
Closes #19432 from zsxwing/SPARK-22203.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c8affec2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c8af
Repository: spark
Updated Branches:
refs/heads/master 08b204fd2 -> debcbec74
[SPARK-21947][SS] Check and report error when monotonically_increasing_id is
used in streaming query
## What changes were proposed in this pull request?
`monotonically_increasing_id` doesn't work in Structured
Repository: spark
Updated Branches:
refs/heads/master 3823dc88d -> 1bb8b7604
[MINOR][SS] keyWithIndexToNumValues" -> "keyWithIndexToValue"
## What changes were proposed in this pull request?
This PR changes `keyWithIndexToNumValues` to `keyWithIndexToValue`.
There will be directories on
Repository: spark
Updated Branches:
refs/heads/master 6a2325448 -> 445f1790a
[SPARK-9104][CORE] Expose Netty memory metrics in Spark
## What changes were proposed in this pull request?
This PR exposes Netty memory usage for Spark's `TransportClientFactory` and
`TransportServer`, including
Repository: spark
Updated Branches:
refs/heads/master acdf45fb5 -> fa0092bdd
[SPARK-21901][SS] Define toString for StateOperatorProgress
## What changes were proposed in this pull request?
Just `StateOperatorProgress.toString` + few formatting fixes
## How was this patch tested?
Local
Repository: spark
Updated Branches:
refs/heads/branch-2.2 9afab9a52 -> 342cc2a4c
[SPARK-21901][SS] Define toString for StateOperatorProgress
## What changes were proposed in this pull request?
Just `StateOperatorProgress.toString` + few formatting fixes
## How was this patch tested?
Local
Repository: spark
Updated Branches:
refs/heads/master d3abb3699 -> 763b83ee8
[SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF` in
SparkConf.
## What changes were proposed in this pull request?
TCP parameters like SO_RCVBUF and SO_SNDBUF can be set in SparkConf, and
Repository: spark
Updated Branches:
refs/heads/master 0bdbefe9d -> 12f0d2422
[SPARK-21880][WEB UI] In the SQL table page, modify jobs trace information
## What changes were proposed in this pull request?
As shown below, for example, When the job 5 is running, It was a mistake to
think that
Repository: spark
Updated Branches:
refs/heads/master 155ab6347 -> 71c2b81aa
[SPARK-22230] Swap per-row order in state store restore.
## What changes were proposed in this pull request?
In state store restore, for each row, put the saved state before the row in the
iterator instead of after.
ted?
- unit tests: `StreamingRelation.computeStats` and
`StreamingExecutionRelation.computeStats`.
- regression tests: `explain join with a normal source` and `explain join with
MemoryStream`.
Author: Shixiong Zhu <zsxw...@gmail.com>
Closes #19465 from zsxwing/SPARK-21988.
Project: http:
non-streaming events, streaming query listeners don't need to wait for
other Spark listeners and can catch up.
## How was this patch tested?
Jenkins
Author: Shixiong Zhu <zsxw...@gmail.com>
Closes #19838 from zsxwing/SPARK-22638.
Project: http://git-wip-us.apache.org/repos/asf/spark/re
501 - 600 of 751 matches
Mail list logo