HeartSaVioR commented on a change in pull request #34333:
URL: https://github.com/apache/spark/pull/34333#discussion_r739946197
##########
File path: docs/structured-streaming-programming-guide.md
##########
@@ -517,6 +517,8 @@ There are a few built-in sources.
- **Rate source (for testing)** - Generates data at the specified number of
rows per second, each output row contains a `timestamp` and `value`. Where
`timestamp` is a `Timestamp` type containing the time of message dispatch, and
`value` is of `Long` type containing the message count, starting from 0 as the
first row. This source is intended for testing and benchmarking.
Review comment:
I'd love to address it, but honestly I have no idea other than below
representation:
* rate: Generates data at the specified number of rows per second
* rate per micro-batch: Generates data at the specified number of rows per
micro-batch
Specified number of rows "per XXX" says the specified number of rows will be
presented per XXX, so the main point they should check is the unit (per XXX).
"per second" vs "per micro-batch" doesn't seem to make confusion IMHO.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]