Weston Pace created ARROW-15820:
-----------------------------------
Summary: [C++][Doc] Add table_source to streaming_execution.rst &
clarify parameter name
Key: ARROW-15820
URL: https://issues.apache.org/jira/browse/ARROW-15820
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Weston Pace
Currently the table_source node does not appear in our documentation.
Also, in {{TableSourceNodeOptions}} we have:
{noformat}
// Size of batches to emit from this node
// If the table is larger the node will emit multiple batches from the
// the table to be processed in parallel.
int64_t batch_size;
{noformat}
However, when looking into a performance issue today, I realized this
description is incomplete. In reality we should probably call this parameter
{{max_batch_size}}.
Furthermore, we should make it clear that a table with smaller batches will
emit smaller batches directly (this is a good thing in my case) and will not
concatenate small batches together into a larger batch.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)