[
https://issues.apache.org/jira/browse/SPARK-54118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Huanli Wang updated SPARK-54118:
--------------------------------
Description:
In SS TWS, when we do the {{put(array) }}operation in liststate, we put the
first element and then merge the remaining elements one by one. so if we want
to put an array with 100 elements, it means we need do 1 put + 99 merges. This
can result in worse performance than a single put operation for the entire
array.
Similar, we have the same issue in {{merge(array)}}
was:
In SS TWS, when we do the `{{{}put(array)` {}}}operation in liststate, we put
the first element and then merge the remaining elements one by one. so if we
want to put an array with 100 elements, it means we need do 1 put + 99 merges.
This can result in worse performance than a single put operation for the entire
array.
Similar, we have the same issue in {{`merge(array)`}}
> Improve the put/merge operation in ListState when t here are multiple values
> ----------------------------------------------------------------------------
>
> Key: SPARK-54118
> URL: https://issues.apache.org/jira/browse/SPARK-54118
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 4.1.0
> Reporter: Huanli Wang
> Priority: Major
>
> In SS TWS, when we do the {{put(array) }}operation in liststate, we put the
> first element and then merge the remaining elements one by one. so if we want
> to put an array with 100 elements, it means we need do 1 put + 99 merges.
> This can result in worse performance than a single put operation for the
> entire array.
>
> Similar, we have the same issue in {{merge(array)}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]