xiaochen-zhou opened a new pull request, #8768:
URL: https://github.com/apache/seatunnel/pull/8768

   ### Purpose of this pull request
   
   <!-- Describe the purpose of this pull request. For example: This pull 
request adds checkstyle plugin.-->
   
   When users explicitly set QUERY_TABLET_SIZE instead of using the default 
value Integer.MAX_VALUE, the returned List<QueryPartition> partitions contains 
partitions with the same beAddress.
   
   ```java
   List<StarRocksSourceSplit> getStarRocksSourceSplit() {
     List<StarRocksSourceSplit> sourceSplits = new ArrayList<>();
     List<QueryPartition> partitions = 
starRocksQueryPlanReadClient.findPartitions();
     for (int i = 0; i < partitions.size(); i++) {
       sourceSplits.add(
         new StarRocksSourceSplit(
           partitions.get(i), String.valueOf(partitions.get(i).hashCode())));
     }
     return sourceSplits;
   }
   
   ```
   
   To avoid establishing a connection to the BE for each split during the read 
process, the StarRocksBeReadClient is cached based on the beAddress from the 
split's partition. However, in StarRocksBeReadClient#openScanner(), some 
variables are not being reset, leading to the potential use of stale values 
from the cache. For example, if eos (end of stream) is true from a previous 
partition read, it can cause data loss for new partitions on the same BE.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   no
   
   
   ### How was this patch tested?
   
   add new tests
   
   
   ### Check list
   
   * [ ] If any new Jar binary package adding in your PR, please add License 
Notice according
     [New License 
Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md)
   * [ ] If necessary, please update the documentation to describe the new 
feature. https://github.com/apache/seatunnel/tree/dev/docs
   * [ ] If you are contributing the connector code, please check that the 
following files are updated:
     1. Update 
[plugin-mapping.properties](https://github.com/apache/seatunnel/blob/dev/plugin-mapping.properties)
 and add new connector information in it
     2. Update the pom file of 
[seatunnel-dist](https://github.com/apache/seatunnel/blob/dev/seatunnel-dist/pom.xml)
     3. Add ci label in 
[label-scope-conf](https://github.com/apache/seatunnel/blob/dev/.github/workflows/labeler/label-scope-conf.yml)
     4. Add e2e testcase in 
[seatunnel-e2e](https://github.com/apache/seatunnel/tree/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/)
     5. Update connector 
[plugin_config](https://github.com/apache/seatunnel/blob/dev/config/plugin_config)
   * [ ] Update the 
[`release-note`](https://github.com/apache/seatunnel/blob/dev/release-note.md).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to