[
https://issues.apache.org/jira/browse/BEAM-13137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476093#comment-17476093
]
Etienne Chauchot commented on BEAM-13137:
-----------------------------------------
[~egalpin] Thanks for taking a look ! There use to be stats specific parameters
in tests that were removed when passed to testContainers:
- _settings.put("index.store.stats_refresh_interval", 0)_ in the index settings
in test.
- _request.addParameters(Collections.singletonMap("refresh",
"wait_for"));_
Maybe it is the cause of more flakiness in stats related features (testSplit
and testSizes).
They were only for test so did not impact production clusters of users.
> make ElasticsearchIO$BoundedElasticsearchSource#getEstimatedSizeBytes
> deterministic
> -----------------------------------------------------------------------------------
>
> Key: BEAM-13137
> URL: https://issues.apache.org/jira/browse/BEAM-13137
> Project: Beam
> Issue Type: Improvement
> Components: io-java-elasticsearch
> Reporter: Etienne Chauchot
> Assignee: Evan Galpin
> Priority: P2
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Index size estimation is statistical in ES and varies. But it is the base for
> splitting so it needs to be more deterministic because that causes flakiness
> in the UTests in _testSplit_ and _testSizes_ and maybe entails sub-optimal
> splitting in production in some cases.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)