S3tuit opened a new issue, #5972: URL: https://github.com/apache/hop/issues/5972
### Apache Hop version? SNAPSHOT-20251107 ### Java version? 17 ### Operating system Linux ### What happened? In the ByteArrayHashIndex data structure, the size attribute is currently being initiated as the length of the array but then is used to keep track of the entries in the array. This causes confusion in the code and triggers the resizing during the first insertion since length is already greater than the Threshold. Moreover, cleaning a bit the useless vars and checks, and running this [micro benchmark](https://gist.github.com/S3tuit/d0da517c56e0b7cc89792273a4544dd7), it's possible to increase the performance a bit: ``` PUT(update) = The index is already fully populated with all keys. For every put_update, the key used is one that already exists in the table. PUT(fill from empty) = New empty index. ==================================================== Benchmark with 100000 entries ==================================================== GET old: 913.16 ns/op GET new: 888.05 ns/op PUT(update) old: 842.21 ns/op PUT(update) new: 885.90 ns/op PUT(fill from empty) old: 1009.46 ns/op PUT(fill from empty) new: 891.25 ns/op ==================================================== Benchmark with 500000 entries ==================================================== GET old: 1044.61 ns/op GET new: 1081.67 ns/op PUT(update) old: 1062.27 ns/op PUT(update) new: 1188.03 ns/op PUT(fill from empty) old: 1550.29 ns/op PUT(fill from empty) new: 975.93 ns/op ``` ### Issue Priority Priority: 3 ### Issue Component Component: Transforms -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
