[
https://issues.apache.org/jira/browse/ARROW-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Liya Fan updated ARROW-6172:
----------------------------
Summary: [Java] Provide benchmarks to set IntVector with different methods
(was: [Java] Avoid creating value holders repeatedly when reading data from
JDBC)
> [Java] Provide benchmarks to set IntVector with different methods
> -----------------------------------------------------------------
>
> Key: ARROW-6172
> URL: https://issues.apache.org/jira/browse/ARROW-6172
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Java
> Reporter: Liya Fan
> Assignee: Liya Fan
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 3h 10m
> Remaining Estimate: 0h
>
> When converting JDBC data to Arrow data. A value holder is created for each
> single value. The following code snippet gives an example:
> NullableSmallIntHolder holder = new NullableSmallIntHolder();
> holder.isSet = isNonNull ? 1 : 0;
> if (isNonNull) {
> holder.value = (short) value;
> }
> smallIntVector.setSafe(rowCount, holder);
> smallIntVector.setValueCount(rowCount + 1);
>
> This is inefficient, both in terms of memory usage, and computational
> efficiency.
> For most types, we can improve the performance by directly setting the value.
> For example, the benchmarks on IntVector show that a 20% performance
> improvement can be achieved by directly setting the int value:
>
> Benchmark Mode Cnt Score Error Units
> IntBenchmarks.setIntDirectly avgt 5 15.397 ± 0.018 us/op
> IntBenchmarks.setWithValueHolder avgt 5 19.198 ± 0.789 us/op
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)