[
https://issues.apache.org/jira/browse/ARROW-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647653#comment-16647653
]
Animesh Trivedi commented on ARROW-3496:
----------------------------------------
To give you an idea what I have so far
([https://github.com/animeshtrivedi/benchmarking-arrow)|https://github.com/animeshtrivedi/benchmarking-arrow).]
(its README is outdated). A standalone java program to :
i) basic data generation template to generate data for integers, longs, binary
column types (we can extend to include any arbitrary types and schema)
ii) In-memory data buffers to hold the generated data in the memory (either on
on or off heap buffers).
iii) readers to consume the generated data using various APIs (calling get*(),
or the holder API variant, or just writing your own readers from the direct
byte buffers).
The whole benchmark is multi-threaded and all 3 steps can be done in parallel.
It is the last step usually what is benchmarked. Obviously the current code
base has a whole lot more code for my own testing and understanding, but we can
clean it up gradually.
Where do we want to have this code? and how should a user run this? May be part
of the default build process where benchmark is compiled as a separate jar
(arrow-java-benchmarks-0.12.jar, something like this)
> [Java] Add microbenchmark code to Java
> --------------------------------------
>
> Key: ARROW-3496
> URL: https://issues.apache.org/jira/browse/ARROW-3496
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Java
> Affects Versions: 0.11.0
> Reporter: Li Jin
> Priority: Major
>
> [~atrivedi] has done some microbenchmarking with the Java API. Let's consider
> adding them to the codebase.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)