[
https://issues.apache.org/jira/browse/ARROW-11901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448907#comment-17448907
]
Samuel Audet commented on ARROW-11901:
--------------------------------------
{quote}I was not referring to binding to the C++ implementation here but
directly to the LZ4 library. It looks like JavaCPP makes this efficient from a
developer perspective. But the
[API|https://github.com/bytedeco/javacpp-presets/pull/1094/files#diff-3d9af736e997982d68098d986670f05ff40ae0cc62773a1dd0eb418e55990317R38]
isn't quite what I imagined, it looks like it goes through ByteBuffer, when
all we really need is something like [ZSTD
API|https://github.com/luben/zstd-jni/blob/master/src/main/java/com/github/luben/zstd/Zstd.java#L454].
For such a minimal API I'm ambivalent on taking on a new dependency here.
{quote}
Could you expand on this point? Why do you consider zstd-jni to be minimal, but
not code generated with JavaCPP? To me it looks like zstd-jni is a lot larger
in size than the JavaCPP Presets for LZ4, even when considering only the builds
in common:
[https://repo1.maven.org/maven2/com/github/luben/zstd-jni/1.5.0-4/]
[https://repo1.maven.org/maven2/org/bytedeco/lz4/1.9.3-1.5.6/]
As for the non-ByteBuffer API, what you are looking for are the overloads
taking Pointer, which is just a fancy wrapper around a long value:
[https://github.com/bytedeco/javacpp-presets/blob/master/lz4/src/gen/java/org/bytedeco/lz4/global/lz4.java#L188]
That does exactly like zstd-jni!
> [Java] Investigate potential performance improvement of compression codec
> -------------------------------------------------------------------------
>
> Key: ARROW-11901
> URL: https://issues.apache.org/jira/browse/ARROW-11901
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Java
> Reporter: Liya Fan
> Priority: Major
>
> In response to the discussion in
> https://github.com/apache/arrow/pull/8949/files#r588046787
> There are some performance penalties in the implementation of the compression
> codecs (e.g. data copying between heap/off-heap data). We need to revise the
> code to improve the performance.
> We should also provide some benchmarks to validate that the performance
> actually improves.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)