[ 
https://issues.apache.org/jira/browse/ARROW-11901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434765#comment-17434765
 ] 

Benjamin Wilhelm commented on ARROW-11901:
------------------------------------------

We at KNIME are currently using the official Java Arrow library for our 
upcoming table backend 
([https://www.knime.com/blog/improved-performance-with-new-table-backend] ). It 
works for us, and we will keep using it. As Samuel pointed out, it might be a 
valid idea to base the Java API on JavaCPP, but this is not the right place for 
this discussion (a thread in the mailing list?).

However, a significant problem with the Java API was/is the missing fast 
compression using LZ4. The JavaCPP project was the easiest and fastest way to 
get a very fast LZ4 API for Java (supporting frame compression as needed). I 
already implemented {{CompressionCodec}} using these bindings, and we (at 
KNIME) will use it with the next release.

Seeing where the JavaCPP is used I think it is a viable project. I could 
contribute my {{CompressionCodec}} implementation to Arrow if this is desired. 
Creating JNI bindings for LZ4 in the Arrow repository would take more time and 
I won't be able to do this soon.

> [Java] Investigate potential performance improvement of compression codec
> -------------------------------------------------------------------------
>
>                 Key: ARROW-11901
>                 URL: https://issues.apache.org/jira/browse/ARROW-11901
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Java
>            Reporter: Liya Fan
>            Assignee: Benjamin Wilhelm
>            Priority: Major
>
> In response to the discussion in 
> https://github.com/apache/arrow/pull/8949/files#r588046787
> There are some performance penalties in the implementation of the compression 
> codecs (e.g. data copying between heap/off-heap data). We need to revise the 
> code to improve the performance. 
> We should also provide some benchmarks to validate that the performance 
> actually improves. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to