[ 
https://issues.apache.org/jira/browse/NIFI-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847219#comment-15847219
 ] 

ASF GitHub Bot commented on NIFI-3420:
--------------------------------------

Github user ilganeli commented on the issue:

    https://github.com/apache/nifi/pull/1457
  
    I've added unit tests but am closing this issue for now. There is a 
substantial blocker to this approach since it leverages the classes from Hadoop 
which themselves depend on natively compiled and loaded C code. Unless NiFi 
explicitly adds the C-code for the Lz4 codec and manually builds and loads that 
library, we won't be able to use the Codec in Hadoop.
    
    I've also evaluated using the lz4-java library instead but this does not 
generate data in a Hadoop readable format. 


> NIFI Should support generating Hadoop-readable Lz4 outside of HDFS Write
> ------------------------------------------------------------------------
>
>                 Key: NIFI-3420
>                 URL: https://issues.apache.org/jira/browse/NIFI-3420
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Ilya Ganelin
>
> Per https://issues.apache.org/jira/browse/HADOOP-12990 data stored in Lz4 
> format on Hadoop is in a different format from the data generated by the Lz4 
> CLI. The Lz4 CLI can also not be used to generate the Hadoop-compatible 
> format. 
> At the moment, NiFi does not support compression to Lz4 for streaming data.
> Although PutHdfs in the Hadoop processors supports writing out Lz4 to HDFS 
> (assuming the appropriate codec exists), if data is instead being saved to 
> something like S3 or simply streamed, there's no way to generate Lz4 
> compressed data.
> If the Lz4 command line tool is used within a custom processor to perform Lz4 
> conversion, this data will then not be readable on Hadoop if it's 
> subsequently loaded to HDFS.
> A processor can be added that performs the conversion streaming data into the 
> Lz4 format that IS readable on Hadoop by using the Hadoop Lz4 Codec to do the 
> compression. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to