[ 
https://issues.apache.org/jira/browse/NIFI-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991318#comment-15991318
 ] 

ASF GitHub Bot commented on NIFI-3724:
--------------------------------------

Github user alopresto commented on the issue:

    https://github.com/apache/nifi/pull/1712
  
    I ran `contrib-check` and all tests and both passed fine. I have minor 
comments on the code above but nothing serious. 
    
    I loaded a template provided by Bryan which generated flowfiles, merged 
them, and wrote them to Parquet format (on local disk using the `core-site.xml` 
referenced above), then fetched those files and wrote them out as CSV. 
    
    ```
    hw12203:/Users/alopresto/Workspace/scratch/NIFI-3724 (master) alopresto
    🔓 1s @ 14:59:53 $ ll
    total 24
    drwxr-xr-x    6 alopresto  staff   204B May  1 14:59 ./
    drwxr-xr-x  105 alopresto  staff   3.5K May  1 14:45 ../
    -rw-r--r--@   1 alopresto  staff   6.0K May  1 14:55 .DS_Store
    -rw-r--r--    1 alopresto  staff   129B May  1 14:41 core-site.xml
    drwxr-xr-x    2 alopresto  staff    68B May  1 14:59 csv/
    drwxr-xr-x    2 alopresto  staff    68B May  1 14:59 parquet/
    hw12203:/Users/alopresto/Workspace/scratch/NIFI-3724 (master) alopresto
    🔓 9s @ 15:00:03 $ tl
    .
    ├── [6.0K]  .DS_Store
    ├── [ 129]  core-site.xml
    ├── [ 238]  csv/
    │   ├── [  54]  257951968574779
    │   ├── [3.5M]  257962982705055
    │   ├── [  54]  257981981063720
    │   ├── [3.7M]  257986105785832
    │   └── [  54]  258011981257869
    └── [ 476]  parquet/
        ├── [  16]  .257951968574779.crc
        ├── [6.5K]  .257962982705055.crc
        ├── [  16]  .257981981063720.crc
        ├── [6.6K]  .257986105785832.crc
        ├── [  16]  .258011981257869.crc
        ├── [6.5K]  .258013234789061.crc
        ├── [ 758]  257951968574779
        ├── [829K]  257962982705055
        ├── [ 758]  257981981063720
        ├── [842K]  257986105785832
        ├── [ 758]  258011981257869
        └── [833K]  258013234789061
    
    2 directories, 19 files
    hw12203:/Users/alopresto/Workspace/scratch/NIFI-3724 (master) alopresto
    🔓 86s @ 15:01:30 $ more csv/258011981257869
    name,favorite_number,favorite_color
    Bryan,693421,blue
    ```
    
    If the `displayName` comments are fixed, I am +1 and ready to merge. Thanks 
Bryan.  
    
    One minor issue:
    * On template import, the processors which referenced a controller service 
were invalid. Configuring each (they showed "Incompatible Controller Service 
Configured") by selecting the same option from the list fixed the issue. This 
doesn't seem like an issue introduced by any code in this PR, however.  


> Add Put/Fetch Parquet Processors
> --------------------------------
>
>                 Key: NIFI-3724
>                 URL: https://issues.apache.org/jira/browse/NIFI-3724
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Bryan Bende
>            Assignee: Bryan Bende
>            Priority: Minor
>             Fix For: 1.2.0
>
>
> Now that we have the record reader/writer services currently in master, it 
> would be nice to have reader and writers for Parquet. Since Parquet's API is 
> based on the Hadoop Path object, and not InputStreams/OutputStreams, we can't 
> really implement direct conversions to and from Parquet in the middle of a 
> flow, but we can we can perform the conversion by taking any record format 
> and writing to a Path as Parquet, or reading Parquet from a Path and writing 
> it out as another record format.
> We should add a PutParquet that uses a record reader and writes records to a 
> Path as Parquet, and a FetchParquet that reads Parquet from a path and writes 
> out records to a flow file using a record writer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to