[ 
https://issues.apache.org/jira/browse/NIFI-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15247588#comment-15247588
 ] 

ASF GitHub Bot commented on NIFI-615:
-------------------------------------

Github user jskora commented on the pull request:

    https://github.com/apache/nifi/pull/252#issuecomment-211868734
  
    @joewitt On [NIFI-1717|https://issues.apache.org/jira/browse/NIFI-1717] and 
[NIFI-1718|https://issues.apache.org/jira/browse/NIFI-1718] Dmitry Goldenberg 
and I discussed using Tika to extract content (OCR) documents and images.  
@markap14 also suggested removing the filters.
    
    I don't know where the OCR changes stand, those tickets have been quiet for 
a couple of weeks.  I think that's a tougher capability to test, and as pointed 
out on [NIFI-1717|https://issues.apache.org/jira/browse/NIFI-1717] and 
[NIFI-1718|https://issues.apache.org/jira/browse/NIFI-1718] it is an expensive 
process that may need special consideration.
    
    As for the filters, I like having them in the processor, especially since 
this one includes filename and mimetype filters.  If consensus is to remove 
them, I can update the PR for that, but I think they are affective for this 
purpose as it currently is.
    
    I don't think we should hold this for the OCR, but if you want the filters 
removed let me know.  It'd be nice to get the metadata functionality in.


> Create a processor to extract WAV file characteristics
> ------------------------------------------------------
>
>                 Key: NIFI-615
>                 URL: https://issues.apache.org/jira/browse/NIFI-615
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Brandon DeVries
>            Assignee: Joe Skora
>            Priority: Minor
>
> Create a processor to extract information from a WAV file, including 
> encoding, bit rate, metadata, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to