Re: CRC ContentHandler

2017-02-22 Thread Wshrdryr Corp
t; DigestAlgorithm.*MD5*)) > > > > If you need modifications, please open a ticket. > > > > *From:* Wshrdryr Corp [mailto:wshrd...@gmail.com] > *Sent:* Wednesday, February 15, 2017 7:49 PM > > *To:* user@tika.apache.org > *Subject:* Re: CRC ContentHandler > > &g

RE: CRC ContentHandler

2017-02-16 Thread Allison, Timothy B.
: user@tika.apache.org Subject: Re: CRC ContentHandler Hello Markus, Thanks again for taking the time to reply. I guess I should be more specific: I am extending a Nifi component to add this CRC calculation, specifically here: https://github.com/apache/nifi/blob/0.x/nifi-nar-bundles/nifi-media

Re: CRC ContentHandler

2017-02-15 Thread Wshrdryr Corp
- > > From:Wshrdryr Corp > > Sent: Thursday 16th February 2017 0:43 > > To: user@tika.apache.org > > Subject: Re: CRC ContentHandler > > > > Hello Markus, > > > > Thanks for replying. > > > > I was hoping not to have to buffer entire medi

RE: CRC ContentHandler

2017-02-15 Thread Markus Jelsma
file at some point before sending it to Apache Tika, hashing the data is, in this case, not a problem. Markus -Original message- > From:Wshrdryr Corp > Sent: Thursday 16th February 2017 0:43 > To: user@tika.apache.org > Subject: Re: CRC ContentHandler > > Hello Ma

Re: CRC ContentHandler

2017-02-15 Thread Wshrdryr Corp
Hello Markus, Thanks for replying. I was hoping not to have to buffer entire media files due to size. Is there a way to get the content segment as a stream? The internal buffering of a stream might be more efficient and less prone to spikes. Java is not my native tongue. I've been able to hack t

RE: CRC ContentHandler

2017-02-15 Thread Markus Jelsma
Hello - i don't know if media files even produce SAX events, but if they do you can catch them in your startElement, charachters, and endElement methods. I would start collecting element names (qName and/or attribute values) and stuff in the character method, and append those to a StringBuilder.