Mark,

Thanks for this clarification, getting a new stream the second time is very convenient.

Yes, this answers my question except that this isn't what I observe, but I'm probably mistaken in my observation. I'm actually giving the stream to Tika for MIME-type identification if the type isn't already known, then asking Tika to parse it (based on the known type).

I'll stop back by if I need to, but your confirmation tells me what I'm seeing is likely some other effect.

Thanks.


On 09/28/2016 10:42 AM, Mark Payne wrote:
Russ,

Each time that you call session.read(), you're going to get a new InputStream 
that starts at the beginning
of the FlowFile. So you can just call session.read() twice. For example:

final AtomicBoolean processContents = new AtomicBoolean(false);
session.read(flowFile, new InputStreamCallback() {
     public void process(InputStream in) {
        // read contents
       readContents.set( someValue );
     }
});

if (processContents.get()) {
     session.read(flowFile, new InputStreamCallback() {
         public void process(InputStream in) {
                // we now have a new InputStream that starts at the beginning 
of the FlowFile.
         }
     });
}


Does this answer your question sufficiently?

Thanks
-Mark


On Sep 28, 2016, at 12:30 PM, Russell Bateman 
<[email protected]> wrote:

This is more a Java question, I'm guessing. I have experimented unconvincingly 
using Apache Commons I/O TeeInputStream, but let me back up...

I just need to, in some cases, consume the input stream:

   // under some condition, look into the flowfile contents to see if
   something's there...
   session.read( flowfile, new InputStreamCallback()
   {
      @Override
   public void process( InputStream( in ) throws IOException
      {
        // read from in..
      }
   } );


then, later (and always) consume it (so, sometimes a second time):

   session.read( flowfile, new InputStreamCallback()
   {
      @Override
      public void process( InputStream( in ) throws IOException
      {
        // read from in..
      }
   } );


Obviously, the content's gone at that point if I've already consumed it.

What should I do here instead? I don't have control over the close(), do I?

Thanks for any comment,

Russ

Reply via email to