Hi Karl, I'm using 2.1 release and I am using only the Solr output connector. If you look at the inputstream size ( document.getBinaryLength()) after tika connector it is zero.
Thanks, Chalitha On Fri, Jul 17, 2015 at 6:08 PM, Karl Wright <[email protected]> wrote: > The document stream contains what tika extracts. If it can't extract > anything then you will have an empty stream. > > It is also possible that if the stream is split, you are tripping over a > bug that was fixed some time ago. What mcf version is this, and do you > have more than one output? > > Karl > > Sent from my Windows Phone > ------------------------------ > From: chalitha udara Perera > Sent: 7/17/2015 7:25 AM > To: [email protected] > Subject: Repository document stream empty after Tika Transformation > > Hi All, > > I'm writing a transformation connector to extract low level features from > images. First I used that connector without tika extractor and I worked > fine. But when I used it with Tika connector (after tika) if fails to > extract features. After debugging I found out that the stream is empty > after tika transformation. > Actually inside tika connector, it creates a new in memory or file stream > output, but original input stream is never copied to it. Connector should > reset binary stream after utilizing the stream to get metadata so the > original inputstream is available from connector to connector. > > Here I have attached a simple solution of stream copy and reset that > worked for me. > > Thanks, > Chalitha > > -- > J.M Chalitha Udara Perera > > *Department of Computer Science and Engineering,* > *University of Moratuwa,* > *Sri Lanka* > -- J.M Chalitha Udara Perera *Department of Computer Science and Engineering,* *University of Moratuwa,* *Sri Lanka*
