Hi Karl, Here I have attached the result from File System -> Tika Transform -> Null Output. Please find the attachment.
Thank you, Chalitha On Fri, Jul 17, 2015 at 6:41 PM, Karl Wright <[email protected]> wrote: > I don't see this here. > > I set up the following: > - file system repository connection > - null output connection > - tika extractor > - a job using all three > > Running the job and looking at the simple history, I see null output > connection ingestion records that have proper document sizes. > > Can you repeat the same setup there, and tell me what you get? > > Thanks, > Karl > > Sent from my Windows Phone > ------------------------------ > From: chalitha udara Perera > Sent: 7/17/2015 8:46 AM > To: Karl Wright > Cc: [email protected] > Subject: Re: Repository document stream empty after Tika Transformation > > Hi Karl, > > I'm using 2.1 release and I am using only the Solr output connector. If > you look at the inputstream size ( > document.getBinaryLength()) after tika connector it is zero. > > Thanks, > Chalitha > > On Fri, Jul 17, 2015 at 6:08 PM, Karl Wright <[email protected]> wrote: > >> The document stream contains what tika extracts. If it can't extract >> anything then you will have an empty stream. >> >> It is also possible that if the stream is split, you are tripping over a >> bug that was fixed some time ago. What mcf version is this, and do you >> have more than one output? >> >> Karl >> >> Sent from my Windows Phone >> ------------------------------ >> From: chalitha udara Perera >> Sent: 7/17/2015 7:25 AM >> To: [email protected] >> Subject: Repository document stream empty after Tika Transformation >> >> Hi All, >> >> I'm writing a transformation connector to extract low level features from >> images. First I used that connector without tika extractor and I worked >> fine. But when I used it with Tika connector (after tika) if fails to >> extract features. After debugging I found out that the stream is empty >> after tika transformation. >> Actually inside tika connector, it creates a new in memory or file stream >> output, but original input stream is never copied to it. Connector should >> reset binary stream after utilizing the stream to get metadata so the >> original inputstream is available from connector to connector. >> >> Here I have attached a simple solution of stream copy and reset that >> worked for me. >> >> Thanks, >> Chalitha >> >> -- >> J.M Chalitha Udara Perera >> >> *Department of Computer Science and Engineering,* >> *University of Moratuwa,* >> *Sri Lanka* >> > > > > -- > J.M Chalitha Udara Perera > > *Department of Computer Science and Engineering,* > *University of Moratuwa,* > *Sri Lanka* > -- J.M Chalitha Udara Perera *Department of Computer Science and Engineering,* *University of Moratuwa,* *Sri Lanka*
