The document stream contains what tika extracts.  If it can't extract
anything then you will have an empty stream.

It is also possible that if the stream is split, you are tripping over a
bug that was fixed some time ago.  What mcf version is this, and do you
have more than one output?

Karl

Sent from my Windows Phone
------------------------------
From: chalitha udara Perera
Sent: 7/17/2015 7:25 AM
To: [email protected]
Subject: Repository document stream empty after Tika Transformation

Hi All,

I'm writing a transformation connector to extract low level features from
images. First I used that connector without tika extractor and I worked
fine. But when I used it with Tika connector (after tika) if fails to
extract features. After debugging I found out that the stream is empty
after tika transformation.
Actually inside tika connector, it creates a new in memory or file stream
output, but original input stream is never copied to it. Connector should
reset binary stream after utilizing the stream to get metadata so the
original inputstream is available from connector to connector.

Here I have attached a simple solution of stream copy and reset that worked
for me.

Thanks,
Chalitha

-- 
J.M Chalitha Udara Perera

*Department of Computer Science and Engineering,*
*University of Moratuwa,*
*Sri Lanka*

Reply via email to