Miklos Pocsaji wrote:
2. I started writing a TextFilter which knows how to extract text from
PDF (I implemented the TextFilter interface). It is simple, I only
have to return a java.io.Reader from which Jackrabbit extracts text.
Obvious and ugly method would be to extract a text to a string and
then return a StringReader but this would require a lot of memory. I
decided to use PiperReader-PipedWriter - a separate thread writes the
text to a PipedWriter and I return the PipedReader instance from the
doFilter() method. It seems that Jackrabbit won't read through the
passed stream immediately.

Indexing of nodes is buffered in jackrabbit. this may mean that nodes are not added to the index until a query is issued.

As far as I can see you have to make sure that the PipedWriter is not closed until the PipedReader is closed.

regards
 marcel

Reply via email to