hallo,

i am trying to use tika in combination with lucene to parse and index of
very large xml-files. so far, without success, because of memory
limitations. tika's BodyContentHandler seems to try to copy the whole
content in memory, which doesn't work as files are several giga-bytes
large.

is there a way of getting around this problem? can i use any other handler
which can deal with streams?

thanks in advance
zenpunk

Reply via email to