very large xml-file parsing

Mugat Gurkowsky Sat, 13 Sep 2014 06:06:13 -0700

hallo,

i am trying to use tika in combination with lucene to parse and index of
very large xml-files. so far, without success, because of memory
limitations. tika's BodyContentHandler seems to try to copy the whole
content in memory, which doesn't work as files are several giga-bytes
large.


is there a way of getting around this problem? can i use any other handler
which can deal with streams?

thanks in advance
zenpunk

very large xml-file parsing

Reply via email to