hallo, i am trying to use tika in combination with lucene to parse and index of very large xml-files. so far, without success, because of memory limitations. tika's BodyContentHandler seems to try to copy the whole content in memory, which doesn't work as files are several giga-bytes large.
is there a way of getting around this problem? can i use any other handler which can deal with streams? thanks in advance zenpunk
