Hi,

On 10/22/07, Keith R. Bennett <[EMAIL PROTECTED]> wrote:
> I was thinking that as a ContentHandler, the user could choose to place all
> the data in memory, and there would be a single copy of the full text.
>
> As the ParserPostProcessor, if I understand correctly, the user is bound to
> consume the extra memory if using the AutoDetectParser, and we are probably
> consuming twice as much memory to do so, since we would be saving the full
> text in two different string writers.

I don't quite follow you. AutoDetectParser never reads the full
content into memory (of course unless an underlying parser does it).

> So I was thinking of moving the existing logic from the ParserPostProcessor
> to a ContentHandler implementation.

Sure, why not.

If I understand you correctly, you'd prefer something like this:

    Parser parser = ...;
    Metadata metadata = new Metadata();
    parser.parse(..., new FullTextContentHandler(metadata), metadata);

over:

    Parser parser = new ParserPostProcessor(...);
    Metadata metadata = new Metadata();
    parser.parse(..., new DefaultHandler(), metadata);

BR,

Jukka Zitting

Reply via email to