On Tue, Aug 30, 2011 at 4:30 PM, Jukka Zitting <[email protected]> wrote:

>> I think Tika.parseToString (static sugar method) closes the
>> InputStream for you, while the Parser.parse method does not?
>> Kinda confusing!
>
> It is, but there's a design behind this. :-) The basic idea behind
> resource handling in Tika is that whoever opens a stream or another
> resource is also responsible for properly closing or releasing it.
> This (I think) is also pretty well documented in Tika's key
> interfaces.

I think that's a good approach in general (if you open it, you close
it).  It does look like it's well documented, which is great.

> The exception to this rule is the Tika.parseToString(InputStream,
> Metadata) method. The reason for making an exception here is that the
> Tika facade class was designed primarily for convenience and to
> minimize the amount of code that the consumer of the API needs to
> write. The proper resource management pattern in this case would have
> been:
>
>    InputStream stream = ...;
>    try {
>        return tika.parseToString(stream, ...);
>    } finally {
>        stream.close();
>    }
>
> However, since in this specific case the client application hardly
> ever needs to use the stream for anything else and since in pretty
> much all cases the stream in question is constructed right when the
> parseToString call is made, it makes more sense for the
> parseToString() method to take care of closing the stream. The result
> is that the above code can be reduced to:
>
>    return tika.parseToString(..., ...);

I can appreciate this motivation, ie "convenience trumps consistency",
here.

Still this inconsistency can lead to confusion and to bugs, but since
it's only one API that's "the exception" I think it's OK?

But maybe we can beef up its javadocs a bit, saying "NOTE: unlike all
other APIs parsing from an InputStream, this API closes the incoming
InputStream for you for convenience" or something?

Mike McCandless

http://blog.mikemccandless.com

Reply via email to