Hi Tim,
Absolutely, see TIKA-2244. This PR primarily helps by detecting when a 
CloseShieldInputStream supports mark, the previous mechanism was to detect if 
the class of InputStream was one of a couple known to support it, but trusting 
markSupported seems to work quite well and avoid the false negatives that were 
causing unnecessary BufferedInputStream allocations. While I was looking I also 
found similar issues in PackageParser and Compressor parser so I went ahead and 
used the same logic in those places.
Thanks,
Josh
> On Jan 18, 2017, at 1:51 PM, Allison, Timothy B. <[email protected]> wrote:
> 
> Josh,
>  Thank you for this PR.  Would you be able to open an issue on our JIRA as 
> well.  Can you explain in a bit more detail how this patch helps?
>  Thank you, again.
> 
>            Best,
> 
>                  Tim
> 
> -----Original Message-----
> From: joshbooks [mailto:[email protected]] 
> Sent: Wednesday, January 18, 2017 4:15 PM
> To: [email protected]
> Subject: [GitHub] tika pull request #148: be more parsimonious wrapping 
> streams
> 
> GitHub user joshbooks opened a pull request:
> 
>    https://github.com/apache/tika/pull/148
> 
>    be more parsimonious wrapping streams
> 
>    it looks like a bunch of streams were getting wrapped in 
> BufferedInputStreams just to make extra double sure that mark was supported, 
> but this is not as harmless as it might otherwise seem when you run into big 
> nested package files
> 
> You can merge this pull request into a Git repository by running:
> 
>    $ git pull https://github.com/joshbooks/tika master
> 
> Alternatively you can review and apply these changes as the patch at:
> 
>    https://github.com/apache/tika/pull/148.patch
> 
> To close this pull request, make a commit to your master/trunk branch with 
> (at least) the following in the commit message:
> 
>    This closes #148
> 
> ----
> commit 896c46a0c652de436da0e4f25bfa53a7d83ae02f
> Author: Joshua Hight <[email protected]>
> Date:   2017-01-18T21:10:03Z
> 
>    be more parsimonious wrapping streams
> 
>    it looks like a bunch of streams were getting wrapped in
>    BufferedInputStream just to make extra double sure that mark was
>    supported, but this is not as harmless as it might otherwise seem when
>    you run into big nested package files
> 
> commit 9477d03e10149a8ec6b5d6889e2fd2317d2ed5f5
> Author: Joshua Hight <[email protected]>
> Date:   2017-01-18T21:13:05Z
> 
>    Merge remote-tracking branch 'apache/master'
> 
> ----
> 
> 
> ---
> If your project is set up for it, you can reply to this email and have your 
> reply appear on GitHub as well. If your project does not have this feature 
> enabled and wishes so, or if the feature is enabled but not working, please 
> contact infrastructure at [email protected] or file a JIRA ticket 
> with INFRA.
> ---

Reply via email to