[
https://issues.apache.org/jira/browse/NUTCH-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229100#comment-17229100
]
ASF GitHub Bot commented on NUTCH-2582:
---------------------------------------
sebastian-nagel opened a new pull request #554:
URL: https://github.com/apache/nutch/pull/554
- add method in MimeUtil to set MimeTypesReader pool size
- actually adjust pool size to number of Fetcher threads / 2
(minimum pool size is 10 in case there are less than 20 Fetcher threads)
- double pool size (10 -> 20) of Tika XMLReaderUtils in tika-config.xml
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Set pool size of XML SAX parsers used for MIME detection in Tika 1.19
> ---------------------------------------------------------------------
>
> Key: NUTCH-2582
> URL: https://issues.apache.org/jira/browse/NUTCH-2582
> Project: Nutch
> Issue Type: Improvement
> Components: protocol
> Affects Versions: 1.15
> Reporter: Sebastian Nagel
> Priority: Major
> Fix For: 1.18
>
>
> See
> [NUTCH-2578|https://issues.apache.org/jira/browse/NUTCH-2578?focusedCommentId=16482879&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16482879].
> Tika 1.19 will use a pool of SAX parser to avoid the bottleneck while
> creating a new one (see NUTCH-2578/TIKA-2645). Fetcher should adjust the size
> of the pool to the number of Fetcher threads (or a fraction of it because
> most threads are likely to be busy fetching content).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)