[jira] Commented: (TIKA-416) Out-of-process text extraction

Jukka Zitting (JIRA) Thu, 27 May 2010 14:52:05 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872395#action_12872395
 ]


Jukka Zitting commented on TIKA-416:
------------------------------------

See http://jukkaz.wordpress.com/2010/05/27/forking-a-jvm/ for a summary of my 
current approach on how to achieve this.

> Out-of-process text extraction
> ------------------------------
>
>                 Key: TIKA-416
>                 URL: https://issues.apache.org/jira/browse/TIKA-416
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Priority: Minor
>
> There's currently no easy way to guard against JVM crashes or excessive 
> memory or CPU use caused by parsing very large, broken or intentionally 
> malicious input documents. To better protect against such cases and to 
> generally improve the manageability of resource consumption by Tika it would 
> be great if we had a way to run Tika parsers in separate JVM processes. This 
> could be handled either as a separate "Tika parser daemon" or as an 
> explicitly managed pool of forked JVMs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (TIKA-416) Out-of-process text extraction

Reply via email to