Adam Rauch
Wed, 27 Jan 2010 08:00:46 -0800
Issue created: http://issues.apache.org/jira/browse/TIKA-374 -----Original Message----- From: Jukka Zitting [mailto:jukka.zitt...@gmail.com] Sent: Tuesday, January 26, 2010 10:24 AM To: tika-user@lucene.apache.org Subject: Re: AutoDetectParser not thread-safe? Hi, On Tue, Jan 26, 2010 at 6:50 PM, Adam Rauch <a...@labkey.com> wrote: > We are using Tika 0.5 to parse files that are added to a Lucene index. If > we assign multiple threads to the parsing task we find that the > AutoDetectParser.parse() method will occasionally fail to return. In our > case, it appears that a HashMap inside Xerces gets corrupted, causing an > infinite loop inside HashMap.get(). This seems to be a concurrency problem; > we have not seen the issue when running single threaded. Hmm, that's indeed quite troublesome. > I can open a JIRA issue if youd prefer. That would be great. Thanks to your in-depth analysis of the problem it should be easy to come up with a fix. BR, Jukka Zitting