Hi Karl,
Okay. For the Out of Memory: This is the last day that I can go on to find out where the error comes from. After that, I should go into production to meet my deadlines. I hope to find time in the future to be able to fix this problem on this server, otherwise I could not index it. Unfortunately, it is very difficult to find the documents that cause this error. I did not find any trace in the database. Even in debug mode, it is difficult to find the problematic document. Maybe if I limit to 1 thread I could find it more easily, but I'm afraid the crawl is very long. Maybe you have an idea of the best method to adopt to find this / these documents? Maxence De : Karl Wright [mailto:[email protected]] Envoyé : vendredi 27 juillet 2018 12:47 À : dev <[email protected]>; [email protected] Objet : Tika/POI bugs Hi all, I've easily spent 40 hours over the last two weeks chasing down bugs in Apache Tika and POI. The two kinds I see are "ClassNotFound" (due to usage of the wrong ClassLoader), and "OutOfMemoryError" (not clear what it is due to yet). I don't have enough time to create tickets directly in Tika for all possible documents where these failures occur, so I urge our users to create tickets DIRECTLY in the Tika project in Jira. I guess you can let the Tika people create the POI tickets, if need be. For OutOfMemory problems, please attach the file that causes the problem to the ticket, and also the amount of memory you gave the agents process. For ClassNotFound problems, also include the stack trace. Thanks in advance, Karlx
