Benoit Tellier created JAMES-3719: ------------------------------------- Summary: Reactify Tika calls Key: JAMES-3719 URL: https://issues.apache.org/jira/browse/JAMES-3719 Project: James Server Issue Type: Improvement Components: elasticsearch Affects Versions: 3.7.0 Reporter: Benoit Tellier Fix For: 3.8.0
We rely on blocking HTTP calls to extract textual content with Tika. This means: - Threads hangs around why we do the requests... - We are blocking in a parrallel reactor thread (cassandra-app) which is dramatic performance wise. We can improve this matter of fact by using reactor-netty to query tika. Caching layers need to be adapted to - guava is blocking. Caffeine library can be a good candidate as a reactive caching library. Also, we need to uncouple MIME parsing and content extraction: both are currently tightly coupled; I suggest extracting a POJO representation of the mail first, then extract content if need be, not do both at the same time. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org