Benoit Tellier created JAMES-3719:
-------------------------------------

             Summary: Reactify Tika calls
                 Key: JAMES-3719
                 URL: https://issues.apache.org/jira/browse/JAMES-3719
             Project: James Server
          Issue Type: Improvement
          Components: elasticsearch
    Affects Versions: 3.7.0
            Reporter: Benoit Tellier
             Fix For: 3.8.0


We rely on blocking HTTP calls to extract textual content with Tika.

This means:
 - Threads hangs around why we do the requests...
 - We are blocking in a parrallel reactor thread (cassandra-app) which is 
dramatic performance wise.

We can improve this matter of fact by using reactor-netty to query tika. 

Caching layers need to be adapted to - guava is blocking. Caffeine library can 
be a good candidate as a reactive caching library.

Also, we need to uncouple MIME parsing and content extraction: both are 
currently tightly coupled; I suggest extracting a POJO representation of the 
mail first, then extract content if need be, not do both at the same time.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to