Benoit Tellier created JAMES-3719:
-------------------------------------
Summary: Reactify Tika calls
Key: JAMES-3719
URL: https://issues.apache.org/jira/browse/JAMES-3719
Project: James Server
Issue Type: Improvement
Components: elasticsearch
Affects Versions: 3.7.0
Reporter: Benoit Tellier
Fix For: 3.8.0
We rely on blocking HTTP calls to extract textual content with Tika.
This means:
- Threads hangs around why we do the requests...
- We are blocking in a parrallel reactor thread (cassandra-app) which is
dramatic performance wise.
We can improve this matter of fact by using reactor-netty to query tika.
Caching layers need to be adapted to - guava is blocking. Caffeine library can
be a good candidate as a reactive caching library.
Also, we need to uncouple MIME parsing and content extraction: both are
currently tightly coupled; I suggest extracting a POJO representation of the
mail first, then extract content if need be, not do both at the same time.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]