[ 
https://issues.apache.org/jira/browse/JAMES-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713620#comment-17713620
 ] 

Benoit Tellier commented on JAMES-3777:
---------------------------------------

https://github.com/apache/james-project/pull/1530 is a Proof of Concept 
regarding 
https://issues.apache.org/jira/browse/JAMES-3777?focusedCommentId=17713154&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17713154

I did succeed to include rules updates also.

Now remains the "wiring work": plug this into the event system, serialize this 
new event, handle this event in subscriber, document the upgrade and add a 
system property to disable this behaviour (Just to ensure rolling upgrades)...

> Event sourcing - O[n²] storage for filters
> ------------------------------------------
>
>                 Key: JAMES-3777
>                 URL: https://issues.apache.org/jira/browse/JAMES-3777
>             Project: James Server
>          Issue Type: Improvement
>    Affects Versions: 3.7.0
>            Reporter: Benoit Tellier
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> h2. Symptoms
> ```  
> Largest Partitions:     
> [FilteringRule/x...@linagora.com] 44952069 (45.0 MB)
> ```
> Every time this guy sends an email we load 45 MB of JSON, which can yield  
> big performance impact.
> h2. What?
> We implemented event sourcing with reset. Given rule A, B if we want to 
> persist rule C then we store a "reset to A, B, C" event.
> So, if we want to store N filter, the resulting structure with have a size 
> depending of O[n²] which proves to be barely sustainable.
> h2. How to fix
> Coming back to O[n] likely would help.
> Implement filter addition / removal both at the storage and JMAP layer
> h2.  Alternatives
> h3. The read projection
> Currently we are loading the full history, building the aggregate each time 
> we process emails, and performing SERIAL lightweight transactions. Which is 
> very common. And impactfull.
> It would be possible to introduce  read projection, maintained by a 
> subscriber to the event source, that would allow efficiently reading current 
> filters for a given user.
> This mean the history would be loaded only upon writes, which are rare.
> Impact: yet another table. Also the solution is local to this usage and does 
> not help other event sourcing usages.
> h3. Event sourcing snapshots
> Augment James event sourcing implementation with a Snapshot mechanism.
> Upon reading history, we would start reading available snapshots, then read 
> the history from that snapshot.
> Event store would be responsible of taking snapshots. Even a one change out 
> of 10 would do the job here.
> This implies being able to serialize state. This implies an additional table 
> for storing event sourcing snapshots.
> My take on it: going `O[n2` -> `O[n]` will likely be a good enough mitigation 
> that we don't need to grow the complexity of the event sourcing code.
> On the other hand, this ewould harden event sourcing code and likely lift 
> most of the limitation for adoptions on the mailboxes write path (to enforce 
> mailbox name unicity constraint).
> Note that both solutions are not exclusive.
> h3. The dirty fix
> For filters the history prior reset event can be dropped, this can be used to 
> solve the immediate problem, even if it is not very clean.
> h1. Proposal
>  - Implement a read projection
>  - Implement addition / removal patches to filtering event sourcing aggregate
>  - Don't implement event sourcing snapshots now
> And also... Remove the obligation to configure JMAP filtering mailet inside 
> JMAP servers: after all this extension is not standard...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to