[
https://issues.apache.org/jira/browse/JAMES-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18050371#comment-18050371
]
Benoit Tellier edited comment on JAMES-4159 at 1/7/26 1:26 PM:
---------------------------------------------------------------
Hello [~matthieu]
> If we know some listeners have different behaviors, why not making delivery
> system knows?
> For example, we could have a way to group listeners explicitly. We could
> group fast listeners but not slow ones.
I do believe the inner working of the EventBus is rather complex. I fear that
by adding yet-another dimension in the way listeners behaves that we add more
complexity than what is really needed. I am thus very reluctant to further
rework the "EventBus" internals.
> Another question: the ticket is about EventBus and then it only focuses on
> RabbitMQ. Is it a RabbitMQ related problem?
Very true. I bet I did follow the path of least-implementation complexity and
missed the big picture.
---
Thanks a lot for those questions. If we end up imtroduction complex constructs
it is very likely that we (I) are trying to solve bad problems.
Another possible way to puzzle things together is to actually "make our
listener fast".
We could:
- Decide quickly if our current email is worth further analysis
- Then put that task onto another EventBus (which is configured with a higher
timeout and dedicated to that sole task)
This could easily be implemented in Twake Mail AI extension without impacting
the project as a all. Also we avoid the double invocation of the third party
system when slow, which crippled my first proposal.
We could still keep the concept of "QOS" and "execution time" for the event bus
as it is likely to be beneficial.
was (Author: btellier):
Hello [~matthieu]
> If we know some listeners have different behaviors, why not making delivery
> system knows?
> For example, we could have a way to group listeners explicitly. We could
> group fast listeners but not slow ones.
I do believe the inner working of the EventBus is rather complex. I fear that
by adding yet-another dimension in the way listeners behaves that we add more
complexity than what is really needed. I am thus very reluctant to further
rework the "EventBus" internals.
> Another question: the ticket is about EventBus and then it only focuses on
> RabbitMQ. Is it a RabbitMQ related problem?
Very true. I bet I did follow the path of least-implementation complexity and
missed the big picture.
---
Thanks a lot for those questions. If we end up imtroduction complex constructs
it is very likely that we (I) are trying to solve bad problems.
Another possible way to puzzle things together is to actually "make our
listener fast".
We could:
- Decide quickly if our current email is worth further analysis
- Then put that task onto another EventBus (which is configured with a higher
timeout and dedicated to that sole task)
This could easily be implemented in Twake Mail AI extension without impacting
the project as a all.
We could still keep the concept of "QOS" and "execution time" for the event bus
as it is likely to be beneficial.
> EventBus: improve slow listener handling
> ----------------------------------------
>
> Key: JAMES-4159
> URL: https://issues.apache.org/jira/browse/JAMES-4159
> Project: James Server
> Issue Type: Improvement
> Components: eventbus
> Reporter: Benoit Tellier
> Priority: Major
> Time Spent: 20m
> Remaining Estimate: 0h
>
> h3. Context
> At LINAGORA we developped some custom listeners responsible of doing AI
> related tasks, either querying or feeding AI models. Mailbox listeners are a
> good fit because they are not blocking / delaying mail reception.
> The EventBus works the following way: it publishes a message onto RabbitMQ,
> and initially is consummed once by a James node that would execute all
> listeners. Using QOS we are consuming up to 10 messages simultaneously per
> node. Failed messages would then be re-published to individual per-listener
> retry queue.
> This "bundling" is done so that we limit the chatter of deserialization +
> wiring of the event bus, where each listener would have needed those
> operations, with separated execution. The amount of CPU time dedicated to
> those operations was significant ( > 40%).
> However this bundling means we are not efficient handling "slow" listeners
> which at scale can sinificantly distrurb and delay event treatment.
> h3. Proposal
> Be able to configure a timeout on the initial eventbus run, for each listener.
> That way, slow listeners would be individually retried.
> In `listeners.xml`:
> {code:xml}
> <listeners>
> <executeGroupListeners>true</executeGroupListeners>
> <!-- New block -->
> <qos>20</qos>
> <initialExecutionTimeout>30s</initialExecutionTimeout>
> <executionTimeout>120s</executionTimeout>
> <!-- -->
> <listener>
>
> <class>org.apache.james.mailbox.cassandra.MailboxOperationLoggingListener</class>
> </listener>
> </listeners>
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]