2017-05-31 21:16 GMT+02:00 Vincent Massol <vinc...@massol.net>:

> Hi,
>
> > On 31 May 2017, at 20:50, Guillaume Delhumeau <
> guillaume.delhum...@xwiki.com> wrote:
> >
> > 2017-05-31 15:59 GMT+02:00 Vincent Massol <vinc...@massol.net>:
> >
> >> Hi Guillaume,
> >>
> >>> On 31 May 2017, at 12:15, Guillaume Delhumeau <
> >> guillaume.delhum...@xwiki.com> wrote:
> >>>
> >>> Help me to decide!
> >>>
> >>> TL;DR:
> >>>
> >>> * I need to know if performing a query on the database for each user
> who
> >>> want to receive an email with all the notifications, is a scalability
> >> issue
> >>> (in a job context).
> >>
> >> Yes whenever we do a lot of queries to the DB it’s a scalability issue.
> If
> >> we have 100K users then it’s 100K queries for definitely a scalability
> >> issue.
> >>
> >
> > Well, in that case, I don't know if sending 100K emails is scalable too.
>
> The new mail system is made for that. There’s a single mail thread
> (actually 2 but that’s a detail) and it can send an infinite number of
> mails without slowing down XWiki. Ofc the only thing not guaranteed is how
> long it takes to do so. But that can be fixed outside of XWiki by having a
> proxy mail server which would accept immediately all mails sent by XWiki
> before forwarding them to some cluster of mail servers. It may not be
> enough though and maybe sending 100K mails to the proxy mail server would
> already take too long. Would be interesting to have some measure of how
> long it takes to send a single mail. I think I did some computation at some
> point but i don’t remember the results.
>
> Do you mean that the notification center would execute the DB queries one
> by one?


Yes this is what I mean.


> In this case it could work indeed and it should be left to the mail module
> to handle that by implementing a custom MimeMessageFactory with an
> iterator. It’s important to delegate this to the mail sender API IMO. See
> UsersAndGroupsMimeMessageFactory for an example. AFAIR Edy refactored the
> watchlist to use a MimeMessageFactory.
>
> Thanks
> -Vincent
>
> > We need to find a way to do a single query (or a small fixed number of
> >> queries independent of the # of users).
> >>
> >> If not possible then we may need to either:
> >> A) Add some new table in our DB to help do that
> >> B) Use some tool other than the DB, e.g. SOLR, etc
> >>
> >> Thanks
> >> -Vincent
> >>
> >>> * If it's not an issue, I can implement the "naïve" solution which
> >> requires
> >>> less development.
> >>>
> >>> Full message:
> >>>
> >>> Status:
> >>> * notifications are displayed on the top menu when you browse the wiki.
> >>> * notifications are displayed differently for each individual user
> >>> according to their preferences (filters on event type, on locations,
> >>> etc...).
> >>> * similar notifications are grouped together into "composite
> >> notifications".
> >>> * there is only a few notifications displayed (5 by default).
> >>>
> >>> Objective:
> >>> * send an email periodically (every hour, every day, every week)
> >> according
> >>> to the user preferences with ALL events that happened during the last
> >>> period of time, but still according to the user preferences.
> >>>
> >>> Inspiration:
> >>> * the watchlist gets ALL events that happened during the last period of
> >> time
> >>> * then, for each user, remove the events which the user is not
> >> interested in
> >>> * Benefit: only one query to get the events from the database for all
> >> users
> >>>
> >>> Problems:
> >>> * in the notifications, I have introduced a NotificationFilter role the
> >>> make possible to inject some SQL in the query to get the events
> according
> >>> to the user preferences. I call this "pre-filters".
> >>> ** it means we generate a unique request for each individual user, so
> if
> >> we
> >>> send a mail to 1000 users, we will have 1000 requests to the database.
> >>>
> >>> I wonder if it's a non-problem or a big scability issue. Because even
> if
> >>> the whole job that send emails take ~10 minutes, it does not matter.
> It's
> >>> not a realtime thing.
> >>>
> >>> For the records, NotificationFilter have "post-filters" too, that
> perform
> >>> check on the event itself (for example checking the permissions,
> etc...).
> >>>
> >>> Alternatives:
> >>> * just like the watchlist, perform a very generic query on the database
> >> to
> >>> get all the events that happened during the last period of time
> >>> * then for each user, use only the "post-filters" to remove events the
> >> user
> >>> don't care of
> >>>
> >>> Problem:
> >>> * it means the pre-filters that make sense in the notification use-case
> >>> cannot be used for emails. Developers must be aware of this.
> >>> * it requires some refactoring of the code that group similar
> >> notifications.
> >>>
> >>> Question:
> >>> Should I go with the "naive" solution, ie for each user get all
> >>> notifications and send a mail, or should I go with the "only 1 query to
> >> the
> >>> database for all users" version?
> >>>
> >>> Thanks,
> >>>
> >>> --
> >>> Guillaume Delhumeau (guillaume.delhum...@xwiki.com)
> >>> Research & Development Engineer at XWiki SAS
> >>> Committer on the XWiki.org project
> >>
> >>
> >
> >
> > --
> > Guillaume Delhumeau (guillaume.delhum...@xwiki.com)
> > Research & Development Engineer at XWiki SAS
> > Committer on the XWiki.org project
>
>


-- 
Guillaume Delhumeau (guillaume.delhum...@xwiki.com)
Research & Development Engineer at XWiki SAS
Committer on the XWiki.org project

Reply via email to