Looking pretty good.  I thought `needs_aggregation` might be doing a lot of 
work just to return a bool, so I turned up timermiddleware to DEBUG level to 
see what kind of queries were going on.  I noticed a number of queries 
containing `'node_id': {'$in': []}`  We should short-circuit that either at the 
storage layer or higher.  Full capture from one `create_timelines` task: 
https://sourceforge.net/p/allura/pastebin/52cb20e8c4d1042c11956f2f

Back on `needs_aggregation`, the `since` param should keep the query relatively 
small, but you might as well add `limit=1` too, so it can quit after the first 
is found.


---

** [tickets:#4397] Run timeline aggregations in the background with taskd**

**Status:** in-progress
**Labels:** v2 activitystreams 
**Created:** Fri Jun 15, 2012 05:59 PM UTC by Tim Van Steenburgh
**Last Updated:** Mon Jan 06, 2014 05:49 PM UTC
**Owner:** Tim Van Steenburgh

Once the volume ramps up, we probably won't want to be doing timeline 
aggregations on demand if we can help it. Find good spots to fire off 
aggregations in the background using taskd, so that when an activitystream page 
is requested, the cached timeline can just be pulled from mongo w/o doing an 
aggregation.

For users, a good spot to do this might be on login. For projects, not 
sure...needs more thought.

Do we need to worry about two aggregations for the same node running at the 
same time?




---

Sent from sourceforge.net because allura-dev@incubator.apache.org is subscribed 
to https://sourceforge.net/p/allura/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/allura/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.

Reply via email to