Max Pfingsthorn wrote:
> Hello!
> 
> I am not sure what has been done about this so far, but we are still 
> experiencing some not-so-great behavior from the CachingPipeline:
> We have some pipelines which take a long time to complete processing, like 
> generating a homepage from many different sources. We see that if a second 
> request for exactly the same pipeline comes in before the first request is 
> done, it takes again as long because the first pipeline did not put its 
> result in the cache yet and it is recomputed. This is quite obvious, but 
> hurts our performance a lot.
> 
> I was thinking that we might implement some thread synchronization by sharing 
> some objects and calling wait() and notify() on them.
> 
> First I thought putting an empty CachedResponse into the Cache when we start 
> generating the data. Any pipeline trying to access that CachedResponse will 
> notice that it is empty and call wait() on it. As soon as the first pipeline 
> is done saving its content to the cache, it calls notify(). Then I remembered 
> that the Cache puts these on disk, and I am not sure what happens to the 
> locks when an object is serialized...
> So, what about doing the same with a store instead? We push an empty response 
> into a store, preferrably a transient, in-memory one without size limits, and 
> do the same as I outlined above.
> 
> WDYT? Would that solve it? Is there a better way?
> 
I think this topic has been discussed several times, so you might want
to search through the archives to see what others did say about it.

We usually solve this problem by pre-caching, so before the first real
user can invoke the pipeline, we already invoked it (using cron etc.)
and therefore the content is always cached.

Carsten

-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Reply via email to