> Perhaps this is more a question about application design then directly about > Memcached. > > I have a very large and somewhat slow data fetch from a back-end service. > The client doesn't need all the data at the same time so the plan is to split > it up into parts and save into Memcached to be > fetched as needed by the client. It's not known ahead of time what parts the > client will need or when, although it's expected any related requests would > likely be over a period of minutes -- i.e. would > not be expecting this data to be in the cache for a very long time. The goal > is 1) to reduce the amount of data the client must deal with at any given > time, and 2) prevent repeated fetches of the same > data from the back-end. > > So, fetching this data results in multiple memcached writes from a single > data source. And if there's a cache miss then repopulating all the parts > with many cache writes would occur. > > My concern is that the client may make multiple requests for additional parts > at the same time triggering multiple (duplicate) re-fetches and saves. > Anyone have a similar situation? Would you recommend > use an atomic "add" with a short timeout as a lock?
It's discussed on the wiki a bit. There's the "Ghetto lock" with add, but also things like gearman which can do request coalescing.
