I do something similar but it's based on the job being run, not when it was
queued. So that I only process a job for a given item once every five
minutes, I store the item's ID in memcache with a five minute expiration
time once processing is done. Then when a worker picks up the job, I check
to see if that item exists before I do any processing. Adding a delay to the
beanstalk job would make it more likely that only the last change gets
indexed.


On Mon, Oct 11, 2010 at 3:06 PM, Ron Mayer <[email protected]> wrote:

> I'm using beanstalkd to queue up jobs to re-index documents whenever they
> get updated; so my jobs in this case are all simple paths/urls to
> documents.
>
> For some documents that change faster than the queue is drained, I end up
> getting the same job in the queue dozens of times; and then doing extra
> work
> re-processing them unnecessarily.
>
> Is there a good way to say "put this in the queue if it's not already in
> there"?
>
>
> If not, does anyone have a good way of handling this outside of beanstalkd?
>
> I'm considering adding something to memcached saying
> "document file://whatever is in the queue = true"
> whenever I enqueue one; check for that flag before adding it again;
> and remove the flag when I process it;
> but was wondering if there's an easier/better/more conventional way.
>
> --
> You received this message because you are subscribed to the Google Groups
> "beanstalk-talk" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<beanstalk-talk%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/beanstalk-talk?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"beanstalk-talk" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/beanstalk-talk?hl=en.

Reply via email to