On Thu, Dec 2, 2010 at 10:10 PM, la...@garfieldtech.com
wrote:
> Hi folks.
>
> I have a project coming up where I will need to process a bazillion (OK, a
> few million) records, possibly with multiple steps. (In this case I'm
> reading data from one data archive into an Apache Solr server.) This is a
> natural use case for a queue server, I believe, and while the requirements
> of the project do not dictate a language it makes sense to me to use PHP for
> the processing code since 1) Other parts of the project will be using it for
> web-facing logic and 2) It's the language I know best.
>
> I'm trying to select a queue server to use. The two I'm investigating in
> particular are Beanstalkd (http://kr.github.com/beanstalkd/) and Gearman
> (http://gearman.org/). In this case I do need a reliable queue, even if
> that means a record gets processed multiple times by accident (which in this
> use case is fine).
>
> Has anyone worked with either of these systems? Any war stories to share,
> good or bad? Any guidelines on the number of resources we need?
I did a presentation at the PHP London meetup in May - my slides with
some good Beanstalk points are at:
http://abulman.co.uk/2010/05/queues-and-beanstalkd/
Gearman does have the advantage of being pretty much drop-in and work,
calling remote PHP functions (or other languages that are setup as
workers). With beanstalkd, there is no solid framework in place yet
(though there are a couple of things around, I've seen one for
CakePHP, and something for Drupal I think). That said, it's not hard
to write one, depending on your personal choice of development
framework, or not. Both have solid C-based daemons backing them up
(which are so much easier to run than the original Perl daemon).
I see Beanstalk as more flexible though, the Priorities, TTR limits,
tubes and optional delays make for a powerful set of tools within the
queue itself.
Alister
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php