Hi folks.

I have a project coming up where I will need to process a bazillion (OK, a few million) records, possibly with multiple steps. (In this case I'm reading data from one data archive into an Apache Solr server.) This is a natural use case for a queue server, I believe, and while the requirements of the project do not dictate a language it makes sense to me to use PHP for the processing code since 1) Other parts of the project will be using it for web-facing logic and 2) It's the language I know best.


I'm trying to select a queue server to use. The two I'm investigating in particular are Beanstalkd (http://kr.github.com/beanstalkd/) and Gearman (http://gearman.org/). In this case I do need a reliable queue, even if that means a record gets processed multiple times by accident (which in this use case is fine).

Has anyone worked with either of these systems? Any war stories to share, good or bad? Any guidelines on the number of resources we need?

For Beanstalk, I've found two user-space PHP libraries, one of which is apparently dead. The other is:

https://github.com/pda/pheanstalk/

For Gearman, there appears to be both a PECL module and a PEAR module.

http://pear.php.net/package/Net_Gearman/
http://pecl.php.net/package/gearman

(Naturally they do not appear to be mirrors of each other, just to make life difficult.)

I do have access to install PECL modules on the server(s) in question if appropriate.

Any experience/advise/horror stories that would help us settle on a queue and API library?

--Larry Garfield

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to