Re: [PHP] Queuing servers

2010-12-02 Thread Alister Bulman
On Thu, Dec 2, 2010 at 10:10 PM, la...@garfieldtech.com
 wrote:
> Hi folks.
>
> I have a project coming up where I will need to process a bazillion (OK, a
> few million) records, possibly with multiple steps.  (In this case I'm
> reading data from one data archive into an Apache Solr server.)  This is a
> natural use case for a queue server, I believe, and while the requirements
> of the project do not dictate a language it makes sense to me to use PHP for
> the processing code since 1) Other parts of the project will be using it for
> web-facing logic and 2) It's the language I know best.
>
> I'm trying to select a queue server to use.  The two I'm investigating in
> particular are Beanstalkd (http://kr.github.com/beanstalkd/) and Gearman
> (http://gearman.org/).  In this case I do need a reliable queue, even if
> that means a record gets processed multiple times by accident (which in this
> use case is fine).
>
> Has anyone worked with either of these systems?  Any war stories to share,
> good or bad?  Any guidelines on the number of resources we need?

I did a presentation at the PHP London meetup in May - my slides with
some good Beanstalk points are at:
http://abulman.co.uk/2010/05/queues-and-beanstalkd/

Gearman does have the advantage of being pretty much drop-in and work,
calling remote PHP functions (or other languages that are setup as
workers). With beanstalkd, there is no solid framework in place yet
(though there are a couple of things around, I've seen one for
CakePHP, and something for Drupal I think).  That said, it's not hard
to write one, depending on your personal choice of development
framework, or not.  Both have solid C-based daemons backing them up
(which are so much easier to run than the original Perl daemon).

I see Beanstalk as more flexible though, the Priorities, TTR limits,
tubes and optional delays make for a powerful set of tools within the
queue itself.

Alister

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] Queuing servers

2010-12-02 Thread la...@garfieldtech.com

Hi folks.

I have a project coming up where I will need to process a bazillion (OK, 
a few million) records, possibly with multiple steps.  (In this case I'm 
reading data from one data archive into an Apache Solr server.)  This is 
a natural use case for a queue server, I believe, and while the 
requirements of the project do not dictate a language it makes sense to 
me to use PHP for the processing code since 1) Other parts of the 
project will be using it for web-facing logic and 2) It's the language I 
know best.


I'm trying to select a queue server to use.  The two I'm investigating 
in particular are Beanstalkd (http://kr.github.com/beanstalkd/) and 
Gearman (http://gearman.org/).  In this case I do need a reliable queue, 
even if that means a record gets processed multiple times by accident 
(which in this use case is fine).


Has anyone worked with either of these systems?  Any war stories to 
share, good or bad?  Any guidelines on the number of resources we need?


For Beanstalk, I've found two user-space PHP libraries, one of which is 
apparently dead.  The other is:


https://github.com/pda/pheanstalk/

For Gearman, there appears to be both a PECL module and a PEAR module.

http://pear.php.net/package/Net_Gearman/
http://pecl.php.net/package/gearman

(Naturally they do not appear to be mirrors of each other, just to make 
life difficult.)


I do have access to install PECL modules on the server(s) in question if 
appropriate.


Any experience/advise/horror stories that would help us settle on a 
queue and API library?


--Larry Garfield

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php