At 02:09 PM 11/15/2002 -0500, Rocco Caputo wrote:
>On Fri, Nov 15, 2002 at 11:45:33AM -0500, Stephen Adkins wrote:
....
>> QUESTIONS:
>> 
>>  * What queue mechanism would you use, assuming all of the
>>    writers and readers are on the same system?
>>    (IPC::Msg? MsgQ?)
>
>If speed is a major factor, I would use a FIFO (named pipe).  This is
>a very lightweight and fast way to pass data between processes on the
>same machine.

Are FIFO's (named pipes) on Unix guaranteed to maintain the integrity
of the messages in the case of multiple writers?
I think you could guarantee this if you imposed restrictions on the
data travelling through the pipe: i.e. single text line, must be
written in a single (unbuffered) write() system call.
Otherwise, doesn't a FIFO break down as a message queue when you have
multiple writers with arbitrarily long message data?

>>  * How about if the queue writers were distributed, but the
>>    queue readers were all on one machine? (RPC to insert into
>>    the above-mentioned local queues?)
>>  * How about if the queue writers and queue readers were all
>>    distributed around the network? (Spread::Queue::FIFO?
>>    Parallel::PVM? Parallel::MPI? MQSeries::Queue?)
>
>Your requirement #2 seems to indicate that the queue is held in a
>database table.  In that case the queue is inherently distributable.
>Each machine makes its own connections to the database and processes
>tasks in the queue using whatever locking is necessary.

Yes. In this case, you are right.
The only thing that's missing is the "wakeup" to the servers
so that they do not need to poll.

>This requires queue workers to poll the database for new jobs, which
>you later state is something you're trying to avoid.
>
....
>> MY HUNCHES
>> 
>> I think I'll use IPC::Msg as the queue because the queue readers
>> will all be on one machine.  I'll also have to implement a simple RPC
>> server (using Net::Server) to perform remote insertions into the 
>> local queue.  If this seems too rough, I'll probably install the
>> Spread Toolkit and use Spread::Queue.
>> 
>> I currently think I'll keep working with Net::Server to see if I
>> can use it to process a queue rather than listen on a network port,
>> but I'm not sure that this is the right use of the module.
>> I may end up ditching this effort and just have a set of parallel
>> servers all waiting on the queue.  The queue mechanism itself will
>> work out who gets to work on which request.
>> 
>> Any input?
>
>Depending on how critical your transactions are, it may be more
>reliable to use the database as the queue.  Jobs passed through it are
>saved to persistent storage, making them more likely to survive a
>crash.  Do you need to roll forward unprocessed tasks if you must
>restart the server?

Crash resistance is an important consideration for queues and queue
workers in general.  In this case, because it is primarily a read-only
decision support system, if we had a system crash, the loss of requests
in the queue would be the least of our worries.

>If you use the database as the queue, the message passing between
>clients and servers amounts to little more than a wake-up call: Hey,
>you've got task!

You are right.

That is in fact all my queue needs to do is to say "Hey, you've got a
task" in order to eliminate polling when there is no work to do and to
wake up the server immediately when there is work to do.

I might almost use a signal.  I would just need to "IGNORE" the signal
while the server is running and reset the signal handler when the server
is about to go back to sleep.

However, I have been thinking about asynchronous execution, queues, 
and queue-working, and I wanted to get a handle on how best I should
solve the problem in a general way.

Stephen


Reply via email to