Adrian Crum wrote: > --- On Thu, 4/1/10, Adam Heath <[email protected]> wrote: >> Adrian Crum wrote: >>> --- On Thu, 4/1/10, Adam Heath <[email protected]> >> wrote: >>>> Adrian Crum wrote: >>>>> --- On Thu, 4/1/10, Adam Heath <[email protected]> >>>> wrote: >>>>>> Adrian Crum wrote: >>>>>>> --- On Thu, 4/1/10, Adam Heath <[email protected]> >>>>>> wrote: >>>>>>>> Adrian Crum wrote: >>>>>>>>> I multi-threaded the data load >> by >>>> having one >>>>>> thread >>>>>>>> parse the XML files >>>>>>>>> and put the results in a >> queue. >>>> Another >>>>>> thread >>>>>>>> services the queue and >>>>>>>>> loads the data. I also >> multi-threaded >>>> the >>>>>> EECAs - but >>>>>>>> that has an issue >>>>>>>>> I need to solve. >>>>>>>> We need to be careful with that. >>>>>> EntitySaxReader >>>>>>>> supports reading >>>>>>>> extremely large data files; it >> doesn't >>>> read the >>>>>> entire >>>>>>>> thing into >>>>>>>> memory. So, any such event >> dispatch >>>> system >>>>>> needs to >>>>>>>> keep the parsing >>>>>>>> from getting to far ahead. >>>>>>> http://java.sun.com/javase/6/docs/api/java/util/concurrent/BlockingQueue.html >>>>>> Not really. That will block the >> calling >>>> thread when >>>>>> no data is available. >>>>> Yeah, really. >>>>> >>>>> 1. Construct a FIFO queue, fire up n consumers >> to >>>> service the queue. >>>>> 2. Consumers block, waiting for queue >> elements. >>>>> 3. Producer adds elements to queue. Consumers >>>> unblock. >>>>> 4. Queue reaches capacity, producer blocks, >> waiting >>>> for room. >>>>> 5. Consumers empty the queue. >>>>> 6. Goto step 2. >>>> And that's a blocking algo, which is bad. >>> Huh? You just asked for a blocking algorithm: "So, any >> such event dispatch system needs to keep the parsing from >> getting to far ahead." >> >> No, I didn't ask for a blocking algorithm. When the >> outgoing queue is >> full, the producer needs to pause itself, so that it's >> thread can be >> used for other things. > > I guess you could make the producer consume a queue element, then try adding > the new one again. So:
Nope, not good enough. It would be possible for the producer thread to stuck for a long time, producing/consuming. If there are several such workflows like this in the thread pool, then the threads become unavailable for doing other work. CPU is a limited resource. In the SEDA model, a worker must be short in execution time, and return back into the pool when it is done. It's perfectly acceptable, however, to add another item to the pool's queue to continue processing, however. 1: producer runs, creates a work unit 2: if the end has reach, submit the work unit directly 3: otherwise, wrap the unit, so that when the unit gets run, the producer will be resubmitted.
