--- On Sat, 4/3/10, Adam Heath <[email protected]> wrote:
> Adrian Crum wrote:
> > --- On Sat, 4/3/10, Adam Heath <[email protected]>
> wrote:
> >> Adrian Crum wrote:
> >>> I multi-threaded the data load by having one
> thread
> >> parse the XML files
> >>> and put the results in a queue. Another
> thread
> >> services the queue and
> >>> loads the data. I also multi-threaded the
> EECAs - but
> >> that has an issue
> >>> I need to solve.
> >> Well, there could be some EECAs that have
> dependencies on
> >> each other,
> >> when defined in a single definition file. 
> Or, they
> >> have implicit
> >> dependencies with other earlier defined
> ecas.  Like,
> >> maybe an order
> >> eca assuming that a product eca has run, just
> because ofbiz
> >> has always
> >> loaded the product component before the order
> component.
> > 
> > I used a FIFO queue serviced by a single thread for
> the EECAs - to preserve the sequence. The main idea was to
> offload the EECA execution from the thread that triggered
> the EECA. The data load was also in a FIFO queue serviced by
> a single thread so the files were being loaded in order.
> > 
> > To summarize:
> > 
> > 1. Table creation is handled by a thread pool with an
> adjustable size. A thread task is to create a table and its
> primary keys. Thread tasks run in parallel. Main thread
> blocks until all tables and primary keys are created.
> > 2. Main thread creates foreign keys.
> > 3. Main thread parses XML files, puts results in data
> load queue.
> > 4. A data load thread services the data load queue and
> stores the data. If an ECA is triggered it puts the ECA info
> in an ECA queue.
> > 5. An ECA thread services the ECA queue and runs the
> ECA.
> > 6. Main thread blocks until all queues are empty.
> 
> Except if an eca fires, but the main data load thread keeps
> going,
> then the main data load thread might insert/update
> something that
> hasn't yet been manipulated by the eca(s).

Good point. Maybe that's the problem I'm having and needed to track down.

> Additionally, and eca can run a service, which can do
> anything,
> including adding/updating/removing other values, which
> cause other
> ecas to fire.  Which then interact with the
> queued-based eca.
> 
> Were your changes only active at startup, during the
> initial install,
> or were they always available?  When data is later
> manipulated, during
> a test run, certain guarantees still have to be met(which
> I'm sure you
> know).

It was just for run-install.

> >> This is a difficult problem to solve; probably not
> worth
> >> it.  During
> >> production, different high-level threads,
> modifying
> >> different
> >> entities, will run faster, they are already
> running in
> >> multiple threads.
> >>
> >> Most ecas(entity, and probably service) generally
> run
> >> relatively fast.
> >>    Trying to break that up and dispatch
> into
> >> a thread pool might make
> >> things slower, as you have cpu cache coherency
> effects to
> >> content with.
> >>
> >> What would be better, is to break up the higher
> levels into
> >> more
> >> threads, during an install.  That could be
> made
> >> semi-smart, if we add
> >> file dependencies to the data xml files. 
> Such
> >> explicit dependencies
> >> will  have to be done by hand.  Then, a
> parallel
> >> execution framework,
> >> that ran each xml file in parallel, once all of
> it's
> >> dependencies were
> >> met, would give us a speedup.
> > 
> > The minor changes I made cut the data load time in
> half. That's not fast enough? ;-)
> > 
> > It didn't take a lot of threads or a lot of thought to
> speed things up. The bottom line is, you want to keep parts
> of the process going while waiting for DB I/O.
> 
> As for run-install, it starts up catalina.  It'd be
> nice if that were
> multi-threaded as well.  But catalina appears to be
> serial internally.

Getting back to SEDA...

We could implement a SEDA-like architecture in a separate control servlet and 
try it out on different applications by changing their web.xml files. If we had 
access to the author's test code we could see if it made a difference in 
overload situations. Where I work we have a classroom filled with computers 
that could be used as clients to test a SEDA server.




Reply via email to