RE: Alt-Design status: XML handling

Rhett Aultman Mon, 25 Nov 2002 13:33:29 -0800

Completely generalized and probably worthless response below. ;)

-----Original Message-----
From: Oleg Tkachenko [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 25, 2002 4:01 PM
To: [EMAIL PROTECTED]
Subject: Re: Alt-Design status: XML handling

>>
Peter B. West wrote:

> I don't believe is is only a matter of style.  I think the detrimental 
> effects of push for general programming are glaringly obvious.
It's just event-driven processing, how it could be detrimental?
<<

I cannot speak for FOP, but I can speak in generalities about this.  The difference 
between event-based and pull-style is roughly the difference between using a garden 
hose and using a garden hose with one of those spray-gun nozzles on it.  In the case, 
the water keeps coming out of the hose, pretty much whether you want it to or not.  In 
the latter case, the water comes out only when you want it, but it requries effort on 
your behalf.  When to use each idea.

Generally, event-driven processing is a pretty good thing.  The critical issue with 
it, though, is the ratio of event production to event processing.  If that number is 
anything greater than 1, then more events are being produced in a stretch of time than 
can be effectively processed in that stretch of time.  Events start to queue up, 
taking up memory.  If it happens enough, the heap starts to get a little too full, the 
gc runs a little too much, and that causes processing time to suffer even further.  
Under most circumstances, event-based processing is like using a garden hose to water 
a bed of flowers.  It works just fine.  Under more intense cases, though, it can be 
more like using a garden hose to fill a small container of water, then leaving the 
hose laying around (spilling water all over the lawn) while the container gets carried 
off somewhere.

Comparitively, if a program decides to pull in more data to process, then there's an 
opportunity to control the amount that comes in at any given point.  This means that 
there's less (or no) need to worry about the rate at which data comes in, since it's 
turned on and off rather easily.  The amount of memory wasted is minimized (yes, I 
consider a wait queue to be a waste of memory, since it cannot be used for anything 
more productive), but the downside is that, of course, to keep the data streaming in 
for long periods of time tends to require continuous effort to tell the pulling system 
to pull in another chunk, much like how it takes effort to keep the valve open on a 
hose's spray gun.

There has been a time or two in my (admittedly, somewhat short) career as a developer 
where I've had cause to stop thinking in terms of an event system and instead work 
with a "pull" concept, and it was for the reason I gave- when an event source was 
allowed to generate events at its own pace, and the event handler took too long to 
process, the events piled up and performance suffered.  I'd expect a very similar 
situation could be expected in FOP.  SAX processing tends to fire a lot of events, and 
if FOP does a reasonable amount of processing work relative to the work needed to fire 
another event, then those events are piling up in memory and wasting space.  I can 
definitely see an argument for a pull-based system.  Also, I think that a push-model 
probably isn't going to scale as effectively to larger documents, where a pull system 
should have more constant performance regardless of document size.

Of course, take that with a mine of salt.

>>
It's in "Comparing XmlReader to SAX Reader" page[1]: "The push model can 
be built on top of the pull model. The reverse is not true." Too 
categorical statement, I think.
<<

And, I believe, it might be wrong, though I must read the full source text.  The push 
model can be seen as a special case of a pull model in the sense of "Pull everything 
ASAP, now and until the data is exhausted."  But, a pull model can be grafted onto a 
push model by implementing what amounts to a specialized buffer of the pushed data 
that accepts pull queries...no?

>>
If so, we need more opinions from others.
<<

My major interests lie in things happening above this layer, so I don't really have 
too much concern, but I definitely can see a good case for a pull-model.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

RE: Alt-Design status: XML handling

Reply via email to