Re: Initializing Sleepycat::DbXml (Berkeley, Oracle) objects in startup.pl

Michael Ludwig Mon, 19 Jan 2009 03:06:43 -0800

macke...@animalhead.com schrieb:

Yes it is now Abend again.  Timewise this is like relationships I
have had with Philips (now NXP) European colleagues...


And a new morning again, and it goes round and round ...

On Jan 18, 2009, at 10:56 AM, Michael Ludwig wrote:

macke...@animalhead.com schrieb:

Thanks for replying again. What you've outlined above is what I had
implemented following Mark's suggestion in his first reply to my
initial post. Unfortunately, this quite reliably produces SEGVs, so I
think this usage is not aligned with the Berkeley interface design. I
don't know in which way this is wrong - I guess I'll have to ask the
Berkeley folks in order to maybe clarify this.

I was using berkeleyDB 4.?, but without the fancy threading stuff you
mention below.


The BerkeleyDB module is based on the C interface while Sleepycat::Db is
based on the C++ extension to the C interface, as far as I can tell.

Since you can't share a tied variable (sharing uses the tying
mechanism), while I was futzing with threading, I tried to share the
underlying filehandle and use the function interface.  But that gave
lots of SEGVs so I quit sharing anything about the DBs between
threads, which got rid of the SEGVs.


Good. So you got the BerkeleyDB module working on a threading worker MPM
by avoiding any attempt at inter-thread sharing.

I'm still getting SEGVs trying to do the same with Sleepycat::DbXml.

What this means is that each thread must open the db's for itself.
The amount of data stored for each open DB connection, times
THREADS_PER_CHILD times the number of Apache children at any
given point, makes for some memory.  But

1) the separate connections help the DB package be thread-safe,


So if coding the handler as above means that each thread, having its
own global variables, opens its own handle to the Berkeley
environment, I shouldn't need the DB_THREAD flag (which, according to
the Berkeley documentation, "[causes] the DbEnv handle returned by
DbEnv::open to be free-threaded; that is, concurrently usable by
multiple threads in the address space." This flag is needed if any
handle is used by more than one thread (or process) concurrently. So
it shouldn't be needed if the handler is coded as above and there
isn't any concurrent access elsewhere.

I have not worked with this flag but from your words it sounds right.

2) the first-used threads keep getting re-used in preference to
   threads not yet used.


I noticed this is indeed what seems to happen - whether by chance or
as a feature, I don't know.


You can find it described proudly as a memory-minimization feature in
some of the Apache docs about worker and/or event.


Does anyone know where this is? Haven't found it here:

http://httpd.apache.org/docs/2.2/mod/worker.html

3) if you consider each thread as more or less equivalent to a
   child process in prefork, your total memory requirement is less.


From perldoc perlthrtut: "In this model each thread runs in its own
Perl interpreter, and any data sharing between threads must be
explicit." This does not sound to me as if there is a significant
advantage over spawning child processes, at least not on UNIX.


Start a prefork Apache2 (if you want, let it run a while to give the
children a chance to grow.  Note the (largest or average) size of
child processes.

Start a worker or event Apache, let it run similarly (or not at all if
you're impatient.  Divide the (largest or average) size of child
processes by THREADS PER CHILD.  My point was that this number will be
much smaller than the process size with prefork.


Okay, I see. More concurrency bang for the memory buck, despite certain
restrictions, if my understanding is correct, of the Perl threading
implementation.

Michael Ludwig

Re: Initializing Sleepycat::DbXml (Berkeley, Oracle) objects in startup.pl

Reply via email to