On May 25, 2007, at 4:32 PM, [EMAIL PROTECTED] wrote:
Hello Ian, Robert, and Henrik
I'll try to comment based on the responses received from the three
of you in this single thread so as to minimize the posts. Before
proceeding, let me just clarify that I am only interested in using
the BDB backend.
I would have to disagree about the documentation for Elephant not
being
abundant---Ian has written a 118 page manual.
Nonetheless, you are correct that the use of Elephant in a multi-
threaded webserver environment is not heavily documented. Ian and I
have discussed the need for a killer "example app" and eagerly await
someone contributing one.
First of all, I want to apologize if my comment came across the
wrong way. I know that Ian (and whoever else has been contributing)
has done a superb job at enhancing Elephant's documentation. It
definitely has come a long way. I first had a bit of difficulty
finding the latest documentation since I couldn't find it online. I
then learned that it came in the doc directory and that you had to
"make" it. Anyway, it is great!
Documentation has always been available online so only developers
updating the web site or editing the documentation will need to
'make' them. The new site has documentation at: http://common-
lisp.net/project/elephant/documentation.html. This page can be
reached by clicking the "documentation" link which you can find in
the leftmost column of the home page. You can jump directly to the
latest online texinfo style HTML by clicking 'Online Docs' in the
upper right hand corner of the home page. Do you know what caused
you to miss these links to documentation? Was there anything
confusing about our site that we could fix?
As far as multi-threaded webserver environment, I know there was a
section about it in the doc (section 6.5) but, as you said, it's
not very elaborate.
Read 4.10, 4.11 and 4.13. 6.5 needs more work to server as a proper
example. Section 6 mostly has placeholders at present. I'll see
about expanding on 4.10 and 6.5 as time allows.
However, I don't have such a strong background on using ODBs and
mainly come from
the SQL world. So, just for curiousity's sake, I read the
tutorial for
AllegroCache which tries to show the "proper" way to use
AllegroCache in
real-world systems
(http://www.franz.com/products/allegrocache/docs/
acachetutorial.pdf).
I'd like to clarify my comment above. Because I read several
AllegroCache documents, I misreferenced the document I really
wanted to reference.
The document in question is title "AllegroCache with Web Servers"
and can be found here: http://www.franz.com/products/allegrocache/
docs/acweb.pdf
As you comment below, reading the acache document created a great
deal of confusion! Please ignore it. While the object and class
interfaces are similar, the system implications and usage models can
be very different so as you comment, you are comparing apples and
oranges.
In the first place, one has to go back to basics a little bit.
Whenever
you have concurrency, you have to have concurrency control.
Personally,
I think to think of this at the object level, but I know it is now
common to think of it at the "database level". You are generally
correct that if you are using SQL (or BDB, for that matter) as a
database and you keep ALL of your state in the database, then you can
generally rely on the concurrency control of those systems to
serialize
everything and not allow two threads to interfere with each other.
However, almost ANY application will have to think about
concurrency; if
your are SQL-orieneted, you will do this by defining
"transactions". If
you define your transactions wrong, you will have concurrency errors,
even though the database serializes transactions perfectly.
For example, generally, since the Web forces us into "page based" or
"form based" interaction (which javascript, CSS and Ajax promptly
complicate), one can generally think of a web applications as "one-
page
turn == one transaction". But even that might not be true---you
could
take 10 page turns to submit an order, and the order must be
atomic---
that is, you either order the thing with 10 pages of
specifications, or
you don't order it. A half-order is a corrupt order.
I agree with you that, in general, when dealing with web
applications involving multiple clients and servers, you have to
have concurrency control. How much do you have to have in your own
application vs how much does the "database framework" offer is, in
my opinion, a good question.
Making reference to the Allegro document, it says "In AllegroCache
a program opens a connection to a database and that connection can
only be used by one thread at a time." Then, as you read the
document and focus on their client-server model, they present
sample code that uses "thread-safe" connection pools, with a macro
named with-client-connectionwith-client-connection. "This macro
retrieves a connection from the pool. If no connection is
available it will create a new connection but it
will create no more than *connection count* connections. If all
connections are created and a new connection is needed the
requesting thread will be put to sleep until a connection is
returned to the pool."
The macro is not the problem, since I could "think of" this macro
as something like Elephant's with-transaction. The problem, and the
overhead I was referring to in my original post is that, to perform
a basic operation such as to update a hash table value, they write
the function as this:
(defun set-password-for-pool (user new-password)
(with-client-connection poolobj
(with-transaction-restart nil
(setf (map-value (or (poolobj-map poolobj)
(setf (poolobj-map poolobj)
(retrieve-from-index 'ac-map
'ac-map-name
"password")))
user)
new-password)
(commit))))
As you can see, there is some, possibly, unnecessary overhead in
the fact that you are getting a connection from the pool and then
obtaining a handle to the "password" hash table before anything can
be set. The reason, as I understand it, they do this is because
since each connection handle works independently in each thread,
each connection has to maintain a separate handle to each
persistent object class and their solution involves storing in the
poolobj structure a handle to the connection and a handle to the
hash table.
So, if this was a more complex application, involving n persistent
classes with m persistent attributes per class, the overhead of
writing all this is significant. Assuming we follow the Elephant
recommendation in section 2.9.3 where actions should be reduced to
minimal operations and nest them with with-transaction/ensure-
transaction, I would have to write, potentially, 2*n*m defuns
(getter/setter) for all the attributes with all the code to fetch
and cache the handles to the connection and to the respective n
persistent class.
Elephant has the "with-tranaction" macro. This is really the best
way
to use Elephant in most circumstances --- but even then, if you are
keeping ANYTHING in memory, whether to cache it for speed or
because you
don't want to put it in the database (session-based information
would be
a typical example), you may still have to "serialize" access to that
state. That unavoidable means that you have to understand some
sort of
mutual exclusion lock; unfortunately, these are not 100% standard
across
different lisps. However, most lisps do provide a basic "with-mutex"
facility. I use this in the DCM code (in the contrib directory) to
serialize access to a "director", which is very similar to a
version of
Ian's persistent classes, but based on a "keep it all in memory
and push
writes to the DB" model (that is, Prevalence.)
The idea I have is to rely on the persistent data instead of in-
memory data. Once I get this going, I may decide to improve
performance with in-memory caches, or anything else. Just want to
get the concept going in a stable and scaleable format.
If you will forgive me over-simplifying things, if:
1) You can legitimately think of every page turn as a transaction,
and
2) You keep all of the meaningful state in the Elephant DB, and
3) You wrap your basic "process-a-page" function in "with-
transaction",
then you won't have a concurrency control problem.
That is a completely appropriate style for certain relatively simple
web-applications; for other kinds of web-applications it would be
very
constraining and slow --- but one should never optimize for speed
before
it is necessary.
I don't mind the over-simplification as long as I understand it :),
and I do. However, thinking back to the AllegroCache document, from
what I understood, they basically take a handle to the connection,
perform the operation, and then release the connection. If this was
a multi-page web operation, it seems that their recommendation
would not be most appropriate, IMHO, but then again, I don't know.
Connections and handles are completely different in elephant, acache
docs are not helpful.
In your recommendation, if I had a order entry system with multiple
pages to be completed before committing the order, I could
understand wrapping the whole thing with with-transaction. However,
wouldn't that present a possible problem locking resources and
leaving it up to the human user to complete the process before
committing or rolling back the transaction?
There are lots of ways to think about this.
One is that you keep track of the ongoing session using in-memory
objects unique to the session. When you need to manipulate a
database (to submit an order, a blog entry, etc) the handler for the
'submit' action uses with-transaction to take the data from the in-
memory session object and commit it to the database (an entry in a
per-user btree, adding a new instance to a class, etc).
If you need session history or want to maintain ongoing state, make
this session object a persistent object instead. Then each post or
get action in the session is logged so you can recover if the user
goes away for awhile, or there is a server error. You will
eventually fill up your disk with sessions (in the absence of GC) so
you need to either drop the session objects when you are done with
them or use a separate store for session objects and periodically
delete and recreate it. We still need a clean model for online GC of
persistent objects to avoid explicit reclamation.
As for contention, with-transaction will retry the transaction code
so if you have a POST handler you can do something like:
(defun handle-post-1
(with-post-data
(send-response-page
(with-transaction ()
<copy post data to persistent objects>
<return response persistent object>))))
This way the update can robustly handle contention while the user
only sees the final page that results from the update to the
persistent object for that user/session. If there is a real problem
and the process fails, you can wrap (send-response-page ...) with a
'handler-case' form that sends a server error page with a link to
restart the transaction (perhaps with the session object so the form
entries are properly initialized on the retry) if the transaction
cannot be committed.
Failing transactions signal 'transaction-retry-count-exceeded.
If you are using BDB make sure that db_deadlock is running. (Either
with the :deadlock-detect keyword option or by running an external
process (if using multiple lisp processes).
As a user of Elephant, you really shouldn't have to worry too much
about threading so long as you follow the simple rules laid out in
the manual under "multi-threading". I think you are trying to
understand how we make this possible since it seems harder from your
read of the acache interface.
You may be right. However, thinking more about this whole thing and
from my understanding of Elephant and what I understood of
AllegroCache, I may be trying to compare apples and oranges. They
may be similar systems, but I don't know if it makes justice
comparing Elephant with AllegroCache client-server model. If I
understand it correctly (now), the current implementation of
Elephant is more similar to AllegroCache stand alone (non-client-
server) model. So, each web process that accesses Elephant can do
so seamlessly with the standard *store-controller* (assuming a
single store controller) and not have to deal with having to manage
connection pools and all that. Keeping this in mind, I would also
assume that in Elephant, I don't have to keep a handle to each
persistent class for each connection. Maybe this is what confused
me and maybe I shouldn't be reading AllegroCache's documentation :)
Correct and correct. We implement the physical storage persistent
classes _very_ differently than acache and trying to compare the
system implications of using each is likely to be more confusing than
helpful. Don't think of them as the same kind of system, they are
two different systems optimizing different aspects of the common
problem of implementing persistent classes.
A simple conceptual model is that each thread has its own
transaction. If these transactions are executing concurrently, the
BDB library or SQL server logs the side read dependencies and the
side effecting writes until a commit or abort. On abort, you throw
away the log. On a commit, one transaction at a time writes their
side effects and cancels any transaction that has read or written one
of the entries written by the committing transaction.
Thread isolation is guaranteed by a policy for using the BDB library
or SQL server. Calls to these libraries are re-entrant. The current
transaction ID (only used by one thread) determines where the reads
and writes are logged (this is a little more complex for SQL, but
handled by the data store and transparent to the user).
I guess this goes back to what I just commented on the fact that
each web thread/request will use the connection in place in the
Lisp VM and not have to deal with establishing a new connection (I
could have checks to make sure that if the store controller is not
opened, I could open it, but once it's opened, I "shouldn't" have
to worry too much about it). Right?
Correct. Elephant maintains a connection to a BDB session which
maintains an open file of the underlying logs and database files.
This is shared among threads because BDB is re-entrant and
transaction ids are used to provide isolation in the presence of
concurrency.
The only other thing we need to do is make sure that the parts of
elephant shared across threads are themselves protected by Lisp
locks. Most of this is the store-controller and some data structures
used to initialize a new store controller.
As an end-user application developer, do I need to worry about this
or should I expect the Elephant framework to handle it?
Elephant handles it, sorry if I confused the issue but I thought you
were trying to understand how elephant implements thread safety.
If you stick to the user contract in the documentation, you shouldn't
have to worry further about interactions of multiple threads (other
than the usual considerations if you have shared lisp variables
instead of shared persistent objects).
I would assume you are referring to my own application shared
variables and not Elephant-related variables, right?
Yes
I think that SQL databases are a safer bet than Berkeley DB
for having several processes on different machines talking to the
same
store, so I will have one instance of postgresql running on a server
with scsi raid 10 and lots of ram.
Henrik, would you mind elaborating more on this? Why would SQL
databases be safer than the BDB stores? I know they are handled by
separate processes, potentially, on separate machines, so in
essence, they are independent of your application. However, isn't
BDB designed just to tackle that using an application library
instead of a separate process?
The problem he is trying to solve is scaling computational power by
using multiple CPUs and multiple servers. This is doable with
Elephant so long as each independent lisp image is using the same
data store spec. However, if you have two machines Berkeley DB in
its normal mode will not work correctly as it's locking facilities
require shared memory between all processes sharing a given disk
database. So the multiple-CPU problem is solved by using N lisp
processes for N CPUs with shared memory. However the multiple server
problem requires a common server that all web servers can talk to.
This is easier to setup with SQL than to write your own server on top
of BDB.
Overall, and being this my first experience with Lisp and ODBs, I
really like Elephant. After reading some of AllegroCache's
documentation, I would still prefer using Elephant. Maybe I'm
trying to see deeper than I need to. Maybe I just need to see more
samples of real-world applications. I would love to contribute
sample applications to the project so as to make it clearer and
easier for others to learn, but I guess, I have to learn it myself
first. Code-wise, I think I have grasped the whole thing. However,
since I currently have no ability to test anything to a larger
scale, I'm trying to understand what it would take for an
application that uses Elephant to work in a large scale system
(both hardware and software).
The biggest issue in scaling is when you think your application needs
to be larger than a single server. Elephant is great for single
server applications. When you scale to multiple servers it is
because you are talking about high hundreds to thousands of
concurrent sessions instead of dozens. That kind of traffic likely
requires a highly reliable substrate and I'm not sure Elephant is
sufficiently hardened that I could recommend it for that kind of
use. Unless, of course, you want to pave new ground with it in which
case I think Elephant can get there.
Thanks again for everyone of your comments. They did in fact help
me and am sure you follow up comments will further help me even
more. Now, while you guys digest this and reply to my post, I will
go back and read the updated Elephant manual :)
Thanks,
Daniel
Good luck, when you figure all this out a detailed summary of the
primary things that confused you would be helpful in improving the
documentation.
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel