>Let me try to explain locking once more (since there's confusion on that and
>the db pool issue).
Thank you! That explains it. I also like how that works and is how I've
implemented data pools/queues before in Java.
<snip>
>Everything with this approach is very scalable (locks work great, multiple
>threads can work on multiple messages simultaneously, use multiple db
>connections, etc...) except for how TownSpoolRepository implements accept().
>Right now what it does is list() of all the keys, and then walks through the
>list to try to find an unlocked message. This is unscalable because as the
>list of messages grows, this gets slower and slower...
*nods*
What happens to messages that are sent. How does James know not to send
them again? I understand the thread must call remove(), but what does this
actually do?
Also, when do the messages get loaded into memory from the DB? Is that
done every time list() is called?
This may not make sense, because I may not understand *all* of this
completely yet, but couldn't the messages that accept() returns a key for
get put in another queue? And then when the thread that has the lock calls
retrieve(), it can be moved to another queue. So you would end up with
three pools/queues/whatever you want to call them. The first would be all
messages that have been retrieved from the DB that haven't been accept()ed
yet. The second would be accept()ed messages. The third would be
retrieve()d messages. I'm assuming remove() removes the message and that
the thread calls this when it successfully sends the message. remove()
could just act on the third queue.
Now you don't even have to search the first queue for a message, you can
just pull the first one off the array/queue/whatever and give it to a
thread calling accept(). You have to search through the second queue to
find the message associated with a key, but this is a much smaller number
of messages than the first queue, theoretically (if the piece responsible
for keeping the spool filled is doing its job and if a lot of messages are
being processed). Messages *should* only be in the second queue for a very
short time too. The third queue represents all the messages currently
attempting to be sent. Remove() will have to search this queue, but again,
it should be a smaller queue than the first.
I'm not sure how messages are searched by accept() to see if they have a
lock, but that wouldn't be required if the virgin messages are kept
separate from the ones that are being worked on.
Does this make sense, or doesn't it fit at all to try to do this with how
it currently works?
>What's even worse is how it implements accept(long delay). If a message is
>put back in the queue for a later retry, a thread calls accept(long delay),
>which means the same thing as accept but wait up to "delay" milliseconds
>before retrying a failed message. Not only does it lock a message, but it
>then retrieves the message to check the time. This makes it geometrically
>slower as the number of messages needing a retry increases.
I'm not exactly sure how this works still. Does that mean the delay is
determined by each thread, not the message object itself? I would think
the message object would contain a time stamp and a duration that it has to
wait after the time stamp before it can be sent again. The accept() method
could either check the message before giving it to the caller or *another*
queue could contain all retries and a different thread could periodically
scan that queue. Of course, you could create high priority threads for
each message that should be retried and have the message itself go into a
sleep() and then insert itself back into the queue when the sleep()
expires. That would eat up a lot of threads though. Or if you wanted to
be more efficient and you didn't care about *exact* retry durations, you
could keep several queues, where the first represents (for example) 60
second retries, the second represents 120 second retries, the third, 240
second retries, etc. Each time a message fails, it is put into the queue
after the one it was in before. Then you could have a thread for each
queue that wakes up every 60, 120, 240, etc. seconds and scans its queue,
dumping all the messages back into the main spool. Obviously, you could
get by with one scanner thread if the durations *were* a multiple of 60.
Make sense?
>Anyway, pure JDBC access will help me build much better implementations of
>this (not that it's impossible without it). I have to look over the db pool
>code from excalibur, but with that, hopefully I can get it working this
>weekend.
Do the above ideas get around the need for pure JDBC access?
>Serge Knystautas
>Loki Technologies
>http://www.lokitech.com/
>----- Original Message -----
>From: "David Doucette" <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Sent: Saturday, June 16, 2001 2:14 PM
>Subject: Re: RDBMS support
>
>
> > First off, I'm very new to James and since I haven't been able to get the
> > last stable release out of CVS yet, I'm forced to sit on the sidelines and
> > watch james-dev and james-user. However, I have learned some things and
> > have thought about how I would have to make changes if I wanted to
>increase
> > performance.
> >
> > >1. James's database structure is very simple. The core of James has the
> > >potential to use 2 unrelated tables. One for a user repository and one
>for
> > >a message spool repository. The structure of these tables are very
>simple.
> > This is what struck me originally when I heard of all the abstraction
>going
> > on. It doesn't seem like a very complex system is needed, since James
> > doesn't really have complex database requirements!
> >
> > >2. Users are already shielded from the database code by the
>UserRepository
> > >and SpoolRepository API. This ease of use would really only help the one
>or
> > >two developers who write the DatabaseUserRepository and
> > >DatabaseSpoolRepository class.
> > Very good point.
> >
> > >3. We need a lot of control over how data is returned for performance
> > >reasons. Two big limitations we are experiencing with Town (that I
>believe
> > >we would have with Turbine and other abstraction layers) is the inability
>to
> > >return parts of a ResultSet and to get streamed access to binary data. I
> > >believe both of these are critical to increasing scalability and
> > >performance.
> > I'm still not 100% clear on how the locking of records works (it almost
> > sounds like it uses Java's thread locking mechanisms rather than locking
>by
> > setting a flag in the DB). However, If the engine has to periodically
> > re-read all the messages in the spool just to send some of them out, then
> > I'm all for anything that will turn that around. It just won't scale
> > without doing something about this. If someone could explain again how
> > this works, I'd appreciate it.
> >
> > >Then the only thing
> > >we need is a JDBC connection pool code, and I happen to have one I can
>add
> > >pretty easily.
> > Again, I'm just not clear on how database connections work now. It seems
> > to me that there can only be one connection retrieving spool messages,
> > because otherwise locking in memory wouldn't work and each thread that has
> > a connection would retrieve the same messages and send them again.
> >
> > >Any thoughts? I can throw the JDBC spool repository implementation
>together
> > >pretty quickly using the old table structure.
> > Go for it!
> >
> > I know that I would really like not having to redesign this part of James
> > so that it scales and I'm sure others feel the same way.
> >
> > David
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]