Re: Using o.a.t.service.persistence.store.Store to keep track of instances in conversation scope

Jim Marino Wed, 22 Nov 2006 20:27:43 -0800

I'm sure the Store API will need to change some as we figure thisout. I was planning on (at least) three implementations: a memorystore; a jdbc store; and a journal store. The first two I checkedin and the latter I have on my laptop waiting to check in. Thememory store will be very fast but non-durable and non-reliable.The jdbc store will be dog slow ;-) but very reliable since itpersists each instance in its own transaction without anyboxcaring, batching or SQL-reordering. We could implement a JPA-based store which will give us batching and SQL re-ordering butthen we loose some reliability since writes are not necessarilyflushed (i.e. forced) and reported to the client. At that point,it seems like the overhead of using a database is superfluousgiven we would have guaranteed "ACI" but not "D".
Would a RDB DAS-based store make sense? If disconnected SDO's(static or dynamic) could provide any benefit then you mightconsider this.

Hi Kevin,

I thought about a disconnected solution but I'm not sure it is whatwe need. I would characterize our main persistence requirement asmaximizing throughput in the following scenarios:

- No durability (and hence no reliability) for high-volume scenariosthat involve non-mission critical operations- Durability with absolute reliability and full recovery (given theright hardware), i.e. no chance of data loss.- Durability with no reliability. For this to be more useful than #1,there should be a very limited window where data loss is possible.I'm not sold on whether this is really useful since generallyapplications require reliability or they don't, and it typically isnot the case that "kind of reliable" is acceptable.

For situations where data loss cannot be tolerated we need to eithercommit with a database transaction or do our own forced disk write,bypassing any OS caches (e.g. through HOWL) as data loss could occurduring hardware failure. The jdbc store uses a database and performseach write within its own transaction, which is really slow. Thejournal store writes to a binary log and only does a forced write forthe final record block (all other blocks are written asynchronously),providing high throughput. I did some preliminary performance testson a dual-core laptop using 1,000 threads running inside my IDE. Forwriting 42byte records consisting of raw binary information, thestore performed a 1,000 forced writes in 337ms. I then ran a similartest but included serializing 163 byte POJO with header informationand got 1,000 writes in 469ms. I would expect this could be tuned andwe would get a lot better performance, particularly if we did it onreal hardware. I also thought about adding a capability to the jdbcstore where it boxcared writes within a single transaction but atthat point I think the best solution is to just use a journal andavoid the overhead of a database entirely.

In other scenarios potential data loss may be tolerable, e.g. theclient performs writes to the persistence provider but the providerdoesn't flush until after the write request returns (e.g. JPA, orHibernate). This may be because the persistence provider can batchthe updates and do some SQL reordering to optimize throughput. Inthat case, though, the trade-off is the possibility that the runtimeor machine crashes before the disk write occurs, in which case thedata will be lost. Thinking about this more, I'm not sure such a JPA-based store would provide that much of a throughput boost unless weallowed it to batch operations from multiple clients. In userapplications, I've seen Hibernate and JPA boost performance bybatching and reordering SQL from one client performing many writes.Data loss vulnerability is limited since the client (application)will receive an error when it commits the transaction (if the machinecrashed before the commit or flush, then I guess data would be lost).With the store, however, there are typically a very limited number ofwrites performed by a large number of clients. Reordering andbatching per client probably would not yield much. If we allowedbatching across clients, I think the risk of data loss may be toogreat. With a completely disconnected model, I think the risk of dataloss would likely be greater.


Jim





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Using o.a.t.service.persistence.store.Store to keep track of instances in conversation scope

Reply via email to