Re: GData Server - Lucene storage

Simon Willnauer Fri, 02 Jun 2006 04:14:13 -0700

On 6/2/06, Ian Holsman <[EMAIL PROTECTED]> wrote:


On 02/06/2006, at 9:37 AM, Simon Willnauer wrote:

> The biggest problem with the lucene storage is to achieve a
> transactional state. Imagine the following scenario:
> An update request comes in. -> the entry to update will be added to
> the lucene writer   who writes the update. But another delete request
> has locked the index and an IOException will be thrown. So the update
> request will queue the entry and retries to obtain the lock. No
> problem so far. But if the index writer can not open the index due to
> some other error (the index could not be found)  the exception will
> also be an IOExc. Is there any way to figure out whether the
> IOException is caused due to a lock which would be alright or due to
> some other serious reasons?

Hi Simon.
Here is my 2c's I am in no way shape or form a lucene expert, but I
have seen a server/service design once or twice.


am I reading this a bit incorrectly?

You did.

are you saying you will have a set of threads which are going to
handle the interaction with the client, which will then queue up that
request
to another set of threads which will actually write to the lucene
backend?

I'm not sure that this is a good way to go, in most designs I've seen
this queue is the cause of a lot of design/operational issues. but
I'll leave it to the lucene experts to comment on this.... personally
I would think just having the client thread do the write to lucene as
easier (and if you need to queue it do it outside of the app via jms
or something)

I also think your focusing on something here which is too low level
at this stage.

I guess you are right with your assumption I was already two steps
ahead. I guess I should first go for the simplest way to use lucene as
a storage.
Using the client thread as the indexing thread might just cause some
performance drawback but that's considerable for this state of
development. I will provide a second implementation anyway and and
public interface for customizing the storage.

right now I suggest you log an error, and return a error back to the
client (and make it their problem).
as long as you can guarantee that you either will:
* write the whole thing properly on success
* fail and leave the server in the same state as it was before the
update (ie.. leave the request in the queue so it will retry it
later, or if you choose a simpler route just return it straight to
the client)

you can worry about queuing and retrying later on if you like.

Using the client threads gives me the possibility to send a HTTP 500
back to the client if something happens during the store process. But
the index has to be synchronized to prevent concurrent requests
altering the index  at the same time. That means that no IO Exception
occurs caused on not released locks.
If I would go for the structure I showed in the UML it would be quiet
tough to achieve a kind of a transactional state.

I mean a performance drawback would be alright for this storage. users
could still switch to the berkleyDB or customize the storage
component.

I will keep the interface and an abstract implementation of the
storage component protected for a while to maturate the interface.


regards
Ian.



Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: GData Server - Lucene storage

Reply via email to