Yeah I am aware of the gap, we did not implement a two phase commit.

Can we introduce those read/write locks into TransacationStateDiff? I
thought the object was already responsible for producing reader and writer
wrappers which should give us a clean way to do it?

For clarity we “only” need to obtain exclusive use during the final commit.
Readers operating on the same transaction need to wind up their operation
before the commit can proceed.

We may be able to narrow this down to *only* readers being used as a source
during the commit of another transacation? Probably safer to wait for all
of them to wind up...

So we would need a distinct workflow during a commit of transaction state
diff.

1. Close the current reader being used to traverse content. This would free
up a read/write lock.
2. Obtain a write lock (this would wait for all readers and writers to
close out so their our no open readers on the source)
3. Traverse the content using a new source reader, writing out a new file
based on the source data and the changes marked in transaction state diff.
4. Close the source reader, close the new file writer.
5. Rename dance: Rename the source out of the way, and rename the modified
file into position. Delete the source file.
6. Release the read write lock.

On Fri, Dec 7, 2018 at 5:09 AM Andrea Aime <andrea.a...@geo-solutions.it>
wrote:

> Hi,
> I am looking at removing a read/write found in the image mosaic, which
> guards all read/write operations
> against GeoTools data stores, so that no to writes can occur in parallel,
> and no writes can occur in parallel
> with reads.
>
> My first reaction was "hey, datastores can already handle this, why limit
> stability here?"
> Turns out, because datastores cannot really handle it, with the exception
> of JDBC ones and possibly
> other stores that are based on an external server managing concurrency,
> because the built-in transaction
> mechanism is anything but isolated.
>
> Let's take shapefiles, property or CSV stores as example. In all those
> cases, the store is not able
> to handle transactions natively, and ends up using
> DiffTransactionState/DiffFeatureWriter, which are
> storing whatever changed in memory (thus isolating the different
> transactions), and then grabbing
> a physical writer on commit, to write the changes... and here is where
> things go bad, two transaction
> on commit are allowed to write on temp file and then replace the original *at
> the same time*.
> Long story short, only the last one doing the replace wins, if they don't
> end up outright stepping
> on each other toes when doing the "back copy".
>
> This could be solved with a synchronization on the store instance in
> DiffTransactionState, but... it does not
> end here.
>
> While a DiffTransactionState is writing, the feature readers are free to
> do whatever they please, meaning they
> are reading files that are being overwritten during the "copy back"
> operation.... that is also not good.
>
> Basically what seems to be needed is, is a read-write locking mechanism,
> disallowing transaction "write backs"
> while others are running, or while reads are running.... and there lies an
> issue, as we'd have to
> track all readers and release the read locks only when they are closed,
> which means, I guess,
> we'd have to wrap whatever the underlying store is returning.
>
> As an alternative, I might have to try and detect the store type in image
> mosaic, and apply rwLock protection
> by checking a whitelist of types that are known to actually work (this is
> a decision to be made also based on
> the time it takes to resolve the above issue, cause I don't have infinite
> amounts of it).
>
> Thoughts?
>
> Cheers
> Andrea
>
> GeoServer Professional Services from the experts! Visit
> http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf
> Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa
> (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549
> http://www.geo-solutions.it http://twitter.com/geosolutions_it
> ------------------------------------------------------- *Con riferimento
> alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 -
> Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni
> circostanza inerente alla presente email (il suo contenuto, gli eventuali
> allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i
> destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per
> errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le
> sarei comunque grato se potesse darmene notizia. This email is intended
> only for the person or entity to which it is addressed and may contain
> information that is privileged, confidential or otherwise protected from
> disclosure. We remind that - as provided by European Regulation 2016/679
> “GDPR” - copying, dissemination or use of this e-mail or the information
> herein by anyone other than the intended recipient is prohibited. If you
> have received this email by mistake, please notify us immediately by
> telephone or e-mail.*
> _______________________________________________
> GeoTools-Devel mailing list
> GeoTools-Devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geotools-devel
>
-- 
--
Jody Garnett
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to