Hi Maarten,

Glad to hear that you found my presentation useful, and that you would like
to contribute to Geode!

In Geode, transactions are not allowed to cross JVM boundary. (They are
replicated, not distributed). This means that you will have to co-locate
your data
<http://geode.docs.pivotal.io/docs/developing/partitioned_regions/colocating_partitioned_region_data.html>
such that you will be able to run transactions on it. Given this
restriction, clock skew will never be an issue. Having said that, we want
to support fully distributed transactions in Geode
<https://issues.apache.org/jira/browse/GEODE-16>. To make queries work well
with transactions, I was hoping to implement MVCC as part of this effort.
Supporting bi temporal data in Geode is one more argument for implementing
MVCC.

Looking forward to having more discussions.

-Swapnil.


On Wed, May 18, 2016 at 10:42 AM, Michael Stolz <[email protected]> wrote:

> This is largely handled for us by the Geode Partitioned Region mechanics
> already.
>
> On a Partitioned Region, there is always a notion of a primary server for
> any given key, and some number of secondaries. All writes are always sent
> to the primary node, who locally locks the entry, then stores the data into
> the entry, and then distributes the update to all of the secondary servers.
>
> I expect that some of the implementation of temporal data access will be
> delegated to a CacheWriter or the GemFire Function Execution Service either
> of which will execute only on the primary server for a given entry. They
> can install the current local server-side timestamp into the entry even
> before it is made visible in the cache, and then let the normal Geode
> caching mechanics do all the rest.
>
>
>
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager
> Mobile: 631-835-4771
>
> On Tue, May 17, 2016 at 3:28 PM, Maarten Niederer <[email protected]
> >
> wrote:
>
> > Hi Mike,
> >
> >
> >
> > All right, I’ll start doing my ‘homework’ in order to get myself
> > familiarized with the Geode codebase.
> >
> >
> >
> > At present I do not wish to learn your ideas as this may cloud my own
> > creative thinking process. So we’ll discuss later on that.
> >
> >
> >
> > Are you having any concrete timelines on this project right now ?
> >
> >
> >
> > Also I’d like to learn your views on my last question, which I will
> > elaborate a bit more.
> >
> >
> >
> > Suppose two transactions (A, B) are executed on two different nodes which
> > have a deviation in system time. Also there is a third process (R) which
> > reads data that is affected by both transactions (which will also
> ‘commit’
> > it’s reads in order to be sure to read a consistent state).
> >
> >
> >
> > In real time the execution order is A à R à B, but due to system time
> > deviation the order of transactions is logged the other way around. As a
> > consequence one can’t reconstruct the read state of R. The only way
> around
> > it, which I can think of given my limited knowledge on distributed
> > transaction, is to take a global lock (which is not an option in my
> > opinion).
> >
> >
> >
> > Can anyone think a better solution than keeping system times as close as
> > possible together and accepting that there may happen conditions which
> > cause
> > that you can’t construct the read state of R?
> >
> >
> >
> > Maarten,
> >
> >
> >
> > I would be interested in spending some cycles thinking through how to
> best
> >
> > implement bi-temporality in Geode. I have implemented bi-temporality
> using
> >
> > several traditional databases, and I have some ideas how to implement
> both
> >
> > time-series and bi-temporal data in Geode, but it would be interesting to
> >
> > have someone who really knows the subject matter well to work with.
> >
> >
> >
> > --
> >
> > Mike Stolz
> >
> > Principal Engineer, GemFire Product Manager
> >
> > Mobile: 631-835-4771
> >
> >
> >
> > On Sun, May 15, 2016 at 6:18 PM, Maarten Niederer <
> > <mailto:[email protected]> [email protected]>
> >
> > wrote:
> >
> >
> >
> > Hi guys,
> >
> >
> >
> > I'm considering to contribute to the Geode project. My attention is drawn
> >
> > because of the desire to create a bi temporal framework in a distributed
> >
> > database system. This should provide me nice engineering challenges.
> >
> >
> >
> > At my present employer I have worked with and developed an application
> that
> >
> > implements a bi temporal framework within the application layer build on
> >
> > top
> >
> > of an old school RDBMS. I have done a lot development with temporal logic
> >
> > (value time). For example, I've implemented several  SQL operations (e.g.
> >
> > outer joins, group/aggregate) in a temporal way (e.g. find me the
> intervals
> >
> > in time when more developers than managers worked at company X given the
> >
> > temporal data on when people started/stopped working and their
> >
> > professions).
> >
> > Also I've been busy with implementing the procedure that writes the
> change
> >
> > log for these temporal tables which allows to inspect these tables at any
> >
> > transaction time. During this work I found several shortcomings to
> >
> > implementing these features within the application layer, which could be
> >
> > only resolved if there's proper support in the RDBMS instead.
> >
> >
> >
> > So I know a lot about bi temporal data, I know only a bit on distributed
> >
> > transactions (I learned some in university and the clubhouse presentation
> >
> > by
> >
> > Swapnil was excellent for knowledge transfer). Finally I don't know
> >
> > anything
> >
> > about the Geode code base. So my first action would be to tackle starter
> >
> > JIRAs for half a year or so in order to get some familiarity with the
> >
> > codebase.
> >
> >
> >
> > However before I start to blow a big shot of my personal time, I'd like
> to
> >
> > know whether there are several people who like to support implementing
> the
> >
> > bi temporal framework. As it requires thorough review both on design and
> >
> > code (and preferably even test coverage). Also for the work on the
> >
> > transaction time dimension, the current work being done on the
> transactions
> >
> > should be completed. Also we will probably need a fundamental discussion
> on
> >
> > 'transaction time' and corresponding 'read consistency' in a distributed
> >
> > context.  (Is there always some snapshot which matches exactly what a
> >
> > client
> >
> > observed at some point in time?).
> >
> >
> >
> > Also good for you to know, most of my time I reside in the CET timezone.
> >
> >
> >
> > Best regards,
> >
> >
> >
> > Maarten Niederer
> >
> >
> >
> >
>

Reply via email to