Hi Alparslan,

On Sat, May 3, 2014 at 1:02 AM, <[email protected]> wrote:

>
> In auto-generated persistent classes, we create an array field called
> _ALL_FIELDS as you know.


Yes this code was not originally planned for inclusion in GORA-94 but was
instead implemented later on... as a kind of 'work around'.


> But this array also contains __g__dirty field,
> which is not a stored field at all. Maybe we should remove __g__dirty field
> from the array, since the array is used for getting all fields in the
> stored table. We can also remove it from Field enum, so the users do not
> know about the __g__dirty field.
>
>
This field should only be visible on our (the Gora) side... you are
absolutely right!
The larger issue here relates to writer schema and class (current model of
Persistent class extends org.apache.gora.persistency.impl.PersistentBase
implements org.apache.avro.specific.SpecificRecord,
org.apache.gora.persistency.Persistent) and reader schema/class which would
not necessarily need to be same as writer schema/class.
Right now in Gora we DO NOT have a standardized approach to supporting
schema evolution other than taking the chance that a schema change
_hopefully_doesn't_ break things.
We have no method for ensuring backwards compatability with data which is
written and then read using a different schema...
IMHO this is the larger issue we need to consier. Removing __g__ dirty
bytes field is another work around.
I've spoken over on user@avro with Martin Klepmann about this and his
suggestions are very sensible.
*http://s.apache.org/7QY* <http://s.apache.org/7QY>
*http://s.apache.org/biI* <http://s.apache.org/biI>
I've also raised this topic on this list.
We need to support dynamic schema evolution or else data can be redundant
very quickly depending on the use case.
Right now I am struggling to envisage how we approach this... should we be
working on Avro code? Should we work on a Gora specific implementation, if
for example we wish to have a pluggable serialization layer in Gora?
Right now, whenever anyone uses Persistent classes generated by
GoraCompiler packaged in 0.4 release, they will ALWAYS be exposed to __g__
dirty bytes field... this is not an ideal situation however and AFAIK the
only work around is to remove this field on the client side prior to doing
operations such as Query... this is far from perfect but it DOES work.
You may also be interested in AVRO-1124
There is still work to be done with Persistency API in Gora for sure.

Reply via email to