On 8 April 2014 02:47, Steve <[email protected]> wrote:

>
> On 08/04/14 10:11, Luca Garulli wrote:
>
> I like very much the idea about Class versioning to avoid massive update
> of database like RDBMSs do.
>
>
> How you currently handle schema changes?  Do you do a full update by
> default?  There are a few cases here.
>
> 1/ Expansive - i.e. a new field is added to schema without constraints.
> i.e. existing record remain valid under the new schema.
>
> 2/ Contractive - i.e. a new field is added with a constraint (like NOT
> NULL) or an existing field has a constraint added.
>
> 3/ Not sure how to classify this one - an existing field has a default
> value added to it or changed.
>
> Case 1 we don't need to touch existing records.
>
> Case 2 we need to check every record and update.  Or in the NOT NULL case
> (typically you would also add a default value at the same time) this
> *could* be done lazily but with a caveat.  If you change the default value
> a second time and a record hasn't been updated since the default value was
> first added then the record will end up with the second default value.  The
> user probably reasonably expects it will have the first default value.
>
> Case 3 see case 2.
>
>
Good analysis, we should classify the operation that need a full update and
the operation can work in mode (1).


>  *Persisting additional class metadata*
>>
>> There is a fundamental mismatch between the way that OrientDB persists
>> classes and this scheme.  Namely that each OClassVersion (the current
>> equivalent of OClassImpl) is a member of an OClassSet.  Each OClassSet
>> shares a table of nameId -> name mappings between all of it's child
>> OClassVersions.  The logical way to persist this would be:
>>
>> OClassSet {
>>     int classId;
>>     Map<Integer, String> nameIdMap;
>>     List<OClassVersion> versions;
>> }
>>
>
>  What's the content of nameIdMap? What nameId stands for?
>
>
> Actually it's a List<String>.  The nameId is the index in the list of the
> string.  But conceptually it's used like a map i.e. read nameId from record
> header then lookup the string field name using nameId as the index.
>

So it's the map for field names. Why a Map and not just an array?


>  Piggybacking OClassSet on top of OClassImpl doesn't seem the right way
>> to do this.
>>
>> Additionally there will need to be persisted a database global map of
>> classId -> OClassSet.
>>
>> I'm open to suggestions as to how to achieve this.  These special
>> documents probably cannot be persisted themselves in the binary format
>> (without some ugly hacking) as the OBinarySerializer is dependent on
>> looking up the OClassSet and nameIds.
>>
>
>  We've a Schema that can manage this. Schema record is marshalled like
> others, so we can add what we want.
>
>
> I noticed OClass has a backing document.  I guess the issue is that
> OClassSet doesn't have properties so making it inherit from OClassImpl
> doesn't really make sense.  What we need to do is embedd OClassImpl's in an
> OClassSet.
>

Ok.


>  *Removing bytes after deserialization*
>>
>> Lazy serialization/deserialization is quite feasible by overriding the
>> various ODocument.field() methods.  i.e. when we read a record we only
>> parse the header (in fact only need to parse the first section of the
>> header initially).  Then if a field is requested that hasn't been retrieved
>> yet we scan the header entry and deserialize.  The question is then raised,
>> under what circumstances is it too expensive to hold on to the backing byte
>> array rather than just deserializing the remaining fields and releasing
>> it.  It would be useful if there was some mechanism to determine if the
>> record is part of a large query.  Or if the OBinDocument itself provides a
>> method to initiate this so that OrientDB can manage it at a lower level.
>>
>
>  I'd like to explore the road to completely avoid to use the
> Map<String,Object> of ODocument's _fieldValues. In facts, with an efficient
> marshallin/unmarshalling we could do it at the fly.
>
>  PROS:
> - Less RAM used and less objects in Garbage Collector (have you ever seen
> tons of Map.Entry?)
> - Less copies of buffers: the byte[] could be the same read from the
> OStorage layer
> - No need of Level2 cache anymore: DiskCache keeps pages, so storing the
> unmarshalled Document has no more sense
>
>  CONS:
> - Slower access to the fields multiple times, but in this case developers
> could call field() once and store the content in a local variable
>
>  WDYT?
>
>
> I had thought of adding that to the existing implementation by overriding
> field().  But there's one major gotcha.  Every call to the same field will
> return a different object.  so o1 != o2 but o1.equals(o2) depends on
> whether o1 has implemented equals().  This could be messy for the user.
>
> With regard to excess Map.Entry I think the Trove library could help
> here.  It doesn't create Entry objects internally, it's faster than HashMap
> and more efficient.
>
> Potentially we could optimise internal representation using arrays
> though.  for schema declared field we'd only need one map per class to map
> field name -> array index.  For schemaless fields we'd still need a map
> though I'll ponder this, there may be another way.
>

I like the array solution more than using Trove.


> We could also use a hybrid approach or different implementation of
> ODocument to let the developer to decide what to use.
>
>
> Good idea.  If they are using predominantly schemaless classes then it
> might be advantageous.
>

We could have OSchemaFullDocument and OSchemaLessDocument impls. I don't
know if makes sense.


> *Partial serialization*
>
>  I'd like also to explore the partial serialization case.
>
>  I mean the case when a user executes a query, browse the result set,
> change a document field and send it back to the database to be saved.
>
>  Now we keep tracks of changes in the ODocument (used also by indexes to
> maintain aligned), so we could marshall and overwrite only the changed
> field in byte[].
>
>  This feature must go together with abandon usage of Map to store field
> values but use only the byte[].
>
>
> My original thoughts on this were much the same.  However since we first
> explored the issue I've been building custom disk persistence mechanisms
> for bitcoin.  One was basically a disk backed arraylist.  When I switched
> that implementation to grouping entries into 4k blocks (to match the
> underlying disk subsystem) I noticed it didn't affect performance at all.
> So I am question whether for most use cases there is any benefit to partial
> updates rather than rewriting the whole record.  Partial adds a lot of
> complexity (i.e. potential bugs) as you have to handle data holes, possibly
> shifting fields around etc.  Even in a large record than spans multiple
> disk blocks the advantage is not so great if the blocks are contiguous on
> disk as the bottleneck is seek time not write time.  You would presumably
> read the whole byte array for a record into memory when reading as you
> don't know where in the record the field you want is until you've parsed
> the record header so if you try to be cute and parse the header first then
> retrieve the data you potentially have another disk seek which kind of
> nullifies the benefit.    So we don't have an advantage with disk access
> time, we don't have an advantage with byte array allocation, the only
> advantage we really get is the cost of deserializing the record in memory.
>
> For very large records however this does change the dynamics quite a bit
> and probably has a valid use case.  Perhaps we need multiple internal
> implementations?
>

My goal was only to avoid marshalling/unarshalling the entire record when
we just set an integer field. With fixed length fields we could update
fields at very low cost, for variable size fields we should shift the
content next and update pointers back. Or in this case we could marshall
the entire record for now.

Lvc@

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to