Re: how should store get an object based on format id and collation id?

Mike Matrigali Fri, 13 Apr 2007 13:23:27 -0700


Mamta Satoor wrote:

Hi Mike,

I didn't quite understand following

*****************Mike wrote**************
2) using existing dvd's class to get a new "empty" dvd that matches it
   (which is why it does not call clone).
   dvd = dvd.getClass().newInstance()

   o less sure about this one.  Seems like we need a new dvd interface
     that does the equivalent thing.  I believe the original code got
     here because the original store code did not deal with DVD's it
     just got objects, so could not make dvd calls.  There is a
     getNewNull() interface, anyone know if there is any runtime work
     that would be saved over this by creating a
     getNewEmpty() interface?

    dvd = dvd.getNewEmpty();
******************************************
What I think you are saying is

"There is someplace in the Derby code, where we dodvd.getClass().newInstance(). And the reason it is done this way is thatthe calling code does not know that it is dealing with a dvd object.

The code now knows that it is dealing with a dvd object, the orignal
implementation probably did not and thus the reason it is currently
using getClass() rather than a call on the dvd.  I did some searching
through the code and it looks like some sort code would also benefit
from a getNewEmpty() interface if it is faster than the current
getClone interface.  It looks like code today is calling getclone there
to create new empty objects which will be read from disk so paying
unnecessary overhead - especially for datatypes that may be doing
addition object allocations to maintain internal state.

Maybe that code can now check if it is dealing with a StringDataValueand if so, then have the code call dvd.getNewEmpty which will be definedonly on StringDataValue.

I don't think code outside of the datafactory should do this kind ofstuff. It seems cleaner if dvd's provide a single interface for all

datatypes, and callers should not be checking type before making a
subsequent call.  At least in this case it is possible for store to
make the call, in the other case store will only have a format id so
no real way to ask the type anything.  And I would really like to avoid
the case where we might have to do 2 object allocations to get a correct
collation type (ie. one allocation on the format id, and then check
something and then ask for another object based on the current object).
The goal for 1, 2 and 3 should be a single object allocation given the

information provided and returning a correct object with correctcollation info, with as little or no overhead as possible to datatypes that

don't really care about collation info.


The getNewEmpty() method will copy the

RuleBasedCollator info to a new instance of StringDataValue and returnthat." Did I understand you right? Also, I am just curious where is thiscode dvd.getClass().newInstance() right now?

The code is injava/engine/org/apache/derby/impl/store/access/conglomerate/TemplateRow.java!newRow()

The comment says that it is more efficient to allocate new objects basedon existing template object than calling the monitor. This was theobservation from measurements a long time ago - no idea if they arestill valid in newest JVM's.

thanks,

Mamta

On 4/12/07, *Mike Matrigali* <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:




    Mamta Satoor wrote:
     > Mike, the following code will be part of DataValueFactory and
    hence it
     > will be part of the interface. Please let me know if I am not
    very clear
     > with what I am proposing or if you forsee problems with this logic.
     > if (dvd instanceof StringDataValue)
     >               dvd = dvd.getValue(dvf.getCharacterCollator(type));

    My comment isn't really the logic, I think we are just not talking about
    the same area.  I think the code above belongs hidden behind the new
    interfaces in the implementation logic of the data factory and data
    types, not an example of what callers of the datatype should be doing.
     >
     > Also, in the following line below
     > "I'll look at building/using DataFactory interface.  It will be some"
     > you mean DataValueFactory interface, right?
     >
     > Mamta

    Yes I meant DataValueFactory interface.  Let's work together on getting
    the DataValueFactory interface right.

    So far I have uncovered to basic ways store creates "empty" objects.
    Note that store really only needs "empty" objects, ie. it is going
    to initialize the state of these objects from disk by calling each
    objects readExternal() method.  But we have decided to not store
    the collation info as state in the object so somehow we need to get
    that info into the empty objects.

    The ways store currently creates these objects:

    1) using Monitor to get dvd directly:
       dvd = Monitor.newInstanceFromIdentifier (format id)

       o I think this use is best implemented as Mamta suggests, just
         providing a non-static interface on the DataValueFactory.
         something like:

         DataValueFactory dvf = somehow cache and pass this around store;
         dvd = dvf.newInstance(format id, collation id);

         at this point dvd can be used to correctly compare against other
         dvd's in possible collate specific ways.

    2) using existing dvd's class to get a new "empty" dvd that matches it
       (which is why it does not call clone).
       dvd = dvd.getClass().newInstance()

       o less sure about this one.  Seems like we need a new dvd interface
         that does the equivalent thing.  I believe the original code got
         here because the original store code did not deal with DVD's it
         just got objects, so could not make dvd calls.  There is a
         getNewNull() interface, anyone know if there is any runtime work
         that would be saved over this by creating a
         getNewEmpty() interface?

        dvd = dvd.getNewEmpty();

        at this point dvd can be used to correctly compare against other
         dvd's in possible collate specific ways.

    3) optimized allocation, caching some of the work.  This is used
       where one query may generate large number of rows - for instance
       hash table scan and sorter calls.  Here the idea is to do some
       part of the work once leaving an InstanceGetter which then can
       repeatedly give back new objects in the most optimized way:

       called once:
       InstanceGetter = Monitor.classFromIdentifier(format id)

       called many times:
       dvd = InstanceGetter.getNewInstance()

       o something like the following would be the direct conversion.  Note
         that implementation of the Instance getter is probably more complex
         now.  It can't just remember a single class and call new instance
         on it.  It has to cache some info on what class to create and what
         collation to set in it.

       called once
       DataValueFactory dvf = somehow cache and pass this around store;
       InstanceGetter =
             dvf.instanceGetterFromIdentifiers(format id, collation id)

       called many times:
       dvd = InstanceGetter.getNewInstance()

    again at this point dvd can be used to correctly compare against other
         dvd's in possible collate specific ways.



    All 3 of these uses have to be replaced to allow store to create
    "correct" types which can be used in possible string comparisons.

Re: how should store get an object based on format id and collation id?

Reply via email to