***********description on InstanceGetter for DVD********
I think the code dealing with getting an InstanceGetter for a DVD from
a formatid is currently isolated in
BaseMonitor.classFromIdentifier(int fmtId). BaseMonitor has a class
level field called rc2 which is an array of same length as
StoredFormatIds.TwoByte. The elements in rc2 will be InstanceGetters.
Every time BaseMonitor.classFromIdentifier(int fmtId) is called, the
method first checks if there is already an InstanceGetter in the rc2
array for the passed format id. If yes, then it simply returns that
cached InstanceGetter from rc2. But if this is the first time this
method is being called for the passed format id, then we first get the
name of the InstanceGetter from RegisteredFormatIds using the format
id passed to the method. (For DVDs, the name of that InstanceGetter
would be org.apache.derby.iapi.types.DTSClassInfo). Using that name
from RegisteredFormatIds, we create a Class object(for DVDs, that
Class object would be DTSClassInfo) and check if that Class is of type
FormatableInstanceGetter. If yes, then we create an instance of that
Class object(for DVDs, this will return an object of type
DTSClassInfo) and set the format id on it. And as a last step, we
cache this FormatableInstanceGetter in the rc2 array for future. So,
in future, if BaseMonitor.classFromIdentifier(int fmtId) gets called
for the same fmtId, we can simply return the cached InstanceGetter
from rc2.
************************************************************
This current code will work fine for non-character type DVDs in Derby
10.3 but it won't work for character type DVDs. For example for the
format id corresponding to SQL type CHAR, we want to return DVD of
type either SQLChar or CollatorSQLChar, depending on the value of
collation type. But existing code will always return SQLChar. What we
want is for one format id to represent 2 DVDs and the deciding factor
is the collation type. In order to support this, I am proposing
following changes to the logic above so that we can have
InstanceGetter return the correct DVD, even for character types.
**********************************changes proposed to
InstanceGetter******************
For collation sensitive format ids (those corresponding to character
types), I am proposing to create a new InstanceGetter class called
CollationSensitiveDTSClassInfo which will extend DTSClassInfo . We
will change RegisteredFormatIds.TwoByte for such format ids to use
org.apache.derby.iapi.types.CollationSensitiveDTSClassInfo. We will
also need to remove the code for collation sensitive format ids from
DTSClassInfo since they will be handled in the new InstanceGetter,
which is CollationSensitiveDTSClassInfo.This new InstanceGetter class
will have two additional fields called collatorForDVD and
collationType. And it will have 2 setter methods, namely,
setRuleBasedCollator and setCollationType. The public Object
getNewInstance() method on this InstanceGetter will have code like
following (Note that, I will need to add a new constructor on
CollatorSQL.. classes to take just the RuleBasedCollator.)
switch (fmtId) {
/* Wrappers */
case StoredFormatIds.SQL_CHAR_ID:
if (collationType == StringDataValue.UCS_BASIC)
return new SQLChar();
else
return new CollatorSQLChar(collatorForDVD);
case StoredFormatIds.SQL_VARCHAR_ID:
if (collationType == StringDataValue.UCS_BASIC )
return new SQLVarchar();
else
return new CollatorSQLVarchar(collatorForDVD);
case StoredFormatIds.SQL_LONGVARCHAR_ID:
if (collationType == StringDataValue.UCS_BASIC)
return new SQLLongvarchar();
else
return new
CollatorSQLLongvarchar(collatorForDVD);
case StoredFormatIds.SQL_CLOB_ID:
if (collationType == StringDataValue.UCS_BASIC)
return new SQLClob();
else
return new CollatorSQLClob(collatorForDVD);
default: return null;
}
The collatorForDVD will need to be set on this new InstanceGetter only
the first time around when it is created. If user has requested
territory based collation, then collatorForDVD will be set to the
Collator that is derived from the database's territory. If user wants
UCS_BASIC collation, then collatorForDVD will be set to JVM's default
Collator. The collationType is subject to change depending on if store
is looking for character types belonging to system tables (such types
will always have collation type of UCS_BASIC) or for character types
belonging to non-system tables (such types will have the collation
type of UCS_BASIC/TERRITORY_BASED depending on what user has requested
for the database). Based on this, the logic for
DVF.instanceGetterFromIdentifiers(fmtId, collationType) will look as
follows
DVF will have a class level field called instanceGettersForFormatIds
which will be an array of same length as StoredFormatIds.TwoByte. The
elements in instanceGettersForFormatIds will be InstanceGetters. Every
time DVF.instanceGetterFromIdentifiers (int fmtId, int collationType)
will be called, the method will first check if there is already an
InstanceGetter in the instanceGettersForFormatIds array for the passed
format id. If yes, then it will check if the instanceGetter is of type
CollationSensitiveDTSClassInfo and if yes, then it will set the
collationType on that InstanceGetter to the collationType passed to
instanceGetterFromIdentifiers method and it will return that
InstanceGetter. If the InstanceGetter is not
CollationSensitiveDTSClassInfo, then it will simply return the
InstanceGetter obtained from the instanceGettersForFormatIds array.
In the case, DVF.instanceGetterFromIdentifiers(int fmtId, int
collationType) does not find InstanceGetter cached for the passed
format id in instanceGettersForFormatIds array, then it will first get
the name of the InstanceGetter from RegisteredFormatIds using the
format id passed to the method. (For non-character DVDs, the name of
that InstanceGetter would be org.apache.derby.iapi.types.DTSClassInfo.
For character DVDs, the name of that InstanceGetter would be
org.apache.derby.iapi.types.CollationSensitiveDTSClassInfo). Using
that name from RegisteredFormatIds, we will create a Class object(for
DVDs, that Class object would be
DTSClassInfo/CollationSensitiveDTSClassInfo) and will check if that
Class is of type FormatableInstanceGetter. If yes, then we create an
instance of that Class object(for non-character DVDs, this will return
an object of type DTSClassInfo. For character DVDs, this will return
an object of type CollationSensitiveDTSClassInfo) and set the format
id on it. For non-character DVDs, as a last step, we will cache this
FormatableInstanceGetter in the instanceGettersForFormatIds array for
future. But for character DVDs, we will set the collationType and
RuleBasedCollator on the InstanceGetter AND then save it in
instanceGettersForFormatIds.
As usual, I might have provided lot of information but hopefully it
will help understand the logic clearly. I will start looking at
implementing this but if anyone has any feedback on the logic, I will
appreciate that.
thanks,
Mamta
On 4/12/07, *Mike Matrigali* <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
Mamta Satoor wrote:
> Mike, the following code will be part of DataValueFactory and
hence it
> will be part of the interface. Please let me know if I am not
very clear
> with what I am proposing or if you forsee problems with this
logic.
> if (dvd instanceof StringDataValue)
> dvd = dvd.getValue(dvf.getCharacterCollator(type));
My comment isn't really the logic, I think we are just not talking
about
the same area. I think the code above belongs hidden behind the new
interfaces in the implementation logic of the data factory and data
types, not an example of what callers of the datatype should be
doing.
>
> Also, in the following line below
> "I'll look at building/using DataFactory interface. It will be
some"
> you mean DataValueFactory interface, right?
>
> Mamta
Yes I meant DataValueFactory interface. Let's work together on
getting
the DataValueFactory interface right.
So far I have uncovered to basic ways store creates "empty" objects.
Note that store really only needs "empty" objects, ie. it is going
to initialize the state of these objects from disk by calling each
objects readExternal() method. But we have decided to not store
the collation info as state in the object so somehow we need to get
that info into the empty objects.
The ways store currently creates these objects:
1) using Monitor to get dvd directly:
dvd = Monitor.newInstanceFromIdentifier (format id)
o I think this use is best implemented as Mamta suggests, just
providing a non-static interface on the DataValueFactory.
something like:
DataValueFactory dvf = somehow cache and pass this around store;
dvd = dvf.newInstance(format id, collation id);
at this point dvd can be used to correctly compare against other
dvd's in possible collate specific ways.
2) using existing dvd's class to get a new "empty" dvd that
matches it
(which is why it does not call clone).
dvd = dvd.getClass().newInstance()
o less sure about this one. Seems like we need a new dvd
interface
that does the equivalent thing. I believe the original code got
here because the original store code did not deal with DVD's it
just got objects, so could not make dvd calls. There is a
getNewNull() interface, anyone know if there is any runtime work
that would be saved over this by creating a
getNewEmpty() interface?
dvd = dvd.getNewEmpty();
at this point dvd can be used to correctly compare against other
dvd's in possible collate specific ways.
3) optimized allocation, caching some of the work. This is used
where one query may generate large number of rows - for instance
hash table scan and sorter calls. Here the idea is to do some
part of the work once leaving an InstanceGetter which then can
repeatedly give back new objects in the most optimized way:
called once:
InstanceGetter = Monitor.classFromIdentifier(format id)
called many times:
dvd = InstanceGetter.getNewInstance()
o something like the following would be the direct conversion.
Note
that implementation of the Instance getter is probably more
complex
now. It can't just remember a single class and call new
instance
on it. It has to cache some info on what class to create and
what
collation to set in it.
called once
DataValueFactory dvf = somehow cache and pass this around store;
InstanceGetter =
dvf.instanceGetterFromIdentifiers(format id, collation id)
called many times:
dvd = InstanceGetter.getNewInstance()
again at this point dvd can be used to correctly compare against
other
dvd's in possible collate specific ways.
All 3 of these uses have to be replaced to allow store to create
"correct" types which can be used in possible string comparisons.