Hi Andy,

I find this discussion very useful and appreciate your insights as well.

On Jul 17, 2005, at 1:23 AM, Andy Jefferson wrote:

Hi Craig,

thanks for your reply and your insights.


Example 1 : Collection of BigDecimal
1. Basic collection
<field name="myfield">
    <collection element-type="java.math.BigDecimal"/>
    <join/>
</field>
This creates 2 tables - 1 for the class owning "myfield", and 1
join table to
contain the elements. If <join> is omitted then an error should be
thrown
(though i'm not sure if JPOX currently flags this up)


The join element has no defaults, so this is not sufficient to
describe the mapping. You need at least a column attribute naming the
join column. And you need to name the column in the join table to map
the BigDecimal values to. So,

<field name="myfield" column="VALUES" table="MYFIELD_TABLE">
     <collection element-type="java.math.BigDecimal"/>
     <join column="JOIN_COLUMN"/>
</field>


I don't necessarily agree here. We have to qualify the statement with the 
following
New schema : The JDO impl is perfectly capable of providing default namings 
for columns and perfectly capable of choosing the join columns ... since it 
has the (PK) columns in the main table. It provides default namings for 
columns in other situations. 
Existing schema : The user should specify the columns and table as you stated.

We discussed this in the expert group in the past, and came to the conclusion that default mappings are too much work, considering the range of implementations and data stores out there. We could spent lots of time coming up with rules for defaults but felt that it was not worth our time.

That said, we don't require an implementation to throw an exception if it encounters an incomplete mapping. But we do require that an implementation support a completely specified mapping, and that's what the TCK will do.


embedded-element has no effect with this example because the element
(BigDecimal) is already embedded (in the join table), and has no
way of not being embedded.


I'd say that serialized implies embedded-element (and vice-versa,
which is why I'm now questioning the value of embedded-element as an
attribute).


OK. That wasn't my interpretation. I see 2 levels of embedded. We have 
"embedded" at field level, and "embedded-element" (or -key, -value) at 
collection/map level. I see embedded-element/key/value as saying that we want 
to embed the elements/keys/values in a join table (like in the example 9 in 
the spec), and embedded (at field level) saying that we want to embed the 
whole collection/map into the main table.

Well, we don't need the "embedded" term to describe whether we have a join table or not. That is accomplished simply by the existence of the join element. To restate, if there is a join element in the field metadata, then the field's data is contained in a join table. This is independent of the issue of whether the data is embedded or not.

It seems that the "embedded" term is almost universally misunderstood, and I need to do a better job of explaining it. 

Embedded as applied to PC types means that the persistent fields in the PC are stored as columns in the row individually. Which row depends on whether there is a join in the field metadata. This is where Example 9 is instructive, and I believe the example is correct. If serialized is specified as true, then the entire PC instance is serialized and stored in one column.

Embedded as applied to Collection/Array/Map types is always true for relational data stores. If serialized is specified to true, then the entire instance is serialized and stored in one column. A join table is commonly used to store each element (or entry) in one row of the join table. Then the embedded-element, embedded-key, and embedded-value can be used to determine how the elements or entries themselves are stored. 

Somewhat of a digression: I don't know why we need embedded-element, embedded-key, and embedded-value, since we have the <embedded> element that is nested inside the <element>, <key>, and <value> elements.

I'll add some examples of this to the spec, probably in Chapter 15.


3. Embedded element
<field name="myfield">
    <collection element-type="MyElement" embedded-element="true"/>
    <join/>
</field>
This creates 3 tables - 1 for the class owning "myfield", and 1
join table
containing the elements (columns aligned with the fields of the PC
element).


Since it's embedded-element, I think there is only one table that contains
all the fields in the class, including the Collection of MyElement.

You can't map the columns of an embedded Collection of PC elements
because you would need one column for each field in each PC, which is
a variable number of columns. And tables have a fixed number of
columns. So the mapping has to either serialize the Collection and
store it into a BLOB column or use another table. For embedded
collection,


As my comment above, I didn't interpret (embedded-element/key/value) like 
this. I'll try to justify this, with a map this time :-) ...
1. We have a map with embedded-key=false, embedded-value=false. This ends up 
with a main table, and a join table, and optionally (if the keys/values are 
PC), tables for key and value. No disagreement there.

Almost. Just to emphasize, there is one row in the join table for each Map.Entry consisting of a key and a value and a reference back to the primary table for the class. If the key is an Integer, Long, Short, etc., it is actually an embedded-key in the join table row. And this is the default for Integer. 

2. We have a map with embedded-key=true, embedded-value=false. How would you 
store these ?

Would you have a BLOB column for the map keys, and have the map 
values stored off in their own table (since they aren't embedded) ?

Depends on the type of the key. If it is a PC type, embedded-key true means that the key is mapped to possibly multiple columns in the join table row. If it is an Integer, then embedded-key is the default, and means the "normal" mapping of using a column in the join table to store the Integer value.

Embedded-value false means that the values are stored elsewhere (like in the table containing the extent of the PC instances). And the only thing in the join table row is a foreign key to the primary table of the PC type.

This 
would make managing the map a bit tricky for the JDO impl (to say the 
least!). I would store the keys as embedded into the join table (as per 
example 9 in the spec - multiple columns in the join table lining up with the 
fields in the key), and have the values in their own table (if PC) with a FK 
from the join table. This makes it simple for the JDO impl to manage the map 
since the keys and values are stored in the join table.



3. We have a map with embedded-key=true, embedded-value=true. How would you 
store this ?

If the map were a Map<Integer, Address>, embedded-key true means to store the key in the join table row, and embedded-value true means to store the value in multiple columns of the same join table row.

Would you store them as a single BLOB column in the main table. 
I would store the key AND value in the join table (as per example 9 in the 
spec - so we gain columns for key, and columns for value in the join table).

Use serialized true for the field itself to get this behavior, and not use embedded-key or embedded-value at all.

As for your point above about variable number of columns, well example 9 in 
the spec is just this case. It embeds the element of a collection into a join 
table. This is "embedded" because the elements are not stored as FCO's - they 
are embedded into the join table (which is effectively a secondary table 
owned by the main table - and which represents the collection).

This is actually embedded-element true as well as embedded true. The join element indicates that the values are stored outside the primary table.

Now if the user specifies <field embedded="true"> then I would expect to have 
to store the whole collection/map as a single BLOB column (like serialized, 
so why do we have a serialized attribute too ?)

I'd welcome any clarification from the people in the know who spent a lot of 
time designing these levels of specification, and from the JDO vendors that 
have supported such embedding for some time on what we interpret these 
attributes as.


It's been on my to-do list for the specification for a while to add
mapping for arrays, lists, sets, and maps to Chapter 15. This might
be the time to actually do it.


That would be great. The spec covers many many situations and the metadata is 
largely intuitive as to what people specify, but I feel we're missing 
clarification on this one part.

Agree.

Craig


Thanks!
-- 
Andy
Java Persistent Objects - JPOX


Craig Russell

Architect, Sun Java Enterprise System http://java.sun.com/products/jdo

408 276-5638 mailto:[EMAIL PROTECTED]

P.S. A good JDO? O, Gasp!


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to