[ http://issues.apache.org/jira/browse/JCR-325?page=all ]

Jukka Zitting updated JCR-325:
------------------------------

    Attachment: xml-refactoring.patch

I've gone through the importer code and come up with a plan on how to implement 
the interpretation described above. The current importer design makes it quite 
hard to postpone the value parsing decisions until the applicable property 
definitions are known, which effectively prevents form implementing the 
proposed heuristics. To make this easier and to simplify overall code I'd like 
to encapsulate the value parsing rules to the PropInfo instances so that the 
SessionImporter or WorkspaceImporter classes wouldn't need to know anything 
about where the values came from (sys view or doc view) and how they should be 
parsed. I could then create separate SysViewPropInfo and DocViewPropInfo 
subclasses to encapsulate the different value parsing rules of the system view 
and document view XML encodings. Doing this however requires quite extensive 
refactoring of the current code, so I'd like to get a confirmation on this plan 
before I proceed.

The attached patch represents the first steps of this refactoring. Unless I've 
made a mistake, the patch doesn't yet change any of the existing behaviour 
(except for the location where some exceptions and log messages are generated), 
it just refactors the code structure so that it is easier to implement the 
SysViewPropInfo and DocViewPropInfo classes. The reason for publishing the 
patch at this point is to separate the structural changes from the behavioural 
changes and thus make the changes easier to review. I will continue with the 
behavioural changes if this structural change is approved.

Below is a breakdown of the refactoring steps included in this patch. I did 
almost all of these changes using the automatic refactoring tools in Eclipse to 
minimize the chance of accidentally introducing errors in the code. The patched 
source also passes all unit tests and seems to import both system view and 
document view files just as before, which makes me quite confident in the 
quality of the refactoring.

1) Move the following internal classes and interfaces to new files to make the 
class structure easier to manage:

   * Importer.TextValue -> TextValue
   * Importer.PropInfo -> PropInfo
   * Importer.NodeInfo -> NodeInfo
   * TargetImportHandler.AppendableValue -> AppendableValue
   * TargetImportHandler.BufferedStringValue -> BufferedStringValue
   * TargetImportHandler.StringValue -> StringValue

2) Remove the NodeInfo and PropInfo setters and make the member fields "private 
final" to enforce their immutability

3) Refactor the TargetImportHandler.disposePropertyValues(PropInfo) method into 
PropInfo.dispose() to improve encapsulation

4) Move the AppendableValue.dispose() method up to TextValue and implement it 
as a null method in StringValue to avoid type casts in PropInfo.dispose()

5) Remove the AppendableValue interface to make the class structure more simple 
as the SysViewImportHandler can just as well use the BufferedStringValue class 
directly

6) Add a TextValue.getNamespaceContext() method and corresponding "private 
final NamespaceResolver nsContext;" fields in StringValue and 
BufferedStringValue to associate the value instances with the namespace context 
in which they should be parsed. Remove the "nsContext" argument from 
Importer.startNode() and use the TextValue.getNamespaceContext() method in 
SessionImporter and WorkspaceImporter to access the correct namespace context.

7) Refactor the contents of the property iterator loop in 
WorkspaceImporter.startNode() into PropInfo.apply(NodeState, 
BatchedItemOperations, NodeTypeRegistry, ReferenceChangeTracker) and the 
contents of the similar loop in SessionImporter.startNode() into 
PropInfo.apply(NodeImpl, NamespaceResolver, ReferenceChangeTracker) to simplify 
the huge startNode() methods and to encapsulate the value parsing logic into 
PropInfo. This also allows the methods to easily share code through extracted 
getTargetType() and getApplicablePropertyDef() methods and enables other 
structural simplifications.

8) Remove the now unneeded PropInfo getters.

9) Refactor the contents of the value iterator loops in the PropInfo.apply() 
methods to TextValue.getValue(int type, NamespaceResolver resolver) and 
TextValue.getInternalValue(int type) methods and copy the implementation to 
StringValue and BufferedStringValue to increase encapsulation. In both cases 
the implementation can be simplified thanks to local access to the value 
details.

10) Remove the length(), retrieve(), reader(), and getNamespaceContext() 
methods from the TextValue interface to ensure proper encapsulation. Those 
methods can also be fully removed from StringValue, but in BufferedStringValue 
they still have a purpose due to the more complex storage model. Refactor the 
JCR_MIXINTYPES branch in DocViewImportContentHandler to use 
TextValue.getValue() instead of StringValue.retrieve() to get the mixin type 
names.

Overall these structural changes improve a number of quality metrics. The 
encapsulation level is higher, the method and class bodies are shorter, and 
even the total amount of code is slightly lower.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Jukka Zitting
>  Attachments: xml-refactoring.patch
>
> when exporting a multivalue property with docview, the property values are 
> serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 
> 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a 
> valid UUID format
> this is due to the fact, that the space delimited list is not exploded 
> anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type 
> (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string 
> property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to