[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-11-30 Thread Stefan Guggisberg (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12454691 ] 

Stefan Guggisberg commented on JCR-569:
---

 Jukka Zitting commented on JCR-569:
 ---
 
 Committed some small-scale improvements in revision 477142.
 
 Looking at the class more closely, it seems like the key to simplifying the 
 code flows would be to get rid of the aborted and succeeded marker flags, 
 relying more on code structure to indicate the current state of the 
 computation. See the patch attached to JCR-546 for one possible approach to 
 removing (or reducing the visibility of) the succeeded flag. The aborted 
 flag could possibly be replaced by using the State pattern like in the 
 JCR-RMI Value implementation.

+1

cheers
stefan

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Assigned To: Jukka Zitting
 Attachments: GenericImporter.patch, SysViewImporter.patch, 
 WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-10-13 Thread Nicolas Toper (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12442223 ] 

Nicolas Toper commented on JCR-569:
---


   [[ Old comment, sent by email on Wed, 13 Sep 2006 18:00:23 +0200 ]]

Hi Stefan,

About your first point, you are right: this is why I sent yesterday a more
in depth analysis. This JIRA issue will be used as Jukka suggested it, for
refactoring issue of this class only.

I would like to create two classes: one generic Importer (name to be
defined) and one dedicated to import a workspace. I would use the generic
Importer for the backup and others could use it when they need to batch load
some data (of course, it is more complex to use and set up). The
WorkspaceImporter would be used for the same use case as now. The
WorkspaceImporter constructor would stay the same to avoid backward
compatibility issue.

Here is a quick descriptive of each methods of the generic Importer. They
are private since no other class need them (this class is used with the
ContentHandler).

 private boolean checkNode(NodeInfo nodeInfo)
Check if the node can be imported to the repository. For instance, it can
try to delete a protected node, the same node might already exist (it can
depend on the ImportUUIDBehavior used) or the node can be incorrect
according to the node type in the repository.

Some checks are escaped if raw mode is chosen.

If the node is not correct, it puts the skip mode on true to exclude this
subtree. When the subtree is over, the skip mode is put to off. If the error
is serious (constraint issue for instance), an exception will be thrown.


 private NodeState createNode(NodeState parent, NodeInfo nodeInfo)
Create the NodeState depending according to ImportUUIDBehavior flag

 private void createProperties(NodeState myNode, List propInfos)
Create the properties depending according to ImportUUIDBehavior flag

 private boolean isSkipped(NodeInfo nodeInfo)
if we are in skip mode return true; else false.


private void postProcess(NodeState node)
Initialize if raw == false,  this method is not called. It will be empty for
the generic Importer (we don't need to initialize any thing) but
WorkspaceImporter will overload it.

protected NodeState resolveUUIDConflict
Resolve UUID conflict :p

 private boolean isAddabble(NodeState parent, NodeInfo nodeInfo) throws
ConstraintViolationException, AccessDeniedException, VersionException,
LockException, ItemNotFoundException, ItemExistsException,
RepositoryException
This method is not needed.

 private boolean isCorrect(NodeState parent, NodeInfo nodeInfo, List
propInfos)
Not needed.

If you are OK with this approach, I will work on this and propose you soon a
patch implementing this refactoring.

BR,
Nico
my blog! http://www.deviant-abstraction.net !!


 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Assigned To: Jukka Zitting
 Attachments: GenericImporter.patch, SysViewImporter.patch, 
 WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-17 Thread Jukka Zitting (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435341 ] 

Jukka Zitting commented on JCR-569:
---

XML import functionality is tested as a part of the JCR test suite that is 
automatically invoked when you run maven jar to build Jackrabbit. Currently 
the patch causes 14 unit test errors due to NullPointerExceptions.

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: GenericImporter.patch, SysViewImporter.patch, 
 WorkspaceImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-17 Thread Jukka Zitting (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435342 ] 

Jukka Zitting commented on JCR-569:
---

Two other issues about the patch:

1) The modified class doesn't implement the Importer interface
2) There's a reference to GenericImporter that causes a compile error

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: GenericImporter.patch, SysViewImporter.patch, 
 WorkspaceImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-17 Thread Nicolas Toper (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435345 ] 

Nicolas Toper commented on JCR-569:
---

Just to be sure everything is all right: I have checked with my machine. Only 2 
tests are run when calling o.a.j.test.core.xml. Are they the only ones run?

Is the JCR test suite what is in /jackrabbit/src/test?

Nico

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: GenericImporter.patch, SysViewImporter.patch, 
 WorkspaceImporter.patch, WorkspaceImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-17 Thread Jukka Zitting (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435349 ] 

Jukka Zitting commented on JCR-569:
---

The patch still fails. The failing tests are in org.apache.jackrabbit.test.api. 
I'm getting:

[junit] Tests run: 584, Failures: 0, Errors: 14, Time elapsed: 17,734 sec
[junit] [ERROR] TEST org.apache.jackrabbit.test.api.TestAll FAILED

You can run the unit tests individually, but I'd really recommend you to use 
maven jar (or even maven clean jar) at least as the final verification step 
before submitting patches. Detailed test reports get saved in 
target/test-reports.

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: GenericImporter.patch, SysViewImporter.patch, 
 WorkspaceImporter.patch, WorkspaceImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-13 Thread Stefan Guggisberg (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434386 ] 

Stefan Guggisberg commented on JCR-569:
---

 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null).
 

sorry, i can't follow you here. what does 'circumvoluted' mean? what does 
'imbricated' mean? 
please note that if something is 'hard to understand' does not necessarily mean 
that it 
is too complex.

 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.

what's the use of the private 'isAddabble'(?),  checkNode etc methods? why are 
they private?



 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: SysViewImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-13 Thread Nicolas Toper (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434496 ] 

Nicolas Toper commented on JCR-569:
---

Sure. Here it is:

Introduction
The current o.a.j.core.xml.WorspaceImporter class has one main responsability: 
to import data provided by a ContentHandler to the repository. It needs to 
check imported data to maintain consistancy and coherence of the repository and 
in some cases rejects them.

The calling stack is:
- WorkspaceImporter implements Importer
- SysViewImportHandler
- SaxParser (and various classes associated)
- Other classes

The Importer interface is composed of 4 methods: start, end, startNode, 
endNode. The current WorkspaceImporter contains only two protected classe

Currently the code is complex to grasp and have some issues I would like to 
solve with this iteration of the backup tool (I am using this class heavily).

Issues I am trying to solve:
- Methods are too long: there are 7 methods in this class for 619 lines. In 
average one method per 88 lines. This goes with the 4 levels of if/else. This 
code is hard to debug.

- Complexity: there is no clear separation of concerns between methods. This is 
partly due to the Importer interface quite close to the XML. For instance, 
startNode() checks the validity of the import ad endNode do. We need to create 
more methods giving a role to each one.

- Factorizing/Code duplicate: Right now we check several times the same thing 
and sometimes not at all. For instance we check the sanity of the workspace at 
the end of each node and at the end of the import. We also check several times 
if a node can be added without disrupting the consistancy of the repository. We 
should have a private dedicated method for that.

- Bugs: jcr:root is not handled correctly and other bugs have been detected. 
Since the code is hardly readable,

-  Lack of flexibility: I will give just an example. Because of this monolothic 
design (only 7 methods), the checks are embedded in each method (cf. upper). I 
had to subclass the 4 methods of the Importer interface. Also the 
WorkspaceImporter initiates versioning if needed and remap some UUID all the 
time.

Solution
My proposal is in four parts:

- Add some private methods to abstract some of the complex issue this class is 
managing

- Add more control in the constructor (for instance for our backup tool): add a 
constructor with the BatchedItemOperation so we can choose if needed the needed 
ItemStateManager. (For instance to restore the version history).

- Add a raw flag in the constructor to import the content as it is without 
initializing versioning and escaping some checks. This is a use case already 
encountered

- With this new design, solve remaining bugs and handle the jcr:root issue.

This new class will be subclassed by WorkspaceImporter to handle the workspace 
(I am not using it for now) and for the sanityCheck(): therefore it will be 
only a few lines and we will have good SoC.

Please have a look as the patch proposed earlier to have a look at the design 
and the new code (it is far from over and a lot of methods are incomplete).

BR
Nicolas

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: SysViewImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-13 Thread Nicolas Toper (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434497 ] 

Nicolas Toper commented on JCR-569:
---

Hi Stefan,

About your first point, you are right: this is why I sent yesterday a more in 
depth analysis. This JIRA issue will be used as Jukka suggested it, for 
refactoring issue of this class only.

I would like to create two classes: one generic Importer (name to be defined) 
and one dedicated to import a workspace. I would use the generic Importer for 
the backup and others could use it when they need to batch load some data (of 
course, it is more complex to use and set up). The WorkspaceImporter would be 
used for the same use case as now. The WorkspaceImporter constructor would stay 
the same to avoid backward compatibility issue.

Here is a quick descriptive of each methods of the generic Importer. They are 
private since no other class need them (this class is used with the 
ContentHandler).

 private boolean checkNode(NodeInfo nodeInfo)
Check if the node can be imported to the repository. For instance, it can try 
to delete a protected node, the same node might already exist (it can depend on 
the ImportUUIDBehavior used) or the node can be incorrect according to the node 
type in the repository.

Some checks are escaped if raw mode is chosen.

If the node is not correct, it puts the skip mode on true to exclude this 
subtree. When the subtree is over, the skip mode is put to off. If the error is 
serious (constraint issue for instance), an exception will be thrown.


 private NodeState createNode(NodeState parent, NodeInfo nodeInfo)
Create the NodeState depending according to ImportUUIDBehavior flag

 private void createProperties(NodeState myNode, List propInfos)
Create the properties depending according to ImportUUIDBehavior flag

 private boolean isSkipped(NodeInfo nodeInfo)
if we are in skip mode return true; else false.


private void postProcess(NodeState node)
Initialize if raw == false,  this method is not called. It will be empty for 
the generic Importer (we don't need to initialize any thing) but 
WorkspaceImporter will overload it.

protected NodeState resolveUUIDConflict
Resolve UUID conflict :p

 private boolean isAddabble(NodeState parent, NodeInfo nodeInfo) throws 
ConstraintViolationException, AccessDeniedException, VersionException, 
LockException, ItemNotFoundException, ItemExistsException, RepositoryException
This method is not needed.

 private boolean isCorrect(NodeState parent, NodeInfo nodeInfo, List propInfos)
Not needed.

If you are OK with this approach, I will work on this and propose you soon a 
patch implementing this refactoring.

BR,
Nico
my blog! http://www.deviant-abstraction.net !!

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: SysViewImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-13 Thread Jukka Zitting (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434503 ] 

Jukka Zitting commented on JCR-569:
---

Nicolas:
 I would like to create two classes: one generic Importer (name to be defined) 
 and one dedicated to import a workspace.

This would still be a part of the flexibility goal (E in my message to the 
mailing list). To keep things simple I'd suggest that you limit this issue to a 
patch that refactors the methods within WorkspaceImporter without introducing 
another class.

Note also that the responsibility of WorkspaceImporter is not to import a 
workspace but to import content *into* a workspace.

 [The methods of the generic Importer] are private since no other class need 
 them (this class is used with the ContentHandler).

How would a subclass modify the behaviour of the class if those methods are 
private?

 If you are OK with this approach, I will work on this and propose you soon a 
 patch implementing this refactoring. 

It looks OK, but I think the main issue is seeing how the existing code gets 
mapped to the proposed structure. A patch would be the perfect way to show 
this. :-)

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: SysViewImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-12 Thread Jukka Zitting (JIRA)
[ 
http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434205 ] 

Jukka Zitting commented on JCR-569:
---

Please be more specific in identifying the problems. The code being 
circumvoluted [sic] and really hard to understand is probably correct, but 
doesn't help us to make it better.

Identify the design structures that could be better organized, explain your 
approach to solving the issue, and propose a patch for doing that. See for 
example JCR-565 where I identify the design issue, discuss how to solve it, and 
provide a patch for doing that. What would the WorkspaceImporter look like 
after the addition of the class you propose? How does it make WorkspaceImporter 
better?

Note also that the Importer interface is supposed to be independent of the XML 
serialization format being used as both SysView and DocView ImportHandlers use 
the interface to save the imported data. Thus the name SysViewImporter is 
incorrect.

 WorkspaceImporter Refactoring
 -

 Key: JCR-569
 URL: http://issues.apache.org/jira/browse/JCR-569
 Project: Jackrabbit
  Issue Type: Improvement
Reporter: Nicolas Toper
 Attachments: SysViewImporter.patch


 Hi,
 As you know, I have run into an issue with the backup tool using the 
 WorkspaceImporter. I ended up copy/pasting large body of code since the 
 current WorkspaceImporter was not flexible enough for my use (in my class 
 called SysViewImporter). This solution was perfectly valid for Google SoC (a 
 lot of time constraints) but unacceptable in the long run for any project (we 
 hate large body of duplicate code :p).
 Also, some issues have been raised with this class (i.e. jcr:root 
 importation, allowsSameNameSiblings issue). 
 Overall I feel this class  is circumvoluted and really hard to understand. 
 For instance, the current code contains at most 5 imbricated if and uses a 
 lot of different ways to pass information (stacks, objects set on null). 
 I tried to refactor my SysViewImporter and the WorkspaceImporter but it 
 implies a new code for the WorkspaceImporter and the SysViewImporter. Here is 
 its skeleton! I first wanted to gather the community feedback before stepping 
 in. I tried moving all work code away from the startNode method and 
 reorganise it for readilibility.
 Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-12 Thread Nicolas

Sure. Here it is:

Introduction
The current o.a.j.core.xml.WorspaceImporter class has one main
responsability: to import data provided by a ContentHandler to the
repository. It needs to check imported data to maintain consistancy and
coherence of the repository and in some cases rejects them.

The calling stack is:
- WorkspaceImporter implements Importer
- SysViewImportHandler
- SaxParser (and various classes associated)
- Other classes

The Importer interface is composed of 4 methods: start, end, startNode,
endNode. The current WorkspaceImporter contains only two protected classe

Currently the code is complex to grasp and have some issues I would like to
solve with this iteration of the backup tool (I am using this class
heavily).

Issues I am trying to solve:
- Methods are too long: there are 7 methods in this class for 619 lines. In
average one method per 88 lines. This goes with the 4 levels of if/else.
This code is hard to debug.

- Complexity: there is no clear separation of concerns between methods. This
is partly due to the Importer interface quite close to the XML. For
instance, startNode() checks the validity of the import ad endNode do. We
need to create more methods giving a role to each one.

- Factorizing/Code duplicate: Right now we check several times the same
thing and sometimes not at all. For instance we check the sanity of the
workspace at the end of each node and at the end of the import. We also
check several times if a node can be added without disrupting the
consistancy of the repository. We should have a private dedicated method for
that.

- Bugs: jcr:root is not handled correctly and other bugs have been detected.
Since the code is hardly readable,

-  Lack of flexibility: I will give just an example. Because of this
monolothic design (only 7 methods), the checks are embedded in each method
(cf. upper). I had to subclass the 4 methods of the Importer interface. Also
the WorkspaceImporter initiates versioning if needed and remap some UUID all
the time.

Solution
My proposal is in four parts:

- Add some private methods to abstract some of the complex issue this class
is managing

- Add more control in the constructor (for instance for our backup tool):
add a constructor with the BatchedItemOperation so we can choose if needed
the needed ItemStateManager. (For instance to restore the version history).

- Add a raw flag in the constructor to import the content as it is without
initializing versioning and escaping some checks. This is a use case already
encountered

- With this new design, solve remaining bugs and handle the jcr:root issue.

This new class will be subclassed by WorkspaceImporter to handle the
workspace (I am not using it for now) and for the sanityCheck(): therefore
it will be only a few lines and we will have good SoC.

Please have a look as the patch proposed earlier to have a look at the
design and the new code (it is far from over and a lot of methods are
incomplete).

BR
Nicolas


Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring

2006-09-12 Thread Jukka Zitting

Hi,

On 9/12/06, Nicolas [EMAIL PROTECTED] wrote:

Sure. Here it is:


Cool, thanks.


Issues I am trying to solve:


I pretty much agree with your analysis of the issues, as I've faced
similar problems working on JCR-325. I labeled the issues for brevity:

  A: Methods are too long
  B: Complexity
  C: Factorizing/Code duplicate
  D: Bugs
  E: Lack of flexibility


Bugs: jcr:root is not handled correctly and other bugs have been detected.


For reference: the jcr:root issue and the preferred fix is discussed
in JCR-535. Please file and refer to separate bug reports for other
specific issues. It's better to keep the refactoring issue separate
from functional changes.


Solution
My proposal is in four parts:


In my opinion it would be best to split the solution to separate
issues to make it easier to review the changes and to understand the
rationale for each change.


- Add some private methods to abstract some of the complex issue this class
is managing


Do you mean the isAddable, checkNode, createProperties, createNode,
isSkipped, isCorrect, and postProcess methods? Are they supposed to be
private or protected, i.e. can a subclass use them directly? Are they
final, i.e. can a subclass override them?

I can see how this would help with goals A, B, and C. Could you make a
patch that changes the WorkspaceImporter class accordingly so we can
see these methods in effect. The patch should improve A, B, and C and
still pass all unit tests for me to apply it.


- Add more control in the constructor (for instance for our backup tool):
add a constructor with the BatchedItemOperation so we can choose if needed
the needed ItemStateManager. (For instance to restore the version history).

- Add a raw flag in the constructor to import the content as it is without
initializing versioning and escaping some checks. This is a use case already
encountered


Both of these parts are related to goal E. They are extra features and
pretty much independent of structural refactoring. I'd propose these
changes as a a separate issue and a separate patch.


- With this new design, solve remaining bugs and handle the jcr:root issue.


This would be goal D. Again, I would propose a fix for the jcr:root
issue in JCR-535 and in separate reports for any other issues.


Please have a look as the patch proposed earlier to have a look at the
design and the new code (it is far from over and a lot of methods are
incomplete).


I like your approach, but as you mention it still needs some work. I'd
be happy to review an updated version.

I'm a bit concerned that the flow in the proposed class, especially in
the startNode() method isn't directly based on the flow within the
existing WorkspaceImporter. Are you certain that all special cases are
handled as before? Could we add some unit tests to verify that all
those special cases remain operational? I don't think the current
WorkspaceImporter class has full test coverage.

BR,

Jukka Zitting

--
Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED]
Software craftsmanship, JCR consulting, and Java development