[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12454691 ] Stefan Guggisberg commented on JCR-569: --- Jukka Zitting commented on JCR-569: --- Committed some small-scale improvements in revision 477142. Looking at the class more closely, it seems like the key to simplifying the code flows would be to get rid of the aborted and succeeded marker flags, relying more on code structure to indicate the current state of the computation. See the patch attached to JCR-546 for one possible approach to removing (or reducing the visibility of) the succeeded flag. The aborted flag could possibly be replaced by using the State pattern like in the JCR-RMI Value implementation. +1 cheers stefan WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Assigned To: Jukka Zitting Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12442223 ] Nicolas Toper commented on JCR-569: --- [[ Old comment, sent by email on Wed, 13 Sep 2006 18:00:23 +0200 ]] Hi Stefan, About your first point, you are right: this is why I sent yesterday a more in depth analysis. This JIRA issue will be used as Jukka suggested it, for refactoring issue of this class only. I would like to create two classes: one generic Importer (name to be defined) and one dedicated to import a workspace. I would use the generic Importer for the backup and others could use it when they need to batch load some data (of course, it is more complex to use and set up). The WorkspaceImporter would be used for the same use case as now. The WorkspaceImporter constructor would stay the same to avoid backward compatibility issue. Here is a quick descriptive of each methods of the generic Importer. They are private since no other class need them (this class is used with the ContentHandler). private boolean checkNode(NodeInfo nodeInfo) Check if the node can be imported to the repository. For instance, it can try to delete a protected node, the same node might already exist (it can depend on the ImportUUIDBehavior used) or the node can be incorrect according to the node type in the repository. Some checks are escaped if raw mode is chosen. If the node is not correct, it puts the skip mode on true to exclude this subtree. When the subtree is over, the skip mode is put to off. If the error is serious (constraint issue for instance), an exception will be thrown. private NodeState createNode(NodeState parent, NodeInfo nodeInfo) Create the NodeState depending according to ImportUUIDBehavior flag private void createProperties(NodeState myNode, List propInfos) Create the properties depending according to ImportUUIDBehavior flag private boolean isSkipped(NodeInfo nodeInfo) if we are in skip mode return true; else false. private void postProcess(NodeState node) Initialize if raw == false, this method is not called. It will be empty for the generic Importer (we don't need to initialize any thing) but WorkspaceImporter will overload it. protected NodeState resolveUUIDConflict Resolve UUID conflict :p private boolean isAddabble(NodeState parent, NodeInfo nodeInfo) throws ConstraintViolationException, AccessDeniedException, VersionException, LockException, ItemNotFoundException, ItemExistsException, RepositoryException This method is not needed. private boolean isCorrect(NodeState parent, NodeInfo nodeInfo, List propInfos) Not needed. If you are OK with this approach, I will work on this and propose you soon a patch implementing this refactoring. BR, Nico my blog! http://www.deviant-abstraction.net !! WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Assigned To: Jukka Zitting Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435341 ] Jukka Zitting commented on JCR-569: --- XML import functionality is tested as a part of the JCR test suite that is automatically invoked when you run maven jar to build Jackrabbit. Currently the patch causes 14 unit test errors due to NullPointerExceptions. WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435342 ] Jukka Zitting commented on JCR-569: --- Two other issues about the patch: 1) The modified class doesn't implement the Importer interface 2) There's a reference to GenericImporter that causes a compile error WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435345 ] Nicolas Toper commented on JCR-569: --- Just to be sure everything is all right: I have checked with my machine. Only 2 tests are run when calling o.a.j.test.core.xml. Are they the only ones run? Is the JCR test suite what is in /jackrabbit/src/test? Nico WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435349 ] Jukka Zitting commented on JCR-569: --- The patch still fails. The failing tests are in org.apache.jackrabbit.test.api. I'm getting: [junit] Tests run: 584, Failures: 0, Errors: 14, Time elapsed: 17,734 sec [junit] [ERROR] TEST org.apache.jackrabbit.test.api.TestAll FAILED You can run the unit tests individually, but I'd really recommend you to use maven jar (or even maven clean jar) at least as the final verification step before submitting patches. Detailed test reports get saved in target/test-reports. WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434386 ] Stefan Guggisberg commented on JCR-569: --- Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). sorry, i can't follow you here. what does 'circumvoluted' mean? what does 'imbricated' mean? please note that if something is 'hard to understand' does not necessarily mean that it is too complex. I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. what's the use of the private 'isAddabble'(?), checkNode etc methods? why are they private? WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: SysViewImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434496 ] Nicolas Toper commented on JCR-569: --- Sure. Here it is: Introduction The current o.a.j.core.xml.WorspaceImporter class has one main responsability: to import data provided by a ContentHandler to the repository. It needs to check imported data to maintain consistancy and coherence of the repository and in some cases rejects them. The calling stack is: - WorkspaceImporter implements Importer - SysViewImportHandler - SaxParser (and various classes associated) - Other classes The Importer interface is composed of 4 methods: start, end, startNode, endNode. The current WorkspaceImporter contains only two protected classe Currently the code is complex to grasp and have some issues I would like to solve with this iteration of the backup tool (I am using this class heavily). Issues I am trying to solve: - Methods are too long: there are 7 methods in this class for 619 lines. In average one method per 88 lines. This goes with the 4 levels of if/else. This code is hard to debug. - Complexity: there is no clear separation of concerns between methods. This is partly due to the Importer interface quite close to the XML. For instance, startNode() checks the validity of the import ad endNode do. We need to create more methods giving a role to each one. - Factorizing/Code duplicate: Right now we check several times the same thing and sometimes not at all. For instance we check the sanity of the workspace at the end of each node and at the end of the import. We also check several times if a node can be added without disrupting the consistancy of the repository. We should have a private dedicated method for that. - Bugs: jcr:root is not handled correctly and other bugs have been detected. Since the code is hardly readable, - Lack of flexibility: I will give just an example. Because of this monolothic design (only 7 methods), the checks are embedded in each method (cf. upper). I had to subclass the 4 methods of the Importer interface. Also the WorkspaceImporter initiates versioning if needed and remap some UUID all the time. Solution My proposal is in four parts: - Add some private methods to abstract some of the complex issue this class is managing - Add more control in the constructor (for instance for our backup tool): add a constructor with the BatchedItemOperation so we can choose if needed the needed ItemStateManager. (For instance to restore the version history). - Add a raw flag in the constructor to import the content as it is without initializing versioning and escaping some checks. This is a use case already encountered - With this new design, solve remaining bugs and handle the jcr:root issue. This new class will be subclassed by WorkspaceImporter to handle the workspace (I am not using it for now) and for the sanityCheck(): therefore it will be only a few lines and we will have good SoC. Please have a look as the patch proposed earlier to have a look at the design and the new code (it is far from over and a lot of methods are incomplete). BR Nicolas WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: SysViewImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434497 ] Nicolas Toper commented on JCR-569: --- Hi Stefan, About your first point, you are right: this is why I sent yesterday a more in depth analysis. This JIRA issue will be used as Jukka suggested it, for refactoring issue of this class only. I would like to create two classes: one generic Importer (name to be defined) and one dedicated to import a workspace. I would use the generic Importer for the backup and others could use it when they need to batch load some data (of course, it is more complex to use and set up). The WorkspaceImporter would be used for the same use case as now. The WorkspaceImporter constructor would stay the same to avoid backward compatibility issue. Here is a quick descriptive of each methods of the generic Importer. They are private since no other class need them (this class is used with the ContentHandler). private boolean checkNode(NodeInfo nodeInfo) Check if the node can be imported to the repository. For instance, it can try to delete a protected node, the same node might already exist (it can depend on the ImportUUIDBehavior used) or the node can be incorrect according to the node type in the repository. Some checks are escaped if raw mode is chosen. If the node is not correct, it puts the skip mode on true to exclude this subtree. When the subtree is over, the skip mode is put to off. If the error is serious (constraint issue for instance), an exception will be thrown. private NodeState createNode(NodeState parent, NodeInfo nodeInfo) Create the NodeState depending according to ImportUUIDBehavior flag private void createProperties(NodeState myNode, List propInfos) Create the properties depending according to ImportUUIDBehavior flag private boolean isSkipped(NodeInfo nodeInfo) if we are in skip mode return true; else false. private void postProcess(NodeState node) Initialize if raw == false, this method is not called. It will be empty for the generic Importer (we don't need to initialize any thing) but WorkspaceImporter will overload it. protected NodeState resolveUUIDConflict Resolve UUID conflict :p private boolean isAddabble(NodeState parent, NodeInfo nodeInfo) throws ConstraintViolationException, AccessDeniedException, VersionException, LockException, ItemNotFoundException, ItemExistsException, RepositoryException This method is not needed. private boolean isCorrect(NodeState parent, NodeInfo nodeInfo, List propInfos) Not needed. If you are OK with this approach, I will work on this and propose you soon a patch implementing this refactoring. BR, Nico my blog! http://www.deviant-abstraction.net !! WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: SysViewImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434503 ] Jukka Zitting commented on JCR-569: --- Nicolas: I would like to create two classes: one generic Importer (name to be defined) and one dedicated to import a workspace. This would still be a part of the flexibility goal (E in my message to the mailing list). To keep things simple I'd suggest that you limit this issue to a patch that refactors the methods within WorkspaceImporter without introducing another class. Note also that the responsibility of WorkspaceImporter is not to import a workspace but to import content *into* a workspace. [The methods of the generic Importer] are private since no other class need them (this class is used with the ContentHandler). How would a subclass modify the behaviour of the class if those methods are private? If you are OK with this approach, I will work on this and propose you soon a patch implementing this refactoring. It looks OK, but I think the main issue is seeing how the existing code gets mapped to the proposed structure. A patch would be the perfect way to show this. :-) WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: SysViewImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-569) WorkspaceImporter Refactoring
[ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434205 ] Jukka Zitting commented on JCR-569: --- Please be more specific in identifying the problems. The code being circumvoluted [sic] and really hard to understand is probably correct, but doesn't help us to make it better. Identify the design structures that could be better organized, explain your approach to solving the issue, and propose a patch for doing that. See for example JCR-565 where I identify the design issue, discuss how to solve it, and provide a patch for doing that. What would the WorkspaceImporter look like after the addition of the class you propose? How does it make WorkspaceImporter better? Note also that the Importer interface is supposed to be independent of the XML serialization format being used as both SysView and DocView ImportHandlers use the interface to save the imported data. Thus the name SysViewImporter is incorrect. WorkspaceImporter Refactoring - Key: JCR-569 URL: http://issues.apache.org/jira/browse/JCR-569 Project: Jackrabbit Issue Type: Improvement Reporter: Nicolas Toper Attachments: SysViewImporter.patch Hi, As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p). Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). Overall I feel this class is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all work code away from the startNode method and reorganise it for readilibility. Please give me your feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring
Sure. Here it is: Introduction The current o.a.j.core.xml.WorspaceImporter class has one main responsability: to import data provided by a ContentHandler to the repository. It needs to check imported data to maintain consistancy and coherence of the repository and in some cases rejects them. The calling stack is: - WorkspaceImporter implements Importer - SysViewImportHandler - SaxParser (and various classes associated) - Other classes The Importer interface is composed of 4 methods: start, end, startNode, endNode. The current WorkspaceImporter contains only two protected classe Currently the code is complex to grasp and have some issues I would like to solve with this iteration of the backup tool (I am using this class heavily). Issues I am trying to solve: - Methods are too long: there are 7 methods in this class for 619 lines. In average one method per 88 lines. This goes with the 4 levels of if/else. This code is hard to debug. - Complexity: there is no clear separation of concerns between methods. This is partly due to the Importer interface quite close to the XML. For instance, startNode() checks the validity of the import ad endNode do. We need to create more methods giving a role to each one. - Factorizing/Code duplicate: Right now we check several times the same thing and sometimes not at all. For instance we check the sanity of the workspace at the end of each node and at the end of the import. We also check several times if a node can be added without disrupting the consistancy of the repository. We should have a private dedicated method for that. - Bugs: jcr:root is not handled correctly and other bugs have been detected. Since the code is hardly readable, - Lack of flexibility: I will give just an example. Because of this monolothic design (only 7 methods), the checks are embedded in each method (cf. upper). I had to subclass the 4 methods of the Importer interface. Also the WorkspaceImporter initiates versioning if needed and remap some UUID all the time. Solution My proposal is in four parts: - Add some private methods to abstract some of the complex issue this class is managing - Add more control in the constructor (for instance for our backup tool): add a constructor with the BatchedItemOperation so we can choose if needed the needed ItemStateManager. (For instance to restore the version history). - Add a raw flag in the constructor to import the content as it is without initializing versioning and escaping some checks. This is a use case already encountered - With this new design, solve remaining bugs and handle the jcr:root issue. This new class will be subclassed by WorkspaceImporter to handle the workspace (I am not using it for now) and for the sanityCheck(): therefore it will be only a few lines and we will have good SoC. Please have a look as the patch proposed earlier to have a look at the design and the new code (it is far from over and a lot of methods are incomplete). BR Nicolas
Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring
Hi, On 9/12/06, Nicolas [EMAIL PROTECTED] wrote: Sure. Here it is: Cool, thanks. Issues I am trying to solve: I pretty much agree with your analysis of the issues, as I've faced similar problems working on JCR-325. I labeled the issues for brevity: A: Methods are too long B: Complexity C: Factorizing/Code duplicate D: Bugs E: Lack of flexibility Bugs: jcr:root is not handled correctly and other bugs have been detected. For reference: the jcr:root issue and the preferred fix is discussed in JCR-535. Please file and refer to separate bug reports for other specific issues. It's better to keep the refactoring issue separate from functional changes. Solution My proposal is in four parts: In my opinion it would be best to split the solution to separate issues to make it easier to review the changes and to understand the rationale for each change. - Add some private methods to abstract some of the complex issue this class is managing Do you mean the isAddable, checkNode, createProperties, createNode, isSkipped, isCorrect, and postProcess methods? Are they supposed to be private or protected, i.e. can a subclass use them directly? Are they final, i.e. can a subclass override them? I can see how this would help with goals A, B, and C. Could you make a patch that changes the WorkspaceImporter class accordingly so we can see these methods in effect. The patch should improve A, B, and C and still pass all unit tests for me to apply it. - Add more control in the constructor (for instance for our backup tool): add a constructor with the BatchedItemOperation so we can choose if needed the needed ItemStateManager. (For instance to restore the version history). - Add a raw flag in the constructor to import the content as it is without initializing versioning and escaping some checks. This is a use case already encountered Both of these parts are related to goal E. They are extra features and pretty much independent of structural refactoring. I'd propose these changes as a a separate issue and a separate patch. - With this new design, solve remaining bugs and handle the jcr:root issue. This would be goal D. Again, I would propose a fix for the jcr:root issue in JCR-535 and in separate reports for any other issues. Please have a look as the patch proposed earlier to have a look at the design and the new code (it is far from over and a lot of methods are incomplete). I like your approach, but as you mention it still needs some work. I'd be happy to review an updated version. I'm a bit concerned that the flow in the proposed class, especially in the startNode() method isn't directly based on the flow within the existing WorkspaceImporter. Are you certain that all special cases are handled as before? Could we add some unit tests to verify that all those special cases remain operational? I don't think the current WorkspaceImporter class has full test coverage. BR, Jukka Zitting -- Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED] Software craftsmanship, JCR consulting, and Java development