BTW, I am already beta testing Reto's latest builds by using nothing else and can sey, they work perfectly so far. Not a glitch! Very well done Reto.
Regards, Andreas On 23/03/2014, at 16:52 , Christiaan Hofman wrote: On Mar 23, 2014, at 15:32, Adam R. Maxwell wrote: On Mar 23, 2014, at 6:35, Christiaan Hofman <cmhof...@gmail.com<mailto:cmhof...@gmail.com>> wrote: Let me add a few remarks on this. And, btw, also from me thanks for doing this. I never much looked at these classes because I I never really understood much of what was going on, and the code is not very clean, to use an understatement. On Mar 23, 2014, at 12:19, Reto Stöckli wrote: Adam A few answers from my side, well knowing that I might be wrong: 1) I am frankly pissed that CitedReferences/CitingArticles/etc is not documented in the code or the online help, if the comment is still correct. This accounts for a huge amount of boilerplate-looking code, which in my experience is a disaster for testing after rewriting/refactoring. I suggest removing them in your rewrite. Since these parts do not seem to be user-accessible anyway, I agree. I however leave the final decision to you guys. That seems reasonable for me too, also cleans up the code a lot. 2) BDSKISISourceXMLTagPriorityKey is apparently an array of fields. No idea what the hell it does, as it's not documented in the code or on the wiki. I suggest removing it in your rewrite. Same as above. idem 3) Style: is testing isKindOfClass: really the only way to distinguish the body parts? It's generally frowned on in Obj-C. Can you make an example to me as a non Obj-C programmer so that I can test the correct way? I don't think this is possible. This is a consequence of the rather bad generated code, there should be a lot of subclassing going on which there isn't. If I look at where this stuff is added (I think that should be WokSearchService.m around line 6577) these objects are just added to an array, and nothing is added to identify what was added, and the objects themselves have no common property to get to figure out what type they are, so checking the class is the only way you can know. Okay, that's what I suspected as well. It is what it is, in that case. 4) Style: as far as I know, the remainder of BibDesk does not use the Obj-C dot syntax for accessors, but it looks like that was used in the previous iteration of this class. Personally, I detest it and would prefer square brackets, but this isn't your fault. That I can change very easily, use brackets instead of dot-syntax. I have, as you say, followed the syntax of the original code since I learned it this way. Not a big deal, especially if Christiaan doesn't mind. I initially thought you were dealing with C structs :). Not a big deal, but I also don't like it 5) Style: downloadWithSearchTerm:database:options: is >300 lines, and createPublicationInfoWithRecord() is >460 lines. This is really unfortunate, as they are both unmaintainably long. I agree. With 1 this should help a lot. 6) There's a hardcoded date of "2020-01-01" in there. This needs to be +[NSDate distantFuture] or +[NSDate date]. 2020 is only 6 years away, and I've been contributing code to BibDesk for the past 12 years :-). I have not checked that part of the code, but I can change that. Thanks for the hint. It seems to me the dates aren't really needed. In the search option it is not used at all, so if we drop the rest this becomes mute. The chief points are really the last two. Can Xpath queries be used to avoid walking the tree(s) so many times? The multiple nested while() loops make me really nervous from a readability standpoint. Ok, here's the answer from a novice Obj-C programmer: I've tried the Xpath syntax since it was also used in the original code. It failed for WOKSearch V3 since sub-tree identifiers are not unique wihtin the whole tree (e.g. the identifier "names" appears several times in the tree, naming several different sub-tree key's, such as author names, publication names, title names etc.). Searching for XPath:name resulted in a wild match of totally different things. I've avoided this by walking through individual sub-trees, one by one. Now my question: can the XPath query be used to match these subtrees uniquely ("XPath:names" matches both)? a) summary->names b) summary->publishers->publisher->names If it is possible, I'll replace the complex walk through sub-trees with single XPath statements. After a quick look to the loops, it seems to me that should be possible, and not so difficult. It should not matter that "names" occur at different places, you should just look at them in the correct place by building a more specific path. So don' look for all "names" nodes with something like .//names (double slash) or "XPath:names", but more specific paths with single slashes and parents included. For instance, use ./static_data/summary from the record to get the summary node, and then use things like ./pub_info, ./titles, etc applied to that node to get its children, and work through its children analogously. For instance, this could be part of the code to get the Author field (typed in email, not tested): NSXMLNode *summaryNode = [[record nodesForXPath:@"./static_data/summary" error:NULL] lastObject]; ... NSArray *names = [summaryNode nodesForXPath:@"./names/name/full_name" error:NULL]; NSMutableString *nameString = nil; for (child in names) { if (name == nil) nameString = [NSMutableString string]; else [nameString appendString:@" and "]; [nameString appendString:[child stringValue]]; } Also, for the rest the following Xpaths should immediately give you arrays of nodes you can use to build the abstract, keywords, and the IDs: ./static_data/fullrecord_metadata/abstracts/abstract/abstract_text/p ./static_data/item/keywords_plus/keyword ./dynamic_data/cluster_related/identifiers/identifier Perhaps you have used the wrong XPaths. It should not matter whether some identifier occurs at different places in the tree, unless you search a whole tree like with .//names (with double slash), which you probably should never do. Yes, what he said! Also, I'll add that Xpath is more tricky than it seems, especially if you don't have deep knowledge of it. If you have a sample of the XML returned for v3, please post that, and we can all look at it. I have WoS at work, but no time to try your code this week, unfortunately. Adam Though I don't think you need the tricky parts of Xpath, because it seems all to be simple descending according to a set path without the need for conditionals and such. AFAICS these are all the Xpaths that are needed (at least for the uncommented part) (the indentation is associated with the hierarchy of sub nodes): ./UID ./dynamic_data/cluster_related/identifiers/identifier ./static_data ./fullrecord_metadata/abstracts/abstract/abstract_text/p ./item/keywords_plus/keyword ./summary ./doctypes/doctype ./pub_info ./page ./titles/title ./names/name/full_name ./publishers/publisher ./address_spec ./names/name/full_name Christiaan ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech_______________________________________________ Bibdesk-develop mailing list Bibdesk-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bibdesk-develop ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Bibdesk-develop mailing list Bibdesk-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bibdesk-develop