Comments inline
On 1/28/13 3:39 PM, "Andy Seaborne" <[email protected]> wrote: >Rob, > >There are YarcData copyright statements in some classes and no ASF >License. It really does have to be the copyright holder that removes >these. Could you fix this please? Sorry I have done this yet again :-(( > >- - - - - - > >You're right this is an area that can be improved - I'd be interested in >hearing your experiences here as, I guess, this is an issue that has >arisen for you in some usage. Yes the existing implementation is fine for current usages which are primarily for parsing and so only really use the expand() method heavily. However in some work I was doing I found that when you had a lot of namespaces defined and were using it for output (I.e. calling abbreviate() heavily) performance started to suffer. This is because for every URI you want to abbreviate you potentially have to consider whether it matches every one of the declared namespaces. In the worst case this makes it an O(n) operation. So my obvious choice for improving the situation for output was to use a Trie since it's an ideal structure for prefix lookup, but as PrefixMap was a class and not an interface I had to extract the interface in order to make the new FastPrefixMap implementation interoperable. > >There seems to be a mix up over PrefixMapping (old, jena-core, in the >comments for LightweightPrefixMap) and PrefixMap (RIOT). Should be slightly clarified now > >How does this all fit together? I hope we can discuss the technical >changes - do you prefer email or JIRA? FastPrefixMap is an alternative implementation of the newly extracted LightweightPrefixMap interface I.e. it is a drop in replacement for the existing prefixMap. It's intended for scenarios which are more output heavy (lots of use of abbreviate()) since an abbreviation call will be close to a O(1) operation even in the worst case since traversing a trie is a linear time operation. By contrast the existing PrefixMap is O(n) worst case since it potentially has to consider every possible namespace. > >I'll write more when I have looked more proper response but >LightweightPrefixMap is being used in parser and output uses. Maybe >there are two interfaces really. PrefixMap was completely reformatted - >I'm having difficulty seeing what the real changes are but I don't see >why it does not have a chance or Trie added. Main changes are, implements the extracted interface. Some of the static helper methods were extracted into an abstract base class (LightweightPrefixMapBase) and made protected instance methods instead since they weren't being used anywhere elsewhere in the codebase. > >Also: > >Any tests for Trie? Not yet, will add soon >Is there any effect on parser speed? Shouldn't be, the existing PrefixMap implementation is still used throughout the codebase, the code just refers to the interface rather than the implementation now. Using FastPrefixMap would involve a marginally higher memory overhead and a small performance drop (extra time to populate the Trie as namespaces are added) but likely so little difference as to be negligible. It may be better renamed as OutputPrefixMap since that better describes its intended usage? Rob > > Andy > >
