Hey Paul, In my data-mining $dayjob I do a fair amount of annotation of text with attributes, similar to what you're talking about. People in this field tend to call it "stand-off annotation", which means it's stored out-of-band, as opposed to "in-line markup" like vanilla XML.
The precedence rules and types of annotation you've defined seem a bit arbitrary though - you might want to have a look at UIMA and GATE, both of which define structures like yours but with a few differences: http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/overview_and_setup/overview_and_setup.html#ugr.ovv.conceptual.representing_results_in_cas http://www.gate.ac.uk/releases/gate-4.0-build2752-ALL/doc/javadoc/gate/TextualDocument.html -Ken On Fri, Jan 30, 2009 at 4:25 PM, Paul LeoNerd Evans <leon...@leonerd.org.uk> wrote: > On Fri, Jan 30, 2009 at 02:08:24PM -0800, Bill Ward wrote: >> Or String::Substrate? The meaning of "substrate" doesn't really fit >> here but it's so close to SubStrAttr that I bet you could get away >> with it, with a suitable comment explaining the name :) > > I can't help thinking we're getting a bit side-tracked by the name here. > > There's a lot of interesting API in the code, I feel the name is > somewhat overshadowing any other discussion on the API design or other > details... > > -- > Paul "LeoNerd" Evans > > leon...@leonerd.org.uk | CPAN ID: PEVANS > srand($,=" ");print sort{rand>0.5}grep{0.8>rand}qw(another Just hacker of > Perl) > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > > iD8DBQFJg353vLS2TC8cBo0RAnREAKDa9UFgoIQ3Cj5dKkuY9sVCR+hzOwCcDWBy > vel3GIHnz0SZhRSXVSbX/7g= > =wIpu > -----END PGP SIGNATURE----- > >