Unless anyone with experience in biojava development wants to take on this, I would volunteer to do this. I ended up using the PhyloXML forester-atv parser (and moving to phyloxml instead of nexus), but as I reported this, I might as well sort it out...
2009/11/4 Richard Holland <[email protected]>: > ah... except a problem! The parser does not know all names in the string in > advance, so if it auto-assigns one that is then used later in the string, we > have the same problem with name clashes as before. > > The names the parser assigns cannot totally avoid all clashes unless it has > already parsed the string to find out what names were used in the string > itself already. So some kind of pre-parse would be necessary. > > On 4 Nov 2009, at 12:46, Richard Holland wrote: > >> Sounds good. >> >> On 4 Nov 2009, at 12:40, Tiago Antão wrote: >> >>> 2009/11/3 Richard Holland <[email protected]>: >>>> >>>> The prefix for the parser currently is hardcoded as p. Two new methods - >>>> set >>>> and getDefaultPrefix which accept a string should be provided (it should >>>> check that the string is valid, i.e. all alphanumeric and with no spaces >>>> or >>>> other Newick-sensitive characters). The parser should be changed to use >>>> the >>>> output from getDefaultPrefix() instead of the hardcoded p. The default >>>> behaviour should be such that it behaves the same as at present unless >>>> the >>>> user explicitly says otherwise by calling the setDefaultPrefix() method. >>> >>> This default behavior would still raise an exception with nodes called >>> p* . I would suggest a minor change: If there is a clash, the parser >>> would try the next p* (or whatever defaultPrefix) ... >>> >>> Example to make it clear: if there is a leaf called p2, internal nodes >>> generated would be p1, p3, p4, .... >>> >>> -- >>> "The hottest places in hell are reserved for those who, in times of >>> moral crisis, maintain a neutrality." - Dante >> >> -- >> Richard Holland, BSc MBCS >> Operations and Delivery Director, Eagle Genomics Ltd >> T: +44 (0)1223 654481 ext 3 | E: [email protected] >> http://www.eaglegenomics.com/ >> >> >> _______________________________________________ >> Biojava-l mailing list - [email protected] >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Richard Holland, BSc MBCS > Operations and Delivery Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: [email protected] > http://www.eaglegenomics.com/ > > -- "The hottest places in hell are reserved for those who, in times of moral crisis, maintain a neutrality." - Dante _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
