It should already be on CruiseControl.

Standards in bioinformatics are a pain - people write them to describe the format of files their software outputs, then the very same people then produce files that break those standards without any additional documentation or explanation. (Genbank are one of the biggest offenders!) It makes it very hard to write parsers, because if you stick to the official spec there will always be files that don't work yet people insist are still valid, yet without prior documented evidence of invalid files that are considered to be valid, it is impossible to write a parser to cater for them. :)

cheers,
Richard

On 11 Aug 2009, at 11:12, David Johnson wrote:

Hi Richard,

OK that's good to know... I suppose that's the problem with specifications - people don't always follow them!

But I get the impression either some people think that using interleave=yes/no is standard practice, or it could be being generated by some other phylo software (e.g. maybe PAUP or some other tools).

I had a talk with my supervisor and he actually can't find the specific programs that have been putting that in, but looking at a range of his Nexus files, there's quite a few that seem to use put in the yes/no bits, some files he received from other researchers.

Are the modifications available in the latest automated build (on CruiseControl)?

Cheers,
-David

2009/8/11 Richard Holland <[email protected]>
I've found the problem - "interleave=yes" is not valid according to the official NEXUS format spec which the parser was written against. (Maddison et al., 1997)

Interleaved file are supposed to only include the word "interleave" - it takes no parameters. Non-interleaved files shouldn't mention it at all.

I've modified the parser to tolerate this but I'd be interested to know where the invalid token came from - was it added manually, or by an existing piece of publically available software?

The modification has been made in the trunk of the biojava-live subversion repository.

cheers,
Richard



--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: [email protected]
http://www.eaglegenomics.com/

_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to