I've found the problem - "interleave=yes" is not valid according to the official NEXUS format spec which the parser was written against. (Maddison et al., 1997)

Interleaved file are supposed to only include the word "interleave" - it takes no parameters. Non-interleaved files shouldn't mention it at all.

I've modified the parser to tolerate this but I'd be interested to know where the invalid token came from - was it added manually, or by an existing piece of publically available software?

The modification has been made in the trunk of the biojava-live subversion repository.

cheers,
Richard

On 7 Aug 2009, at 11:10, David Johnson wrote:

Hi Richard,

Actually the original exception was thrown in a different file that my
supervisor tried uploading to a Web app I'm developing that uses the
BioJava Nexus parser, but can't get hold of that particular file
today. So the one I provided the link for was just another example of
an interleaved Nexus file I Googled for when I got your first email
this morning, as I figured they'd probably be the same formatting. But
I remember it's definitely the same exception in both cases.

I had a quick look in the example I provided today, and the
interleave=yes token is definitely in the header of the data block,
and is also definitely in the Format line.

Oh, just FYI, I'm using the BioJava 1.7 binary distribution
(http://www.biojava.org/download/bj17/bin/biojava.jar).

Cheers,
-David

2009/8/7 Richard Holland <[email protected]>:
Thanks David. One more quick question - is this the exact file that is
throwing the exception? I haven't tested it yet - but if I could test
against the real file that is throwing the problem, that would help me find
out exactly what's going wrong.

For what it's worth, the exception is normally thrown when more than one interleave=yes/no token is found in the header of the Data or Characters block, or when the interleave token appears in a line other than the Format
line of the header.

cheers,
Richard

On 7 Aug 2009, at 10:28, David Johnson wrote:

Hi Richard,

Thanks for your mail. An example of an interleaved file can be found here:

http://www.molecularevolution.org/si/resources/fileformats/files/dna.nex

where the link pointing to the example file is from
http://www.molecularevolution.org/si/resources/fileformats/ and under
the NEXUS section.

The specific error message is: "org.biojava.bio.seq.io.ParseException:
Found unexpected token interleave=yes in CHARACTERS block"

So it looks like the error is thrown reading the "interleave"
parameter in the top of the data block, and before reaching the actual
interleaved matrix data. Full stacktrace in attached .txt.

Cheers,
-David

2009/8/7 Richard Holland <[email protected]>:

Could you point me to an example of an interleaved file?

And also the full stack trace of the exception that gets thrown?

cheers,
Richard

On 6 Aug 2009, at 18:03, David Johnson wrote:

Hi everyone,

A quick question about the BioJava Nexus parser. I've been trying to
use the Nexus file parser, simply by doing something like:

     NexusFileBuilder builder = new NexusFileBuilder();
     NexusFileFormat.parseFile(builder, f);

However, when parsing Nexus files that are interleaved, I get a
ParseException.

Is there a way to setup the parser provided by BioJava to handle
interleaved Nexus files?

Thanks,
-David
--


--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: [email protected]
http://www.eaglegenomics.com/

_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to