I've found the problem - "interleave=yes" is not valid according to
the official NEXUS format spec which the parser was written against.
(Maddison et al., 1997)
Interleaved file are supposed to only include the word "interleave" -
it takes no parameters. Non-interleaved files shouldn't mention it at
all.
I've modified the parser to tolerate this but I'd be interested to
know where the invalid token came from - was it added manually, or by
an existing piece of publically available software?
The modification has been made in the trunk of the biojava-live
subversion repository.
cheers,
Richard
On 7 Aug 2009, at 11:10, David Johnson wrote:
Hi Richard,
Actually the original exception was thrown in a different file that my
supervisor tried uploading to a Web app I'm developing that uses the
BioJava Nexus parser, but can't get hold of that particular file
today. So the one I provided the link for was just another example of
an interleaved Nexus file I Googled for when I got your first email
this morning, as I figured they'd probably be the same formatting. But
I remember it's definitely the same exception in both cases.
I had a quick look in the example I provided today, and the
interleave=yes token is definitely in the header of the data block,
and is also definitely in the Format line.
Oh, just FYI, I'm using the BioJava 1.7 binary distribution
(http://www.biojava.org/download/bj17/bin/biojava.jar).
Cheers,
-David
2009/8/7 Richard Holland <[email protected]>:
Thanks David. One more quick question - is this the exact file that
is
throwing the exception? I haven't tested it yet - but if I could test
against the real file that is throwing the problem, that would help
me find
out exactly what's going wrong.
For what it's worth, the exception is normally thrown when more
than one
interleave=yes/no token is found in the header of the Data or
Characters
block, or when the interleave token appears in a line other than
the Format
line of the header.
cheers,
Richard
On 7 Aug 2009, at 10:28, David Johnson wrote:
Hi Richard,
Thanks for your mail. An example of an interleaved file can be
found here:
http://www.molecularevolution.org/si/resources/fileformats/files/dna.nex
where the link pointing to the example file is from
http://www.molecularevolution.org/si/resources/fileformats/ and
under
the NEXUS section.
The specific error message is:
"org.biojava.bio.seq.io.ParseException:
Found unexpected token interleave=yes in CHARACTERS block"
So it looks like the error is thrown reading the "interleave"
parameter in the top of the data block, and before reaching the
actual
interleaved matrix data. Full stacktrace in attached .txt.
Cheers,
-David
2009/8/7 Richard Holland <[email protected]>:
Could you point me to an example of an interleaved file?
And also the full stack trace of the exception that gets thrown?
cheers,
Richard
On 6 Aug 2009, at 18:03, David Johnson wrote:
Hi everyone,
A quick question about the BioJava Nexus parser. I've been
trying to
use the Nexus file parser, simply by doing something like:
NexusFileBuilder builder = new NexusFileBuilder();
NexusFileFormat.parseFile(builder, f);
However, when parsing Nexus files that are interleaved, I get a
ParseException.
Is there a way to setup the parser provided by BioJava to handle
interleaved Nexus files?
Thanks,
-David
--
--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: [email protected]
http://www.eaglegenomics.com/
_______________________________________________
Biojava-l mailing list - [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l