On Tue, Apr 30, 2002 at 09:12:59AM -0400, Simon Foote wrote: > I've recently run across a problem with parsing of Genbank files > containing unbounded locations. > Anyone have any idea what's causing it. I tried to trace it back > through but got lost. But I think it has to do with the single <1 for > the -35_signal as shown in the example. > > -35_signal <1 > /gene="entD"
The default Feature implementations in the BioJava development tree explicitly forbid construction of Features with locations which aren't contained by the sequence to which they're attached. As a quick fix, you can just remove the check from the constructor of org.biojava.bio.seq.impl.SimpleFeature (lines 281--283 in my copy). I'm not sure what the proper solution for this problem is. Normally, features which extend beyond the sequence can be transformed into RemoteFeatures. However, this particular feature is nasty in that it doesn't even partially overlap the sequence. To my mind, it's actually pretty much meaningless, and the best thing to do would be to drop it. But some people like to be able to represent the whole of Genbank. Does anyone know how many more `wholly remote' features there are in the databases? And any great ideas about how they could be usefully represented? Thomas. _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l