Ewan: >BTW - Andrew's suggestion of writing out the header portion is almost >certainly a bad idea. A format definition is a mixture of the precise >definition and its common usage. Also the old adage rings true "be >permissive in what you accept but strict in what you output".
Well, I didn't say that it was a good idea - I said that the GenBank format specification from NCBI requires a header so that what bioperl, biojava, biopython, etc. accept and generate as "genbank" isn't strictly the GenBank format. I also said that ] In reality the software should ] respond to a request for something in GenBank format by saying it ] can't generate a GenBank file but can generate something which is ] GenBank-like, and somehow describe the differences. Documenting ] this nuance and those differences is itself hard and tedious, ] which is why it is rarely done outside of pointers to source code.) ] [It] then happens that there's a de facto consensus amoung those ] in the know about what constitutes a "correct" GenBank file. This ] consensus isn't documented and is learned mostly from experience. The consensus is that the header is optional. In my view as well, sticking in a fake header with false data to meet the strict format spec is a bad thing. But there's apparently not a consensus as to what is strict/permissive enough for a LOCUS line for neither biojava nor biopython's parsers will accept bioperl's stock output. That's easy to fix, once it's pointed out -- make stricter output or have a more permissive consensus. Andrew [EMAIL PROTECTED] _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l