>>>>> "David" == David Waring <[EMAIL PROTECTED]> writes:
[...] David> This parses the blast file and builds David> SequenceDBSearchResults into a list. It is a little bit David> more complicated than that really. But this complication David> gives very great functionality. The SearchResultBuilder David> must have two things that you might not expect, a David> SequenceDB with all the query sequences that blast was David> called with, and a SequenceDBInstallation which contains a David> SequenceDB with the same name as that found in the blast David> output file, in the demo this is 'genome'. With these David> things in place you can get both the subject, and query David> sequences of any hit from the SequenceDBSearchResult. I David> included a little sample of how to do this below since it David> is not in the demo. David> But, you say, this is a blast against some foreign David> database, How can I have a sequencDB with all this David> data. The truth is you do not really need it. You just need David> an empty SequenceDB with the correct name inside your David> SequenceDBInstallation. But then of course you can not get David> the subject sequences from the search result. Yeah, the added complexity issue has been bugging me since the bootcamp. I'm just finishing off a dotplot-style viewer for pairwise comparisons which has to read Blast/Fasta/whatever. As an end-user application it's got to cope with this robustly (e.g. where the sequence name of the query/subject or database may not match up with the search output). As you say, there are tricks to get round the problem. The tests don't contain a copy of EMBL (!), but use a dummy SequenceDB in the way you describe. In cases where a user has said "this was my query, no matter what your code thinks" I use a SingleSequenceDB (which contains one sequence, no ID list and you always get back that sequence when you request it). It's also possible to compact things like the SequenceDBInstallation to anonymous inner classes which behave exactly as you want (such as making assumptions about the identity of sequences/databases which you wouldn't normally allow). Keith -- -= Keith James - [EMAIL PROTECTED] - http://www.sanger.ac.uk/Users/kdj =- Pathogen Sequencing Unit, Wellcome Trust Sanger Institute, Cambridge, UK _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l