Hi,
I am new to BioJava. I want to test what is going on here in order to potentially integrate it with KNIME. My first project is parsing BLAST output for large files. The example in the codebook is very good and I had no problems integrating everything in Eclipse and geting it to work. Now here is my problem: I am interested in parsing the summary table in the beginning of the blast-output, and I haven't found a way to get at this information. I am blasting short sequences (20nt - 300nt) against genomic databases (mouse/human/refseq/miRBase). I want to know if a given sequence (out of a set of sequences) aligns to a specific genome with high identity. I want to then separate the input source fasta file into a set that aligns to the genome and one that doesn't (potentially another list of dubious sequences where there is no clear answer). For this I only need the length of the query sequence and score and the first few characters of the header line. At least that's the way I am currently doing it. I have set the blast parameters to only give me the first alignment, but the first 50 or so in the summary. Any help, comments are appreciated. Thanks, Bernd Bernd Jagla Bioinformatician Institut Pasteur Plate-forme puces a ADN Genopole / Institut Pasteur 28 rue du Docteur Roux 75724 Paris Cedex 15 France <mailto:[email protected]> [email protected] tel: <http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2 9+140+61+35+13&[email protected]> +33 (0) 140 61 35 13 _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
