Hi, everybody... I'm like Georges....I want to extract data from BLAST files..... I can have the alignements, no problem...But, now, I want the alignment between the 2 sequences (the lines with "+", "-" and some letters in George's example....) because with this, we can see in a glance if the alignment between the 2 sequences is really good or not.
Is it possible, Docs?? Thank you. Sebastien --- Richard HOLLAND <[EMAIL PROTECTED]> a écrit : > BioJava's BLAST framework parses files and fires events for every > piece of information it finds. The SeqSimilarityAdapter class is an > example of how to catch these events and construct basic BLAST result > objects (SimpleSeqSimilarityHit), however they are not comprehensive > and do not record full details of every hit. > > If you want the kind of detail you mention below you will have to > write your own content handler for BLAST parsing and parse it to the > BLASTLikeSAXParser when parsing a file. This event handler should > implement the ContentHandler interface. Look at the source of > SeqSimilarityAdapter for guidance. You will then receive events for > every part of the file, from which you can construct your own custom > BLAST result objects to describe them. > > If you're not sure what tag names to listen for in your > ContentHandler the easiest thing to do is just run it once and dump > them all out to see what you get. > > cheers, > Richard > > > -----Original Message----- > From: [EMAIL PROTECTED] on behalf of Y D Sun > Sent: Sun 6/26/2005 5:42 PM > To: biojava-l@biojava.org > Cc: > Subject: [Biojava-l] BLAST Parser for extracting all BLAST data? > > Hi, > > I want to extract all data from BLASTP results. In the following hit, > for example, I need to get the lengths of query and subject proteins, > the identities (including all data 54, 124 and 43%), the positives > (all > data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the > BLASTLikeSAXParser filter all these information? I can't find the > methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs > to > retrieve these data. Does Biojava provide any methods for this > purpose? > > Thanks, > > George > > > BLASTP 2.2.5 [Nov-16-2002] > > Query= Prot0001 > (138 letters) > > Database: /work/nys1/fasta/protein/AE000782.pro.fasta > 2407 sequences; 662,866 total letters > > Searching.....done > > > Score > E > Sequences producing significant alignments: > (bits) > Value > > Prot0002 > 100 > 1e-23 > Prot0003 > 74 > 2e-15 > Prot0004 > 43 > 3e-06 > > >Prot0002 > Length = 138 > > Score = 100 bits (250), Expect = 1e-23 > Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124 > (2%) > > Query: 18 > NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY > 77 > NAR T IAK LN+TEAA+RKRI LE + I Y I+YKK+G + ++ G+D+D > D > Sbjct: 15 > NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK > 74 > > Query: 78 > FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII > 134 > K+++EL+ + ++ + GDH IM I K +L EI+ + > ++GVKRVCP+II > Sbjct: 75 > LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT > 134 > > Query: 135 DQIK 138 > D +K > Sbjct: 135 DIVK 138 > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > ___________________________________________________________________________ Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger Téléchargez cette version sur http://fr.messenger.yahoo.com _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l