Hi Guillermo,

Thank you for your patience. One of our engineers has this to say, "The other 
alignments are being dropped due to the global near best criteria of only saved 
alignments that score within 0.25 % of the top scoring alignment. The top 
scoring one aligns with 100% identity, the next one is 96%."

I hope this helps to clarify things for you. If you have further questions, 
please contact the mailing list: [email protected].

Vanessa Kirkup Swing
UCSC Genome Bioinformatics Group

----- Original Message -----
From: "Guillermo Parada" <[email protected]>
To: "Vanessa Kirkup Swing" <[email protected]>
Sent: Friday, April 29, 2011 8:00:58 PM
Subject: Re: [Genome] Alignment filter by identity



Hi Vanessa, 



I already found a perl code to calculate PIDs based in your source published at 
BLAT FAQ, and I successfully personalized it to my own recruitments. I ran it 
over the Table Browser mm9 data (all_mrna and all_est). I found 1491 cDNAs 
(0.64%) and 129268 EST (2.96%) with lower PID than 95%. 



So, this data suggest that you didn't filtered the alignments lower than 95% of 
PID. I really need you tell me the criteria applied to filter out the low 
quality alignments of the Table Browser data, because it will be my a gold 
standard criteria to filter the alignments done by another programs. 



I'm very interesting in the way yo distinguish the sequences that only align 
once from those which align more than locus into genome (putative pseudogenes). 





Thanks for your kind attention. 



Best Regards 



2011/4/29 Vanessa Kirkup Swing < [email protected] > 


Hi Guillermo, 

We are currently working on your question. We hope to have some answers for you 
sometime next week. Thank you for your patience. 

Vanessa Kirkup Swing 
UCSC Genome Bioinformatics Group 





----- Original Message ----- 
From: "Guillermo Parada" < [email protected] > 
To: [email protected] 
Sent: Wednesday, April 27, 2011 9:49:18 AM 
Subject: [Genome] Alignment filter by identity 

Hello UCSC Genome Browser stuff! 

My name is Guillermo and I downloaded the aligment data of the cDNAs and EST 
over mm9 genome from Table Browser and now I'm comparing these with my gmap 
alignment results of the same cDNAs and EST. 

I writing to you because I'm not clear about the BLAT alignments filter 
parameters you used to generate the alignments. The BLAT program 
specification says "Blat produces .... at the DNA level between two 
sequences that are of 95% or greater identity ..." ( 
https://cgwb.nci.nih.gov/goldenPath/help/blatSpec.html ). But in addition you 
may configured the pslReps filter options to only get the alignments over 
certain amount of coverage (-minCover flag). Did you? 

I Found some cDNAs cases like BC096042 that at the Table Browser alignments 
results only have one alignment, but when I put the BC096042 sequence into 
web version BLAT of the Genome Browser, it shows many alignments, which is 
expected because there are unfiltered alignments. But what make no sense to 
me, is that show me many alignment with identity over 95%. Why this 
alignments aren't at Table Browser? Maybe because in this particular case 
the BC096042 alignment has a 100% identity and automatically the others 
sub-optimals alignments were deleted, Is that right? (see the 100% identity 
alignment also at Genome Browser 
http://genome.ucsc.edu/cgi-bin/hgc?hgsid=193400045&o=125111868&t=125115317&g=mrna&i=BC096042
 
) 

In order to use your BLAT aligments results from the Table Browser I need to 
know your filter parameters. Also I would like to confirm the formula (that 
is used at the hyperlink line) to get the identity from the psl line. I need 
it to apply to my gmap results and after many attempts, I get to a logical 
way to calculate it: 

[(match*100/(match+mismatch+Q gap bases) + match*100/(blocksizes 
summation)]/2 

Please tell me if that's right. 

Very thanks for your time. 
I look forward to your reply. 

-- 
Guillermo Parada. Undergraduate student Biochemistry at Ponticia Univesidad 
Católica de Chile. 
_______________________________________________ 
Genome maillist - [email protected] 
https://lists.soe.ucsc.edu/mailman/listinfo/genome 



-- 
Guillermo Parada. Undergraduate student Biochemistry of Ponticia Univesidad 
Católica de Chile.

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to