Hello All,

Please forgive me if this post comes off as inexperienced, but if any of you have the time I would like to hear your suggestions on the following problem.

I've got a set of genomic DNA sequences for a number of species. What I want to do is to obtain only full-length cDNA matches to these genomic sequences from GenBank, excluding Refseq sequences. What I've been doing so far is blasting these genomic sequences against the nr nucleotide database and manually evaluating which hits to keep or discard, depending on the coverage of the subject sequence to the query. While this method may be suitable for organisms with poorly characterized expression data, when trying to do this for mouse or human the task becomes entirely daunting.

So my question is this:

What is the most efficient way to obtain a set of cDNA sequences that match to a set of genomic DNA sequences while excluding spurious hits , RefSeq sequences and "pseudo" full length cDNAs?

As you can imagine, I am interesting in looking for alternative splice variants for a number of genes.

Any information or help that you could graciously muster would be very much appreciated.

with sincere regards,

dale richardson





_______________________________________________
BBB mailing list
[email protected]
http://www.bioinformatics.org/mailman/listinfo/bbb

Reply via email to