Re: [BiO BB] Efficient way to retrieve full length cDNA sequences from GenBank?

Mike Marchywka Thu, 02 Apr 2009 09:14:25 -0700

----------------------------------------
> Date: Thu, 2 Apr 2009 16:41:51 +0100
> From: [email protected]
> To: [email protected]
> Subject: Re: [BiO BB] Efficient way to retrieve full length cDNA sequences 
> from GenBank?
>
> Hi
>
> I would do it programmatically. You do not even need to know much of PERL to
> create your own simple scripts and the ENSEMBL APIs.
>


I was using bash scripts with various things ( sed/awk) to parse blast output
on short probe queries and then using wget or curl to request
genome sequence near the hits ( alt, you can just download
the complete genomes locally and use your favorite random access
facility, perl would work, to get pieces you want). 
IIRC, I then used my own c++ code for various tests.

For unrelated work on splicing, many arguable splicing cues could be
formulated as regular expressions with reverse-complement matches.
You can also set up your own local blast DB or get other patterns
or rules against which to search. Not sure if there are canned
tools but it isn't hard to do a lot of this locally once you
get coarse hits for marginal candidates. 



>
> Go to http://www.ensembl.org and look for the APIs in the Docs & FAQ's 
> section.
> It is full of instructions and examples.
>
> Good luck
> Pedro
>
> --
> Pedro Fernandes
> Centro Português de Bioinformática

> Quoting dale richardson :
>
>>
>> So my question is this:
>>
>> What is the most efficient way to obtain a set of cDNA sequences that
>> match to a set of genomic DNA sequences while excluding spurious
>> hits , RefSeq sequences and "pseudo" full length cDNAs?
>>
>> As you can imagine, I am interesting in looking for alternative splice
>> variants for a number of genes.


_________________________________________________________________
Rediscover Hotmail®: Get quick friend updates right in your inbox. 
http://windowslive.com/RediscoverHotmail?ocid=TXT_TAGLM_WL_HM_Rediscover_Updates1_042009
_______________________________________________
BBB mailing list
[email protected]
http://www.bioinformatics.org/mailman/listinfo/bbb

Re: [BiO BB] Efficient way to retrieve full length cDNA sequences from GenBank?

Reply via email to