Hi, Jennifer:

Sorry to bother you again. I think I just have last several questions about my 
analysis in honeybee genome. Now I can run blat in my server and I can get the 
pls format output data. But the problem is the database file I downloaded is 
the scaffold sequence in UCSC table browser which under the scaffold track. 
After searching against this database, I compared the output to the result I  
searched in web-base blat, it turned out that the output for a same refseq ID 
has the same hit but different coordinates, I think the difference is due to 
different coordinates system(one based on linkage group and the other based on 
a certain scaffold.) In this case I can't extract the right sequence based on 
the scaffold coordinates.  So, I was wondering if there is a resource like the 
sequence for a whole linkage group I can use as the database? Thank you !

Jia   
----- Original Message -----
From: "Jennifer Jackson" <[email protected]>
To: "Jia Zeng" <[email protected]>
Cc: [email protected]
Sent: Tuesday, August 18, 2009 5:02:46 PM GMT -05:00 US/Canada Eastern
Subject: Re: [Genome] question about honey bee NCBI gene track

Hello,

Since you have access to a unix cluster, I would suggest getting an acedemic 
licence for BLAT and installing. Go to the bottom of the UCSC BLAT page and 
contact Jim Kent. Once you have the software we can help guide you a bit in 
terms of paramaters, although the documentation that comes with the software is 
very good. There is also an FAQ for BLAT that describes how to mimic web-blat 
paramaters (but these are not really recommended for uses like yourself, who 
have a specific need for a full length alignment and have the resources to 
spend on a sensitive query against a smaller genome).

Try this and I will also alert out local BLAT expert that you will be 
installing and may have questions,

Thanks, jen


------------------------------------------------ 
Jennifer Jackson 
UCSC Genome Bioinformatics Group 

----- "Jia Zeng" <[email protected]> wrote:

> From: "Jia Zeng" <[email protected]>
> To: "Jennifer Jackson" <[email protected]>
> Cc: [email protected]
> Sent: Tuesday, August 18, 2009 1:34:12 PM GMT -08:00 US/Canada Pacific
> Subject: Re: [Genome] question about honey bee NCBI gene track
>
> Hi, Jennifer:
> 
> Thank you for these suggestion. I just tried one of the honeybee genes
> from NCBI and it worked. I could extract genomic sequence based the
> mRNA sequence from NCBI. Since I have a list of around 10000 genes, I
> think it will very slow if I run them on web-based browser and I
> definitely should run BLAT in a batch mode. I have access to an Unix
> cluster but I never run the BLAT on that before. In order to search
> against the honey bee genome, do I have to connect to the relational
> database in UCSC first? or I have to download the database by myself?
> Can you give me some instructions for that or is there any
> documentation I can refer to? Thank you
> 
> Jia 
> ----- Original Message -----
> From: "Jennifer Jackson" <[email protected]>
> To: [email protected]
> Cc: [email protected]
> Sent: Tuesday, August 18, 2009 3:18:58 PM GMT -05:00 US/Canada
> Eastern
> Subject: Re: [Genome] question about honey bee NCBI gene track
> 
> Hello again,
> 
> Very glad that helped to resolve the discrepencies.
> 
> One option is to run BLAT in a batch mode, with the new sequences, and
> save the output in BED or PSL format. Or, obtain similiar alignment
> data from NCBI and format. With either, the results can be loaded as a
> custom track. Using the table browser, select the custom track and
> output "sequence". All of the regular options will come up.
> 
> File formats are described in detail in the FAQ. There are also some
> utilities to transform files in the code tree. I am not sure if you
> have programming/unix resources or capability. We can provide specific
> pointers to Help/FAQ documents if you are interested in these tools
> and need help locating them, but our team cannot actually do the
> programming/code tree installation for you. Write back and let us know
> if and how we can help more if this is the type of analysis you wish
> to pursue,
> 
> Jennifer
> 
> 
> ------------------------------------------------ 
> Jennifer Jackson 
> UCSC Genome Bioinformatics Group 
> 
> ----- [email protected] wrote:
> 
> > From: [email protected]
> > To: "Jennifer Jackson" <[email protected]>
> > Cc: [email protected]
> > Sent: Tuesday, August 18, 2009 11:33:46 AM GMT -08:00 US/Canada
> Pacific
> > Subject: Re: [Genome] question about honey bee NCBI gene track
> >
> > Dear Jennifer:
> > 
> > Thank you so much for your quick response. I think these are
> helpful,
> > and I just checked the NCBI ftp site to find out the most current
> > honey bee RNA sequence, which it was updated in October 2006.
> > 
> >  However, I still have problem to ask for your help. In my analysis,
> I
> > want to extract the genomic sequence(which includes both intron and
> > exon) for a bunch of genes. I know in UCSC Genome Browser it is
> very
> > easy to do that because there are many options for you such as
> 5'UTR
> > Exon,3'UTR Exon, CDS Exon,Intron, to select. But in NCBI I can only
> > get the mRNA sequence for a certain refseq ID instead of the whole
> > gene sequence. I think if I make a custom track in UCSC genome
> browser
> > from NCBI, I still can't extract the sequence like I mentioned
> above.
> > Do you have any idea to extract the sequence as what I need based
> on
> > the most current information  from NCBI? Thank you.
> > 
> > 
> > Jia Zeng
> > ----- Original Message -----
> > From: "Jennifer Jackson" <[email protected]>
> > To: [email protected]
> > Cc: [email protected]
> > Sent: Tuesday, August 18, 2009 1:25:45 PM GMT -05:00 US/Canada
> > Eastern
> > Subject: Re: [Genome] question about honey bee NCBI gene track
> > 
> > Hello,
> > 
> > This is likely related to updates between our version of the track
> and
> > the data at NCBI.
> > 
> > For apiMel2, the NCBI Gene Model track's description page notes
> that:
> > Data last updated: 2005-05-26
> > There may have been updates since then - check at NCBI.
> > 
> > If you want to, the entire current dataset could be extracted from
> > NCBI, formatted as a custom track, and uploaded. Instructions for
> data
> > formats, custom tracks, and other tools are in our FAQ. If you need
> > help locating something specific, please let us know.
> > 
> > Thanks, Jen
> > 
> > ------------------------------------------------ 
> > Jennifer Jackson 
> > UCSC Genome Bioinformatics Group 
> > 
> > ----- [email protected] wrote:
> > 
> > > From: [email protected]
> > > To: [email protected]
> > > Sent: Tuesday, August 18, 2009 9:18:44 AM GMT -08:00 US/Canada
> > Pacific
> > > Subject: [Genome] question about honey bee NCBI gene track
> > >
> > > Dear Genome:
> > > 
> > > I am using UCSC genome browser to do some analysis about honey
> > > bee(A.mellifera) genome. For my purpose, I use the NCBI Genes
> track
> > > under A.mellifera genome. However, when I manually check the
> > sequence
> > > for a same refseq ID between UCSC and NCBI, I found there were
> > always
> > > some difference among them. The attach file is a sequence
> alignment
> > > for XM_392354 between the resources I mentioned above. Because my
> > > analysis is very sensitive to the length of gene. Could you tell
> me
> > > why there is difference between the same refseq gene from
> different
> > > resources and what resource is the most current one? Thank you
> very
> > > much
> > > 
> > > Jia Zeng  
> > > _______________________________________________
> > > Genome maillist  -  [email protected]
> > > https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to