Hi, Jennifer: Thank you for these suggestion. I just tried one of the honeybee genes from NCBI and it worked. I could extract genomic sequence based the mRNA sequence from NCBI. Since I have a list of around 10000 genes, I think it will very slow if I run them on web-based browser and I definitely should run BLAT in a batch mode. I have access to an Unix cluster but I never run the BLAT on that before. In order to search against the honey bee genome, do I have to connect to the relational database in UCSC first? or I have to download the database by myself? Can you give me some instructions for that or is there any documentation I can refer to? Thank you
Jia ----- Original Message ----- From: "Jennifer Jackson" <[email protected]> To: [email protected] Cc: [email protected] Sent: Tuesday, August 18, 2009 3:18:58 PM GMT -05:00 US/Canada Eastern Subject: Re: [Genome] question about honey bee NCBI gene track Hello again, Very glad that helped to resolve the discrepencies. One option is to run BLAT in a batch mode, with the new sequences, and save the output in BED or PSL format. Or, obtain similiar alignment data from NCBI and format. With either, the results can be loaded as a custom track. Using the table browser, select the custom track and output "sequence". All of the regular options will come up. File formats are described in detail in the FAQ. There are also some utilities to transform files in the code tree. I am not sure if you have programming/unix resources or capability. We can provide specific pointers to Help/FAQ documents if you are interested in these tools and need help locating them, but our team cannot actually do the programming/code tree installation for you. Write back and let us know if and how we can help more if this is the type of analysis you wish to pursue, Jennifer ------------------------------------------------ Jennifer Jackson UCSC Genome Bioinformatics Group ----- [email protected] wrote: > From: [email protected] > To: "Jennifer Jackson" <[email protected]> > Cc: [email protected] > Sent: Tuesday, August 18, 2009 11:33:46 AM GMT -08:00 US/Canada Pacific > Subject: Re: [Genome] question about honey bee NCBI gene track > > Dear Jennifer: > > Thank you so much for your quick response. I think these are helpful, > and I just checked the NCBI ftp site to find out the most current > honey bee RNA sequence, which it was updated in October 2006. > > However, I still have problem to ask for your help. In my analysis, I > want to extract the genomic sequence(which includes both intron and > exon) for a bunch of genes. I know in UCSC Genome Browser it is very > easy to do that because there are many options for you such as 5'UTR > Exon,3'UTR Exon, CDS Exon,Intron, to select. But in NCBI I can only > get the mRNA sequence for a certain refseq ID instead of the whole > gene sequence. I think if I make a custom track in UCSC genome browser > from NCBI, I still can't extract the sequence like I mentioned above. > Do you have any idea to extract the sequence as what I need based on > the most current information from NCBI? Thank you. > > > Jia Zeng > ----- Original Message ----- > From: "Jennifer Jackson" <[email protected]> > To: [email protected] > Cc: [email protected] > Sent: Tuesday, August 18, 2009 1:25:45 PM GMT -05:00 US/Canada > Eastern > Subject: Re: [Genome] question about honey bee NCBI gene track > > Hello, > > This is likely related to updates between our version of the track and > the data at NCBI. > > For apiMel2, the NCBI Gene Model track's description page notes that: > Data last updated: 2005-05-26 > There may have been updates since then - check at NCBI. > > If you want to, the entire current dataset could be extracted from > NCBI, formatted as a custom track, and uploaded. Instructions for data > formats, custom tracks, and other tools are in our FAQ. If you need > help locating something specific, please let us know. > > Thanks, Jen > > ------------------------------------------------ > Jennifer Jackson > UCSC Genome Bioinformatics Group > > ----- [email protected] wrote: > > > From: [email protected] > > To: [email protected] > > Sent: Tuesday, August 18, 2009 9:18:44 AM GMT -08:00 US/Canada > Pacific > > Subject: [Genome] question about honey bee NCBI gene track > > > > Dear Genome: > > > > I am using UCSC genome browser to do some analysis about honey > > bee(A.mellifera) genome. For my purpose, I use the NCBI Genes track > > under A.mellifera genome. However, when I manually check the > sequence > > for a same refseq ID between UCSC and NCBI, I found there were > always > > some difference among them. The attach file is a sequence alignment > > for XM_392354 between the resources I mentioned above. Because my > > analysis is very sensitive to the length of gene. Could you tell me > > why there is difference between the same refseq gene from different > > resources and what resource is the most current one? Thank you very > > much > > > > Jia Zeng > > _______________________________________________ > > Genome maillist - [email protected] > > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
