Hi Arnoldo,

your question seems to be the one for the best gene models. This depends on
the species. Your species seem to be so rare that they don't have refseq
genes. You map the available ESTs yourself to a gene set of a close species
and then use the most-5' one to get the promoter. An easier option  is to
try other EST-based gene predictions. NCBI UniGene or Ensembl might have
gene models that are better suited for your problem.

cheers
Max



On Wed, Oct 6, 2010 at 11:04 PM, Mary Goldman <[email protected]> wrote:

> Hi Arnoldo,
>
> While we do not provide advice on research direction, one of our
> engineers had these comments about your tables and tracks of interest:
>
> "Some species don't have refFlat files because they have no native
> RefSeq mRNAs.
>
> The xeno refseq alignments are done using protein translated
> blat. They have the drawback that they easily align to paralogs
> and pseudogenes. They also tent to not align UTR very well, and
> hence are not a good indication of transcription start.
>
> TransMap would probably be a better source of genes. They are
> filtered by synteny and do a much better job of aligning UTRs.
> The drawback is the gene alignments are restricted to genes
> from the species with pairwise genomic alignments. I would
> suggest that using either the Transmap RefSeq or mRNA
> alignments. Transmap RefSeq would be a cleaner set of data, however
> mRNAs would be more comprehensive."
>
> Best,
> Mary
> ---------------------
> Mary Goldman
> UCSC Bioinformatics Group
>
> On 10/5/10 4:08 PM, Arnoldo Jose Muller-Molina wrote:
> > Hello Mary, members of the list:
> >
> > Well, I would like to do some alignments on the different promoters of
> > different species.
> > I am running a phylogenetic foot-printing technique and I want to have
> > as many genes from different species as possible.
> >
> > > From what I could gather from the mailing list, xenoRefFlat consists of
> > other genes from other species aligned into the organism. This would
> > give me a larger list of genes because some organisms only have 1000+
> > gene annotations in refFlat. Do you think using xenoRefFlat for this
> > purpose makes sense?
> >
> > Regards,
> >
> > Arnoldo Muller
> >
> > On Tue, 2010-10-05 at 15:51 -0700, Mary Goldman wrote:
> >
> >> Hi Arnoldo,
> >>
> >> The answer to your question depends on what you are going to use the
> data for. Please keep in mind that the UCSC Genome Browser simply displays
> data; it does not say what is or is not acceptable analysis of this data.
> >>
> >> Please feel free to contact the mail list again if you require further
> assistance.
> >>
> >> Best,
> >> Mary
> >> ------------------
> >> Mary Goldman
> >> UCSC Bioinformatics Group
> >>
> >> ----- Original Message -----
> >> From: "Arnoldo Jose Muller-Molina"<
> [email protected]>
> >> To: [email protected]
> >> Sent: Tuesday, October 5, 2010 1:31:56 PM GMT -08:00 US/Canada Pacific
> >> Subject: [Genome] About xenorefflat
> >>
> >> Hello!
> >>
> >> I would like to extract promoter regions of various sizes from different
> >> vertebrates. I am aware that you provide the upstreamXXXX.fa.gz files
> >> but I would like to have them repeatmasked.
> >>
> >> I decided to extract my data directly from the chromosomes using
> >> refFlat.txt files. I have noticed that some organisms have a small
> >> number of entries. Some organisms like the Panda do not have refFlat.txt
> >> at all. Would it be safe to approximate promoter regions with
> >> xenoRefFlat?
> >>
> >> Regards,
> >>
> >>
> >
> >
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to