Hello Maximilian, Mary:
Thank you for your helpful insights!
Regards,
Arnoldo
On Thu, 2010-10-07 at 14:20 +0100, Maximilian Haussler wrote:
> Hi Arnoldo,
>
>
> your question seems to be the one for the best gene models. This
> depends on the species. Your species seem to be so rare that they
> don't have refseq genes. You map the available ESTs yourself to a gene
> set of a close species and then use the most-5' one to get the
> promoter. An easier option is to try other EST-based gene
> predictions. NCBI UniGene or Ensembl might have gene models that are
> better suited for your problem.
>
>
> cheers
> Max
>
>
>
> On Wed, Oct 6, 2010 at 11:04 PM, Mary Goldman <[email protected]>
> wrote:
> Hi Arnoldo,
>
> While we do not provide advice on research direction, one of
> our
> engineers had these comments about your tables and tracks of
> interest:
>
> "Some species don't have refFlat files because they have no
> native
> RefSeq mRNAs.
>
> The xeno refseq alignments are done using protein translated
> blat. They have the drawback that they easily align to
> paralogs
> and pseudogenes. They also tent to not align UTR very well,
> and
> hence are not a good indication of transcription start.
>
> TransMap would probably be a better source of genes. They are
> filtered by synteny and do a much better job of aligning UTRs.
> The drawback is the gene alignments are restricted to genes
> from the species with pairwise genomic alignments. I would
> suggest that using either the Transmap RefSeq or mRNA
> alignments. Transmap RefSeq would be a cleaner set of data,
> however
> mRNAs would be more comprehensive."
>
> Best,
> Mary
> ---------------------
> Mary Goldman
> UCSC Bioinformatics Group
>
>
>
> On 10/5/10 4:08 PM, Arnoldo Jose Muller-Molina wrote:
> > Hello Mary, members of the list:
> >
> > Well, I would like to do some alignments on the different
> promoters of
> > different species.
> > I am running a phylogenetic foot-printing technique and I
> want to have
> > as many genes from different species as possible.
> >
> > > From what I could gather from the mailing list,
> xenoRefFlat consists of
> > other genes from other species aligned into the organism.
> This would
> > give me a larger list of genes because some organisms only
> have 1000+
> > gene annotations in refFlat. Do you think using xenoRefFlat
> for this
> > purpose makes sense?
> >
> > Regards,
> >
> > Arnoldo Muller
> >
> > On Tue, 2010-10-05 at 15:51 -0700, Mary Goldman wrote:
> >
> >> Hi Arnoldo,
> >>
> >> The answer to your question depends on what you are going
> to use the data for. Please keep in mind that the UCSC Genome
> Browser simply displays data; it does not say what is or is
> not acceptable analysis of this data.
> >>
> >> Please feel free to contact the mail list again if you
> require further assistance.
> >>
> >> Best,
> >> Mary
> >> ------------------
> >> Mary Goldman
> >> UCSC Bioinformatics Group
> >>
> >> ----- Original Message -----
> >> From: "Arnoldo Jose
> Muller-Molina"<[email protected]>
> >> To: [email protected]
> >> Sent: Tuesday, October 5, 2010 1:31:56 PM GMT -08:00
> US/Canada Pacific
> >> Subject: [Genome] About xenorefflat
> >>
> >> Hello!
> >>
> >> I would like to extract promoter regions of various sizes
> from different
> >> vertebrates. I am aware that you provide the
> upstreamXXXX.fa.gz files
> >> but I would like to have them repeatmasked.
> >>
> >> I decided to extract my data directly from the chromosomes
> using
> >> refFlat.txt files. I have noticed that some organisms have
> a small
> >> number of entries. Some organisms like the Panda do not
> have refFlat.txt
> >> at all. Would it be safe to approximate promoter regions
> with
> >> xenoRefFlat?
> >>
> >> Regards,
> >>
> >>
> >
> >
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
>
>
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome