Thanks for the clarification.

If that's the case though, perhaps it would be good to reconsider the
rationale of the track. Considering that RefSeq is chiefly an annotation and
curation project, it doesn't make sense to me that you would take only the
transcript sequences from refseq, then re-align and re-annotate them - and
then call the resuling track "RefSeq". (Since it doesn't actually agree with
RefSeq itself.) Transcript sequences are a dime a dozen; it's the curation
and annotation processes that distinguish one database's set of genes from
another. So when a user sees a track called "refseq" (or whatever else),
they would expect the infomation there to represent that database, not a
reworking of the databases's raw data.
That's my suggestion, anyway.


On Mon, Nov 29, 2010 at 9:58 PM, Brooke Rhead <[email protected]> wrote:

> Hi Maayan,
>
> One of our engineers has offered this further explanation:
>
> The UCSC RefGene track contains BLAT alignments of the RefSeq mRNA and RNA
> entries. These RefSeq entries are transcript sequences, not genomic
> annotations, and are independent of any given assembly. The UCSC RefGene
> alignments are analogous, but not the same as the genomic mappings of these
> transcripts produced by NCBI. NCBI uses a different alignment process than
> UCSC, and the processes don't always agree.
>
> --
> Brooke Rhead
> UCSC Genome Bioinformatics Group
>
>
>
> On 11/24/10 19:34, maayan kreitzman wrote:
>
>>  Hi All,
>>
>> The explanation supplied is not adequate.
>> The RefSeq project supplies information on transcripts that are unique -
>> and
>> on the NCBI (which created refseq), indeed, there is only ONE record per
>> acession. (Try a simple search in Entrez). Indeed, the refseq project
>> often
>> supplies muliple accession for the same or similar loci with various
>> splices. That's the whole point. It's a conservative approach - one name,
>> one transcript.
>> There is a mistake in the adaptation of their database to yours. Your
>> explanation makes no sense unless you went and did all the alignments and
>> selection from scratch - and if that's the case, why would you call it a
>> RefSeq track?
>>
>> maayan
>>
>>
>> On Thu, Nov 25, 2010 at 12:08 AM, Pauline Fujita <[email protected]
>> >wrote:
>>
>>  Hello Maayan,
>>>
>>> Please see this previously answered mailing list question about the same
>>> issue:
>>>
>>> https://lists.soe.ucsc.edu/pipermail/genome/2010-November/024242.html
>>>
>>> Hopefully this information was helpful and answers your question. If you
>>> have further questions or require clarification feel free to contact the
>>> mailing list at [email protected].
>>>
>>> Regards,
>>>
>>> Pauline Fujita
>>> UCSC Genome Bioinformatics Group
>>> http://genome.ucsc.edu
>>>
>>>
>>>
>>> On 11/24/10 01:08, maayan kreitzman wrote:
>>>
>>>  Hi there,
>>>> I've found a kind of serious problem with your database which is based
>>>> on
>>>> the RefSeq project.
>>>> Many of the refseq accessions, when queried from the genome browser
>>>> return
>>>> more than one gene, IN COMPLETELY DIFFERENT LOCATIONS.
>>>> If you search, say, NM_198181, this is the case. Sometimes, like in the
>>>> case
>>>> of NM_020364, the different entries are even on opposite strands.
>>>> if you want a longer list of examples like this, I can send you some
>>>> more.
>>>> The mistake is somewhere in the conversion from the RefSeq database to
>>>> your
>>>> software, because if you search the same accessions in Entrez you get,
>>>> as
>>>> expected, ONE gene.
>>>> Reqseq documents specific, unique, verified transcripts. There should
>>>> not
>>>> be
>>>> more than one set of coordinates for each refseq accession.
>>>> maayan
>>>> _______________________________________________
>>>> Genome maillist  -  [email protected]
>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>>
>>>>
>>>  _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to