I have recently downloaded human expression data via UCSC genome Table
Browser using the following query parameters: Mammal, human, Assembly:
Feb 2009(GRCh37/hg19), Group: Expression, Track: GNFAtlas2, Table:
hgFixed.gnfHumanAtlas2All, as I wanted all available replicates
available for each probe.

However, the file output is very difficult to understand. There were
44775 probes (as expected) for which data are available.  Each probe
has a corresponding 'hgFixed.gnfHumanAtlas2All.expCount' value= 158,
suggesting there should be 158 expression values per probe and, in
fact, the column headed 'hgFixed.gnfHumanAtlas2All.expScores' does in
fact contain 158 comma-separated absolute expression values.

However, I am not able to obtain the EXP ids (i.e., tissue name)
associated with each of the 158 expression values in the sequence so
how is one supposed to figure out which tissue each of the 158
expression scores corresponds to?

I have attempted to obtain those expression IDs in several ways, by
selecting different associated tables to join and seemingly relevant
variables to no avail. Moreover, even more confusingly, when I select
from associated table gnfHumanAtlas2MedianExps the variables
'hgFixed.gnfHumanAtlas2AllExps.id' and
'hgFixed.gnfHumanAtlas2AllExps.name' which would seem like the desired
information, I get a series of comma-separated EXP ids and the
corresponding EXP id tissue names (e.g., 112 and Pancreas,
respectively), but there are generally not 158 entries in each of
these cells and many probes have 'n/a' in both columns.

So, for example, probe '1007_s_at' has the following associated data:
hgFixed.gnfHumanAtlas2All.expCount='158',
hgFixed.gnfHumanAtlas2All.expScores=
'3621,3212,1078,1130,475,408,375,528,...' (158 distinct values
comma-separated)
hgFixed.gnfHumanAtlas2AllExps.id= '112'
hgFixed.gnfHumanAtlas2AllExps.name='Pancreas'

While probe '117_at' gives:
hgFixed.gnfHumanAtlas2All.expCount='158',
hgFixed.gnfHumanAtlas2All.expScores=
'338,277,2383,2456,617,423,...'(158 comma-separated values)
hgFixed.gnfHumanAtlas2AllExps.id= '52,74,75,85,94,96,98,112,121,127,129,137,'
hgFixed.gnfHumanAtlas2AllExps.name='cerebellum,CingulateCortex,CingulateCortex
2,Lung 2,Uterus,Thyroid,fetalThyroid,Pancreas,TestisGermCell
2,salivarygland 2,trachea 2,skin 2,'

Since the number of expression values listed under
'hgFixed.gnfHumanAtlas2All.expScores' does not correspond to the
number of Expression IDs/names listed under
'hgFixed.gnfHumanAtlas2AllExps.id' and
'hgFixed.gnfHumanAtlas2AllExps.name', respectively, how is one
supposed to figure out which tissue each of the 158 expression scores
corresponds to?

-- 
Kathleen Askland, MD
Assistant Professor
Department of Psychiatry & Human Behavior
Warren Alpert School of Medicine
Brown University/Butler Hospital
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to