I have recently downloaded human expression data via UCSC genome Table Browser using the following query parameters: Mammal, human, Assembly: Feb 2009(GRCh37/hg19), Group: Expression, Track: GNFAtlas2, Table: hgFixed.gnfHumanAtlas2All, as I wanted all available replicates available for each probe.
However, the file output is very difficult to understand. There were 44775 probes (as expected) for which data are available. Each probe has a corresponding 'hgFixed.gnfHumanAtlas2All.expCount' value= 158, suggesting there should be 158 expression values per probe and, in fact, the column headed 'hgFixed.gnfHumanAtlas2All.expScores' does in fact contain 158 comma-separated absolute expression values. However, I am not able to obtain the EXP ids (i.e., tissue name) associated with each of the 158 expression values in the sequence so how is one supposed to figure out which tissue each of the 158 expression scores corresponds to? I have attempted to obtain those expression IDs in several ways, by selecting different associated tables to join and seemingly relevant variables to no avail. Moreover, even more confusingly, when I select from associated table gnfHumanAtlas2MedianExps the variables 'hgFixed.gnfHumanAtlas2AllExps.id' and 'hgFixed.gnfHumanAtlas2AllExps.name' which would seem like the desired information, I get a series of comma-separated EXP ids and the corresponding EXP id tissue names (e.g., 112 and Pancreas, respectively), but there are generally not 158 entries in each of these cells and many probes have 'n/a' in both columns. So, for example, probe '1007_s_at' has the following associated data: hgFixed.gnfHumanAtlas2All.expCount='158', hgFixed.gnfHumanAtlas2All.expScores= '3621,3212,1078,1130,475,408,375,528,...' (158 distinct values comma-separated) hgFixed.gnfHumanAtlas2AllExps.id= '112' hgFixed.gnfHumanAtlas2AllExps.name='Pancreas' While probe '117_at' gives: hgFixed.gnfHumanAtlas2All.expCount='158', hgFixed.gnfHumanAtlas2All.expScores= '338,277,2383,2456,617,423,...'(158 comma-separated values) hgFixed.gnfHumanAtlas2AllExps.id= '52,74,75,85,94,96,98,112,121,127,129,137,' hgFixed.gnfHumanAtlas2AllExps.name='cerebellum,CingulateCortex,CingulateCortex 2,Lung 2,Uterus,Thyroid,fetalThyroid,Pancreas,TestisGermCell 2,salivarygland 2,trachea 2,skin 2,' Since the number of expression values listed under 'hgFixed.gnfHumanAtlas2All.expScores' does not correspond to the number of Expression IDs/names listed under 'hgFixed.gnfHumanAtlas2AllExps.id' and 'hgFixed.gnfHumanAtlas2AllExps.name', respectively, how is one supposed to figure out which tissue each of the 158 expression scores corresponds to? -- Kathleen Askland, MD Assistant Professor Department of Psychiatry & Human Behavior Warren Alpert School of Medicine Brown University/Butler Hospital _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
