Re: [Genome] How to correctly display exon boundaries for a microarray track

Pauline Fujita Wed, 04 May 2011 15:06:08 -0700

Hello again Carlos,

Our developer had this to say about your additional questions:


expData tables have 3 columns:

(from kent/src/hg/lib/expData.sql):

CREATE TABLE expData (
name varchar(255) not null, # Name of gene/target/probe etc.
expCount int unsigned not null, # Number of scores
expScores longblob not null, # Scores. May be absolute or relative ratio
#Indices
INDEX)
);

The first column is the UID that xrefs the "name" field in the BED12 
that is also used by bedMergeExpData. The second column should be the 
number of arrays (or experiments if merging replicates, etc)... very 
important: this number should not change from row to row, i.e. all the 
rows should have the same number of scores. The scores (third column) is 
a string of comma-separated floating-point numbers.

If you have created a ~/.hg.conf file with MySQL username/password and 
other info, then the following utilities may be used to load BED and 
expData-formatted files into a MySQL database:

1. kent/src/hg/makeDb/hgLoadBed
2. kent/src/hg/makeDb/hgLoadSqlTab (using kent/src/hg/lib/expData.sql)

Also note that the ~/.hg.conf file should be similar to the hg.conf file 
in the server's cgi-bin directory, except the personal .hg.conf should 
refer to MySQL users with both read and write capability, whereas the 
server hg.conf should be read-only access to MySQL (aside from the 
hgcentral database). You can read more about how we use the hg.conf file 
on this wiki page:

http://genomewiki.ucsc.edu/index.php/Hg.conf

Best regards,

Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu



On 05/04/11 12:05, Carlos Javier Borroto wrote:
> Hi Pauline,
> 
> This has been extremely helpful, I do have a couple of question thou.
> 
> When exporting the affyU133Plus2 data to BED I realized the info for
> chromStarts is not exactly the same as qStarts, I wrongly assumed it
> was the same, that's why taking it directly from mysql didn't produce
> the same output, out of curiosity and for future reference could you
> tell me why this difference?
> 
> I have never used Galaxy, I'll try to use it for this now, but it
> seems bedMergeExpData would fit better for my need, as I would like to
> have the option os running my data through a script pipeline, I
> compiled the utility and I get this from the help:
> $ bedMergeExpData
> bedMergeExpData - Merge probe position information (bed table) with
> an expData table and make a new bed file from that.
> usage:
>    bedMergeExpData database.expDataTable database.bedTable merged.bed
> 
> As the developer point me, I need to get the BED data I got from the
> table browser into a mysql table, that's fine with me, but what about
> the format for expDataTable?, could you point me to any documentation
> with a description?
> 
> Thanks,
> --
> Carlos Borroto
> Baltimore, MD
> 
> 
> 
> On Fri, Apr 29, 2011 at 2:54 AM, Pauline Fujita <[email protected]> wrote:
>> Hello Carlos,
>>
>> One of our developers had this to say about your question:
>>
>> The coordinates you are using from the annotation are indeed not at the
>> precision you want. They're essentially the region of the gene, introns
>> included. You will want to download a BED12 of Affy data and append your
>> data in columns 13-15. The U133+2.0 seems to be based on multiple gene sets:
>> refseq and ensembl, so it's just a matter of making a bed12 for each probe.
>>
>> To do so,  go to the table browser (http://genome.ucsc.edu/cgi-bin/hgTables)
>> and select:
>>
>> clade = Mammal
>> genome = Human
>> assembly = Feb 2009 (hg19), or hg18 if desired
>> group = Expression
>> track = Affy U133Plus2
>> table = affyU133Plus2
>> region = genome
>> identifiers, filter, intersection, correlation = <don't change>
>> output format = BED - browser extensible data
>> output file = something.bed.gz
>> file type returned = gzip compressed
>> ... then click "get output"
>>
>> To append your data to the BED12 you have obtained you can try using the
>> kent source utility bedMergeExpData but be advised that this is designed for
>> working on tables rather than files. Alternatively you might try joining the
>> data using the utilities at Galaxy (http://galaxy.psu.edu/). If you decide
>> to use Galaxy you can take advantage of the "Send output to Galaxy" function
>> in the table browser.
>>
>> Hopefully this information was helpful and answers your question. If you
>> have further questions or require clarification feel free to contact the
>> mailing list at [email protected].
>>
>> Best regards,
>>
>> Pauline Fujita
>>
>> UCSC Genome Bioinformatics Group
>> http://genome.ucsc.edu
>>
>>
>>
>> On 4/20/11 7:46 AM, Carlos Javier Borroto wrote:
>>> Hi,
>>>
>>> I'm working on getting a microarray track into our local mirror,
>>> following directions from the wiki page I was able to do so.
>>>
>>> The coordinates I was using initially were from this file:
>>>
>>> http://www.affymetrix.com/analysis/downloads/na31/ivt/HG-U133_Plus_2.na31.annot.csv.zip
>>>
>>> But this coordinates expand very large areas, I would like to only
>>> display data for the exon areas like in "GNF Atlas 2" track, I found I
>>> could add "expDrawExons on" to activate this option, but after
>>> selecting it I still don't get the same results, so I focus my
>>> attention into getting blockCounts, blockSizes and chromStarts right,
>>> I tried to take the coordinates for the probes from affyU133Plus2
>>> track, that didn't do the trick either ask I could see the coordinates
>>> in affyU133Plus2 and gnfAtlas2 aren't exactly the same, my last
>>> resource was to use the coordinates directly from gnfAtlas2, but that
>>> doesn't cover all of the probes we have. Is there a better way to get
>>> this right?
>>>
>>> Thanks for your help,
>>> --
>>> Carlos Borroto
>>> Baltimore, MD
>>> _______________________________________________
>>> Genome maillist  -  [email protected]
>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>
>>

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] How to correctly display exon boundaries for a microarray track

Reply via email to