Hi Angie, Thanks for the advice.
I'll convert to bigBed if the current setup doesn't suffice ... I already stuffed `I`s in teh QUAL column, so that's fine, but the NNN's in the sequence do look a bit wonky. Out of curiosity, how does "the general public" know when a new release of your software is deployed to go live? Also -- thanks for fixing the issue so quickly. -steve On Fri, Jul 29, 2011 at 8:11 PM, Angie Hinrichs <[email protected]> wrote: > Hi Steve, > > I have fixed the code, so '*' should be handled correctly on our main site > after the next software release (2.5-3 weeks). > > Using NNNN's as the sequence might cause some strange display effects in the > browser (I expect all bases would be marked as mismatches). If your user is > going to click on an item to see item details, the page will terminate early > when it tries to display missing qual scores, so you might want to stuff some > characters there too (at least for the next few weeks), if you continue to > use the BAM for viewing. > > Since BED is already a UCSC Genome Browser custom track format, why not send > the GEO bed directly to the browser? You could submit a track line followed > by a GEO URL(s), if there is a stable URL for the bed file(s). Some GEO > short-read files are so large that they may time out the upload, but it seems > like it's worth a shot because of our code's alignment-centric display of > BAM. (Of course, bigBed would work great and avoid the bulk upload time > issue.) > > Angie > > ----- Original Message ----- > From: "Steve Lianoglou" <[email protected]> > To: "Angie Hinrichs" <[email protected]> > Cc: [email protected] > Sent: Friday, July 29, 2011 4:41:50 PM > Subject: Re: [Genome] Mimimal BAM files is tripping up browser > > Hi Angie, > > Thanks for the quick response. > > I converted the bed to bam because I wanted to use some data I pulled > down from GEO that is supposed to represent alignments, but was only > given as bed files. > > The tools I've written to deal with NGS data work off of bam files, so > I thought the easiest thing to do was to convert it to a bam file and > move along -- and my collaborator likes to look at data through the > "UCSC Genome Browser lens", which is where I got tripped up. > > Converting the data to bigBed didn't even cross my mind, really ... I > regenerated bam files from the bed files, but put NNN's in the SEQ > column for now. > > Thanks for the help, > > -steve > > On Fri, Jul 29, 2011 at 7:20 PM, Angie Hinrichs <[email protected]> wrote: >> Hi Steve, >> >> You're right -- that is legal BAM, but our code expects to see a query >> sequence that corresponds to the CIGAR (in this case, it wants 32 bases >> corresponding to "32M"). The code assumes that BAM contains sequence >> alignments. Thanks for reporting this, I will work on a fix. >> >> In the meantime, can you tell us why you converted bed to bam? The bigBed >> format (http://genome.ucsc.edu/goldenPath/help/bigBed.html) was designed for >> large remote BED tracks, and might be a better solution (at least for >> viewing in the genome browser :). >> >> Thanks, >> Angie >> >> ----- Original Message ----- >> From: "Steve Lianoglou" <[email protected]> >> To: [email protected] >> Sent: Friday, July 29, 2011 2:48:45 PM >> Subject: Re: [Genome] Mimimal BAM files is tripping up browser >> >> As a follow up to this -- I think the "*" in either the sequence or >> quality columns (columns 10 or 11) of a BAM file is what's causing the >> error ... >> >> -steve >> >> On Fri, Jul 29, 2011 at 4:59 PM, Steve Lianoglou >> <[email protected]> wrote: >>> Hi, >>> >>> [sorry, I sent this email to the genome-mirror list previously, so I'm >>> resending here] >>> >>> I've converted some bed files to bam files using bedtools. >>> >>> The alignments in the bam file look like so: >>> >>> * 16 chr1 13517 255 32M * 0 0 >>> * * >>> * 16 chr1 16275 255 32M * 0 0 >>> * * >>> * 16 chr1 16458 255 32M * 0 0 >>> * * >>> * 16 chr1 16461 255 32M * 0 0 >>> * * >>> >>> When I add the bam file as a custom track and try to hop to the >>> genome, I get this error: >>> >>> baseColorDrawSetup: *: mRNA size (0) != psl qSize (32) >>> >>> I've put this BAM file online to help smoke this problem for testing >>> purposes, which you can add as a custom track like so: >>> >>> track type="bam" name="test-bam" >>> bigDataUrl="http://cbio.mskcc.org/leslielab/files/ucsc/test.bam" >>> genome="hg19" visibility="squish" >>> >>> I'm guessing that the genome browser doesn't like "*" in the QNAME >>> (1st) column of the BAM file (or maybe one of the "*"s in another >>> column(?)). >>> >>> It's easy enough for me to change whatever column to a bogus value to >>> fix this (some points as to which column that would be are welcome), >>> but as far as I can tell this is a valid bam file. Each column that >>> has a "*" is allowed to do so according to the spec: >>> >>> http://samtools.sourceforge.net/SAM1.pdf >>> >>> While this is easy enough for me to fix on my end, I thought it would >>> be worth reporting since it seems like a (somehow minor) bug on your >>> side as well (assuming such a bam file is valid, of course). >>> >>> Thanks, >>> -steve >>> >>> -- >>> Steve Lianoglou >>> Graduate Student: Computational Systems Biology >>> | Memorial Sloan-Kettering Cancer Center >>> | Weill Medical College of Cornell University >>> Contact Info: http://cbio.mskcc.org/~lianos/contact >>> >> >> >> >> -- >> Steve Lianoglou >> Graduate Student: Computational Systems Biology >> | Memorial Sloan-Kettering Cancer Center >> | Weill Medical College of Cornell University >> Contact Info: http://cbio.mskcc.org/~lianos/contact >> >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome >> > > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
