Ryan Golhar wrote:
> Speaking of which, the Wiki site hasn't been update to reflect that
> step 1 uses "hg clone https://bitbucket.org/galaxy/galaxy-dist/";
> 
> and step 5 moves to galaxy-central.  Is this supposed to be the case?

That was just a bit of incorrect output from 'hg incoming', but the
instructions were still correct.  I've fixed the output, thanks for the
heads up.

> 
> 
> On 4/7/11 12:13 PM, Assaf Gordon wrote:
> >Kelly Vincent wrote, On 04/07/2011 11:35 AM:
> >>That opening-the-entire-file inefficiency in our sam_to_bam.py was
> >>fixed quite some time ago (4/30/2010, in 3724:007f175c8b88). It has
> >>used getsize since then.
> >
> >Fixed is one thing, pushed to "dist" is another :)
> >====
> >$ hg clone https://bitbucket.org/galaxy/galaxy-dist galaxy_test2
> >requesting all changes
> >adding changesets
> >adding manifests
> >adding file changes
> >added 5071 changesets with 21893 changes to 4940 files
> >updating to branch default
> >3256 files updated, 0 files merged, 0 files removed, 0 files unresolved
> >$ cd galaxy_test2/
> >$ hg tip
> >changeset:   5070:ca0c4ad2bb39
> >tag:         tip
> >user:        Greg Von Kuster<g...@bx.psu.edu>
> >date:        Wed Feb 16 10:14:24 2011 -0500
> >summary:     When rendering the contents of a data library, make sure a 
> >LibraryDataset has an associated LibraryDatasetDatasetAssociation.  There 
> >should always be an ldda, but some users running their own instances have 
> >reported that some of their LibraryDatasets are missing these associations.
> >$ sed -n '148,152p' tools/samtools/sam_to_bam.py
> >         if returncode != 0:
> >             raise Exception, stderr
> >         if len( open( tmp_aligns_file_name ).read() ) == 0:
> >             raise Exception, 'Initial BAM file empty'
> >     except Exception, e:
> >====
> >It only exists in "galaxy-central", not in "galaxy-dist" which is what you 
> >recommend people to use.
> >
> >Both me and Ryan have this line of code, and I'm sure we've both pulled 
> >since April 2010.
> >
> >>It's really just intended to catch weird errors that don't throw an
> >>actual error (also, I think that if the sam file has no header, no
> >>info would be output into the bam file).
> >
> >Nope - look at my example below - an invalid sam file still generates a 
> >non-empty BAM file.
> >This happens because of the "-t" parameter: samtools takes the header 
> >information from another file and generates the BAM file before even reading 
> >the SAM file.
> >
> >>It seems somewhat more
> >>informative to tell the user the file is empty rather than just
> >>"successfully" output an empty file.
> >>
> >
> >Depends on your definition of "informative" - in this case the error message 
> >was not so informative at all, to a point a traceback was needed...
> >
> >
> >-gordon
> >
> >>
> >>
> >>On Apr 7, 2011, at 1:05 AM, Assaf Gordon wrote:
> >>
> >>>Just another example why python's misleadingly simple idioms are
> >>>quite dangerous in production code (couldn't help myself from
> >>>teasing about python... sorry about that).
> >>>
> >>>Seems like line 150 in "sam_to_bam.py" tries to read the entire
> >>>BAM file into memory just to find out if it's empty or not...
> >>>
> >>>As a stop gap solution with minimal changes, change line 150 from:
> >>>  if len( open( tmp_aligns_file_name ).read() ) == 0: to if len(
> >>>open( tmp_aligns_file_name ).read(10) ) == 0:
> >>>
> >>>Which will read up to the first 10 bytes (instead of the entire
> >>>file).
> >>>
> >>>A slightly better (but still wrong) solution is to simply check
> >>>the file size, with: if os.path.getsize(tmp_aligns_file_name) ==
> >>>0:
> >>>
> >>>But it's still wrong because even an invalid sam file will create
> >>>a non-empty BAM file (when using "samtools view -bt") - the BAM
> >>>file will still contain the chromosome names and sizes.
> >>>
> >>>Example: ======== $ cat mm9.fa.fai chr1    197195432       6 50
> >>>51 chr10   129993255       201139354       50      51 chr11
> >>>121843856       333732482       50      51 chr12   121257530
> >>>458013223       50      51 chr13   120284312       581695911 50
> >>>51 chr13_random    400311  704385924       50      51 chr14
> >>>125194864       704794249       50      51 chr15   103494974
> >>>832493018       50      51 ... ...
> >>>
> >>>$ cat 1.sam Hello World This is not a SAM file
> >>>
> >>>$ samtools view -bt mm9.fa.fai -o 1.bam 1.sam [sam_header_read2]
> >>>35 sequences loaded. [sam_read1] reference 'This is not a SAM file'
> >>>is recognized as '*'. [main_samview] truncated file.
> >>>
> >>>$ ls -l 1.* -rw-r--r-- 1 gordon hannon 348 Apr  7 00:57 1.bam
> >>>-rw-r--r-- 1 gordon hannon  35 Apr  7 00:57 1.sam ========
> >>>
> >>>
> >>>So in short, this whole sam-to-bam wrapper tool is not suitable
> >>>for large SAM files (if they don't fit entirely in memory), and not
> >>>for error checking of invalid SAM files.
> >>>
> >>>
> >>>-gordon
> >>>
> >>>
> >>>On 04/07/2011 12:30 AM, Ryan Golhar wrote:
> >>>>Here's what I get:
> >>>>
> >>>>(galaxy_env)[galaxy@vail pbs]$ sh ./big.sh Samtools Version:
> >>>>0.1.14 (r933:170) Traceback (most recent call last): File
> >>>>"/home/galaxy/galaxy-dist/tools/samtools/sam_to_bam.py", line
> >>>>150, in __main__ if len( open( tmp_aligns_file_name ).read() )
> >>>>== 0: MemoryError Error extracting alignments from
> >>>>(/home/galaxy/galaxy-dist/database/files/000/dataset_785.dat),
> >>>>(galaxy_env)[galaxy@vail pbs]$
> >>>>
> >>>>
> >>>>
> >>>>On 4/6/11 7:29 PM, Assaf Gordon wrote:
> >>>>>Ryan,
> >>>>>
> >>>>>Since we're shooting in the dark here, best to try and
> >>>>>understand what's the exception.
> >>>>>
> >>>>>Add the following line to the beginning of "sam_to_bam.py":
> >>>>>import traceback
> >>>>>
> >>>>>and add the following line to "sam_to_bam.py" line 156 (before
> >>>>>the call to "stop_err"): traceback.print_exc()
> >>>>>
> >>>>>Hopefully this will print out which exception you're getting,
> >>>>>and where is it thrown from.
> >>>>>
> >>>>>-gordon
> >>>>>
> >>>___________________________________________________________ Please
> >>>keep all replies on the list by using "reply all" in your mail
> >>>client.  To manage your subscriptions to this and other Galaxy
> >>>lists, please use the interface at:
> >>>
> >>>http://lists.bx.psu.edu/
> >>
> >
> >___________________________________________________________
> >Please keep all replies on the list by using "reply all"
> >in your mail client.  To manage your subscriptions to this
> >and other Galaxy lists, please use the interface at:
> >
> >   http://lists.bx.psu.edu/
> >
> 
> -- 
> CONFIDENTIALITY NOTICE: This email communication may contain
> private, confidential, or legally privileged information intended
> for the sole use of the designated and/or duly authorized
> recipient(s). If you are not the intended recipient or have received
> this email in error, please notify the sender immediately by email
> and permanently delete all copies of this email including all
> attachments without reading them. If you are the intended recipient,
> secure the contents in a manner that conforms to all applicable
> state and/or federal requirements related to privacy and
> confidentiality of such information.

> begin:vcard
> fn:Ryan Golhar, Ph.D.
> n:Golhar;Ryan
> org:The Cancer Institute of NJ;Cancer Informatics Core/Bioinformatics
> adr:5th floor;;120 Albany St;New Brunswick;NJ;08901;USA
> email;internet:golha...@umdnj.edu
> title:NGS Bioinformatics Specialist
> tel;work:(732) 235-6613
> tel;fax:(732) 235-6267
> tel;cell:(732) 236-1176
> x-mozilla-html:FALSE
> url:http://www.cinj.org
> version:2.1
> end:vcard
> 

> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> 
>   http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to