On Thu, Feb 16, 2012 at 10:47 AM, Peter Cock <p.j.a.c...@googlemail.com> wrote:
> On Wed, Feb 15, 2012 at 6:07 PM, Dannon Baker <dannonba...@me.com> wrote:
>> Good luck, let me know how it goes, and again - contributions are certainly
>> welcome :)
>
> I think I found the first bug, method split in 
> lib/galaxy/datatypes/sequence.py
> for class Sequence assumes four lines per sequence. This would make
> sense as the split method of the Fastq class (after grooming to remove
> any line wrapping) but is a very bad idea on most sequence file formats
> (e.g. FASTA).
>
> It looks like a little refactoring is needed, defining a Sequence split method
> which raises not implemented, and moving the current code to the Fastq
> class, then writing something similar but allowing multiple lines per record
> for the Fasta class.
>
> Does that sound reasonable? I'll do this on a new branch for review...

Refactoring lib/galaxy/datatypes/sequence.py split method here,
https://bitbucket.org/peterjc/galaxy-central/changeset/762777618073

This is part of a work-in-progress "split_blast" branch to try splitting
BLAST jobs, for which I will need to split FASTA files as inputs, and
also merge BLAST XML output:
https://bitbucket.org/peterjc/galaxy-central/src/split_blast

Peter
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to