Duddy, John wrote:
> I had considered cat, but from what I have read, not all readers understand 
> it, or understand it as a single compressed stream. Since the resulting file 
> would differ from a block gzipped file (embedded gzip headers with filenames, 
> embedded per-file trailers, and the final CRC/size not matching the entire 
> contents) I worry cat would be a brittle solution.
> 
> Also, single AFAIK gzjoin is not part of any installable package (more of an 
> example program) I don't know if it's something that can be addressed by a 
> dependency installation system, unless you host the installations.
> 
> There is stuff that happens on first start, such as copying *example files. 
> Would it be far-fetched to compile the program at that stage?

We probably can't make the assumption that the user has the zlib dev
package with the headers, or even a C compiler, but this is a general
problem that we will need to solve as we implement tight dependency
control.

My initial thought on this is automatic fetching of precompiled
binaries, and if not available for your platform then we can try to
automatically compile.  Although really, I would be surprised if anyone
is running tools on anything other than Mac or Linux.

For the immediate solution, we could probably fetch the binary much like
we do for eggs.

--nate

> 
> 
> ________________________________________
> From: James Taylor [ja...@jamestaylor.org]
> Sent: Tuesday, May 24, 2011 11:51 PM
> To: Duddy, John
> Cc: galaxy-...@bx.psu.edu Dev
> Subject: Re: [galaxy-dev] Getting binary programs into Galaxy distribution?
> 
> John, I'll take a look at the program. There isn't a great way to do
> this until the dependency installation system is working. A thin
> python wrapper (using Cython) would be the usual trick we would use.
> 
> However: have you considered just using cat? This should be completely
> valid for gzip (at the cost of an extra 15 bytes per source file or so
> for duplicate headers). It looks like gzjoin does require
> decompression of all input data so this trade off may be worthwhile.
> 
> 
> On May 24, 2011, at 3:13 PM, Duddy, John wrote:
> 
> > There is a C program for merging Gzip files (gzjoin) that I’d love
> > to rely on for a core Galaxy capability. Is there a standard way to
> > get things like this included in Galaxy? Recoding it in Python would
> > be a bit of a pain, and might be a lot slower due to the IO layer
> > not allowing the reuse of buffers.
> >
> > Thanks -
> >
> > John Duddy
> > Sr. Staff Software Engineer
> > Illumina, Inc.
> > 9885 Towne Centre Drive
> > San Diego, CA 92121
> > Tel: 858-736-3584
> > E-mail: jdu...@illumina.com
> >
> > ___________________________________________________________
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
> 
> 
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> 
>   http://lists.bx.psu.edu/
> 
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to