Yes, I think this should work as I have seen it work for another binary type I
made before. See below:
class FileSet( Binary ):
"""FileSet containing N files"""
file_ext = "prims.fileset.zip"
blurb = "(zipped) FileSet containing multiple files"
def sniff( self, filename ):
# If the zip file contains multiple files then return true, false
zf = zipfile.ZipFile(filename)
# the if is just for backwards compatibility...could remove this at some point
if hasattr(Binary, 'register_sniffable_binary_format'):
Now the question I have is: what would be a good logic to use in the sniff
method? I need something that uniquely distinguishes this zipped file from
other zip files, right? In the previous example above I found a solution by
checking whether the zip file has multiple files inside and return true if this
is the case. Now with RData, does it mean I have to try to parse the binary
contents inside and come with a good heuristic/rule ? Just wondering if someone
already has thought about such a rule, specifically for RData.
From: John Chilton [mailto:jmchil...@gmail.com]
Sent: donderdag 23 oktober 2014 3:02
To: Lukasse, Pieter
Subject: Re: [galaxy-dev] strange issue with .RData files
Sorry I am swamped right now so I don't have time to dig into this in detail
- but I have encountered this before with datatypes that are compressed -
zipped, gzipped, etc.... Galaxy will attempt to decompress them in order to
figure out what they are. I believe this is what is happening to your data. If
you register the type as a sniffable binary it looks like it should skip the
- unless I am reading this logic wrong in tools/data_source/upload.py
E.g. like bam datatypes:
class Bam( Binary ):
Binary.register_sniffable_binary_format("bam", "bam", Bam)
Have you registered a sniffable binary datatype for RData?
On Wed, Oct 22, 2014 at 9:38 AM, Lukasse, Pieter <pieter.luka...@wur.nl> wrote:
> When I upload any .RData file to my Galaxy server it seems to be
> unpacking/changing it. The resulting file in my history is different
> and around 2x larger than the uploaded file. The tool that needs to
> use it also aborts with an error due to this erroneous file.
> What are the workarounds?
> Pieter Lukasse
> Wageningen UR, Plant Research International
> Department of Bioinformatics (Bioscience)
> Wageningen Campus, Building 107, Droevendaalsesteeg 1, 6708 PB,
> Wageningen, the Netherlands
> T: +31-317481122;
> M: +31-628189540;
> skype: pieter.lukasse.wur
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
> To search Galaxy mailing lists use the unified search at:
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at: