Christian Hundsrucker wrote:
> Hi all!
> 
> I want to load a R-workspace within a galaxy module (.rdat-file,
> R-Project) and therefore built the galaxy-.rdat datatype (binary).
> .rdat-files are gzipped and are only recognized within R if they are
> still zipped.
> However, the corresponding .dat-file is an uncompressed version of the
> original .rdat file as I figured out using a hex-editor.
> I couldn't find any documentation how to change this behaviour, nor
> answers to similar Questions in this list.
> 
> Would be happy for any answere that points me in the right direction.

Hi Christian,

Have a look in the upload tool, tools/data_source/upload.py, which is
where the decompression would be occuring.  There's a spot where we 
bypass decompression for certain formats like BAM, and this would need
to do the same.

Sorry it's a bit of a hack, eventually the goal is to make it more 
pluggable, but this is the solution for now.

--nate

> 
> Details
> #####
> 
> datatypes_conf.xml:
> -----------------------------------
> 
> <?xml version="1.0"?>
> <datatypes>
>     <registration converters_path="lib/galaxy/datatypes/converters"
> display_path="display_applications">
> [...]
>         <datatype extension="rdat" type="galaxy.datatypes.binary:Rdat"
> mimetype="application/octet-stream" display_in_upload="true"/>       
> [...]
>    </registration>
> <sniffers>
> [...]
>         <sniffer type="galaxy.datatypes.binary:Rdat"/>       
> [...]
>     </sniffers>
> </datatypes>
> 
> 
> binary.py:
> ------------------
> 
> [...]
> class Rdat( Binary ):
>     """Class describing an rdat binary file (R-workspace)"""
>     file_ext = "rdat"
>     #MetadataElement( name="Rdat", desc="R-workspace",
> param=metadata.FileParameter, readonly=True, no_value=None,
> visible=False, optional=True )
> 
>     """
>     def __init__( self, **kwd ):
>         Binary.__init__( self, **kwd )       
>         self._name = "Rdat"
>     """
> 
>     def set_peek( self, dataset, is_multi_byte=False ):
>         if not dataset.dataset.purged:
>             dataset.peek  = "Binary rdat file (R-workspace)"
>             dataset.blurb = data.nice_size( dataset.get_size() )
>         else:
>             dataset.peek = 'file does not exist'
>             dataset.blurb = 'file purged from disk'
>     def display_peek( self, dataset ):
>         try:
>             return dataset.peek
>         except:
>             return "Binary rdat file (%s)" % ( data.nice_size(
> dataset.get_size() ) )
>     def get_mime( self ):
>         """Returns the mime type of the datatype"""
>         return 'application/octet-stream'
>     def sniff( self, filename ):
>         # rdat is compressed in the gzip format, and must not be
> uncompressed in Galaxy.
>         # The first 4 bytes of any rdat file are RDX2
>         try:
>             header = gzip.open( filename ).read(4) #(4)=>4Bytes
>             if binascii.b2a_hex( header ) == binascii.hexlify( 'RDX2' ):
> #check if there is the RDX2 signature
>                 return True
>             return False
>         except:
>         return False
>             try:
>                 header = open( filename ).read(4) #(4)=>4Bytes
>                 if binascii.b2a_hex( header ) == binascii.hexlify(
> 'RDX2' ): #check if there is the RDX2 signature
>                     return True
>                 return False
>             except:
>                 return False  
> 
> -- 
> Dr. Christian Hundsrucker
> Institute for Functional Genomics
> Computational Diagnostics Group
> University of Regensburg
> Josef Engertstr. 9 
> 93053 Regensburg, Germany              
> 
> 
> 
> _______________________________________________
> To manage your subscriptions to this and other Galaxy lists, please use the 
> interface at:
> 
>   http://lists.bx.psu.edu/
_______________________________________________
To manage your subscriptions to this and other Galaxy lists, please use the 
interface at:

  http://lists.bx.psu.edu/

Reply via email to