Thanks, Mark. The Boeing encryption handles the problem nicely by letting you pass in your encryption "key" when you register the filter with HDF5. It also supports having multiple keys, so conceivably you could allow someone access to parts of the data, but not others. Anyone interest should take a look at the link Gerd sent (http://www.hdfgroup.uiuc.edu/HDF5/projects/boeing/encryption/).
Block encryption itself--such as AES--doesn't change the size of the data at all. However, paired with compression, the order is critical. You absolutely need to compress it first, then encrypt it. Warm Regards, Jim -----Original Message----- From: Hdf-forum [mailto:[email protected]] On Behalf Of Miller, Mark C. Sent: Friday, March 21, 2014 11:17 AM To: HDF Users Discussion List Subject: Re: [Hdf-forum] symmetric encryption filters? Hmm. This is an interesting discussion. Let me see if I can add two centsŠ The HDF5 library allows you to define your own 'filters' which operate on the data in-transit as it is written to and read from the file. The filters are just call backs made from the HDF5 library to your user-defined code to operate on chunks of the dataset as they are emitted underneath the H5Dwrite and H5Dread calls. If you write data via some user-defined filter then any reader will need to have access to the code that does the reverse operation (decrypt in your case and, of course any decryption keys). So, there is already implied in this that if you define some 'weird' filter, none of the existing HDF5 tools will be able to read your data (hdfview, h5dump, or third party applications that read HDF5 like IDL, MATLAB, VisIt, etc.). But, given that you are talking about encryption here, I suspect that such an outcome is actually perfectly fine. So, only applications that have access to your reader code (decryption filters) will be able to read the data. And, why not handle that the way something like ssh does it now. Your reader 'filter' would have to acquire the key from ~/.ssh/id_rsa and then use what it gets to decrypt the chunks getting read during H5Dread. Failure to acquire the key would result in a filter error and ultimately a read error in H5Dread's error stack. You could do some work to detect this case and report a useful error message (e.g. "no appropriate key to read encrypted data"). Would you have a single HDF5 file with datasets encrypted for different ids? If so, I think the ssh-like mechanim still works. Because 'filter' operations apply only to the raw data of a dataset, the metadata is not encrypted. This means things like the names, dimensions, datatypes, etc (and any attributes defined on the datasets) cannot be encrypted via the 'filter' approach. Perhaps this is why another responder mentioned the introduction of a Virtual File Driver that collects metadata together and encrypts that separately. I could see how that could be important in certain circumstances. Some other issues are that 'filters' can be applied only when dataset are 'chunked'. And, the filters are then applied independently to each chunk. So, what you get for a single dataset is a bunch of chunks, each chunk independently encrypted. So, you don't have the whole dataset encrypted in one fell swoop. I don't think that would cause problems but thought I would mention it. HDF5 can be 'smart' about applying filters and wind up NOT applying a requested filter in circumstances where you tell it the filter is optional. So, you have to take care to be sure your filter won't be treated by HDF5 that way and wind up skpping and encryption filter it should not have. Just be sure to set up the filters correctly when you define them to HDF5. Will encryption *increase* the size of the data being written? I don't think it does but I guess its always possible depending on what you are doing. If so, HDF5 may not be able to tolerate that. It may expect chunks to be equal to or less than in size that the un-filtered chunks and error-out (or skip such a filter) if that is not the case. So, just be sure too review the documentation on these details. I guess this is a long winded way of saying I think you could make it work within the limitations of some of the issues I mention above. And, I think you can invent a way to handle the keys that can probably be made to work. Hope that was helpful. Mark On 3/21/14 3:23 AM, "huebbe" <[email protected]> wrote: >While it is possible to perform some encryption in a filter, the filter >mechanism is not designed for encryption. The problem is the key: >Filters don't get arbitrary data from the calling application to do the >decryption, they get only data that is stored in the file. Otherwise, >the HDF5 library would not be able to do the decoding in a completely >transparent way. And if you put the key into the file (as filter >options, or similar), the NSA will be happy. > >To use the filter mechanism for encryption, you would need to get the >key via a side-channel. This is possible, but it will be hard to do >this in a usable and portable fashion. For instance, you cannot just >pop up a dialog asking for a key, because many programs using HDF5 >don't even have a text terminal connected to them while they run. > >Also note that filtering does not touch the metadata in the file. I. e. >the NSA will be able to see the entire description of what is encoded >in the file, they will just not have the actual data. > >If you want security, just use gpg to encrypt the entire file. > >Cheers, >Nathanael Hübbe > > > >On 03/21/2014 12:44 AM, Rowe, Jim wrote: >> Hello has anyone used a symmetric encryption filter with HDF5? I >> would like to introduce encryption (AES, DES, 3DES) in the pipeline >> after zlib compression to encrypt some datasets. >> >> >> >> Any examples, starting points, or suggestions would help. >> >> >> >> >> >> Thanks! >> >> --Jim >> >> >> >> >> >> _______________________________________________ >> Hdf-forum is for HDF software users discussion. >> [email protected] >> >>http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup. >>org >> > > >-- >Please be aware that the enemies of your civil rights and your freedom >are on CC of all unencrypted communication. Protect yourself. > _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
