Thanks, Mark.  The Boeing encryption handles the problem nicely by letting you 
pass in your encryption "key" when you register the filter with HDF5.  It also 
supports having multiple keys, so conceivably you could allow someone access to 
parts of the data, but not others.  Anyone interest should take a look at the 
link Gerd sent (http://www.hdfgroup.uiuc.edu/HDF5/projects/boeing/encryption/).

Block encryption itself--such as AES--doesn't change the size of the data at 
all.  However, paired with compression, the order is critical.  You absolutely 
need to compress it first, then encrypt it.


Warm Regards,
Jim

-----Original Message-----
From: Hdf-forum [mailto:[email protected]] On Behalf Of 
Miller, Mark C.
Sent: Friday, March 21, 2014 11:17 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] symmetric encryption filters?

Hmm. This is an interesting discussion. Let me see if I can add two centsŠ

The HDF5 library allows you to define your own 'filters' which operate on the 
data in-transit as it is written to and read from the file. The filters are 
just call backs made from the HDF5 library to your user-defined code to operate 
on chunks of the dataset as they are emitted underneath the H5Dwrite and 
H5Dread calls.

If you write data via some user-defined filter then any reader will need to 
have access to the code that does the reverse operation (decrypt in your case 
and, of course any decryption keys). So, there is already implied in this that 
if you define some 'weird' filter, none of the existing HDF5 tools will be able 
to read your data (hdfview, h5dump, or third party applications that read
HDF5
like IDL, MATLAB, VisIt, etc.). But, given that you are talking about 
encryption here, I suspect that such an outcome is actually perfectly fine.

So, only applications that have access to your reader code (decryption
filters)
will be able to read the data.

And, why not handle that the way something like ssh does it now. Your reader 
'filter' would have to acquire the key from ~/.ssh/id_rsa and then use what it 
gets to decrypt the chunks getting read during H5Dread. Failure to acquire the 
key would result in a filter error and ultimately a read error in H5Dread's 
error stack. You could do some work to detect this case and report a useful 
error message (e.g. "no appropriate key to read encrypted data").

Would you have a single HDF5 file with datasets encrypted for different ids?
If so, I think the ssh-like mechanim still works.

Because 'filter' operations apply only to the raw data of a dataset, the 
metadata is not encrypted. This means things like the names, dimensions, 
datatypes, etc (and any attributes defined on the datasets) cannot be encrypted 
via the 'filter'
approach. Perhaps this is why another responder mentioned the introduction of a 
Virtual File Driver that collects metadata together and encrypts that 
separately.
I could see how that could be important in certain circumstances.

Some other issues are that 'filters' can be applied only when dataset are 
'chunked'.
And, the filters are then applied independently to each chunk. So, what you get 
for a single dataset is a bunch of chunks, each chunk independently encrypted.
So, you
don't have the whole dataset encrypted in one fell swoop. I don't think that 
would cause problems but thought I would mention it.

HDF5 can be 'smart' about applying filters and wind up NOT applying a requested 
filter in circumstances where you tell it the filter is optional. So, you have 
to take care to be sure your filter won't be treated by HDF5 that way and wind 
up skpping and encryption filter it should not have. Just be sure to set up the 
filters correctly when you define them to HDF5.

Will encryption *increase* the size of the data being written? I don't think it 
does but I guess its always possible depending on what you are doing. If so,
HDF5 may not
be able to tolerate that. It may expect chunks to be equal to or less than in 
size that the un-filtered chunks and error-out (or skip such a filter) if that 
is not the case. So, just be sure too review the documentation on these details.

I guess this is a long winded way of saying I think you could make it work 
within the limitations of some of the issues I mention above. And, I think you 
can invent a way to handle the keys that can probably be made to work.

Hope that was helpful.

Mark


On 3/21/14 3:23 AM, "huebbe" <[email protected]>
wrote:

>While it is possible to perform some encryption in a filter, the filter 
>mechanism is not designed for encryption. The problem is the key:
>Filters don't get arbitrary data from the calling application to do the 
>decryption, they get only data that is stored in the file. Otherwise, 
>the HDF5 library would not be able to do the decoding in a completely 
>transparent way. And if you put the key into the file (as filter 
>options, or similar), the NSA will be happy.
>
>To use the filter mechanism for encryption, you would need to get the 
>key via a side-channel. This is possible, but it will be hard to do 
>this in a usable and portable fashion. For instance, you cannot just 
>pop up a dialog asking for a key, because many programs using HDF5 
>don't even have a text terminal connected to them while they run.
>
>Also note that filtering does not touch the metadata in the file. I. e.
>the NSA will be able to see the entire description of what is encoded 
>in the file, they will just not have the actual data.
>
>If you want security, just use gpg to encrypt the entire file.
>
>Cheers,
>Nathanael Hübbe
>
>
>
>On 03/21/2014 12:44 AM, Rowe, Jim wrote:
>> Hello ­ has anyone used a symmetric encryption filter with HDF5?  I 
>> would like to introduce encryption (AES, DES, 3DES) in the pipeline 
>> after zlib compression to encrypt some datasets.
>> 
>>  
>> 
>> Any examples, starting points, or suggestions would help.
>> 
>>  
>> 
>>  
>> 
>> Thanks!
>> 
>> --Jim
>> 
>>  
>> 
>> 
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> 
>>http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.
>>org
>> 
>
>
>--
>Please be aware that the enemies of your civil rights and your freedom 
>are on CC of all unencrypted communication. Protect yourself.
>


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Reply via email to