Hello,

On Monday 08 March 2010 11:11:19 Alex Ehrlich wrote:
> Hello,
>
> I am in the way of implementing EFS (Encrypted File System) support on
> Windows platform.

That is a *very* important project.  It is definitely something that we need.

>
> The current state of code is "proof of concept". It allows to back up
> encrypted files and restore them to NTFS *if* information about
> encryption is hardcoded (filenames hardcoded), so the very basic flow
> kind of works. 

No problem. That is a good way to start.

> The base source code is 5.0.0 (neither the latest 5.0.1 
> nor development) as I wanted something "stable enough" as a baseline to
> add my two cents to and I started before 5.0.1; I believe it is easy to
> merge the changes into the latest versions when they are completed.

Yes, merging into the current code should not be a problem.  In any case, it 
is something that we have done *many* times.

> I attach the changed files for your consideration in full, not in diff
> format (as I found no suitable diff tool right now), they are packed by
> 7-zip (www.7-zip.org) as sourceforge lists block zip attachments iirc.
>
> Currently, I've added
>    bool use_efs_api;
> to win32-struct BFILE in the same way as "bool use_backup_api" is
> defined. It is not very nice (maybe a switch {normal, backupapi, efsapi}
> should be used instead) but at least it works -- with minimal changes to
> the rest of code. Please forgive me the fact that my code changes are
> pretty ugly -- I am not a C/C++ developer.

OK, I will take a look ...

>
> Ramifications made *currently* for EFS-encrypted files are:
> * no compression: compressing encrypted data is of no good usually,
> however, for small files compressing the EFS header could result into
> reasonable compression rate)
> * no encryption (by means of bacula): the file contents are already
> encrypted, but the file header (EFS stream) still needs encryption
> * no sparse support: while sparse files can be EFS-encrypted, sparse
> files could still be handled as "regular files filled by zeros"
> * no portable: see "other missing parts" below
> All these have been made for the sake of code simplicity only and I
> would like to get rid of them, see #3 below.
>
> The data stream type for EFS-encrypted files is still STREAM_WIN32_DATA.
> Is it OK or shall a new stream data type be added?

Yes, we could add STREAM_WIN32_EFS_DATA

>
> There are 3 main outstanding issues where I hope to get advice from
> architects.
>
> (1) How to get "encrypted" attribute of the file being backed up from
> "stat" functions (in src/win32/compat/compat.cpp) up to bopen/backup
> level (FF_PKT->BFILE) -- as those stat() functions are the right places
> to read it? 

It will most likely need a special call to a Win API, or we could get the 
information and return it in a new field in the stat packet.

> Currently src/findlib/find_one.c calls those stat functions, 
> but the only output data is "structure stat" that is "very much minimal
> unix info". 

Yes, that is for the Catalog only.  On the Volume, we have *all* information 
that is needed.

> Can I just add more attributes to this structure? (I got 
> access violation when I tried it in an "easy way"; there are lots of
> defines in the headers about the stat structure). 

It is not so easy to add new fields to the structure, but it is possible.  
Most of the structure is "compressed" by converting it to base 64 prior to 
putting it in the catalog.

> Or, alternatively, can 
> I use some other attribute (like sb->st_rdev that is currently used to
> keep reparse/mount point flag)? Or what would be the "right" way? Again,
> in the current "proof of concept" this is hardcoded based on sample
> filenames in backup.c/restore.c.

There is really no "right" way.  It is just a matter of finding the 
right "klude" to put it in.  Adding a new bit to st_rdev, is a very 
reasonable idea, but I need to look at the code first.

>
> (2) How to store (on backup) and read (on restore) the *new*
> "efs-encrypted" attribute of a file? This attribute does not fit into
> extended attributes imo as it shall be available "very early on
> backup/restore" -- at the moment of bopen(), since the way of opening
> file depends on this attribute. So how to add a new "basic-level"
> attribute?

It is added in src/baconfig.h, then it is used in the File daemon as it wants.  
In general, only the File daemon understands the contents of a particular 
stream.  The exception is STREAM_UNIX_ATTRIBUTES, which are sent to the 
Director (in addition to going to the Volume) to be written into the catalog.

>
> (3) The contents of the main data sending loop in src/filed/backup.c
> send_data()
>     while ((sd->msglen=(uint32_t)bread(&ff_pkt->bfd, rbuf, rsize)) > 0) {
>       ...
>       }
> should be rewritten into a separate function imo that gets a block of
> data and processes it. In this case callback-based processing EFS data
> could be easily fit into this "process data block" function logic: EFS
> callback to be an alternative to this "while bread" loop. Any opinion?

Normally, I would expect the same loop to be used, but a different stream 
would be set.  I need to look at the code though.

> So far I've made a "cut-down" copy-paste of the while loop contents to
> be called by EFS processing -- only sending data to SD, skipping all
> kinds of compression/encryption/etc.

Yes, we will need to look at that. 

The code you refer to needs to be rewritten.  It started as one or two 
different types of processing and now has grown enormously, and is too 
complicated.  We need to parameterize it and turn it into either classes or 
some sort of "filters" that could be stacked on each other.  Doing so will 
simplify it a lot an permit easier addition of new functionality such as 7zip 
compression.

>
> Other missing parts:
> * a known bug in src/filed/restore.c line 1175: only tiny files can be
> restored
> * only files are efs-encrypted/decrypted right now, not directories
> (contents of an "efs-encrypted" directory are visible without decryption
> in windows anyway, "encrypted directory" means just "encrypt any file
> created in this directory", not "encrypt directory listing")
> * check if target restore machine supports EFS (W2K or later)
> * check if target restore volume on win32 supports EFS (ie volume
> information states it supports encryption -- not available for FAT32 etc)
> * portable restore (when the target platform/filesystem does not support
> EFS) -- I believe it is not needed actually, as (in contrast to
> backupread/backupwrite format parsing) there is not much to do with raw
> EFS data; however, if anybody can provide a good reason for it this can
> be implemented
> * error handling should be reviewed and improved
> * set_portable_backup logic should be reviewed to work with efs

Yes, as with every project, initially to most people it seems simple, but 
there are lots of sticky details that take time to get it into production 
quality.

>
> Additional questions:
> * do I miss something in the overall picture (standalone utilities,
> verify, some other parts of bacula broken due to this way of changes)?

I am not sure what you are referring to, but the standalone utilities are not 
meant to do everything.  For example, I see no reason to include verify in 
them.  They are meant to assure users that they do not need all of Bacula to 
access Volumes.  If I were doing it over today, I might not even write all 
the utility routines -- very few people use them, and only Zmanda is making a 
big noise about the Bacula "proprietary" volume format, because they are 
using "standard" Unix utilities that are not in fact compatible nor do they 
have the necessary functionality.  So, aside from the presumed competitors, 
these stand alone utilites are not critical.

> * does anybody know about other "things" in/for Windows using the same
> FILE_ATTRIBUTE_ENCRYPTED for a purpose other than EFS (maybe, BitLocker,
> 3rd party encryption tools)?

I don't know anything about FILE_ATTRIBUTE_ENCRYPTED

> * looking at the backup vs restore code, it *seems* that backup reads
> and sends file data in chunks, while restore reads the whole file data
> from SD and then processes it as a single piece; have I understood the
> logic correctly? 

No, the restore gets back exactly what the backup sends.

> if yes then has anybody ever restored a 4Gb file 
> successfully? a 40Gb file?

Yes, of course, we handle any size file.  Bacula is not constrained by the 2GB 
file limitations of older systems.

>
> Any other advices/hints/ideas are appreciated, especially about the
> overall architecture (using STREAM_WIN32_DATA, working with attributes
> and so on).

For the moment, I have given all my comments.  

Best regards,

Kern

PS: could you send me the files in normal zip format (or tar format).  I use 
Linux, and there doesn't seem to be anything immediately available to read 
your attachement.  You can even send the files directly to me non-compressed.


>
> Regards,
>
> Alex Ehrlich



------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to