Hi Jean-Pierre,

I've made a few updates to the "system compression" branch.

I finally got around to testing files with uncompressed size >= 4 GiB.  It turns
out that Windows *does* permit system compression on such files.  The file
format changes slightly to accomodate 64-bit offsets rather than 32-bit offsets,
(exactly the same as in WIM archives), so I updated the code accordingly.

I added a check in ntfs_fuse_open() to forbid writing to the unnamed data stream
of system compressed files, since it is not supported.  Such files are
effectively read-only; the write bit is being cleared in the mode as well.  I
suppose it would be possible to implement Windows' behavior where it
automatically decompresses the file if you try to write to it, but I'm passing
on that for now.

I simplified chunk caching in the decompression context.  Now it just holds the
most recently decompressed chunk, which should be good enough for library users
who are unaware of the precise compression chunk size.  However, the FUSE driver
still just opens the inode and allocates a new decompression context for every
read.  Since the FUSE driver --- the high-level one, at least --- doesn't
currently maintain file descriptor structures, there wasn't much that could be
done.  But it does do big reads, as you mentioned.
  
(Side note: in the FUSE filesystem I have in wimlib for mounting WIM images, I
set the 'fh' member of the 'struct fuse_file_info' to a file descriptor
structure in the ->open() operation, and I have 'flag_nullpath_ok' set in the
'struct fuse_operations'.  Then, I just get the file descriptor structure, with
no path, passed to operations such as ->read().  If something like that could be
done with NTFS-3g and objects like inodes could be left open for many reads or
writes, I expect it would make things a bit faster for all users.  Maybe it's
not possible because you could end up with the same inode opened multiple times
at once, in different file descriptors...)

Finally, I made a few other code cleanups and added a short subsection to the
ntfs-3g man page.

Eric


On Tue, Sep 22, 2015 at 10:54:10PM -0500, Eric Biggers wrote:
> I've pushed changes to my repository that address a few things you brought
> up:
> 
> - compiler warnings addressed
> - decompression memory allocated on heap rather than stack
> - a couple optimizations for decompression speed
> 
> I'll take a closer look at the interaction with the NTFS-3g driver when I
> have time.
> 
> 
> 
> On Tue, Sep 22, 2015 at 10:49 PM, Eric Biggers <ebigge...@gmail.com> wrote:
> 
> > Hi,
> >
> > "WOF compression" is as good as the other names.  It still seems slightly
> > wrong
> > because WOF (the "Windows Overlay Filesystem Filter") is a more general
> > feature,
> > and this is actually the *second* compression technology that Microsoft has
> > built on top of it (the first was "WIMBoot").  For now, I'll keep the code
> > the
> > way it is, using the "system compression" name.  It could be that
> > Microsoft will
> > release more documentation for this.
> >
> > Yes, your reparse data indicates XPRESS4K compression (the fourth 32-bit
> > little
> > endian word is 0).  FYI, here are the compressed sizes I get with the
> > Silesia
> > corpus (uncompressed size: 211,938,580 bytes total):
> >
> > LZNT1 (NTFS compression): 121,049,088 bytes
> > XPRESS4K: 104,124,416 bytes
> > XPRESS8K: 95,465,472 bytes
> > XPRESS16K: 90,460,160 bytes
> > LZX: 69,144,576 bytes
> >
> > Even though FUSE makes big reads, it would be nice to not have to allocate
> > a
> > decompression context for every read.  That would avoid doing all of the
> > following on a per-read basis:
> > - open WofCompressedData attribute
> > - allocate heap memory for ntfs_system_decompression_ctx
> > - allocate heap memory for XPRESS or LZX
> > - read chunk offsets from the compressed file's chunk table
> >
> > Having an external tool to create "system compressed" files, if people
> > want that
> > support, is probably the way to go.  Probably that would be possible even
> > with
> > no changes in libntfs-3g.
> >
> > Eric
> >
> >

------------------------------------------------------------------------------
_______________________________________________
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel

Reply via email to