[ntfs-3g-devel] [PATCH] attrib.c: fix expanding $STANDARD_INFORMATION with almost-full MFT record

2016-12-30 Thread Eric Biggers
When setting a security descriptor on an NTFS v1.2 format file in an
NTFS v3.0+ volume, NTFS-3G would migrate $STANDARD_INFORMATION to the
new format, which requires extending its size from 48 to 72 bytes.  If
this happened while the file's MFT record was almost full, and none of
the file's attributes could be made non-resident, and the file did not
have an attribute list attribute, then the operation would unexpectedly
fail with ENOENT.  Fix this by adding an attribute list to the file in
this situation.

Note that this bug would have been very difficult to hit under normal
usage because it required the MFT record to be filled to just the right
amount with attributes that cannot be made nonresident, such as
$FILE_NAME attributes.  The $SECURITY_DESCRIPTOR attribute must also
have already been made nonresident, since otherwise space could be freed
by making it nonresident.  Nevertheless, here's a script which
reproduces the bug:

fallocate -l 100M ntfs.img
mkntfs --fast --force ntfs.img
mkdir -p mnt
ntfs-3g ntfs.img mnt
touch mnt/file
ln mnt/file mnt/0001
ln mnt/file mnt/0002
ln mnt/file mnt/0003
ln mnt/file mnt/004
setfattr mnt/file -n system.ntfs_object_id -v 
0x
setfattr mnt/file -n system.ntfs_acl -v 
0x010004801400240034000102000520002002010200052000200202001c0001031400ff011f0001010001

The hard links make the MFT record of "mnt/file" nearly full.  Then,
assigning an object ID forces $SECURITY_DESCRIPTOR to be nonresident in
favor of $OBJECT_ID, while still keeping the MFT record nearly full.
Finally, the last command, which sets the file's security descriptor,
should succeed; but in fact it failed with "No such file or directory".

This bug was found using the wlfuzz program from wimlib.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/attrib.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/libntfs-3g/attrib.c b/libntfs-3g/attrib.c
index a5a6549a..1cc3ef64 100644
--- a/libntfs-3g/attrib.c
+++ b/libntfs-3g/attrib.c
@@ -5142,6 +5142,10 @@ static int ntfs_resident_attr_resize_i(ntfs_attr *na, 
const s64 newsize,
 */
if (na->type==AT_STANDARD_INFORMATION || na->type==AT_ATTRIBUTE_LIST) {
ntfs_attr_put_search_ctx(ctx);
+   if (!NInoAttrList(na->ni) && ntfs_inode_add_attrlist(na->ni)) {
+   ntfs_log_perror("Could not add attribute list");
+   return -1;
+   }
if (ntfs_inode_free_space(na->ni, offsetof(ATTR_RECORD,
non_resident_end) + 8)) {
ntfs_log_perror("Could not free space in MFT record");
-- 
2.11.0


--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] Allow setting DOS name when long name has trailing dot or space

2016-11-20 Thread Eric Biggers
Windows places filenames with a trailing dot or space in the Win32
namespace and allows setting DOS names on such files.  This is true even
though on Windows such filenames can only be created and accessed using
WinNT-style paths and will confuse most Windows software.  Regardless,
because libntfs-3g did not allow setting DOS names on such files, in
some cases it was impossible to correctly restore, using libntfs-3g, a
directory structure that was created under Windows.

Update ntfs_set_ntfs_dos_name() to permit operating on a file that has a
long name with a trailing dot or space.  But continue to forbid creating
such names on a filesystem FUSE-mounted with the windows_name option.
Additionally, continue to forbid a trailing a dot or space in DOS names;
this matches the Windows behavior.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 include/ntfs-3g/layout.h | 11 ---
 include/ntfs-3g/unistr.h |  4 ++--
 libntfs-3g/dir.c | 10 +++---
 libntfs-3g/unistr.c  | 21 +++--
 src/lowntfs-3g.c |  6 +++---
 src/ntfs-3g.c|  8 
 6 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/include/ntfs-3g/layout.h b/include/ntfs-3g/layout.h
index 564167c..0aba464 100644
--- a/include/ntfs-3g/layout.h
+++ b/include/ntfs-3g/layout.h
@@ -1068,12 +1068,17 @@ typedef enum {
FILE_NAME_WIN32 = 0x01,
/* The standard WinNT/2k NTFS long filenames. Case insensitive.
   All Unicode chars except: '\0', '"', '*', '/', ':', '<',
-  '>', '?', '\' and '|'. Further, names cannot end with a '.'
-  or a space. */
+  '>', '?', '\' and '|'.  Trailing dots and spaces are allowed,
+  even though on Windows a filename with such a suffix can only
+  be created and accessed using a WinNT-style path, i.e.
+  \\?\-prefixed.  (If a regular path is used, Windows will
+  strip the trailing dots and spaces, which makes such
+  filenames incompatible with most Windows software.) */
FILE_NAME_DOS   = 0x02,
/* The standard DOS filenames (8.3 format). Uppercase only.
   All 8-bit characters greater space, except: '"', '*', '+',
-  ',', '/', ':', ';', '<', '=', '>', '?' and '\'. */
+  ',', '/', ':', ';', '<', '=', '>', '?' and '\'.  Trailing
+  dots and spaces are forbidden. */
FILE_NAME_WIN32_AND_DOS = 0x03,
/* 3 means that both the Win32 and the DOS filenames are
   identical and hence have been saved in this single filename
diff --git a/include/ntfs-3g/unistr.h b/include/ntfs-3g/unistr.h
index 7ea0038..76e4ced 100644
--- a/include/ntfs-3g/unistr.h
+++ b/include/ntfs-3g/unistr.h
@@ -65,9 +65,9 @@ extern ntfschar *ntfs_str2ucs(const char *s, int *len);
 
 extern void ntfs_ucsfree(ntfschar *ucs);
 
-extern BOOL ntfs_forbidden_chars(const ntfschar *name, int len);
+extern BOOL ntfs_forbidden_chars(const ntfschar *name, int len, BOOL strict);
 extern BOOL ntfs_forbidden_names(ntfs_volume *vol,
-   const ntfschar *name, int len);
+   const ntfschar *name, int len, BOOL strict);
 extern BOOL ntfs_collapsible_chars(ntfs_volume *vol,
const ntfschar *shortname, int shortlen,
const ntfschar *longname, int longlen);
diff --git a/libntfs-3g/dir.c b/libntfs-3g/dir.c
index fdc87fa..a66f807 100644
--- a/libntfs-3g/dir.c
+++ b/libntfs-3g/dir.c
@@ -2654,9 +2654,12 @@ int ntfs_set_ntfs_dos_name(ntfs_inode *ni, ntfs_inode 
*dir_ni,
shortlen = ntfs_mbstoucs(newname, );
if (shortlen > MAX_DOS_NAME_LENGTH)
shortlen = MAX_DOS_NAME_LENGTH;
-   /* make sure the short name has valid chars */
+
+   /* Make sure the short name has valid chars.
+* Note: the short name cannot end with dot or space, but the
+* corresponding long name can. */
if ((shortlen < 0)
-   || ntfs_forbidden_names(ni->vol,shortname,shortlen)) {
+   || ntfs_forbidden_names(ni->vol,shortname,shortlen,TRUE)) {
ntfs_inode_close_in_dir(ni,dir_ni);
ntfs_inode_close(dir_ni);
res = -errno;
@@ -2667,7 +2670,8 @@ int ntfs_set_ntfs_dos_name(ntfs_inode *ni, ntfs_inode 
*dir_ni,
if (longlen > 0) {
oldlen = get_dos_name(ni, dnum, oldname);
if ((oldlen >= 0)
-   && !ntfs_forbidden_names(ni->vol, longname, longlen)) {
+   && !ntfs_forbidden_names(ni->vol, longname, longlen,
+FALSE)) {
if (oldlen > 0) {
if (flag

Re: [ntfs-3g-devel] [PATCH 2/2] ACE validation fixes

2016-10-29 Thread Eric Biggers
On Sat, Oct 29, 2016 at 11:21:37AM +0200, Jean-Pierre André wrote:
> Hi again,
> 
> Eric Biggers wrote:
> > Hi Jean-Pierre,
> > 
> > Sorry for the late response.
> 
> No problem. I also did not do much about it.
> 
> The intent of ntfs_valid_descr() was to guard against
> processing security descriptors with invalid or unknown
> features, but your need is to check whether a descriptor
> is valid for Windows. The purpose of ntfs-3g is to map
> Unix concepts to an ntfs file system, which is somehow
> different from emulating the Windows behavior (a moving
> target, Windows 8 and Windows 10 brought significant
> changes).
> 
> The translations of Windows ACLs to Posix ones rely on
> heuristics which will be defeated if the ACEs are not
> as expected.
> 
> Maybe having two variants of ntfs_valid_descr() would
> be the way to go, as you do not need translations,
> inheritance, etc.
> 
> Jean-Pierre

Maybe.  I suppose that would mean callers would be updated to specify whether
they want the stricter validation or not, and ntfs_get_ntfs_acl() and
ntfs_set_ntfs_acl() wouldn't require the stricter validation?  Would the
stricter validation apply to the SACL as well as the DACL?  It seems that it
shouldn't, i.e. having entries in the SACL, such as system audit entries or
integrity labels or whatever, shouldn't prevent NTFS-3G from attempting to map
the DACL to UNIX permissions.

Eric

--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] ntfs-3g, lowntfs-3g: remove failed_secure variable

2016-10-29 Thread Eric Biggers
This is no longer used.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 src/lowntfs-3g.c | 3 ---
 src/ntfs-3g.c| 3 ---
 2 files changed, 6 deletions(-)

diff --git a/src/lowntfs-3g.c b/src/lowntfs-3g.c
index 9d933d2..59fbcd0 100644
--- a/src/lowntfs-3g.c
+++ b/src/lowntfs-3g.c
@@ -4227,7 +4227,6 @@ int main(int argc, char *argv[])
fuse_fstype fstype = FSTYPE_UNKNOWN;
 #endif
const char *permissions_mode = (const char*)NULL;
-   const char *failed_secure = (const char*)NULL;
 #if defined(HAVE_SETXATTR) && defined(XATTR_MAPPINGS)
struct XATTRMAPPING *xattr_mapping = (struct XATTRMAPPING*)NULL;
 #endif /* defined(HAVE_SETXATTR) && defined(XATTR_MAPPINGS) */
@@ -4454,8 +4453,6 @@ int main(int argc, char *argv[])
ntfs_log_info("%s", fuse26_kmod_msg);
 #endif  
setup_logging(parsed_options);
-   if (failed_secure)
-   ntfs_log_info("%s\n",failed_secure);
if (permissions_mode)
ntfs_log_info("%s, configuration type %d\n",permissions_mode,
5 + POSIXACLS*6 - KERNELPERMS*3 + CACHEING);
diff --git a/src/ntfs-3g.c b/src/ntfs-3g.c
index 702d676..c6ee014 100644
--- a/src/ntfs-3g.c
+++ b/src/ntfs-3g.c
@@ -4034,7 +4034,6 @@ int main(int argc, char *argv[])
fuse_fstype fstype = FSTYPE_UNKNOWN;
 #endif
const char *permissions_mode = (const char*)NULL;
-   const char *failed_secure = (const char*)NULL;
 #if defined(HAVE_SETXATTR) && defined(XATTR_MAPPINGS)
struct XATTRMAPPING *xattr_mapping = (struct XATTRMAPPING*)NULL;
 #endif /* defined(HAVE_SETXATTR) && defined(XATTR_MAPPINGS) */
@@ -4260,8 +4259,6 @@ int main(int argc, char *argv[])
ntfs_log_info("%s", fuse26_kmod_msg);
 #endif 
setup_logging(parsed_options);
-   if (failed_secure)
-   ntfs_log_info("%s\n",failed_secure);
if (permissions_mode)
ntfs_log_info("%s, configuration type %d\n",permissions_mode,
4 + POSIXACLS*6 - KERNELPERMS*3 + CACHEING);
-- 
2.10.1


--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] unistr.c: make utf16_to_utf8_size() always honor @outs_len

2016-10-29 Thread Eric Biggers
On Sat, Oct 29, 2016 at 12:07:17PM +0200, Jean-Pierre André wrote:
> Eric Biggers wrote:
> > On Sat, Oct 29, 2016 at 09:45:57AM +0200, Jean-Pierre André wrote:
> > > 
> > > I am waiting for a green light from Tuxera for merging them
> > > into the git.
> > 
> > Is there any particular reason you need their permission to do so?
> 
> Well, they are the owner of the project...
> 

Why?  What exactly do they contribute?  I don't see patches from them, or them
reviewing patches on the mailing list.

>From what I can see the real work is done by you and other independent
contributors such as myself.  It's been claimed Tuxera helps with testing.  How
exactly?  Where can I find information about the tests they run on NTFS-3G?  Is
it helpful and do they actually find bugs?  Wouldn't it be more useful to
develop more open source tests and add NTFS support to xfstests?

You've mentioned several times that not all changes are going into the next
version.  Normally this means that some time before the release, a release
branch would be cut, so there would be two branches in the repository: one for
the development version, and one for the next release.  Right now I only see an
"edge" branch.  Which version is that supposed to be?  Is there going to be
another branch created?  Where can I find the version destined for the next
release so I can help test it?

Thanks,

Eric

--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] lowntfs-3g: correctly pass file info to reparse plugins

2016-10-29 Thread Eric Biggers
uintptr_t can always hold a pointer but long may not.  In practice it doesn't
matter on non-Windows, though.  Anyway, the important part is not the cast at
all but rather the fact that 'fi->fh' needs to be used and not 'fi'...

On Sat, Oct 29, 2016 at 10:38:45AM +0200, Jean-Pierre André wrote:
> Hi again,
> 
> The "#ifdef PLUGIN_ENABLED" part is a major bug eligible
> for the next version.
> 
> The "uintptr_t" casts are more debatable, because this is
> defined as an optional type in C11 (section 7.20.1.4).
> This is probably why fh is not declared as uintptr_t in
> "fuse_common.h". Nevertheless, you are right, the cast to
> long is wrong in 64-bit environments which define long as
> a 32-bit integer (and the equivalent type for that on
> Windows is ULONG_PTR). The real question is "Is there an
> environment which defines uintptr_t different from unsigned
> long ?"
> 
> As doing something about it in "configure" would be
> overkilling, I will probably take the proposal for next
> version.
> 
> Jean-Pierre
> 
> Eric Biggers wrote:
> > Signed-off-by: Eric Biggers <ebigge...@gmail.com>
> > ---
> >   src/lowntfs-3g.c | 14 +++---
> >   1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/src/lowntfs-3g.c b/src/lowntfs-3g.c
> > index a91d123..9d933d2 100644
> > --- a/src/lowntfs-3g.c
> > +++ b/src/lowntfs-3g.c
> > @@ -1493,15 +1493,15 @@ close:
> > of->parent = 0;
> > of->ino = ino;
> > of->state = state;
> > -#ifdef PLUGIN_ENABLED
> > +#ifndef PLUGINS_DISABLED
> > memcpy(>fi, fi, sizeof(struct fuse_file_info));
> > -#endif /* PLUGIN_ENABLED */
> > +#endif /* PLUGINS_DISABLED */
> > of->next = ctx->open_files;
> > of->previous = (struct open_file*)NULL;
> > if (ctx->open_files)
> > ctx->open_files->previous = of;
> > ctx->open_files = of;
> > -   fi->fh = (long)of;
> > +   fi->fh = (uintptr_t)of;
> > }
> > }
> > if (res)
> > @@ -1542,7 +1542,7 @@ static void ntfs_fuse_read(fuse_req_t req, fuse_ino_t 
> > ino, size_t size,
> > REPARSE_POINT *reparse;
> > struct open_file *of;
> > 
> > -   of = (struct open_file*)fi;
> > +   of = (struct open_file*)(uintptr_t)fi->fh;
> > res = CALL_REPARSE_PLUGIN(ni, read, buf, size, offset, >fi);
> > if (res >= 0) {
> > goto stamps;
> > @@ -1623,7 +1623,7 @@ static void ntfs_fuse_write(fuse_req_t req, 
> > fuse_ino_t ino, const char *buf,
> > REPARSE_POINT *reparse;
> > struct open_file *of;
> > 
> > -   of = (struct open_file*)fi;
> > +   of = (struct open_file*)(uintptr_t)fi->fh;
> > res = CALL_REPARSE_PLUGIN(ni, write, buf, size, offset,
> > >fi);
> > if (res >= 0) {
> > @@ -2283,7 +2283,7 @@ exit:
> > if (ctx->open_files)
> > ctx->open_files->previous = of;
> > ctx->open_files = of;
> > -   fi->fh = (long)of;
> > +   fi->fh = (uintptr_t)of;
> > }
> > }
> > return res;
> > @@ -2774,7 +2774,7 @@ static void ntfs_fuse_release(fuse_req_t req, 
> > fuse_ino_t ino,
> > char ghostname[GHOSTLTH];
> > int res;
> > 
> > -   of = (struct open_file*)(long)fi->fh;
> > +   of = (struct open_file*)(uintptr_t)fi->fh;
> > /* Only for marked descriptors there is something to do */
> > if (!of
> > || !(of->state & (CLOSE_COMPRESSED | CLOSE_ENCRYPTED
> > 
> 

--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] unistr.c: make utf16_to_utf8_size() always honor @outs_len

2016-10-28 Thread Eric Biggers
Hi Jean-Pierre,

Are you going to be reviewing/applying any of these other patches?  (Excluding
the "ACE validation fixes" one which will need to be reworked once the desired
behavior is agreed on.)

I've also found a bug in lowntfs-3g regarding the reparse plugin support, so
I'll send a patch for that too.

Thanks,

Eric

On Wed, Sep 14, 2016 at 11:39:07PM -0700, Eric Biggers wrote:
> utf16_to_utf8_size() was not guaranteed to fail with ENAMETOOLONG if the
> computed length was greater than @outs_len.  This could cause a buffer
> overrun in ntfs_utf16_to_utf8().  This was a bug introduced by the
> patches to allow broken Unicode.  Fix it.
> 
> Signed-off-by: Eric Biggers <ebigge...@gmail.com>
> ---
>  libntfs-3g/unistr.c | 26 +-
>  1 file changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/libntfs-3g/unistr.c b/libntfs-3g/unistr.c
> index 4d33bb4..190dbd8 100644
> --- a/libntfs-3g/unistr.c
> +++ b/libntfs-3g/unistr.c
> @@ -458,10 +458,15 @@ void ntfs_file_value_upcase(FILE_NAME_ATTR 
> *file_name_attr,
>  */
>   
>  /* 
> - * Return the amount of 8-bit elements in UTF-8 needed (without the 
> terminating
> - * null) to store a given UTF-16LE string.
> + * Return the number of bytes in UTF-8 needed (without the terminating null) 
> to
> + * store the given UTF-16LE string.
>   *
> - * Return -1 with errno set if string has invalid byte sequence or too long.
> + * On error, -1 is returned, and errno is set to the error code. The 
> following
> + * error codes can be expected:
> + *   EILSEQ  The input string is not valid UTF-16LE (only possible
> + *   if compiled without ALLOW_BROKEN_UNICODE).
> + *   ENAMETOOLONGThe length of the UTF-8 string in bytes (without the
> + *   terminating null) would exceed @outs_len.
>   */
>  static int utf16_to_utf8_size(const ntfschar *ins, const int ins_len, int 
> outs_len)
>  {
> @@ -470,7 +475,7 @@ static int utf16_to_utf8_size(const ntfschar *ins, const 
> int ins_len, int outs_l
>   BOOL surrog;
>  
>   surrog = FALSE;
> - for (i = 0; i < ins_len && ins[i]; i++) {
> + for (i = 0; i < ins_len && ins[i] && count <= outs_len; i++) {
>   unsigned short c = le16_to_cpu(ins[i]);
>   if (surrog) {
>   if ((c >= 0xdc00) && (c < 0xe000)) {
> @@ -511,17 +516,20 @@ static int utf16_to_utf8_size(const ntfschar *ins, 
> const int ins_len, int outs_l
>   count += 3;
>   else 
>   goto fail;
> - if (count > outs_len) {
> - errno = ENAMETOOLONG;
> - goto out;
> - }
>   }
> - if (surrog) 
> +
> + if (surrog && count <= outs_len) {
>  #if ALLOW_BROKEN_UNICODE
>   count += 3; /* ending with a single surrogate */
>  #else
>   goto fail;
>  #endif /* ALLOW_BROKEN_UNICODE */
> + }
> +
> + if (count > outs_len) {
> + errno = ENAMETOOLONG;
> + goto out;
> + }
>  
>   ret = count;
>  out:
> -- 
> 2.9.3
> 

--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH 2/2] ACE validation fixes

2016-10-28 Thread Eric Biggers
Hi Jean-Pierre,

Sorry for the late response.

On Mon, Sep 26, 2016 at 01:52:36PM +0200, Jean-Pierre André wrote:
> > 1. "Object" ACEs are mentioned as only being used for Active Directory 
> > objects
> > [source: Windows Internals 6th edition].  On Windows, trying to use
> > SetFileSecurity() to set an object ACE in the DACL of an NTFS directory 
> > fails
> > with ERROR_INVALID_ACL.  This is different from how Windows treats truly
> > unknown ACE types (see below).  But I think it would be fine for 
> > NTFS-3G to
> > simplify things by treating object ACEs like any other unknown ACE type.
> 
> Just for me to understand : Unix stores special objects
> (such as pipes, sockets, etc.) as file system objects with
> access control attached. Does Windows also records special
> objects as file system objects ? If so, how are these
> objects distinguished from files and directories ?

Active Directory objects are stored in a database, not directly on the
filesystem.  They just happen to both use the security descriptor format.

> The purpose of ntfs_valid_descr() is to reject descriptors
> which ntfs-3g cannot process properly. I am not keen on
> letting special ACEs leaking into translations to Linux.
> 
> Inheritance is of particular concern, because this is rather
> complex and there are undocumented special cases which have
> to be found by trials and errors. Moreover the result is
> generally not satisfactory for Linux users who have
> expectations different from Windows ones (typical example :
> inheritance of execute permissions).
> 
> Also : if objects can inherit special ACEs from their parent
> directory, what prevents them to be inherited to plain files
> created in the same directory ?
> 
> Cannot these specific needs be implemented within wimlib ?

I am not sure what you mean regarding the purpose of ntfs_valid_descr(), but the
existing behavior is that it allows unknown ACEs in both the DACL and SACL
except in certain cases.  And in those certain cases, such as a callback ACE
that is not the last ACE in the list, it is broken for backup/restore
applications like wimlib which expect to be able to set security descriptors
that were created on Windows.

Note that as a backup/restore application, wimlib does not want ACE inheritance
at all, nor does it want translation of Windows ACLs to POSIX or Linux
permissions.  It simply wants to stamp a security descriptor on each file or
directory.  It can be assumed that the security descriptors are valid in the
sense that Windows would accept them.  As part of this, it can be assumed that
the ACEs in each ACL have the common ACE_HEADER.  But it can *not* be assumed
that every ACE is of a known type.  The expected behavior is that unknown ACEs
are settable but are then skipped during access evalution.  This would match the
Windows behavior.  I think Windows users might actually find it quite annoying
for Windows to do otherwise because then people could not, for example, restore
security descriptors intended for a later version of Windows from a live system
running an older version of Windows.  Essentially the same argument applies to
NTFS-3G.

With regards to inheritance, as I pointed out Windows simply performs the
standard inheritance algorithm on unknown ACEs per the standard ACE_HEADER
flags.  It's true that there could be special rules that it is missing, but it
seems like the most logical behavior and probably the easiest to implement too.
Of course, this isn't currently relevant to wimlib which as I mentioned does not
care about inheritance.  I just thought I'd suggest it as an improvement (though
it's still not clear to me what the current behavior is "supposed" to be, and
that's part of the problem).

Eric

--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH 2/2] ACE validation fixes

2016-09-25 Thread Eric Biggers
On Wed, Sep 21, 2016 at 12:01:23PM +0200, Jean-Pierre André wrote:
> 
> I have never met an object ACE and they might be irrelevant
> for a file system which only deals with files and directories.
> 
> Is there a point in ntfs-3g accepting ACE types controlling
> entities which are not emulated on Linux (callbacks, labels,
> policies, etc.) ?

Yes, because it should be --- and already is, except in certain cases ---
possible to use NTFS-3G to restore security descriptors that were created under
Windows.  This can be done by using wimlib to extract a WIM image to a NTFS
volume (for example).

I think the emulation of ACEs under Linux is a separate concern which for some
of the new ACE types isn't really possible or meaningful.

I also did some research, and some experiments on Windows:

1. "Object" ACEs are mentioned as only being used for Active Directory objects
   [source: Windows Internals 6th edition].  On Windows, trying to use
   SetFileSecurity() to set an object ACE in the DACL of an NTFS directory fails
   with ERROR_INVALID_ACL.  This is different from how Windows treats truly
   unknown ACE types (see below).  But I think it would be fine for NTFS-3G to
   simplify things by treating object ACEs like any other unknown ACE type.

2. "Callback" ACEs, also known as "conditional" ACEs, are mentioned as only
   existing for use by the AuthZ API, which is a userspace API for access
   control.  The Windows kernel does *not* evaluate such ACEs when performing
   access checks [source: Windows Internals 6th edition].  However, I *was* able
   to set such an ACE in the DACL of an NTFS directory using SetFileSecurity().
   In addition, on Windows such an ACE is inherited per the standard ACE header
   flags, and the generic rights and SID mapping is performed.  Still, I don't
   yet know exactly *why* recent Windows 10 builds have been observed to use
   such ACEs.

3. Truly unknown ACE types are accepted by SetFileSecurity().  They also are
   inherited per the standard ACE header flags.  However, they are not evaluated
   during access checks.  In addition, SetFileSecurity() does no validation of
   the ACCESS_MASK or SID fields of unknown ACEs --- which makes sense because
   the format of such ACEs is actually unknown beyond the ACE_HEADER.  Instead,
   the ACE size field simply required to be at least sizeof(ACE_HEADER) and a
   multiple of 4.  No generic rights or SID mapping is performed during
   inheritance of unknown ACEs.

So, given the requirements and these observations, I'd like to propose that
NTFS-3G handle unknown ACE types as follows:

* ntfs_valid_descr() accepts them and check the size only (like Windows)
* ntfs_inherit_acl() performs inheritance on unknown ACE types per the ACE
  header flags but without the generic mapping (like Windows).  Optionally,
  generic rights and SID mapping can be done for callback ACEs.
* NTFS-3G otherwise ignores unknown ACEs (like Windows)

Any thoughts on this?

Eric

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH 2/2] Remove unused argument from ntfs_make_symlink()

2016-09-21 Thread Eric Biggers
On Wed, Sep 21, 2016 at 10:47:57AM +0200, Jean-Pierre André wrote:
> Hi Eric,
> 
> There has been a recent request for ntfs-3g to return
> the st_size for symlinks as the size of the target
> path (as described in the stat manual), so the target
> is now useful
> 
> Regards
> 

Well, the previous patch does that, which is why 'attr_size' became unused.

Eric

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH 2/2] Remove unused argument from ntfs_make_symlink()

2016-09-21 Thread Eric Biggers
Now that the size of the reparse point attribute is no longer used by
the FUSE drivers to populate st_size for symlinks and junctions, it no
longer needs to be returned by ntfs_make_symlink().

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 include/ntfs-3g/reparse.h |  4 ++--
 libntfs-3g/reparse.c  |  4 +---
 src/lowntfs-3g.c  | 14 --
 src/ntfs-3g.c | 12 
 4 files changed, 11 insertions(+), 23 deletions(-)

diff --git a/include/ntfs-3g/reparse.h b/include/ntfs-3g/reparse.h
index 27e9050..76af915 100644
--- a/include/ntfs-3g/reparse.h
+++ b/include/ntfs-3g/reparse.h
@@ -24,8 +24,8 @@
 #ifndef REPARSE_H
 #define REPARSE_H
 
-char *ntfs_make_symlink(ntfs_inode *ni, const char *mnt_point,
-   int *pattr_size);
+char *ntfs_make_symlink(ntfs_inode *ni, const char *mnt_point);
+
 BOOL ntfs_possible_symlink(ntfs_inode *ni);
 
 int ntfs_get_ntfs_reparse_data(ntfs_inode *ni, char *value, size_t size);
diff --git a/libntfs-3g/reparse.c b/libntfs-3g/reparse.c
index b0f96ae..2e92fbb 100644
--- a/libntfs-3g/reparse.c
+++ b/libntfs-3g/reparse.c
@@ -724,8 +724,7 @@ static char *ntfs_get_rellink(ntfs_inode *ni, ntfschar 
*junction, int count)
  * symbolic link or directory junction
  */
 
-char *ntfs_make_symlink(ntfs_inode *ni, const char *mnt_point,
-   int *pattr_size)
+char *ntfs_make_symlink(ntfs_inode *ni, const char *mnt_point)
 {
s64 attr_size = 0;
char *target;
@@ -820,7 +819,6 @@ char *ntfs_make_symlink(ntfs_inode *ni, const char 
*mnt_point,
}
free(reparse_attr);
}
-   *pattr_size = attr_size;
if (bad)
errno = EOPNOTSUPP;
return (target);
diff --git a/src/lowntfs-3g.c b/src/lowntfs-3g.c
index bc9770a..b05492e 100644
--- a/src/lowntfs-3g.c
+++ b/src/lowntfs-3g.c
@@ -634,11 +634,10 @@ static int junction_getstat(ntfs_inode *ni,
struct stat *stbuf)
 {
char *target;
-   int attr_size;
int res;
 
errno = 0;
-   target = ntfs_make_symlink(ni, ctx->abs_mnt_point, _size);
+   target = ntfs_make_symlink(ni, ctx->abs_mnt_point);
/*
 * If the reparse point is not a valid
 * directory junction, and there is no error
@@ -713,11 +712,9 @@ static int ntfs_fuse_getstat(struct SECURITY_CONTEXT *scx,
goto ok;
 #else /* PLUGINS_DISABLED */
char *target;
-   int attr_size;
 
errno = 0;
-   target = ntfs_make_symlink(ni, ctx->abs_mnt_point,
-   _size);
+   target = ntfs_make_symlink(ni, ctx->abs_mnt_point);
/*
 * If the reparse point is not a valid
 * directory junction, and there is no error
@@ -1020,12 +1017,11 @@ static int junction_readlink(ntfs_inode *ni,
const REPARSE_POINT *reparse __attribute__((unused)),
char **pbuf)
 {
-   int attr_size;
int res;
 
errno = 0;
res = 0;
-   *pbuf = ntfs_make_symlink(ni, ctx->abs_mnt_point, _size);
+   *pbuf = ntfs_make_symlink(ni, ctx->abs_mnt_point);
if (!*pbuf) {
if (errno == EOPNOTSUPP) {
*pbuf = strdup(ntfs_bad_reparse);
@@ -1068,11 +1064,9 @@ static void ntfs_fuse_readlink(fuse_req_t req, 
fuse_ino_t ino)
res = -errno;
}
 #else /* PLUGINS_DISABLED */
-   int attr_size;
-
errno = 0;
res = 0;
-   buf = ntfs_make_symlink(ni, ctx->abs_mnt_point, _size);
+   buf = ntfs_make_symlink(ni, ctx->abs_mnt_point);
if (!buf) {
if (errno == EOPNOTSUPP)
buf = strdup(ntfs_bad_reparse);
diff --git a/src/ntfs-3g.c b/src/ntfs-3g.c
index f4af89b..3633ac3 100644
--- a/src/ntfs-3g.c
+++ b/src/ntfs-3g.c
@@ -698,11 +698,10 @@ static int junction_getattr(ntfs_inode *ni,
struct stat *stbuf)
 {
char *target;
-   int attr_size;
int res;
 
errno = 0;
-   target = ntfs_make_symlink(ni, ctx->abs_mnt_point, _size);
+   target = ntfs_make_symlink(ni, ctx->abs_mnt_point);
/*
 * If the reparse point is not a valid
 * directory junction, and there is no error
@@ -805,10 +804,9 @@ static int ntfs_fuse_getattr(const char *org_path, struct 
stat *stbuf)
goto exit;
 #else /* PLUGINS_DISABLED */
char *target;
-   int attr_size;
 
errno = 0;
-   target = ntfs_make_symlink(ni, ctx->abs_mnt_point, 

[ntfs-3g-devel] [PATCH 1/2] ntfs-3g, lowntfs-3g: set correct st_size for symlinks

2016-09-21 Thread Eric Biggers
NTFS-3G used several different conventions for setting st_size of
symlinks.  Make it use the standard POSIX convention of setting st_size
to the length of the link target without a terminating null.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 src/lowntfs-3g.c | 35 +++
 src/ntfs-3g.c| 35 +++
 2 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/src/lowntfs-3g.c b/src/lowntfs-3g.c
index a91d123..bc9770a 100644
--- a/src/lowntfs-3g.c
+++ b/src/lowntfs-3g.c
@@ -645,11 +645,10 @@ static int junction_getstat(ntfs_inode *ni,
 * we still display as a symlink
 */
if (target || (errno == EOPNOTSUPP)) {
-   /* returning attribute size */
if (target)
-   stbuf->st_size = attr_size;
+   stbuf->st_size = strlen(target);
else
-   stbuf->st_size = sizeof(ntfs_bad_reparse);
+   stbuf->st_size = sizeof(ntfs_bad_reparse) - 1;
stbuf->st_blocks = (ni->allocated_size + 511) >> 9;
stbuf->st_mode = S_IFLNK;
free(target);
@@ -705,7 +704,7 @@ static int ntfs_fuse_getstat(struct SECURITY_CONTEXT *scx,
apply_umask(stbuf);
} else {
stbuf->st_size =
-   sizeof(ntfs_bad_reparse);
+   sizeof(ntfs_bad_reparse) - 1;
stbuf->st_blocks =
(ni->allocated_size + 511) >> 9;
stbuf->st_mode = S_IFLNK;
@@ -725,12 +724,11 @@ static int ntfs_fuse_getstat(struct SECURITY_CONTEXT *scx,
 * we still display as a symlink
 */
if (target || (errno == EOPNOTSUPP)) {
-   /* returning attribute size */
if (target)
-   stbuf->st_size = attr_size;
+   stbuf->st_size = strlen(target);
else
stbuf->st_size = 
-   sizeof(ntfs_bad_reparse);
+   sizeof(ntfs_bad_reparse) - 1;
stbuf->st_blocks =
(ni->allocated_size + 511) >> 9;
stbuf->st_nlink =
@@ -837,8 +835,29 @@ static int ntfs_fuse_getstat(struct SECURITY_CONTEXT *scx,
le64_to_cpu(
intx_file->minor));
}
-   if (intx_file->magic == INTX_SYMBOLIC_LINK)
+   if (intx_file->magic == INTX_SYMBOLIC_LINK) {
+   char *target = NULL;
+   int len;
+
+   /* st_size should be set to length of
+* symlink target as multibyte string */
+   len = ntfs_ucstombs(
+   intx_file->target,
+   (na->data_size -
+   offsetof(INTX_FILE,
+target)) /
+  sizeof(ntfschar),
+, 0);
+   if (len < 0) {
+   res = -errno;
+   free(intx_file);
+   ntfs_attr_close(na);
+   goto exit;
+   }
+   free(target);
stbuf->st_mode = S_IFLNK;
+   stbuf->st_size = len;
+   }
free(intx_file);
}
ntfs_attr_close(na);
diff --git a/src/ntfs-3g.c b/src/ntfs-3g.c
index 702d676..f4af89b 100644
--- a/src/ntfs-3g.c
+++ b/src/ntfs-3g.c
@@ -709,11 +709,10 @@ static int junction_getattr(ntfs_inode *ni,
 * we still display as a symlink
 */
if (target || (errno == EOPNOTSUPP)) {
-   /* returni

[ntfs-3g-devel] [PATCH 2/2] ACE validation fixes

2016-09-21 Thread Eric Biggers
- Allow extra data after the SID.  Recent Windows 10 images have been
  reported to contain DACLs with an ACCESS_ALLOWED_CALLBACK_ACE with
  extra data after the SID.  The ACE is not necessarily at the end of
  the DACL, so the recent fix to allow extra data for the last ACE only
  was insufficient.  I also suspect extra data is sometimes used by
  other ACE types.  In fact, SetFileSecurity() on Windows permits extra
  data for all ACE types.  So make NTFS-3G do the same.

- Validate the SID at the correct offset for "object" ACEs.  (Most
  likely this bug wasn't noticed because object ACEs are rarely used.)

- Only validate the SID for recognized ACE types.  The placement or
  presence of the SID should not be assumed for future ACE types.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/acls.c | 114 +++---
 1 file changed, 82 insertions(+), 32 deletions(-)

diff --git a/libntfs-3g/acls.c b/libntfs-3g/acls.c
index b91e041..06c44d9 100644
--- a/libntfs-3g/acls.c
+++ b/libntfs-3g/acls.c
@@ -536,45 +536,95 @@ gid_t ntfs_find_group(const struct MAPPING* groupmapping, 
const SID * gsid)
 }
 
 /*
+ * Does the ACE have the same format as ACCESS_ALLOWED_ACE?
+ */
+
+static BOOL is_regular_ace(const ACE_HEADER *pace)
+{
+   switch (pace->type) {
+   case ACCESS_ALLOWED_ACE_TYPE:
+   case ACCESS_DENIED_ACE_TYPE:
+   case SYSTEM_AUDIT_ACE_TYPE:
+   case ACCESS_ALLOWED_CALLBACK_ACE_TYPE:
+   case ACCESS_DENIED_CALLBACK_ACE_TYPE:
+   case SYSTEM_AUDIT_CALLBACK_ACE_TYPE:
+   case SYSTEM_MANDATORY_LABEL_ACE_TYPE:
+   case SYSTEM_RESOURCE_ATTRIBUTE_ACE_TYPE:
+   case SYSTEM_SCOPED_POLICY_ID_ACE_TYPE:
+   case SYSTEM_PROCESS_TRUST_LABEL_ACE_TYPE:
+   return TRUE;
+   default:
+   return FALSE;
+   }
+}
+
+/*
+ * Does the ACE have the same format as ACCESS_ALLOWED_OBJECT_ACE?
+ */
+
+static BOOL is_object_ace(const ACE_HEADER *pace)
+{
+   switch (pace->type) {
+   case ACCESS_ALLOWED_OBJECT_ACE_TYPE:
+   case ACCESS_DENIED_OBJECT_ACE_TYPE:
+   case SYSTEM_AUDIT_OBJECT_ACE_TYPE:
+   case ACCESS_ALLOWED_CALLBACK_OBJECT_ACE_TYPE:
+   case ACCESS_DENIED_CALLBACK_OBJECT_ACE_TYPE:
+   case SYSTEM_AUDIT_CALLBACK_OBJECT_ACE_TYPE:
+   return TRUE;
+   default:
+   return FALSE;
+   }
+}
+
+/*
  * Check the validity of the ACEs in a DACL or SACL
+ *
+ * If an ACE is recognized, we validate its SID.
+ * Otherwise, we validate its size only.
  */
 
 static BOOL valid_acl(const ACL *pacl, unsigned int end)
 {
-   const ACCESS_ALLOWED_ACE *pace;
-   unsigned int offace;
-   unsigned int acecnt;
-   unsigned int acesz;
-   unsigned int nace;
-   unsigned int wantsz;
-   BOOL ok;
+   unsigned int ace_count;
+   unsigned int ace_offset;
+   unsigned int ace_size;
+   unsigned int sid_offset;
+   const ACE_HEADER *pace;
+   const SID *psid;
 
-   ok = TRUE;
-   acecnt = le16_to_cpu(pacl->ace_count);
-   offace = sizeof(ACL);
-   for (nace = 0; (nace < acecnt) && ok; nace++) {
-   /* be sure the beginning is within range */
-   if ((offace + sizeof(ACCESS_ALLOWED_ACE)) > end)
-   ok = FALSE;
-   else {
-   pace = (const ACCESS_ALLOWED_ACE*)
-   &((const char*)pacl)[offace];
-   acesz = le16_to_cpu(pace->size);
-   if (((offace + acesz) > end)
-  || !ntfs_valid_sid(>sid))
-ok = FALSE;
-   else {
-   /* Win10 may insert garbage in the last ACE */
-   wantsz = ntfs_sid_size(>sid) + 8;
-   if (((nace < (acecnt - 1))
-   && (wantsz != acesz))
-   || (wantsz > acesz))
-   ok = FALSE;
-   }
-   offace += acesz;
-   }
+   for (ace_count = le16_to_cpu(pacl->ace_count), ace_offset = sizeof(ACL);
+ace_count != 0;
+ace_count--, ace_offset += ace_size)
+   {
+   if (sizeof(ACE_HEADER) > end - ace_offset)
+   return FALSE;
+
+   pace = (const ACE_HEADER *)((char *)pacl + ace_offset);
+   ace_size = le16_to_cpu(pace->size);
+   if (ace_size < sizeof(ACE_HEADER) ||
+   ace_size > end - ace_offset)
+   return FALSE;
+
+   if (is_regular_ace(pace))
+   sid_offset = offsetof(ACCESS_ALLOWED_ACE, sid);
+   else if (is_object_ace

[ntfs-3g-devel] [PATCH 1/2] Add definitions of ACE types up to Windows 10

2016-09-21 Thread Eric Biggers
The ACE types defined in layout.h were significantly out of date, as
Microsoft has defined a number of new ACE types over the years.

None of the new ACEs uses a new base structure, though it seems that
some can have (or usually have) additional data after the SID.

More information about the new ACEs can be found in the public
documentation on MSDN.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 include/ntfs-3g/layout.h | 90 +---
 1 file changed, 62 insertions(+), 28 deletions(-)

diff --git a/include/ntfs-3g/layout.h b/include/ntfs-3g/layout.h
index 98380de..564167c 100644
--- a/include/ntfs-3g/layout.h
+++ b/include/ntfs-3g/layout.h
@@ -1406,28 +1406,52 @@ typedef enum {
  * enum ACE_TYPES - The predefined ACE types (8-bit, see below).
  */
 typedef enum {
-   ACCESS_MIN_MS_ACE_TYPE  = 0,
-   ACCESS_ALLOWED_ACE_TYPE = 0,
-   ACCESS_DENIED_ACE_TYPE  = 1,
-   SYSTEM_AUDIT_ACE_TYPE   = 2,
-   SYSTEM_ALARM_ACE_TYPE   = 3, /* Not implemented as of Win2k. */
-   ACCESS_MAX_MS_V2_ACE_TYPE   = 3,
-
-   ACCESS_ALLOWED_COMPOUND_ACE_TYPE= 4,
-   ACCESS_MAX_MS_V3_ACE_TYPE   = 4,
-
-   /* The following are Win2k only. */
-   ACCESS_MIN_MS_OBJECT_ACE_TYPE   = 5,
-   ACCESS_ALLOWED_OBJECT_ACE_TYPE  = 5,
-   ACCESS_DENIED_OBJECT_ACE_TYPE   = 6,
-   SYSTEM_AUDIT_OBJECT_ACE_TYPE= 7,
-   SYSTEM_ALARM_OBJECT_ACE_TYPE= 8,
-   ACCESS_MAX_MS_OBJECT_ACE_TYPE   = 8,
-
-   ACCESS_MAX_MS_V4_ACE_TYPE   = 8,
-
-   /* This one is for WinNT&2k. */
-   ACCESS_MAX_MS_ACE_TYPE  = 8,
+   ACCESS_MIN_MS_ACE_TYPE  = 0,
+   ACCESS_ALLOWED_ACE_TYPE = 0,
+   ACCESS_DENIED_ACE_TYPE  = 1,
+   SYSTEM_AUDIT_ACE_TYPE   = 2,
+   SYSTEM_ALARM_ACE_TYPE   = 3, /* reserved */
+   ACCESS_MAX_MS_V2_ACE_TYPE   = 3,
+
+   ACCESS_ALLOWED_COMPOUND_ACE_TYPE= 4, /* reserved */
+   ACCESS_MAX_MS_V3_ACE_TYPE   = 4,
+
+   /* Win2k and later */
+   ACCESS_MIN_MS_OBJECT_ACE_TYPE   = 5,
+   ACCESS_ALLOWED_OBJECT_ACE_TYPE  = 5,
+   ACCESS_DENIED_OBJECT_ACE_TYPE   = 6,
+   SYSTEM_AUDIT_OBJECT_ACE_TYPE= 7,
+   SYSTEM_ALARM_OBJECT_ACE_TYPE= 8, /* reserved */
+   ACCESS_MAX_MS_OBJECT_ACE_TYPE   = 8,
+
+   ACCESS_MAX_MS_V4_ACE_TYPE   = 8,
+
+   /* Apparently, this was the max type in Win2k, but for some reason MS
+* chose not to update this constant in later Windows versions */
+   ACCESS_MAX_MS_ACE_TYPE  = 8,
+
+   /* Windows XP and later */
+   ACCESS_ALLOWED_CALLBACK_ACE_TYPE= 9,
+   ACCESS_DENIED_CALLBACK_ACE_TYPE = 10,
+   ACCESS_ALLOWED_CALLBACK_OBJECT_ACE_TYPE = 11,
+   ACCESS_DENIED_CALLBACK_OBJECT_ACE_TYPE  = 12,
+   SYSTEM_AUDIT_CALLBACK_ACE_TYPE  = 13,
+   SYSTEM_ALARM_CALLBACK_ACE_TYPE  = 14, /* reserved */
+   SYSTEM_AUDIT_CALLBACK_OBJECT_ACE_TYPE   = 15,
+   SYSTEM_ALARM_CALLBACK_OBJECT_ACE_TYPE   = 16, /* reserved */
+
+   /* Windows Vista and later */
+   SYSTEM_MANDATORY_LABEL_ACE_TYPE = 17,
+
+   /* Windows 8 and later */
+   SYSTEM_RESOURCE_ATTRIBUTE_ACE_TYPE  = 18,
+   SYSTEM_SCOPED_POLICY_ID_ACE_TYPE= 19,
+
+   /* Windows 10 and later */
+   SYSTEM_PROCESS_TRUST_LABEL_ACE_TYPE = 20,
+
+   ACCESS_MAX_MS_V5_ACE_TYPE   = 20,
+
 } __attribute__((__packed__)) ACE_TYPES;
 
 /**
@@ -1628,9 +1652,7 @@ typedef struct {
  */
 
 /**
- * struct ACCESS_DENIED_ACE -
- *
- * ACCESS_ALLOWED_ACE, ACCESS_DENIED_ACE, SYSTEM_AUDIT_ACE, SYSTEM_ALARM_ACE
+ * struct ACCESS_ALLOWED_ACE, etc. - Base structure for all regular ACEs
  */
 typedef struct {
 /*  0  ACE_HEADER; -- Unfolded here as gcc doesn't like unnamed structs. */
@@ -1641,7 +1663,15 @@ typedef struct {
 /*  4*/ACCESS_MASK mask;   /* Access mask associated with the ACE. 
*/
 /*  8*/SID sid;/* The SID associated with the ACE. */
 } __attribute__((__packed__)) ACCESS_ALLOWED_ACE, ACCESS_DENIED_ACE,
-  SYSTEM_AUDIT_ACE, SYSTEM_ALARM_ACE;
+  SYSTEM_AUDIT_ACE, SYSTEM_ALARM_ACE,
+  ACCESS_ALLOWED_CALLBACK_ACE,
+  ACCESS_DENIED_CALLBACK_ACE,
+  SYSTEM_AUDIT_CALLBACK_ACE,
+  SYSTEM_ALARM_CALLBACK_ACE,
+  SYSTEM_MANDATORY_LABEL_ACE,
+  SYSTEM_RESOURCE_ATTRIBUTE_ACE,
+  SYSTEM_SCOPED_POLICY_ID_ACE,
+  SYSTEM_PROCESS_TRUST_LABEL_ACE;
 
 /**
  * enum OBJECT_ACE_FLAGS - The object ACE flags (32-bit).
@@ -1652

[ntfs-3g-devel] [PATCH] Eliminate NTFS_BUG()

2016-09-15 Thread Eric Biggers
NTFS_BUG() was broken because it relied on dereferencing a NULL pointer.
This is undefined behavior, and gcc was compiling out the statement.
Crashing in library code is also unfriendly in general.

There were only two users.  Make them just use regular error handling.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 include/ntfs-3g/debug.h |  8 
 libntfs-3g/mft.c| 12 
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/include/ntfs-3g/debug.h b/include/ntfs-3g/debug.h
index f7f3c6f..dba400b 100644
--- a/include/ntfs-3g/debug.h
+++ b/include/ntfs-3g/debug.h
@@ -36,12 +36,4 @@ extern void ntfs_debug_runlist_dump(const struct 
_runlist_element *rl);
 static __inline__ void ntfs_debug_runlist_dump(const struct _runlist_element 
*rl __attribute__((unused))) {}
 #endif
 
-#define NTFS_BUG(msg)  \
-{  \
-   int ___i = 1;   \
-   ntfs_log_critical("Bug in %s(): %s\n", __FUNCTION__, msg);  \
-   ntfs_log_debug("Forcing segmentation fault!");  \
-   ___i = ((int*)NULL)[___i];  \
-}
-
 #endif /* defined _NTFS_DEBUG_H */
diff --git a/libntfs-3g/mft.c b/libntfs-3g/mft.c
index 29f1f4b..85cd120 100644
--- a/libntfs-3g/mft.c
+++ b/libntfs-3g/mft.c
@@ -1276,8 +1276,10 @@ static int ntfs_mft_record_init(ntfs_volume *vol, s64 
size)

/* Sanity checks. */
if (mft_na->data_size > mft_na->allocated_size ||
-   mft_na->initialized_size > mft_na->data_size)
-   NTFS_BUG("mft_na sanity checks failed");
+   mft_na->initialized_size > mft_na->data_size) {
+   ntfs_log_critical("mft_na sanity checks failed");
+   goto undo_data_init;
+   }

/* Sync MFT to minimize data loss if there won't be clean unmount. */
if (ntfs_inode_sync(mft_na->ni))
@@ -1343,8 +1345,10 @@ static int ntfs_mft_rec_init(ntfs_volume *vol, s64 size)

/* Sanity checks. */
if (mft_na->data_size > mft_na->allocated_size ||
-   mft_na->initialized_size > mft_na->data_size)
-   NTFS_BUG("mft_na sanity checks failed");
+   mft_na->initialized_size > mft_na->data_size) {
+   ntfs_log_critical("mft_na sanity checks failed");
+   goto undo_data_init;
+   }
 out:   
ntfs_log_leave("\n");
return ret;
-- 
2.9.3


--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] unistr.c: fix another buffer overrun in ntfs_utf16_to_utf8()

2016-09-15 Thread Eric Biggers
If an output buffer was provided, ntfs_utf16_to_utf8() limited the
output string length without the terminating null to 'outs_len'.  This
was incorrect because a terminating null was always added to the string,
causing a buffer overrun if the output string happened to have exactly
the maximum length.  This was a longstanding bug.  Fix it by leaving
space for a terminating null.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/unistr.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/libntfs-3g/unistr.c b/libntfs-3g/unistr.c
index 190dbd8..e70e316 100644
--- a/libntfs-3g/unistr.c
+++ b/libntfs-3g/unistr.c
@@ -544,7 +544,7 @@ fail:
  * @ins:   input utf16 string buffer
  * @ins_len:   length of input string in utf16 characters
  * @outs:  on return contains the (allocated) output multibyte string
- * @outs_len:  length of output buffer in bytes
+ * @outs_len:  length of output buffer in bytes (ignored if *@outs is NULL)
  *
  * Return -1 with errno set if string has invalid byte sequence or too long.
  */
@@ -563,10 +563,16 @@ static int ntfs_utf16_to_utf8(const ntfschar *ins, const 
int ins_len,
int halfpair;
 
halfpair = 0;
-   if (!*outs)
+   if (!*outs) {
+   /* If no output buffer was provided, we will allocate one and
+* limit its length to PATH_MAX.  Note: we follow the standard
+* convention of PATH_MAX including the terminating null. */
outs_len = PATH_MAX;
+   }
 
-   size = utf16_to_utf8_size(ins, ins_len, outs_len);
+   /* The size *with* the terminating null is limited to @outs_len,
+* so the size *without* the terminating null is limited to one less. */
+   size = utf16_to_utf8_size(ins, ins_len, outs_len - 1);
 
if (size < 0)
goto out;
@@ -877,7 +883,7 @@ fail:
  * @ins:   input Unicode string buffer
  * @ins_len:   length of input string in Unicode characters
  * @outs:  on return contains the (allocated) output multibyte string
- * @outs_len:  length of output buffer in bytes
+ * @outs_len:  length of output buffer in bytes (ignored if *@outs is NULL)
  *
  * Convert the input little endian, 2-byte Unicode string @ins, of length
  * @ins_len into the multibyte string format dictated by the current locale.
-- 
2.9.3


--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] unistr.c: remove unused function ntfs_file_value_upcase()

2016-09-15 Thread Eric Biggers
ntfs_file_value_upcase() is not called from anywhere in NTFS-3G, seems
unlikely to be used by third-party programs, and can be replaced with
calling ntfs_name_upcase() directly.  So remove it.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 include/ntfs-3g/unistr.h |  3 ---
 libntfs-3g/unistr.c  | 17 -
 2 files changed, 20 deletions(-)

diff --git a/include/ntfs-3g/unistr.h b/include/ntfs-3g/unistr.h
index b6d428e..7ea0038 100644
--- a/include/ntfs-3g/unistr.h
+++ b/include/ntfs-3g/unistr.h
@@ -50,9 +50,6 @@ extern void ntfs_name_upcase(ntfschar *name, u32 name_len,
 extern void ntfs_name_locase(ntfschar *name, u32 name_len,
const ntfschar *locase, const u32 locase_len);
 
-extern void ntfs_file_value_upcase(FILE_NAME_ATTR *file_name_attr,
-   const ntfschar *upcase, const u32 upcase_len);
-
 extern int ntfs_ucstombs(const ntfschar *ins, const int ins_len, char **outs,
int outs_len);
 extern int ntfs_mbstoucs(const char *ins, ntfschar **outs);
diff --git a/libntfs-3g/unistr.c b/libntfs-3g/unistr.c
index e70e316..199aeba 100644
--- a/libntfs-3g/unistr.c
+++ b/libntfs-3g/unistr.c
@@ -416,23 +416,6 @@ void ntfs_name_locase(ntfschar *name, u32 name_len, const 
ntfschar *locase,
name[i] = locase[u];
 }
 
-/**
- * ntfs_file_value_upcase - Convert a filename to upper case
- * @file_name_attr:
- * @upcase:
- * @upcase_len:
- *
- * Description...
- *
- * Returns:
- */
-void ntfs_file_value_upcase(FILE_NAME_ATTR *file_name_attr,
-   const ntfschar *upcase, const u32 upcase_len)
-{
-   ntfs_name_upcase((ntfschar*)_name_attr->file_name,
-   file_name_attr->file_name_length, upcase, upcase_len);
-}
-
 /*
NTFS uses Unicode (UTF-16LE [NTFS-3G uses UCS-2LE, which is enough
for now]) for path names, but the Unicode code points need to be
-- 
2.9.3


--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] unistr.c: make utf16_to_utf8_size() always honor @outs_len

2016-09-15 Thread Eric Biggers
utf16_to_utf8_size() was not guaranteed to fail with ENAMETOOLONG if the
computed length was greater than @outs_len.  This could cause a buffer
overrun in ntfs_utf16_to_utf8().  This was a bug introduced by the
patches to allow broken Unicode.  Fix it.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/unistr.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/libntfs-3g/unistr.c b/libntfs-3g/unistr.c
index 4d33bb4..190dbd8 100644
--- a/libntfs-3g/unistr.c
+++ b/libntfs-3g/unistr.c
@@ -458,10 +458,15 @@ void ntfs_file_value_upcase(FILE_NAME_ATTR 
*file_name_attr,
 */
  
 /* 
- * Return the amount of 8-bit elements in UTF-8 needed (without the terminating
- * null) to store a given UTF-16LE string.
+ * Return the number of bytes in UTF-8 needed (without the terminating null) to
+ * store the given UTF-16LE string.
  *
- * Return -1 with errno set if string has invalid byte sequence or too long.
+ * On error, -1 is returned, and errno is set to the error code. The following
+ * error codes can be expected:
+ * EILSEQ  The input string is not valid UTF-16LE (only possible
+ * if compiled without ALLOW_BROKEN_UNICODE).
+ * ENAMETOOLONGThe length of the UTF-8 string in bytes (without the
+ * terminating null) would exceed @outs_len.
  */
 static int utf16_to_utf8_size(const ntfschar *ins, const int ins_len, int 
outs_len)
 {
@@ -470,7 +475,7 @@ static int utf16_to_utf8_size(const ntfschar *ins, const 
int ins_len, int outs_l
BOOL surrog;
 
surrog = FALSE;
-   for (i = 0; i < ins_len && ins[i]; i++) {
+   for (i = 0; i < ins_len && ins[i] && count <= outs_len; i++) {
unsigned short c = le16_to_cpu(ins[i]);
if (surrog) {
if ((c >= 0xdc00) && (c < 0xe000)) {
@@ -511,17 +516,20 @@ static int utf16_to_utf8_size(const ntfschar *ins, const 
int ins_len, int outs_l
count += 3;
else 
goto fail;
-   if (count > outs_len) {
-   errno = ENAMETOOLONG;
-   goto out;
-   }
}
-   if (surrog) 
+
+   if (surrog && count <= outs_len) {
 #if ALLOW_BROKEN_UNICODE
count += 3; /* ending with a single surrogate */
 #else
goto fail;
 #endif /* ALLOW_BROKEN_UNICODE */
+   }
+
+   if (count > outs_len) {
+   errno = ENAMETOOLONG;
+   goto out;
+   }
 
ret = count;
 out:
-- 
2.9.3


--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] reparse.c: validate minimum size of mountpoint/symlink reparse points

2016-09-15 Thread Eric Biggers
valid_reparse_data() would read past the end of the reparse point buffer
if it was passed a malformed reparse point that had the tag for a
mountpoint or a symlink but had a data buffer smaller than expected.
Fix this by validating the buffer size.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/reparse.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/libntfs-3g/reparse.c b/libntfs-3g/reparse.c
index 354f7bb..b0f96ae 100644
--- a/libntfs-3g/reparse.c
+++ b/libntfs-3g/reparse.c
@@ -446,6 +446,11 @@ static BOOL valid_reparse_data(ntfs_inode *ni,
if (ok) {
switch (reparse_attr->reparse_tag) {
case IO_REPARSE_TAG_MOUNT_POINT :
+   if (size < sizeof(REPARSE_POINT) +
+  sizeof(struct MOUNT_POINT_REPARSE_DATA)) {
+   ok = FALSE;
+   break;
+   }
mount_point_data = (const struct 
MOUNT_POINT_REPARSE_DATA*)
reparse_attr->reparse_data;
offs = le16_to_cpu(mount_point_data->subst_name_offset);
@@ -458,6 +463,11 @@ static BOOL valid_reparse_data(ntfs_inode *ni,
ok = FALSE;
break;
case IO_REPARSE_TAG_SYMLINK :
+   if (size < sizeof(REPARSE_POINT) +
+  sizeof(struct SYMLINK_REPARSE_DATA)) {
+   ok = FALSE;
+   break;
+   }
symlink_data = (const struct SYMLINK_REPARSE_DATA*)
reparse_attr->reparse_data;
offs = le16_to_cpu(symlink_data->subst_name_offset);
-- 
2.9.3


--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] Correct validation of multi sector transfer protected records

2016-07-27 Thread Eric Biggers
On Wed, Jul 27, 2016 at 12:20:24PM +0200, Jean-Pierre André wrote:
> 
> Can you disambiguate the word "sector" here ? This is not
> a physical sector, but an ntfs logical sector whose size
> is NTFS_BLOCK_SIZE (512 bytes). This might not have been
> known to the original developer, and it would be useful
> to have it documented somewhere.
> 

Thanks, I've submitted a revised patch.

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] Correct validation of multi sector transfer protected records

2016-07-27 Thread Eric Biggers
I found that the validation contained an off-by-one error.  The
expression '(u32)(usa_ofs + (usa_count * 2)) > size' used 'usa_count'
after it had been decremented to skip the update sequence number entry.
Consequently, the code could read out of bounds, up to two bytes past the
end of the MST-protected record.

Furthermore, as documented in the comment in layout.h for "NTFS_RECORD"
and also on MSDN for "MULTI_SECTOR_HEADER", the update sequence array
must end before the last le16 in the first sector --- not merely before
the end of the record.

Fix the validation and move it into a helper function, as it was done
identically in the read and write paths.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/mst.c | 45 +++--
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/libntfs-3g/mst.c b/libntfs-3g/mst.c
index 9dff773..88b9cdb 100644
--- a/libntfs-3g/mst.c
+++ b/libntfs-3g/mst.c
@@ -31,6 +31,21 @@
 #include "mst.h"
 #include "logging.h"
 
+/*
+ * Basic validation of a NTFS multi-sector record.  The record size must be a
+ * multiple of the sector size; and the update sequence array must be properly
+ * aligned, of the expected length, and must end before the last le16 in the
+ * first sector.
+ */
+static BOOL
+is_valid_record(u32 size, u16 usa_ofs, u16 usa_count)
+{
+   return size % NTFS_BLOCK_SIZE == 0 &&
+   usa_ofs % 2 == 0 &&
+   usa_count == 1 + (size / NTFS_BLOCK_SIZE) &&
+   usa_ofs + ((u32)usa_count * 2) <= NTFS_BLOCK_SIZE - 2;
+}
+
 /**
  * ntfs_mst_post_read_fixup - deprotect multi sector transfer protected data
  * @b: pointer to the data to deprotect
@@ -57,12 +72,9 @@ int ntfs_mst_post_read_fixup_warn(NTFS_RECORD *b, const u32 
size,
 
/* Setup the variables. */
usa_ofs = le16_to_cpu(b->usa_ofs);
-   /* Decrement usa_count to get number of fixups. */
-   usa_count = le16_to_cpu(b->usa_count) - 1;
-   /* Size and alignment checks. */
-   if (size & (NTFS_BLOCK_SIZE - 1) || usa_ofs & 1 ||
-   (u32)(usa_ofs + (usa_count * 2)) > size ||
-   (size >> NTFS_BLOCK_SIZE_BITS) != usa_count) {
+   usa_count = le16_to_cpu(b->usa_count);
+
+   if (!is_valid_record(size, usa_ofs, usa_count)) {
errno = EINVAL;
if (warn) {
ntfs_log_perror("%s: magic: 0x%08lx  size: %ld "
@@ -91,7 +103,7 @@ int ntfs_mst_post_read_fixup_warn(NTFS_RECORD *b, const u32 
size,
/*
 * Check for incomplete multi sector transfer(s).
 */
-   while (usa_count--) {
+   while (--usa_count) {
if (*data_pos != usn) {
/*
 * Incomplete multi sector transfer detected! )-:
@@ -109,10 +121,10 @@ int ntfs_mst_post_read_fixup_warn(NTFS_RECORD *b, const 
u32 size,
data_pos += NTFS_BLOCK_SIZE/sizeof(u16);
}
/* Re-setup the variables. */
-   usa_count = le16_to_cpu(b->usa_count) - 1;
+   usa_count = le16_to_cpu(b->usa_count);
data_pos = (u16*)b + NTFS_BLOCK_SIZE/sizeof(u16) - 1;
/* Fixup all sectors. */
-   while (usa_count--) {
+   while (--usa_count) {
/*
 * Increment position in usa and restore original data from
 * the usa into the data buffer.
@@ -171,12 +183,9 @@ int ntfs_mst_pre_write_fixup(NTFS_RECORD *b, const u32 
size)
}
/* Setup the variables. */
usa_ofs = le16_to_cpu(b->usa_ofs);
-   /* Decrement usa_count to get number of fixups. */
-   usa_count = le16_to_cpu(b->usa_count) - 1;
-   /* Size and alignment checks. */
-   if (size & (NTFS_BLOCK_SIZE - 1) || usa_ofs & 1 ||
-   (u32)(usa_ofs + (usa_count * 2)) > size ||
-   (size >> NTFS_BLOCK_SIZE_BITS) != usa_count) {
+   usa_count = le16_to_cpu(b->usa_count);
+
+   if (!is_valid_record(size, usa_ofs, usa_count)) {
errno = EINVAL;
ntfs_log_perror("%s", __FUNCTION__);
return -1;
@@ -195,7 +204,7 @@ int ntfs_mst_pre_write_fixup(NTFS_RECORD *b, const u32 size)
/* Position in data of first le16 that needs fixing up. */
data_pos = (le16*)b + NTFS_BLOCK_SIZE/sizeof(le16) - 1;
/* Fixup all sectors. */
-   while (usa_count--) {
+   while (--usa_count) {
/*
 * Increment the position in the usa and save the
 * original data from the data buffer into the usa.
@@ -223,7 +232,7 @@ void ntfs_mst_post_write_fixup(NTFS_RECORD *b)
u16 *usa_pos, *data_pos;
 
u16 usa_ofs = le16_to_cpu(b->usa_ofs);
-   u16 usa_count = le16_to_cpu(b->usa_count) - 1

[ntfs-3g-devel] [RFC PATCH] Always open $Secure when mounting NTFS volume

2016-07-26 Thread Eric Biggers
Currently, applications that wish to access security descriptors have to
explicitly open the volume's security descriptor index ("$Secure") using
ntfs_open_secure().  Applications are also responsible for closing the
index when done with it.  However, the cleanup function for doing,
ntfs_close_secure(), cannot be called easily by all applications because
it requires a SECURITY_CONTEXT argument, not simply the ntfs_volume.
Some applications therefore have to close the inode and index contexts
manually in order to clean up properly.

This proposal updates libntfs-3g to open $Secure unconditonally as part
of ntfs_mount(), so that applications do not have to worry about it.

ntfs_close_secure() is updated to take in a ntfs_volume for internal use,
and ntfs_destroy_security_context() is now the function to call to free
memory associated with a SECURITY_CONTEXT rather than a ntfs_volume.

Some memory leaks in error paths of ntfs_open_secure() are also fixed.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 include/ntfs-3g/security.h |  4 ++-
 libntfs-3g/security.c  | 87 --
 libntfs-3g/volume.c|  7 
 src/lowntfs-3g.c   |  6 +---
 src/ntfs-3g.c  |  6 +---
 5 files changed, 65 insertions(+), 45 deletions(-)

diff --git a/include/ntfs-3g/security.h b/include/ntfs-3g/security.h
index b5c6375..d27599e 100644
--- a/include/ntfs-3g/security.h
+++ b/include/ntfs-3g/security.h
@@ -256,7 +256,9 @@ int ntfs_set_owner_mode(struct SECURITY_CONTEXT *scx,
 le32 ntfs_inherited_id(struct SECURITY_CONTEXT *scx,
ntfs_inode *dir_ni, BOOL fordir);
 int ntfs_open_secure(ntfs_volume *vol);
-void ntfs_close_secure(struct SECURITY_CONTEXT *scx);
+void ntfs_close_secure(ntfs_volume *vol);
+
+void ntfs_destroy_security_context(struct SECURITY_CONTEXT *scx);
 
 #if POSIXACLS
 
diff --git a/libntfs-3g/security.c b/libntfs-3g/security.c
index ef036af..f7085cc 100644
--- a/libntfs-3g/security.c
+++ b/libntfs-3g/security.c
@@ -4466,55 +4466,75 @@ int ntfs_set_ntfs_attrib(ntfs_inode *ni,
 
 
 /*
- * Open $Secure once for all
+ * Open the volume's security descriptor index ($Secure)
+ *
  * returns zero if it succeeds
- * non-zero if it fails. This is not an error (on NTFS v1.x)
+ * non-zero if it fails and the NTFS version is at least v3.x
  */
-
-
 int ntfs_open_secure(ntfs_volume *vol)
 {
ntfs_inode *ni;
-   int res;
+   ntfs_index_context *sii;
+   ntfs_index_context *sdh;
 
-   res = -1;
-   vol->secure_ni = (ntfs_inode*)NULL;
-   vol->secure_xsii = (ntfs_index_context*)NULL;
-   vol->secure_xsdh = (ntfs_index_context*)NULL;
-   if (vol->major_ver >= 3) {
-   /* make sure this is a genuine $Secure inode 9 */
-   ni = ntfs_pathname_to_inode(vol, NULL, "$Secure");
-   if (ni && (ni->mft_no == 9)) {
-   vol->secure_reentry = 0;
-   vol->secure_xsii = ntfs_index_ctx_get(ni,
-   sii_stream, 4);
-   vol->secure_xsdh = ntfs_index_ctx_get(ni,
-   sdh_stream, 4);
-   if (ni && vol->secure_xsii && vol->secure_xsdh) {
-   vol->secure_ni = ni;
-   res = 0;
-   }
-   }
+   if (vol->secure_ni) /* Already open? */
+   return 0;
+
+   ni = ntfs_pathname_to_inode(vol, NULL, "$Secure");
+   if (!ni)
+   goto err;
+
+   /* Verify that $Secure has the expected inode number. */
+   if (ni->mft_no != FILE_Secure) {
+   errno = EINVAL;
+   goto err_close_ni;
}
-   return (res);
+
+   /* Allocate the needed index contexts. */
+   sii = ntfs_index_ctx_get(ni, sii_stream, 4);
+   if (!sii)
+   goto err_close_ni;
+
+   sdh = ntfs_index_ctx_get(ni, sdh_stream, 4);
+   if (!sdh)
+   goto err_close_sii;
+
+   vol->secure_ni = ni;
+   vol->secure_xsii = sii;
+   vol->secure_xsdh = sdh;
+   return 0;
+
+err_close_sii:
+   ntfs_index_ctx_put(sii);
+err_close_ni:
+   ntfs_inode_close(ni);
+err:
+   /* Failing on NTFS versions before 3.x is expected */
+   if (vol->major_ver < 3)
+   return 0;
+   ntfs_log_perror("error opening $Secure");
+   return -1;
 }
 
 /*
- * Final cleaning
- * Allocated memory is freed to facilitate the detection of memory leaks
+ * Close the volume's security descriptor index ($Secure)
  */
-
-void ntfs_close_secure(struct SECURITY_CONTEXT *scx)
+void ntfs_close_secure(ntfs_volume *vol)
 {
-   ntfs_volume *vol;
-
-   vol = scx->vol;
 

[ntfs-3g-devel] [PATCH] ntfscmp: fix tautological comparison

2016-07-26 Thread Eric Biggers
Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 ntfsprogs/ntfscmp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ntfsprogs/ntfscmp.c b/ntfsprogs/ntfscmp.c
index 555e401..cabc9c0 100644
--- a/ntfsprogs/ntfscmp.c
+++ b/ntfsprogs/ntfscmp.c
@@ -547,7 +547,7 @@ static void cmp_index_allocation(ntfs_attr *na1, ntfs_attr 
*na2)
/*
 *  FIXME: ia can be the same even if the bitmap sizes are different.
 */
-   if (cia1.bm_size != cia1.bm_size)
+   if (cia1.bm_size != cia2.bm_size)
goto out;
 
if (cmp_buffer(cia1.bitmap, cia2.bitmap, cia1.bm_size, na1))
-- 
2.9.0


--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] xattrs.c: remove unused variables

2016-07-26 Thread Eric Biggers
Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/xattrs.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/libntfs-3g/xattrs.c b/libntfs-3g/xattrs.c
index f17e4ca..2b7e709 100644
--- a/libntfs-3g/xattrs.c
+++ b/libntfs-3g/xattrs.c
@@ -81,10 +81,6 @@ struct LE_POSIX_ACL {
 #endif
 #endif
 
-static const char xattr_ntfs_3g[] = "ntfs-3g.";
-static const char nf_ns_user_prefix[] = "user.";
-static const int nf_ns_user_prefix_len = sizeof(nf_ns_user_prefix) - 1;
-
 static const char nf_ns_xattr_ntfs_acl[] = "system.ntfs_acl";
 static const char nf_ns_xattr_attrib[] = "system.ntfs_attrib";
 static const char nf_ns_xattr_attrib_be[] = "system.ntfs_attrib_be";
-- 
2.9.0


--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] Filename collation cleanups

2016-07-26 Thread Eric Biggers
- Update documentation for COLLATION_RULES
- Document how ntfs_names_full_collate() compares names
- Update comments and DEBUG code to reflect that ntfs_names_full_collate()
  always access 'upcase', even in CASE_SENSITIVE mode
- Remove unneeded assignments to 'c1' and 'c2' in IGNORE_CASE mode

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 include/ntfs-3g/layout.h | 33 -
 libntfs-3g/unistr.c  | 28 +++-
 2 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/include/ntfs-3g/layout.h b/include/ntfs-3g/layout.h
index e23fa10..ddd29c1 100644
--- a/include/ntfs-3g/layout.h
+++ b/include/ntfs-3g/layout.h
@@ -515,16 +515,15 @@ typedef enum {
  * enum COLLATION_RULES - The collation rules for sorting views/indexes/etc
  * (32-bit).
  *
- * COLLATION_UNICODE_STRING - Collate Unicode strings by comparing their binary
- * Unicode values, except that when a character can be uppercased, the
- * upper case value collates before the lower case one.
- * COLLATION_FILE_NAME - Collate file names as Unicode strings. The collation
- * is done very much like COLLATION_UNICODE_STRING. In fact I have no idea
- * what the difference is. Perhaps the difference is that file names
- * would treat some special characters in an odd way (see
- * unistr.c::ntfs_collate_names() and unistr.c::legal_ansi_char_array[]
- * for what I mean but COLLATION_UNICODE_STRING would not give any special
- * treatment to any characters at all, but this is speculation.
+ * COLLATION_BINARY - Collate by binary compare where the first byte is most
+ * significant.
+ * COLLATION_FILE_NAME - Collate Unicode strings by comparing their 16-bit
+ * coding units, primarily ignoring case using the volume's $UpCase table,
+ * but falling back to a case-sensitive comparison if the names are equal
+ * ignoring case.
+ * COLLATION_UNICODE_STRING - TODO: this is not yet implemented and still needs
+ * to be properly documented --- is it really the same as
+ * COLLATION_FILE_NAME?
  * COLLATION_NTOFS_ULONG - Sorting is done according to ascending le32 key
  * values. E.g. used for $SII index in FILE_Secure, which sorts by
  * security_id (le32).
@@ -549,17 +548,9 @@ typedef enum {
  * equal then the second le32 values would be compared, etc.
  */
 typedef enum {
-   COLLATION_BINARY = const_cpu_to_le32(0), /* Collate by binary
-   compare where the first byte is most
-   significant. */
-   COLLATION_FILE_NAME  = const_cpu_to_le32(1), /* Collate file names
-   as Unicode strings. */
-   COLLATION_UNICODE_STRING = const_cpu_to_le32(2), /* Collate Unicode
-   strings by comparing their binary
-   Unicode values, except that when a
-   character can be uppercased, the upper
-   case value collates before the lower
-   case one. */
+   COLLATION_BINARY= const_cpu_to_le32(0),
+   COLLATION_FILE_NAME = const_cpu_to_le32(1),
+   COLLATION_UNICODE_STRING= const_cpu_to_le32(2),
COLLATION_NTOFS_ULONG   = const_cpu_to_le32(16),
COLLATION_NTOFS_SID = const_cpu_to_le32(17),
COLLATION_NTOFS_SECURITY_HASH   = const_cpu_to_le32(18),
diff --git a/libntfs-3g/unistr.c b/libntfs-3g/unistr.c
index 54cfd46..4d33bb4 100644
--- a/libntfs-3g/unistr.c
+++ b/libntfs-3g/unistr.c
@@ -143,14 +143,24 @@ BOOL ntfs_names_are_equal(const ntfschar *s1, size_t 
s1_len,
  * @name1_len: length of first Unicode name to compare
  * @name2: second Unicode name to compare
  * @name2_len: length of second Unicode name to compare
- * @ic:either CASE_SENSITIVE or IGNORE_CASE
- * @upcase:upcase table (ignored if @ic is CASE_SENSITIVE)
- * @upcase_len:upcase table size (ignored if @ic is CASE_SENSITIVE)
+ * @ic:either CASE_SENSITIVE or IGNORE_CASE (see below)
+ * @upcase:upcase table
+ * @upcase_len:upcase table size
  *
- *  -1 if the first name collates before the second one,
- *   0 if the names match,
- *   1 if the second name collates before the first one, or
+ * If @ic is CASE_SENSITIVE, then the names are compared primarily ignoring
+ * case, but if the names are equal ignoring case, then they are compared
+ * case-sensitively.  As an example, "abc" would collate before "BCD" (since
+ * "abc" and "BCD" differ ignoring case and 'A' < 'B') but after "ABC" (since
+ * "ABC" and "abc" are equal ignoring case and 'A' < 'a').  This matches the
+ * collation order of filenames as indexed in NTFS directories.
+ *
+ * If @ic is IGNOR

[ntfs-3g-devel] New repository for system compression plugin

2016-07-02 Thread Eric Biggers
Hello,

I have made the NTFS-3G system compression plugin available in a new repository
at https://github.com/ebiggers/ntfs-3g-system-compression.

I also made a few small updates and updated the build system to use autotools.
With libntfs-3g installed including headers, the plugin can be built and
installed with the standard './configure && make && sudo make install'.  Or at
least that's the intent --- I still need to do more testing on more platforms.

Eric

--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH 2/2] Conditionally compile debugging code in ntfs_delete()

2016-06-21 Thread Eric Biggers
Although ntfs_log_trace() is defined to a no-op in non-DEBUG builds,
ntfs_attr_name_get() is not.  This function performs a string conversion
and a memory allocation, so it is nice to have the call to it compiled
out when not needed.

Signed-off-by: Eric Biggers <ebigge...@gmail.com>
---
 libntfs-3g/dir.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/libntfs-3g/dir.c b/libntfs-3g/dir.c
index bd049d2..6e97ee7 100644
--- a/libntfs-3g/dir.c
+++ b/libntfs-3g/dir.c
@@ -1906,17 +1906,21 @@ int ntfs_delete(ntfs_volume *vol, const char *pathname,
 search:
while (!(err = ntfs_attr_lookup(AT_FILE_NAME, AT_UNNAMED, 0,
CASE_SENSITIVE, 0, NULL, 0, actx))) {
+   #ifdef DEBUG
char *s;
+   #endif
IGNORE_CASE_BOOL case_sensitive = IGNORE_CASE;
 
fn = (FILE_NAME_ATTR*)((u8*)actx->attr +
le16_to_cpu(actx->attr->value_offset));
+   #ifdef DEBUG
s = ntfs_attr_name_get(fn->file_name, fn->file_name_length);
ntfs_log_trace("name: '%s'  type: %d  dos: %d  win32: %d  "
   "case: %d\n", s, fn->file_name_type,
   looking_for_dos_name, looking_for_win32_name,
   case_sensitive_match);
ntfs_attr_name_free();
+   #endif
if (looking_for_dos_name) {
if (fn->file_name_type == FILE_NAME_DOS)
break;
-- 
2.9.0


--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] Experimental support for Windows 10 "System Compressed" files

2015-12-05 Thread Eric Biggers
On Fri, Dec 04, 2015 at 10:03:41AM +0100, Jean-Pierre André wrote:
> For creating a new compressed file, the procedure would be :
> - create a new void file
> - "truncate" it to the desired size (hence a void sparse file)
> - set reparse data for the desired compression mode
> - feed the data sequentially.
> 
> Only the last step would go through the plugin, and the plugin
> knows how much space has to be reserved for the pointers to
> compression blocks.

Interesting idea.  I hadn't considered how, if at all, the plugin should support
*creating* system compressed files.

I think it is actually already possible for a 3rd party application to create a
system compressed file on a volume mounted by a released version of NTFS-3g ---
either with FUSE driver or directly with the library.  The process would be as
follows:

1. create empty file and truncate it to the desired size (same as you described)
2. open the "WOFCompressedData" named data stream of the file and write the
   compressed data to it
3. create the reparse point

The hard part is, of course, creating the compressed data stream.  I think what
you're suggesting is that *uncompressed* data would be written to the file and
the plugin would automatically compress it, which would mean users wouldn't have
to deal with that part.  I think it will be possible, provided that the
uncompressed size is known ahead of time and the writes are made sequentially.
The plugin could detect out-of-order writes and fail them.

I would still have to do my own thing in wimlib if I wanted to extract files as
system-compressed, since it uses libntfs-3g directly.  So this ability would be
for FUSE driver users.  Of course, the same will also apply to reading system
compressed files.

I think the audience for *reading* system compressed files is much larger than
the audience for *creating* system compressed files, since you always have the
option of just creating standard uncompressed files, whereas users might have no
choice in reading compressed files created by Windows.

Something else to keep in mind is that allowing a system-compressed file to be
opened for writing would conflict with any attempt by the plugin to emulate
Windows' behavior where it automatically turns the file into a standard
uncompressed file when it is opened for writing (perhaps it could otherwise be
done in the ->open() hook).

Eric

--
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911=/4140
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] Experimental support for Windows 10 "System Compressed" files

2015-12-05 Thread Eric Biggers
Hi,

On Fri, Dec 04, 2015 at 10:03:41AM +0100, Jean-Pierre André wrote:
> Hi Eric,
> 
> Please see http://jp-andre.pagesperso-orange.fr/systcomp.tar.gz
> with (most of) your comments taken into account.

It generally looks good.  I did a basic test of the system compression plugin
(reading files only) and it worked correctly.  Here are some comments on the
code:

- I think there should be a comment in plugin.h that describes the reparse
plugin architecture and when the header is needed.  It should clarify that the
header is there for plugin development for the FUSE drivers and is not part of
the libntfs-3g API.

- The 'size' argument to truncate() should be off_t.

- The documentation for init() should mention the errno to set (currently EINVAL
in your proposal) if the reparse tag is not supported.

- The documentation for readlink() says the link target must be encoded in 
UTF-8,
but I don't believe that's necessarily true because NTFS-3g supports alternate
locales.  The link target will need to be returned as a "multibyte string" as
provided by ntfs_ucstombs(), which is what ntfs_make_symlink() does.

- ntfs_get_reparse_data() needs a comment to document it.  It should note that 
the
return value, if not NULL, has already been validated as a proper reparse point.
Also, there is the case where the file is not a reparse point. It looks like the
function would fail with ENOENT in that case; is that the proper error code or
should it be something else like ENODATA?  Another issue is that it is prone to
be confused with ntfs_get_ntfs_reparse_data().  Maybe it would be better to call
it ntfs_get_reparse_point()?

- There is a lot of boilerplate code around calling the reparse point
operations.  I have two ideas for improvement.  First, there could be a function
that combines ntfs_get_reparse_data() and get_reparse_plugin():

const struct plugin_operations *select_reparse_plugin(ntfs_inode *ni,
  ntfs_fuse_context_t *ctx,
  REPARSE_POINT 
**reparse_ret)
{
REPARSE_POINT *reparse;
const struct plugin_operations *ops;

reparse = ntfs_get_reparse_data(ni);

if (!reparse)
return NULL;

ops = get_reparse_plugin(ctx, reparse->reparse_tag);
if (ops)
*reparse_ret = reparse;
else
free(reparse);
return ops;
}

Second, there could be a macro which calls a plugin operation:

#define CALL_REPARSE_PLUGIN(ni, op_name, ...)   \
({  \
const struct plugin_operations *ops;\
REPARSE_POINT *reparse; \
int res;\
\
ops = select_reparse_plugin(ni, ctx, ); \
if (ops) {  \
if (ops->op_name)   \
res = ops->op_name(ni, reparse, ##__VA_ARGS__); \
else\
res = -EOPNOTSUPP;  \
free(reparse);  \
} else {\
res = -errno;   \
}   \
res;\
})

Maybe there would be too much magic going on behind the scenes with the macro,
but it does get rid of the boilerplate code.  Example for truncate():

if (ni->flags & FILE_ATTR_REPARSE_POINT) {
if (stream_name_len) {
res = -EINVAL;
goto exit;
}
res = CALL_REPARSE_PLUGIN(ni, truncate, size);
if (res)
goto exit;
set_archive(ni);
goto stamps;
}

- Perhaps it should be possible to disable external plugins at build time as a
./configure option?

- In the final version, I think there should be a dedicated directory created
for plugins.  It's not really appropriate to drop plugins in the top-level
system library directory.  Probably the plugin directory should be settable
by ./configure and should default to something like ${libdir}/ntfs-3g/.

- The function names "set_reparse_plugin()" and "set_internal_reparse_plugins()"
seem a little nonstandard.  I think they should use the verb "register" instead
of "set".  You're "registering" a plugin, not "setting" a plugin.

- Interesting idea with the fi->fh value.  I'll have to see if I can do 

Re: [ntfs-3g-devel] Experimental support for Windows 10 "System Compressed" files

2015-11-27 Thread Eric Biggers
On Wed, Nov 25, 2015 at 10:32:06AM +0100, Jean-Pierre André wrote:
> Agreed. It should be plugin.h, but where should this be
> located ("src/plugin.h" ?)

There are a few options I can think of:

1.) Make it include/ntfs-3g/plugin.h and install it with the library headers
2.) Make it src/plugin.h and require that compiling a ntfs-3g plugin requires
access to the ntfs-3g source tree
3.) Make it src/plugin.h and install it as a separate "ntfs-3g-plugin-devel"
package for plugin developers to use

(3) sounds like "the right way", but for now I think it would be overkill to
make a separate package just for that header.

What do you think about (1) or (2)?

> Microsoft advertises IO_REPARSE_TAG_WIM (0x8008)
> Is that not used for WIMBoot files ?

No.  IO_REPARSE_TAG_WIM is used to indicate a WIM image that has been "mounted"
with ImageX or DISM, whereas IO_REPARSE_TAG_WOF with the "WIM provider" is used
to indicate a single file whose data (specifically, the unnamed data stream) is
stored in a resource in a WIM file.  The former has been around since Windows
Vista, whereas the latter was added in Windows 8.1 to support the "WIMBoot"
feature.

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] Experimental support for Windows 10 "System Compressed" files

2015-11-20 Thread Eric Biggers
Hi Jean-Pierre,

I've made a few updates to the "system compression" branch.

I finally got around to testing files with uncompressed size >= 4 GiB.  It turns
out that Windows *does* permit system compression on such files.  The file
format changes slightly to accomodate 64-bit offsets rather than 32-bit offsets,
(exactly the same as in WIM archives), so I updated the code accordingly.

I added a check in ntfs_fuse_open() to forbid writing to the unnamed data stream
of system compressed files, since it is not supported.  Such files are
effectively read-only; the write bit is being cleared in the mode as well.  I
suppose it would be possible to implement Windows' behavior where it
automatically decompresses the file if you try to write to it, but I'm passing
on that for now.

I simplified chunk caching in the decompression context.  Now it just holds the
most recently decompressed chunk, which should be good enough for library users
who are unaware of the precise compression chunk size.  However, the FUSE driver
still just opens the inode and allocates a new decompression context for every
read.  Since the FUSE driver --- the high-level one, at least --- doesn't
currently maintain file descriptor structures, there wasn't much that could be
done.  But it does do big reads, as you mentioned.
  
(Side note: in the FUSE filesystem I have in wimlib for mounting WIM images, I
set the 'fh' member of the 'struct fuse_file_info' to a file descriptor
structure in the ->open() operation, and I have 'flag_nullpath_ok' set in the
'struct fuse_operations'.  Then, I just get the file descriptor structure, with
no path, passed to operations such as ->read().  If something like that could be
done with NTFS-3g and objects like inodes could be left open for many reads or
writes, I expect it would make things a bit faster for all users.  Maybe it's
not possible because you could end up with the same inode opened multiple times
at once, in different file descriptors...)

Finally, I made a few other code cleanups and added a short subsection to the
ntfs-3g man page.

Eric


On Tue, Sep 22, 2015 at 10:54:10PM -0500, Eric Biggers wrote:
> I've pushed changes to my repository that address a few things you brought
> up:
> 
> - compiler warnings addressed
> - decompression memory allocated on heap rather than stack
> - a couple optimizations for decompression speed
> 
> I'll take a closer look at the interaction with the NTFS-3g driver when I
> have time.
> 
> 
> 
> On Tue, Sep 22, 2015 at 10:49 PM, Eric Biggers <ebigge...@gmail.com> wrote:
> 
> > Hi,
> >
> > "WOF compression" is as good as the other names.  It still seems slightly
> > wrong
> > because WOF (the "Windows Overlay Filesystem Filter") is a more general
> > feature,
> > and this is actually the *second* compression technology that Microsoft has
> > built on top of it (the first was "WIMBoot").  For now, I'll keep the code
> > the
> > way it is, using the "system compression" name.  It could be that
> > Microsoft will
> > release more documentation for this.
> >
> > Yes, your reparse data indicates XPRESS4K compression (the fourth 32-bit
> > little
> > endian word is 0).  FYI, here are the compressed sizes I get with the
> > Silesia
> > corpus (uncompressed size: 211,938,580 bytes total):
> >
> > LZNT1 (NTFS compression): 121,049,088 bytes
> > XPRESS4K: 104,124,416 bytes
> > XPRESS8K: 95,465,472 bytes
> > XPRESS16K: 90,460,160 bytes
> > LZX: 69,144,576 bytes
> >
> > Even though FUSE makes big reads, it would be nice to not have to allocate
> > a
> > decompression context for every read.  That would avoid doing all of the
> > following on a per-read basis:
> > - open WofCompressedData attribute
> > - allocate heap memory for ntfs_system_decompression_ctx
> > - allocate heap memory for XPRESS or LZX
> > - read chunk offsets from the compressed file's chunk table
> >
> > Having an external tool to create "system compressed" files, if people
> > want that
> > support, is probably the way to go.  Probably that would be possible even
> > with
> > no changes in libntfs-3g.
> >
> > Eric
> >
> >

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [RESEND] Incorrect handling of attribute starting with single-cluster extent

2015-11-14 Thread Eric Biggers
[Resending with only the script attached, since the original apparently didn't
go through]

Hi,

I finally have more information, and a potential solution, for the apparent NTFS
corruption bug I've been encountering during randomized tests.  The bug, as I've
been experiencing it, results in an unreadable directory where readdir fails
with EIO.  I've attached a script that creates a small volume exhibiting this
bug.

Based on my analysis, the bug is actually read-side.  In the example volume, the
unreadable directory has an ATTRIBUTE_LIST attribute and an INDEX_ALLOCATION
attribute occupying two clusters, each in a different extent.  Therefore, the
first INDEX_ALLOCATION extent has lowest_vcn=0 and highest_vcn=0, and the second
has lowest_vcn=1 and highest_vcn=1.

This unusual case, which is apparently created by the combination of a small
volume and near-full MFT records, triggers some special-case behavior in
ntfs_mapping_pairs_decompress_i() near line 950 in runlist.c:

>   /*
>* A highest_vcn of zero means this is a single extent
>* attribute so simply terminate the runlist with LCN_ENOENT).
>*/

That behavior is incorrect if the attribute's first extent only contains a
single cluster, since in that case highest_vcn=0 as well.

For what it's worth, I tested the volume on Windows and it *is* able to
successfully read the directory.  This supports the hypothesis that the volume
is valid and NTFS-3g has a bug on the read side.

I think that this bug could, in theory, occur with any non-resident attribute,
not just INDEX_ALLOCATION attributes.

Finally, here is a proposed patch to fix the bug; please read it carefully since
a lot of the code has been new to me:

diff --git a/libntfs-3g/runlist.c b/libntfs-3g/runlist.c
index 7e158d4..7bb9da9 100644
--- a/libntfs-3g/runlist.c
+++ b/libntfs-3g/runlist.c
@@ -939,40 +939,39 @@ mpa_err:
"attribute.\n");
goto err_out;
}
-   /* Setup not mapped runlist element if this is the base extent. */
+
+   /* If this is the base extent (if 'lowest_vcn' is 0), then
+* 'allocated_size' is valid, and we can use it to compute the total
+* number of clusters across all extents.  If the runlist covers all
+* clusters, then there was just a single extent and we can terminate
+* the runlist with LCN_NOENT.  Otherwise, we must terminate the runlist
+* with LCN_RL_NOT_MAPPED and let the caller look for more extents.  */
if (!attr->lowest_vcn) {
-   VCN max_cluster;
+   VCN num_clusters;
 
-   max_cluster = ((sle64_to_cpu(attr->allocated_size) +
+   num_clusters = ((sle64_to_cpu(attr->allocated_size) +
vol->cluster_size - 1) >>
-   vol->cluster_size_bits) - 1;
-   /*
-* A highest_vcn of zero means this is a single extent
-* attribute so simply terminate the runlist with LCN_ENOENT).
-*/
-   if (deltaxcn) {
-   /*
-* If there is a difference between the highest_vcn and
-* the highest cluster, the runlist is either corrupt
-* or, more likely, there are more extents following
-* this one.
-*/
-   if (deltaxcn < max_cluster) {
-   ntfs_log_debug("More extents to follow; 
deltaxcn = "
-   "0x%llx, max_cluster = 
0x%llx\n",
-   (long long)deltaxcn,
-   (long long)max_cluster);
-   rl[rlpos].vcn = vcn;
-   vcn += rl[rlpos].length = max_cluster - 
deltaxcn;
-   rl[rlpos].lcn = (LCN)LCN_RL_NOT_MAPPED;
-   rlpos++;
-   } else if (deltaxcn > max_cluster) {
-   ntfs_log_debug("Corrupt attribute. deltaxcn = "
-   "0x%llx, max_cluster = 
0x%llx\n",
-   (long long)deltaxcn,
-   (long long)max_cluster);
-   goto mpa_err;
-   }
+   vol->cluster_size_bits);
+
+   if (num_clusters > vcn) {
+   /* The runlist doesn't cover all the clusters, so there
+* must be more extents.  */
+   ntfs_log_debug("More extents to follow; vcn = 0x%llx, "
+  "num_clusters = 0x%llx\n",
+   (long long)vcn,
+   (long 

Re: [ntfs-3g-devel] ENOSPC when adding file to directory with near-full MFT record

2015-11-05 Thread Eric Biggers
On Wed, Nov 04, 2015 at 08:10:56AM +0100, Jean-Pierre André wrote:
> Hi Eric,
> 
> Attached is the patch (simpler than I first thought).
> 
> Jean-Pierre

Thanks.  I tested the patch and it made the ENOSPC problem go away.

I'm currently trying to track down a corruption problem that seems to trigger
under a similar set of very specific circumstances.  It occurs both before and
after this patch, and I expect it is a different problem.  More information to
come...

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [BUG] ntfs_attr_pwrite() may return short count when writing to compressed attribute

2015-10-31 Thread Eric Biggers
Hi,

This is indeed a real problem for me; I discovered it via some new automated
tests of wimlib.  For extraction to an NTFS volume, I had code that assumed any
return value from ntfs_attr_pwrite() other than the count that was passed in
indicated an error.  In practice I was usually passing a count of 32768 which
happened to avoid the problem, but when dealing with highly compressed WIM
archives the count could, in fact, be much larger.

The solution for me is, of course, to keep calling ntfs_attr_pwrite() until all
bytes have been written or a real error has occurred.  However, I'm suggesting
that something should also be done on the libntfs-3g side to make it harder for
people to run into this problem, whether that is updating the documentation or
updating ntfs_attr_pwrite() itself to always try to write the full count.  I am
also concerned about whether all internal callers of ntfs_attr_pwrite() in
libntfs-3g itself handle short writes correctly.  There are quite a few callers
and it looks like most don't use retry loops; however, many callers probably
either write small amounts of data only or rarely operate on compressed
attributes, thereby avoiding short writes in practice.

What if the existing ntfs_attr_pwrite() was simply moved to an internal
function, and ntfs_attr_pwrite() was written as a retry loop around the internal
function?

Eric

On Sat, Oct 31, 2015 at 07:06:01PM +0100, Jean-Pierre André wrote:
> Hi Eric,
> 
> Eric Biggers wrote:
> >Hi,
> >
> >The return value of ntfs_attr_pwrite() is documented as follows:
> >
> >>On success, return the number of successfully written bytes. If this number
> >>is lower than @count this means that an error was encountered during the
> >>write so that the write is partial. 0 means nothing was written (also return
> >>0 when @count is 0).
> >
> >Hence, a short count implies that an error occurred.  However, I discovered 
> >that
> >a short count may, in fact, be returned when successfully writing to a
> >compressed attribute, since ntfs_attr_pwrite() truncates the count to a 
> >single
> >compression block only:
> >
> >>   if (compressed) {
> >>   fullcount = (pos | (na->compression_block_size - 1)) + 1 - 
> >> pos;
> >>   if (count > fullcount)
> >>   count = fullcount;
> >>   }
> >
> >There are two possible ways to fix this:
> >
> > 1) Update ntfs_attr_pwrite() to always try to write the full count
> > 2) Update ntfs_attr_pwrite() documentation to clarify that short returns
> >are allowed and applications should, generally, continue calling
> >ntfs_attr_pwrite() until all bytes have been written
> >
> >It looks like the callers of ntfs_attr_pwrite() in the FUSE drivers do retry
> >short writes, but this doesn't appear to be the case for all internal 
> >callers in
> >libntfs-3g itself.  So I think that option (1) is preferred, if it is at all
> >possible.
> 
> Actually, the current state is intentional, and motivated
> by the complexity of allocating clusters for compressed
> attributes. The promise should indeed have been mitigated
> for compressed attributes (a subset of data attributes).
> 
> If this is a problem (did you find a specific case ?), a
> reasonable solution is to insert a new function for writing
> to data attributes, which will repeat the call until done.
> 
> Jean-Pierre
> 
> >
> >Eric
> 

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] Experimental support for Windows 10 "System Compressed" files

2015-09-22 Thread Eric Biggers
Hi,

"WOF compression" is as good as the other names.  It still seems slightly
wrong
because WOF (the "Windows Overlay Filesystem Filter") is a more general
feature,
and this is actually the *second* compression technology that Microsoft has
built on top of it (the first was "WIMBoot").  For now, I'll keep the code
the
way it is, using the "system compression" name.  It could be that Microsoft
will
release more documentation for this.

Yes, your reparse data indicates XPRESS4K compression (the fourth 32-bit
little
endian word is 0).  FYI, here are the compressed sizes I get with the
Silesia
corpus (uncompressed size: 211,938,580 bytes total):

LZNT1 (NTFS compression): 121,049,088 bytes
XPRESS4K: 104,124,416 bytes
XPRESS8K: 95,465,472 bytes
XPRESS16K: 90,460,160 bytes
LZX: 69,144,576 bytes

Even though FUSE makes big reads, it would be nice to not have to allocate a
decompression context for every read.  That would avoid doing all of the
following on a per-read basis:
- open WofCompressedData attribute
- allocate heap memory for ntfs_system_decompression_ctx
- allocate heap memory for XPRESS or LZX
- read chunk offsets from the compressed file's chunk table

Having an external tool to create "system compressed" files, if people want
that
support, is probably the way to go.  Probably that would be possible even
with
no changes in libntfs-3g.

Eric
--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] Multiple unnamed data streams?

2015-08-17 Thread Eric Biggers
Hi,

I had a report of a file's data disappearing when an NTFS volume was archived
using wimlib, which uses libntfs-3g to read from NTFS volumes.  What seems to
have happened is that libntfs-3g reported two unnamed data streams for a file:
one nonempty and one empty, and wimlib happened to store the empty one
(arbitrarily).

Is it an expected or valid case for a file to have more than one unnamed data
stream like this?  When and how might this happen?

If it's relevant: in wimlib I am basically doing something like this:

ntfs_attr_search_ctx *actx = ntfs_attr_get_search_ctx(ni, NULL);

while (!ntfs_attr_lookup(AT_DATA, NULL, 0, CASE_SENSITIVE, 0, NULL, 0, 
actx))
{
ATTR_RECORD *record = actx-attr;
u32 name_nchars = record-name_length;
ntfschar *name = (ntfschar *) ((u8 *)record + 
le16_to_cpu(record-name_offset));

if (name_chars == 0) {
/* unnamed stream... */
} else {
/* named stream... */
}
}


I know that for the problem to have been apparent, ntfs_attr_lookup() must have
provided the nonempty version of the stream first and the empty version second.
This may explain why this problem isn't encountered more frequently, since
perhaps the first matching stream is used ordinarily and the empty duplicate
streams always appear second.

Eric

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] Experimental support for Windows 10 System Compressed files

2015-07-22 Thread Eric Biggers
[I'm re-sending this since it didn't reach the mailing list due to the
SourceForge outage.]

There is not too much information specifically about this feature available yet.
You can try googling Windows 10 System compression to find some articles.
If you are looking for information about the data format, it is not yet
documented in the context of the system compression feature but it seems that
Microsoft lifted the format of the compressed data directly from the Windows
Imaging (WIM) file format.

One way to create such files for testing is to use the Windows 10 version of the
compact program.  It has a new option for compressing files using one of the
new formats:

/exe:xpress4k
/exe:xpress8k
/exe:xpress16k
/exe:lzx

The format is designed for write-once, read-many files, such as executable
files.  If you try to write to such a file on Windows, Windows immediately
decompresses it and turns it into a standard uncompressed file.  There is no
need for manual cluster allocation as the feature is not implemented directly in
NTFS.

However, for reading, the compressed files can be accessed randomly with chunk
granuality.  Each chunk can be decompressed independently.  If, say, you want to
read starting from byte offset 100 and the chunks are 8192 bytes, then you
know you need to read starting from chunk (100/8192) = 122.  Then you can
load the offsets of chunks 122, and any later chunks that may be needed, from
the chunk table at the beginning of the file.  Those will tell you where in
the file the chunks are and what their compressed sizes are.

Eric

On Thu, Jul 16, 2015 at 09:59:46AM +0200, Jean-Pierre André wrote:
 Hi Eric,
 
 Interesting.
 
 Where can I find more information about this feature,
 and how can I create such files on Windows 10 ?
 
 Glancing at your code, I do not see anything related
 to (sparse) cluster allocation. Does that mean these
 files are not seekable and must be read/written
 sequentially ?
 
 Regards
 
 Jean-Pierre
 
 Eric Biggers wrote:
 Hello,
 
 I've made an experimental fork of ntfs-3g that supports reading the System
 Compressed files that are / will be supported by Windows 10.  This feature
 allows rarely-modified files to be stored using XPRESS or LZX compression, 
 with
 stronger compression than the LZNT1 compression built into NTFS.  Windows 10
 will supposedly enable it on selected files automatically.
 
 Microsoft designed this feature to use a reparse point which redirects 
 access to
 a named data stream, which avoided changing NTFS itself.  The format of the
 compressed stream is identical to that of a compressed resource stored in a
 Windows Imaging (WIM) archive.
 
 I suspect it will be a while before NTFS-3g support would be useful to more
 people and it ultimately may not be worthwhile adding it at all (especially
 since this is a reparse-point based feature and therefore is not part of NTFS
 itself, and it takes quite a bit of code to support), but I thought I'd post
 this in case anyone else is interested.
 
 The source code is available as the system_compression branch of
 https://github.com/ebiggers/ntfs-3g.git.
 
 Eric
 

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] Experimental support for Windows 10 System Compressed files

2015-07-15 Thread Eric Biggers
Hello,

I've made an experimental fork of ntfs-3g that supports reading the System
Compressed files that are / will be supported by Windows 10.  This feature
allows rarely-modified files to be stored using XPRESS or LZX compression, with
stronger compression than the LZNT1 compression built into NTFS.  Windows 10
will supposedly enable it on selected files automatically.

Microsoft designed this feature to use a reparse point which redirects access to
a named data stream, which avoided changing NTFS itself.  The format of the
compressed stream is identical to that of a compressed resource stored in a
Windows Imaging (WIM) archive.

I suspect it will be a while before NTFS-3g support would be useful to more
people and it ultimately may not be worthwhile adding it at all (especially
since this is a reparse-point based feature and therefore is not part of NTFS
itself, and it takes quite a bit of code to support), but I thought I'd post
this in case anyone else is interested.

The source code is available as the system_compression branch of
https://github.com/ebiggers/ntfs-3g.git.

Eric

--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


[ntfs-3g-devel] [PATCH] acls.c: fix validation of SID subauthority count

2015-07-12 Thread Eric Biggers
ntfs_valid_sid() required that the subauthority count be between 1 and 8
inclusively.  However, Windows permits more than 8 subauthorities as well
as 0 subauthorities:

  - The install.wim file for the latest Windows 10 build contains a file
whose DACL contains a SID with 10 subauthorities.
ntfs_set_ntfs_acl() was failing on this file.

  - The IsValidSid() function on Windows returns true for subauthority
less than or equal to 15, including 0.

There was actually already a another SID validation function that had the
Windows-compatible behavior, so I merged the two together.
---
 include/ntfs-3g/security.h | 16 
 libntfs-3g/acls.c  | 16 +---
 libntfs-3g/security.c  |  4 ++--
 3 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/include/ntfs-3g/security.h b/include/ntfs-3g/security.h
index 8875c9c..9167155 100644
--- a/include/ntfs-3g/security.h
+++ b/include/ntfs-3g/security.h
@@ -222,22 +222,6 @@ enum {
 extern BOOL ntfs_guid_is_zero(const GUID *guid);
 extern char *ntfs_guid_to_mbs(const GUID *guid, char *guid_str);
 
-/**
- * ntfs_sid_is_valid - determine if a SID is valid
- * @sid:   SID for which to determine if it is valid
- *
- * Determine if the SID pointed to by @sid is valid.
- *
- * Return TRUE if it is valid and FALSE otherwise.
- */
-static __inline__ BOOL ntfs_sid_is_valid(const SID *sid)
-{
-   if (!sid || sid-revision != SID_REVISION ||
-   sid-sub_authority_count  SID_MAX_SUB_AUTHORITIES)
-   return FALSE;
-   return TRUE;
-}
-
 extern int ntfs_sid_to_mbs_size(const SID *sid);
 extern char *ntfs_sid_to_mbs(const SID *sid, char *sid_str,
size_t sid_str_size);
diff --git a/libntfs-3g/acls.c b/libntfs-3g/acls.c
index 925bb96..500d60f 100644
--- a/libntfs-3g/acls.c
+++ b/libntfs-3g/acls.c
@@ -362,16 +362,18 @@ unsigned int ntfs_attr_size(const char *attr)
return (attrsz);
 }
 
-/*
- * Do sanity checks on a SID read from storage
- * (just check revision and number of authorities)
+/**
+ * ntfs_valid_sid - determine if a SID is valid
+ * @sid:   SID for which to determine if it is valid
+ *
+ * Determine if the SID pointed to by @sid is valid.
+ *
+ * Return TRUE if it is valid and FALSE otherwise.
  */
-
 BOOL ntfs_valid_sid(const SID *sid)
 {
-   return ((sid-revision == SID_REVISION)
-(sid-sub_authority_count = 1)
-(sid-sub_authority_count = 8));
+   return sid  sid-revision == SID_REVISION 
+   sid-sub_authority_count = SID_MAX_SUB_AUTHORITIES;
 }
 
 /*
diff --git a/libntfs-3g/security.c b/libntfs-3g/security.c
index 3ac4790..e00bcf9 100644
--- a/libntfs-3g/security.c
+++ b/libntfs-3g/security.c
@@ -224,7 +224,7 @@ int ntfs_sid_to_mbs_size(const SID *sid)
 {
int size, i;
 
-   if (!ntfs_sid_is_valid(sid)) {
+   if (!ntfs_valid_sid(sid)) {
errno = EINVAL;
return -1;
}
@@ -298,7 +298,7 @@ char *ntfs_sid_to_mbs(const SID *sid, char *sid_str, size_t 
sid_str_size)
 * No need to check @sid if !@sid_str since ntfs_sid_to_mbs_size() will
 * check @sid, too.  8 is the minimum SID string size.
 */
-   if (sid_str  (sid_str_size  8 || !ntfs_sid_is_valid(sid))) {
+   if (sid_str  (sid_str_size  8 || !ntfs_valid_sid(sid))) {
errno = EINVAL;
return NULL;
}
-- 
2.4.5


--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] compress.c: Speed up NTFS compression algorithm

2014-08-10 Thread Eric Biggers
On Sun, Aug 10, 2014 at 10:45:16AM +0200, Jean-Pierre André wrote:
 
 You have defined the hash table on static data, and I do
 not want to enter into the meanings of static data in
 shared objects in various operating systems (allowed or
 not, shared by threads or not...). I prefer to have it
 dynamically allocated (hence never shared by mounts),
 and pointed to in the volume structure. Unfortunately
 this means adding an extra argument to
 ntfs_compress_block() and freeing the table when unmounting.
 (I will later post the code for that).
 

Yes, I wondered if that would cause issues.  Since the algorithm does not depend
on the specific hash function used, an alternative to jumping through hoops to
use the crc32_table is to swap ntfs_hash() with another 3-byte hash function,
one that does not rely on static data.  I will try some other ones (zlib-like
which I've already tested a little bit, and maybe multiplicative hashing) and
see how they affect the results.

 Also a minor issue : please use
 http://sourceforge.net/p/ntfs-3g/ntfs-3g/ci/edge/tree/libntfs-3g/compress.c
 as a reference for your patch for easier integration.

Will do next time.  I somehow missed the fact that this repository even exists!
Looks like the only conflict is in the change to the copyright block...

Eric

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] compress.c: Speed up NTFS compression algorithm

2014-08-10 Thread Eric Biggers
On Sun, Aug 10, 2014 at 11:18:49AM +0200, Jean-Pierre André wrote:
 Hi,
 
 Did you compare with the Microsoft implementation ?
 
 I have only checked the biggest file in IE7 update for WinXP
 (WINDOWS/ie7updates/KB963027-IE7/ieframe.dll) with
 cluster size 4096 :
 
 Original size 6066688
 Microsoft implementation 3883008 (64.0%)
 current implementation 3682304 (60.7%)
 proposed implementation 3710976 (61.2%)

I have not done any comparisons with the Microsoft implementation yet.  Is there
a more precise way to test it than actually copying a file to a NTFS volume from
Windows?

I'm not surprised that it apparently produces a worse compression ratio than
NTFS-3g.  Although it's impossible to know for sure what their algorithm does,
my expectation is that they use hash chains --- similar to my proposal, perhaps
with a slightly less exhaustive search --- but use greedy parsing rather than
lazy parsing.

If there's a desire for even greater performance improvement, then greedy
parsing is the way to go.  But it will degrade the compression ratio, maybe
placing it closer to the Microsoft implementation.

Eric

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] compress.c: Speed up NTFS compression algorithm

2014-08-10 Thread Eric Biggers
On Sun, Aug 10, 2014 at 05:29:53PM +0200, Jean-Pierre André wrote:
 
 For a better way you would have to identify which is the dll
 which compresses , and submit compression tasks with some
 control over the durations.
 

RtlCompressBuffer() in ntdll.dll can do LZNT1 compression, which I think is the
same as NTFS compression.  But I wouldn't be surprised if there is actually
another implementation in Microsoft's NTFS driver, which would be inaccessible.
But either way, simply copying files to a NTFS volume is probably good enough
for approximate benchmarking; that's what I was doing with NTFS-3g, after all.

 I had analyzed the difference of results, and I was surprised
 to find that the full length of the matching string was not
 always used (such as found a matching string at some position
 with a matching length of 20, but only used a length of 12
 and the next match not being better than the expected 8 bytes),
 and there does not appear to be a fixed maximum length
 (when all bytes are the same, the matching length is 4095
 as would be expected).
 
 They probably bargained the duration against the compression
 rate.

This is strange.  If the algorithm does the work to find a match at some
position, it should at least extend it to its full length.  Although a
non-greedy parser will not necessarily choose that full length, it's unexpected
that the algorithm would actually choose a sequence of matches that is *worse*
than that which a greedy parser would choose.  Is it possible that the length 12
match was less distant than the length 20 match?  If it was, then this would be
an expected result of an incomplete search using hash chains.

Regardless, it's probably possible to implement something that beats the
Microsoft implementation (and I have done this for the XPRESS-Huffman and LZX
algorithms), so I personally wouldn't read too much into how they might be doing
things.

Eric

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel


Re: [ntfs-3g-devel] [PATCH] compress.c: Speed up NTFS compression algorithm

2014-08-10 Thread Eric Biggers
Hi,

I did some more testing with zlib-like and multiplicative hashing.  The
differences for the files I was testing were quite small.  However, the proposed
hash function, as well as a slightly modified version of it, did come out
slightly ahead.  So it might as well be used.

I also did a few quick tests with greedy parsing.  In general, it seemed to
improve performance by 10-20% and increase the size of the compressed files by
1-2%.  If the performance improvement is considered more desirable, then I can
change the patch to use greedy parsing.  For now it's using lazy parsing, like
it was before but more optimized.

Here's the updated patch.



diff --git a/libntfs-3g/compress.c b/libntfs-3g/compress.c
index 73ad283..1fefc3e 100644
--- a/libntfs-3g/compress.c
+++ b/libntfs-3g/compress.c
@@ -6,6 +6,7 @@
  * Copyright (c) 2004-2006 Szabolcs Szakacsits
  * Copyright (c)  2005 Yura Pakhuchiy
  * Copyright (c) 2009-2014 Jean-Pierre Andre
+ * Copyright (c)  2014 Eric Biggers
  *
  * This program/include file is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as published
@@ -21,17 +22,6 @@
  * along with this program (in the main directory of the NTFS-3G
  * distribution in the file COPYING); if not, write to the Free Software
  * Foundation,Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
- *
- * A part of the compression algorithm is based on lzhuf.c whose header
- * describes the roles of the original authors (with no apparent copyright
- * notice, and according to http://home.earthlink.net/~neilbawd/pall.html
- * this was put into public domain in 1988 by Haruhiko OKUMURA).
- *
- * LZHUF.C English version 1.0
- * Based on Japanese version 29-NOV-1988   
- * LZSS coded by Haruhiko OKUMURA
- * Adaptive Huffman Coding coded by Haruyasu YOSHIZAKI
- * Edited and translated to English by Kenji RIKITAKE
  */
 
 #ifdef HAVE_CONFIG_H
@@ -81,96 +71,256 @@ typedef enum {
NTFS_SB_IS_COMPRESSED   =   0x8000,
 } ntfs_compression_constants;
 
+/* Match length at or above which ntfs_best_match() will stop searching for
+ * longer matches.  */
+#define NICE_MATCH_LEN 18
+
+/* Maximum number of potential matches that ntfs_best_match() will consider at
+ * each position.  */
+#define MAX_SEARCH_DEPTH 24
+
+/* Number of entries in the hash table for match-finding.
+ *
+ * This can be changed, but ntfs_hash() would need to be updated to use an
+ * appropriate shift.  Also, if this is made more than 1  16 (not recommended
+ * for the 4096-byte buffers used in NTFS compression!), then 'crc_table' would
+ * need to be updated to use 32-bit entries.  */
+#define HASH_LEN (1  14)
+
 struct COMPRESS_CONTEXT {
const unsigned char *inbuf;
int bufsize;
int size;
int rel;
int mxsz;
-   s16 head[256];
-   s16 lson[NTFS_SB_SIZE];
-   s16 rson[NTFS_SB_SIZE];
+   s16 head[HASH_LEN];
+   s16 prev[NTFS_SB_SIZE];
 } ;
 
 /*
- * Search for the longest sequence matching current position
+ * CRC table for hashing bytes for Lempel-Ziv match-finding.
  *
- * A binary tree is maintained to locate all previously met sequences,
- * and this function has to be called for all of them.
+ * We use a CRC32 for this purpose.  But since log2(HASH_LEN) = 16, we only
+ * need 16 bit entries, each of which contains the low 16 bits of the entry in 
a
+ * real CRC32 table.  (CRC16 would also work, but it caused more collisions 
when
+ * I tried it.)
  *
- * This function is heavily used, it has to be optimized carefully
+ * Hard-coding the table avoids dealing with thread-safe initialization.
+ */
+static const u16 crc_table[256] = {
+   0x, 0x3096, 0x612C, 0x51BA, 0xC419, 0xF48F, 0xA535, 0x95A3,
+   0x8832, 0xB8A4, 0xE91E, 0xD988, 0x4C2B, 0x7CBD, 0x2D07, 0x1D91,
+   0x1064, 0x20F2, 0x7148, 0x41DE, 0xD47D, 0xE4EB, 0xB551, 0x85C7,
+   0x9856, 0xA8C0, 0xF97A, 0xC9EC, 0x5C4F, 0x6CD9, 0x3D63, 0x0DF5,
+   0x20C8, 0x105E, 0x41E4, 0x7172, 0xE4D1, 0xD447, 0x85FD, 0xB56B,
+   0xA8FA, 0x986C, 0xC9D6, 0xF940, 0x6CE3, 0x5C75, 0x0DCF, 0x3D59,
+   0x30AC, 0x003A, 0x5180, 0x6116, 0xF4B5, 0xC423, 0x9599, 0xA50F,
+   0xB89E, 0x8808, 0xD9B2, 0xE924, 0x7C87, 0x4C11, 0x1DAB, 0x2D3D,
+   0x4190, 0x7106, 0x20BC, 0x102A, 0x8589, 0xB51F, 0xE4A5, 0xD433,
+   0xC9A2, 0xF934, 0xA88E, 0x9818, 0x0DBB, 0x3D2D, 0x6C97, 0x5C01,
+   0x51F4, 0x6162, 0x30D8, 0x004E, 0x95ED, 0xA57B, 0xF4C1, 0xC457,
+   0xD9C6, 0xE950, 0xB8EA, 0x887C, 0x1DDF, 0x2D49, 0x7CF3, 0x4C65,
+   0x6158, 0x51CE, 0x0074, 0x30E2, 0xA541, 0x95D7, 0xC46D, 0xF4FB,
+   0xE96A, 0xD9FC, 0x8846, 0xB8D0, 0x2D73, 0x1DE5, 0x4C5F, 0x7CC9,
+   0x713C, 0x41AA, 0x1010, 0x2086, 0xB525, 0x85B3, 0xD409, 0xE49F,
+   0xF90E, 0xC998, 0x9822, 0xA8B4, 0x3D17, 0x0D81, 0x5C3B, 0x6CAD,
+   0x8320, 0xB3B6, 0xE20C, 0xD29A, 0x4739, 0x77AF, 0x2615, 0x1683,
+   0x0B12, 0x3B84, 0x6A3E, 0x5AA8, 0xCF0B

Re: [ntfs-3g-devel] [PATCH] compress.c: Speed up NTFS compression algorithm

2014-08-10 Thread Eric Biggers
Good news: I tried some different constants for multiplicative hashing, and the
results for two constants were as good as the CRC-based hash function, and
faster to compute (at least on x86).  So here's the revised patch that does away
with static data completely.

---

diff --git a/libntfs-3g/compress.c b/libntfs-3g/compress.c
index 73ad283..b356e35 100644
--- a/libntfs-3g/compress.c
+++ b/libntfs-3g/compress.c
@@ -6,6 +6,7 @@
  * Copyright (c) 2004-2006 Szabolcs Szakacsits
  * Copyright (c)  2005 Yura Pakhuchiy
  * Copyright (c) 2009-2014 Jean-Pierre Andre
+ * Copyright (c)  2014 Eric Biggers
  *
  * This program/include file is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as published
@@ -21,17 +22,6 @@
  * along with this program (in the main directory of the NTFS-3G
  * distribution in the file COPYING); if not, write to the Free Software
  * Foundation,Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
- *
- * A part of the compression algorithm is based on lzhuf.c whose header
- * describes the roles of the original authors (with no apparent copyright
- * notice, and according to http://home.earthlink.net/~neilbawd/pall.html
- * this was put into public domain in 1988 by Haruhiko OKUMURA).
- *
- * LZHUF.C English version 1.0
- * Based on Japanese version 29-NOV-1988   
- * LZSS coded by Haruhiko OKUMURA
- * Adaptive Huffman Coding coded by Haruyasu YOSHIZAKI
- * Edited and translated to English by Kenji RIKITAKE
  */
 
 #ifdef HAVE_CONFIG_H
@@ -81,96 +71,183 @@ typedef enum {
NTFS_SB_IS_COMPRESSED   =   0x8000,
 } ntfs_compression_constants;
 
+/* Match length at or above which ntfs_best_match() will stop searching for
+ * longer matches.  */
+#define NICE_MATCH_LEN 18
+
+/* Maximum number of potential matches that ntfs_best_match() will consider at
+ * each position.  */
+#define MAX_SEARCH_DEPTH 24
+
+/* log base 2 of the number of entries in the hash table for match-finding.  */
+#define HASH_SHIFT 14
+
+/* Constant for the multiplicative hash function.  */
+#define HASH_MULTIPLIER 0x1E35A7BD
+
 struct COMPRESS_CONTEXT {
const unsigned char *inbuf;
int bufsize;
int size;
int rel;
int mxsz;
-   s16 head[256];
-   s16 lson[NTFS_SB_SIZE];
-   s16 rson[NTFS_SB_SIZE];
+   s16 head[1  HASH_SHIFT];
+   s16 prev[NTFS_SB_SIZE];
 } ;
 
 /*
+ * Hash the next 3-byte sequence in the input buffer
+ */
+static inline unsigned int ntfs_hash(const u8 *p)
+{
+   u32 str;
+
+#if defined(__i386__) || defined(__x86_64__)
+   /* Unaligned access okay  */
+   str = *(u32 *)p  0xFF;
+#else
+   str = ((u32)p[0]  0) | ((u32)p[1]  8) | ((u32)p[2]  16);
+#endif
+
+   return (str * HASH_MULTIPLIER)  (32 - HASH_SHIFT);
+}
+
+/*
  * Search for the longest sequence matching current position
  *
- * A binary tree is maintained to locate all previously met sequences,
- * and this function has to be called for all of them.
+ * A hash table, each entry of which points to a chain of sequence
+ * positions sharing the corresponding hash code, is maintained to speed up
+ * searching for matches.  To maintain the hash table, either
+ * ntfs_best_match() or ntfs_skip_position() has to be called for each
+ * consecutive position.
+ *
+ * This function is heavily used; it has to be optimized carefully.
+ *
+ * This function sets pctx-size and pctx-rel to the length and offset,
+ * respectively, of the longest match found.
+ *
+ * The minimum match length is assumed to be 3, and the maximum match
+ * length is assumed to be pctx-mxsz.  If this function produces
+ * pctx-size  3, then no match was found.
+ *
+ * Note: for the following reasons, this function is not guaranteed to find
+ * *the* longest match up to pctx-mxsz:
  *
- * This function is heavily used, it has to be optimized carefully
+ * (1) If this function finds a match of NICE_MATCH_LEN bytes or greater,
+ * it ends early because a match this long is good enough and it's not
+ * worth spending more time searching.
  *
- * Returns the size of the longest match,
- * zero if no match is found.
+ * (2) If this function considers MAX_SEARCH_DEPTH matches with a single
+ * position, it ends early and returns the longest match found so far.
+ * This saves a lot of time on degenerate inputs.
  */
-
-static int ntfs_best_match(struct COMPRESS_CONTEXT *pctx, int i)
+static void ntfs_best_match(struct COMPRESS_CONTEXT *pctx, const int i,
+   int best_len)
 {
-   s16 *prev;
-   int node;
-   register long j;
-   long maxpos;
-   long startj;
-   long bestj;
-   int bufsize;
-   int bestnode;
-   register const unsigned char *p1,*p2;
-
-   p1 = pctx-inbuf;
-   node = pctx-head[p1[i]  255

[ntfs-3g-devel] [PATCH] compress.c: Speed up NTFS compression algorithm

2014-08-09 Thread Eric Biggers
The current compression algorithm does lazy parsing of matches, backed
by a binary tree match-finder with one byte hashing.  Performance-wise,
this approach is not very good for several reasons:

(1) One byte hashing results in a lot of hash collisions, which slows
down searches.

(2) With lazy parsing, many positions never actually need to be
matched.  But when using binary trees, all the work needs to be done
anyway, because the sequence at each position needs to be inserted
into the appropriate binary tree.  This makes binary trees better
suited for optimal parsing --- but that isn't being done and is
probably too slow to be practical for NTFS.

(3) There was also no hard cut-off on the amount of work done per
position.  This did not matter too much because the buffer size is
never greater than 4096 bytes, but in degenerate cases the binary
trees could generate into linked lists and there could be hundreds of
matches considered at each position.

This patch changes the algorithm to use hash chains instead of binary
trees, with much stronger hashing.  It also introduces useful (for
performance) parameters, such as the nice match length and maximum
search depth, that are similar to those used in other commonly used
compression algorithms such as zlib's DEFLATE implementation.

The performance improvement is very significant, but data-dependent.
Compressing text files is faster by about 3x; x86 executables files by
about 3x; random data by about 1.7x; all zeroes by about 1.2x; some
degenerate cases by over 10x.  (I did my tests on an x86_64 CPU.)

The compression ratio is the same or slightly worse.  It is less than 1%
worse on all files I tested except an ASCII representation of a genome.

No changes were made to the decompressor.
---
 libntfs-3g/compress.c | 484 +-
 1 file changed, 324 insertions(+), 160 deletions(-)

diff --git a/libntfs-3g/compress.c b/libntfs-3g/compress.c
index 69b39ed..e62c7dd 100644
--- a/libntfs-3g/compress.c
+++ b/libntfs-3g/compress.c
@@ -6,6 +6,7 @@
  * Copyright (c) 2004-2006 Szabolcs Szakacsits
  * Copyright (c)  2005 Yura Pakhuchiy
  * Copyright (c) 2009-2013 Jean-Pierre Andre
+ * Copyright (c)  2014 Eric Biggers
  *
  * This program/include file is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as published
@@ -21,17 +22,6 @@
  * along with this program (in the main directory of the NTFS-3G
  * distribution in the file COPYING); if not, write to the Free Software
  * Foundation,Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
- *
- * A part of the compression algorithm is based on lzhuf.c whose header
- * describes the roles of the original authors (with no apparent copyright
- * notice, and according to http://home.earthlink.net/~neilbawd/pall.html
- * this was put into public domain in 1988 by Haruhiko OKUMURA).
- *
- * LZHUF.C English version 1.0
- * Based on Japanese version 29-NOV-1988   
- * LZSS coded by Haruhiko OKUMURA
- * Adaptive Huffman Coding coded by Haruyasu YOSHIZAKI
- * Edited and translated to English by Kenji RIKITAKE
  */
 
 #ifdef HAVE_CONFIG_H
@@ -81,96 +71,210 @@ typedef enum {
NTFS_SB_IS_COMPRESSED   =   0x8000,
 } ntfs_compression_constants;
 
+/* Match length at or above which ntfs_best_match() will stop searching for
+ * longer matches.  */
+#define NICE_MATCH_LEN 16
+
+/* Maximum length at which a lazy match will be attempted.  */
+#define MAX_LAZY_MATCH_LEN 20
+
+/* Maximum number of potential matches that ntfs_best_match() will consider at
+ * each position.  */
+#define MAX_SEARCH_DEPTH 24
+
+/* Number of entries in the hash table for match-finding.  This can be changed,
+ * but it should be a power of 2 so that computing the hash bucket is fast.  */
+#define HASH_LEN (1  14)
+
 struct COMPRESS_CONTEXT {
const unsigned char *inbuf;
int bufsize;
int size;
int rel;
int mxsz;
-   s16 head[256];
-   s16 lson[NTFS_SB_SIZE];
-   s16 rson[NTFS_SB_SIZE];
+   s16 head[HASH_LEN];
+   s16 prev[NTFS_SB_SIZE];
 } ;
 
+#define CRC32_POLYNOMIAL 0xEDB88320
+
+static u32 crc32_table[256];
+
+static void do_crc32_init(void)
+{
+   int i, j;
+   u32 r;
+
+   for (i = 0; i  256; i++) {
+   r = i;
+   for (j = 0; j  8; j++)
+   r = (r  1) ^ (CRC32_POLYNOMIAL  ~((r  1) - 1));
+   crc32_table[i] = r;
+   }
+}
+
+/*
+ * Initialize the CRC32 table for ntfs_hash() if not done already
+ */
+static void crc32_init(void)
+{
+   static int done = 0;
+
+   if (!done) {
+   do_crc32_init();
+   done = 1;
+   }
+}
+
+/*
+ * Hash the next 3-byte sequence in the input buffer
+ *
+ * Currently, we use a hash function similar to that used in LZMA.  It
+ * takes slightly longer to compute than zlib's hash

Re: [ntfs-3g-devel] [PATCH] compress.c: Speed up NTFS compression algorithm

2014-08-09 Thread Eric Biggers
Also, results from tests I did copying a file to compressed directory on a
NTFS-3g mount, with time elapsed and compressed sizes shown:

silesia_corpus.tar (211,957,760 bytes)
Current 43.318s 111,230,976 bytes
Proposed12.903s 111,751,168 bytes

canterbury_corpus.tar (2,826,240 bytes):
Current 1.778s  1,232,896 bytes
Proposed0.142s  1,241,088 bytes

Firefox-11-windows-bin.tar (38,225,920 bytes)
Current 5.685s  27,418,624 bytes
Proposed1.992s  27,492,352 bytes

boot.wim (361,315,088 bytes, no internal compression)
Current 64.682s 189,124,608 bytes
Proposed20.990s 190,386,176 bytes

mp3-files.tar (201,646,080 bytes)
Current 14.547s 200,916,992 bytes
Proposed8.585s  200,937,472 bytes

linux-2.4.31-src.tar (174,417,920 bytes)
Current 36.115s 75,751,424 bytes
Proposed10.262s 76,251,136 bytes

gcc-4.7.3.tar.bz2 (82,904,224 bytes)
Current 5.637s  82,907,136 bytes
Proposed3.276s  82,907,136 bytes

ntoskrnl.exe (3,911,040 bytes)
Current 0.492s  2,789,376 bytes
Proposed0.186s  2,789,376 bytes

E_coli_genome.fasta (4,706,040 bytes)
Current 1.101s  2,060,288 bytes
Proposed0.458s  2,351,104 bytes

shakespeare.txt (5,328,042 bytes)
Current 0.909s  3,321,856 bytes
Proposed0.303s  3,321,856 bytes

zeroes.bin (134,217,728 bytes)
Current 3.053s  0 bytes
Proposed2.601s  0 bytes

ascending_digrams.bin (16,777,216 bytes)
Current 4.309s  16,777,216 bytes
Proposed0.673s  16,777,216 bytes

degenerate_trigrams.bin (16,777,216 bytes)
Current 9.292s  5,242,880 bytes
Proposed0.684s  5,242,880 bytes

--
___
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel