Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-25 Thread Andreas Dilger
On Feb 18, 2021, at 4:21 PM, Daniel Rosenberg  wrote:
> 
> On Wed, Feb 17, 2021 at 2:48 PM Andreas Dilger  wrote:
>> 
>> On Feb 17, 2021, at 9:08 AM, Theodore Ts'o  wrote:
>>> 
>>> The problem is in how the space after the filename in a directory is
>>> encoded.  The dirdata format is (mildly) expandable, supporting up to
>>> 4 different metadata chunks after the filename, using a very
>>> compatctly encoded TLV (or moral equivalent) scheme.  For directory
>>> inodes that have both the encyption and compression flags set, we have
>>> a single blob which gets used as the IV for the crypto.
>>> 
>>> So it's the difference between a simple blob that is only used for one
>>> thing in this particular case, and something which is the moral
>>> equivalent of simple ASN.1 or protobuf encoding.
>>> 
>>> Currently, datadata has defined uses for 2 of the 4 "chunks", which is
>>> used in Lustre servers.  The proposal which Andreas has suggested is
>>> if the dirdata feature is supported, then the 3rd dirdata chunk would
>>> be used for the case where we currently used by the
>>> encrypted-casefolded extension, and the 4th would get reserved for a
>>> to-be-defined extension mechanism.
>>> 
>>> If there ext4 encrypted/casefold is not yet in use, and we can get the
>>> changes out to all potential users before they release products out
>>> into the field, then one approach would be to only support
>>> encrypted/casefold when dirdata is also enabled.
>>> 
>>> If ext4 encrypted/casefold is in use, my suggestion is that we support
>>> both encrypted/casefold && !dirdata as you have currently implemented
>>> it, and encrypted/casefold && dirdata as Andreas has proposed.
>>> 
>>> IIRC, supporting that Andreas's scheme essentially means that we use
>>> the top four bits in the rec_len field to indicate which chunks are
>>> present, and then for each chunk which is present, there is a 1 byte
>>> length followed by payload.  So that means in the case where it's
>>> encrypted/casefold && dirdata, the required storage of the directory
>>> entry would take one additional byte, plus setting a bit indicating
>>> that the encrypted/casefold dirdata chunk was present.
>> 
>> I think your email already covers pretty much all of the points.
>> 
>> One small difference between current "raw" encrypted/casefold hash vs.
>> dirdata is that the former is 4-byte aligned within the dirent, while
>> dirdata is packed.  So in 3/4 cases dirdata would take the same amount
>> of space (the 1-byte length would use one of the 1-3 bytes of padding
>> vs. the raw format), since the next dirent needs to be aligned anyway.
>> 
>> The other implication here is that the 8-byte hash may need to be
>> copied out of the dirent into a local variable before use, due to
>> alignment issues, but I'm not sure if that is actually needed or not.
>> 
>>> So, no, they aren't incompatible ultimatly, but it might require a
>>> tiny bit more work to integrate the combined support for dirdata plus
>>> encrypted/casefold.  One way we can do this, if we have to support the
>>> current encrypted/casefold format because it's out there in deployed
>>> implementations already, is to integrate encrypted/casefold &&
>>> !dirdata first upstream, and then when we integrate dirdata into
>>> upstream, we'll have to add support for the encrypted/casefold &&
>>> dirdata case.  This means that we'll have two variants of the on-disk
>>> format to test and support, but I don't think it's the going to be
>>> that difficult.
>> 
>> It would be possible to detect if the encrypted/casefold+dirdata
>> variant is in use, because the dirdata variant would have the 0x40
>> bit set in the file_type byte.  It isn't possible to positively
>> identify the "raw" non-dirdata variant, but the assumption would be
>> if (rec_len >= round_up(name_len, 4) + 8) in an encrypted+casefold
>> directory that the "raw" hash must be present in the dirent.
> 
> So sounds like we're going with the combined version. Andreas, do you
> have any suggestions for changes to the casefolding patch to ease the
> eventual merging with dirdata? A bunch of the changes are already
> pretty similar, so some of it is just calling essentially the same
> functions different things.

One thing I would suggest is to change the "is_fake_entry()" from using
offsets in the leaf block to using the content of the dirent to make
that decision.  Comparing entries against "." and ".." is trivial (and
already done in many places), and the checksum entry/tail has a "magic"
file type that can be used.  This will avoid potential problems if e.g.
encrypted entries are stored inline with the inode, and/or dirdata that
also adds fields to "." and "..".

Also, the patch adds the use of "lblk" all around the code, but that
wouldn't be needed if is_fake_entry() was updated as above?

Note in find_group_orlov() the filename hash doesn't strictly need to
match the actual hash used in the directory.  That is only for finding
a suitable group for 

Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-19 Thread Theodore Ts'o
On Wed, Feb 17, 2021 at 03:48:39PM -0700, Andreas Dilger wrote:
> It would be possible to detect if the encrypted/casefold+dirdata
> variant is in use, because the dirdata variant would have the 0x40
> bit set in the file_type byte.  It isn't possible to positively
> identify the "raw" non-dirdata variant, but the assumption would be
> if (rec_len >= round_up(name_len, 4) + 8) in an encrypted+casefold
> directory that the "raw" hash must be present in the dirent.

Consider a 4k directory directory block which has only three entries,
".", "..", and "a".  The directory entry for "a" will have a rec_len
substantially larger than name_len.

Fortunatelly, the "raw" non-dirdata variant case easily can be
detected.  If the directory has the encryption and casefold set, and
the 0x40 bit is not set, then raw must be present, assuming that the
directory block has not been corrupted (but if it's corrupted, all
bets are off).

   - Ted
   


Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-18 Thread Daniel Rosenberg
On Wed, Feb 17, 2021 at 2:48 PM Andreas Dilger  wrote:
>
> On Feb 17, 2021, at 9:08 AM, Theodore Ts'o  wrote:
> >
> > On Tue, Feb 16, 2021 at 08:01:11PM -0800, Daniel Rosenberg wrote:
> >> I'm not sure what the conflict is, at least format-wise. Naturally,
> >> there would need to be some work to reconcile the two patches, but my
> >> patch only alters the format for directories which are encrypted and
> >> casefolded, which always must have the additional hash field. In the
> >> case of dirdata along with encryption and casefolding, couldn't we
> >> have the dirdata simply follow after the existing data? Since we
> >> always already know the length, it'd be unambiguous where that would
> >> start. Casefolding can only be altered on an empty directory, and you
> >> can only enable encryption for an empty directory, so I'm not too
> >> concerned there. I feel like having it swapping between the different
> >> methods makes it more prone to bugs, although it would be doable. I've
> >> started rebasing the dirdata patch on my end to see how easy it is to
> >> mix the two. At a glance, they touch a lot of the same areas in
> >> similar ways, so it shouldn't be too hard. It's more of a question of
> >> which way we want to resolve that, and which patch goes first.
> >>
> >> I've been trying to figure out how many devices in the field are using
> >> casefolded encryption, but haven't found out yet. The code is
> >> definitely available though, so I would not be surprised if it's being
> >> used, or is about to be.
> >
> > The problem is in how the space after the filename in a directory is
> > encoded.  The dirdata format is (mildly) expandable, supporting up to
> > 4 different metadata chunks after the filename, using a very
> > compatctly encoded TLV (or moral equivalent) scheme.  For directory
> > inodes that have both the encyption and compression flags set, we have
> > a single blob which gets used as the IV for the crypto.
> >
> > So it's the difference between a simple blob that is only used for one
> > thing in this particular case, and something which is the moral
> > equivalent of simple ASN.1 or protobuf encoding.
> >
> > Currently, datadata has defined uses for 2 of the 4 "chunks", which is
> > used in Lustre servers.  The proposal which Andreas has suggested is
> > if the dirdata feature is supported, then the 3rd dirdata chunk would
> > be used for the case where we currently used by the
> > encrypted-casefolded extension, and the 4th would get reserved for a
> > to-be-defined extension mechanism.
> >
> > If there ext4 encrypted/casefold is not yet in use, and we can get the
> > changes out to all potential users before they release products out
> > into the field, then one approach would be to only support
> > encrypted/casefold when dirdata is also enabled.
> >
> > If ext4 encrypted/casefold is in use, my suggestion is that we support
> > both encrypted/casefold && !dirdata as you have currently implemented
> > it, and encrypted/casefold && dirdata as Andreas has proposed.
> >
> > IIRC, supporting that Andreas's scheme essentially means that we use
> > the top four bits in the rec_len field to indicate which chunks are
> > present, and then for each chunk which is present, there is a 1 byte
> > length followed by payload.  So that means in the case where it's
> > encrypted/casefold && dirdata, the required storage of the directory
> > entry would take one additional byte, plus setting a bit indicating
> > that the encrypted/casefold dirdata chunk was present.
>
> I think your email already covers pretty much all of the points.
>
> One small difference between current "raw" encrypted/casefold hash vs.
> dirdata is that the former is 4-byte aligned within the dirent, while
> dirdata is packed.  So in 3/4 cases dirdata would take the same amount
> of space (the 1-byte length would use one of the 1-3 bytes of padding
> vs. the raw format), since the next dirent needs to be aligned anyway.
>
> The other implication here is that the 8-byte hash may need to be
> copied out of the dirent into a local variable before use, due to
> alignment issues, but I'm not sure if that is actually needed or not.
>
> > So, no, they aren't incompatible ultimatly, but it might require a
> > tiny bit more work to integrate the combined support for dirdata plus
> > encrypted/casefold.  One way we can do this, if we have to support the
> > current encrypted/casefold format because it's out there in deployed
> > implementations already, is to integrate encrypted/casefold &&
> > !dirdata first upstream, and then when we integrate dirdata into
> > upstream, we'll have to add support for the encrypted/casefold &&
> > dirdata case.  This means that we'll have two variants of the on-disk
> > format to test and support, but I don't think it's the going to be
> > that difficult.
>
> It would be possible to detect if the encrypted/casefold+dirdata
> variant is in use, because the dirdata variant would have the 0x40
> bit set in the 

Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-17 Thread Andreas Dilger
On Feb 17, 2021, at 9:08 AM, Theodore Ts'o  wrote:
> 
> On Tue, Feb 16, 2021 at 08:01:11PM -0800, Daniel Rosenberg wrote:
>> I'm not sure what the conflict is, at least format-wise. Naturally,
>> there would need to be some work to reconcile the two patches, but my
>> patch only alters the format for directories which are encrypted and
>> casefolded, which always must have the additional hash field. In the
>> case of dirdata along with encryption and casefolding, couldn't we
>> have the dirdata simply follow after the existing data? Since we
>> always already know the length, it'd be unambiguous where that would
>> start. Casefolding can only be altered on an empty directory, and you
>> can only enable encryption for an empty directory, so I'm not too
>> concerned there. I feel like having it swapping between the different
>> methods makes it more prone to bugs, although it would be doable. I've
>> started rebasing the dirdata patch on my end to see how easy it is to
>> mix the two. At a glance, they touch a lot of the same areas in
>> similar ways, so it shouldn't be too hard. It's more of a question of
>> which way we want to resolve that, and which patch goes first.
>> 
>> I've been trying to figure out how many devices in the field are using
>> casefolded encryption, but haven't found out yet. The code is
>> definitely available though, so I would not be surprised if it's being
>> used, or is about to be.
> 
> The problem is in how the space after the filename in a directory is
> encoded.  The dirdata format is (mildly) expandable, supporting up to
> 4 different metadata chunks after the filename, using a very
> compatctly encoded TLV (or moral equivalent) scheme.  For directory
> inodes that have both the encyption and compression flags set, we have
> a single blob which gets used as the IV for the crypto.
> 
> So it's the difference between a simple blob that is only used for one
> thing in this particular case, and something which is the moral
> equivalent of simple ASN.1 or protobuf encoding.
> 
> Currently, datadata has defined uses for 2 of the 4 "chunks", which is
> used in Lustre servers.  The proposal which Andreas has suggested is
> if the dirdata feature is supported, then the 3rd dirdata chunk would
> be used for the case where we currently used by the
> encrypted-casefolded extension, and the 4th would get reserved for a
> to-be-defined extension mechanism.
> 
> If there ext4 encrypted/casefold is not yet in use, and we can get the
> changes out to all potential users before they release products out
> into the field, then one approach would be to only support
> encrypted/casefold when dirdata is also enabled.
> 
> If ext4 encrypted/casefold is in use, my suggestion is that we support
> both encrypted/casefold && !dirdata as you have currently implemented
> it, and encrypted/casefold && dirdata as Andreas has proposed.
> 
> IIRC, supporting that Andreas's scheme essentially means that we use
> the top four bits in the rec_len field to indicate which chunks are
> present, and then for each chunk which is present, there is a 1 byte
> length followed by payload.  So that means in the case where it's
> encrypted/casefold && dirdata, the required storage of the directory
> entry would take one additional byte, plus setting a bit indicating
> that the encrypted/casefold dirdata chunk was present.

I think your email already covers pretty much all of the points.

One small difference between current "raw" encrypted/casefold hash vs.
dirdata is that the former is 4-byte aligned within the dirent, while
dirdata is packed.  So in 3/4 cases dirdata would take the same amount
of space (the 1-byte length would use one of the 1-3 bytes of padding
vs. the raw format), since the next dirent needs to be aligned anyway.

The other implication here is that the 8-byte hash may need to be
copied out of the dirent into a local variable before use, due to
alignment issues, but I'm not sure if that is actually needed or not.

> So, no, they aren't incompatible ultimatly, but it might require a
> tiny bit more work to integrate the combined support for dirdata plus
> encrypted/casefold.  One way we can do this, if we have to support the
> current encrypted/casefold format because it's out there in deployed
> implementations already, is to integrate encrypted/casefold &&
> !dirdata first upstream, and then when we integrate dirdata into
> upstream, we'll have to add support for the encrypted/casefold &&
> dirdata case.  This means that we'll have two variants of the on-disk
> format to test and support, but I don't think it's the going to be
> that difficult.

It would be possible to detect if the encrypted/casefold+dirdata
variant is in use, because the dirdata variant would have the 0x40
bit set in the file_type byte.  It isn't possible to positively
identify the "raw" non-dirdata variant, but the assumption would be
if (rec_len >= round_up(name_len, 4) + 8) in an encrypted+casefold
directory that the "raw" hash 

Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-17 Thread Theodore Ts'o
On Tue, Feb 16, 2021 at 08:01:11PM -0800, Daniel Rosenberg wrote:
> I'm not sure what the conflict is, at least format-wise. Naturally,
> there would need to be some work to reconcile the two patches, but my
> patch only alters the format for directories which are encrypted and
> casefolded, which always must have the additional hash field. In the
> case of dirdata along with encryption and casefolding, couldn't we
> have the dirdata simply follow after the existing data? Since we
> always already know the length, it'd be unambiguous where that would
> start. Casefolding can only be altered on an empty directory, and you
> can only enable encryption for an empty directory, so I'm not too
> concerned there. I feel like having it swapping between the different
> methods makes it more prone to bugs, although it would be doable. I've
> started rebasing the dirdata patch on my end to see how easy it is to
> mix the two. At a glance, they touch a lot of the same areas in
> similar ways, so it shouldn't be too hard. It's more of a question of
> which way we want to resolve that, and which patch goes first.
> 
> I've been trying to figure out how many devices in the field are using
> casefolded encryption, but haven't found out yet. The code is
> definitely available though, so I would not be surprised if it's being
> used, or is about to be.

The problem is in how the space after the filename in a directory is
encoded.  The dirdata format is (mildly) expandable, supporting up to
4 different metadata chunks after the filename, using a very
compatctly encoded TLV (or moral equivalent) scheme.  For directory
inodes that have both the encyption and compression flags set, we have
a single blob which gets used as the IV for the crypto.

So it's the difference between a simple blob that is only used for one
thing in this particular case, and something which is the moral
equivalent of simple ASN.1 or protobuf encoding.

Currently, datadata has defined uses for 2 of the 4 "chunks", which is
used in Lustre servers.  The proposal which Andreas has suggested is
if the dirdata feature is supported, then the 3rd dirdata chunk would
be used for the case where we currently used by the
encrypted-casefolded extension, and the 4th would get reserved for a
to-be-defined extension mechanism.

If there ext4 encrypted/casefold is not yet in use, and we can get the
changes out to all potential users before they release products out
into the field, then one approach would be to only support
encrypted/casefold when dirdata is also enabled.

If ext4 encrypted/casefold is in use, my suggestion is that we support
both encrypted/casefold && !dirdata as you have currently implemented
it, and encrypted/casefold && dirdata as Andreas has proposed.

IIRC, supporting that Andreas's scheme essentially means that we use
the top four bits in the rec_len field to indicate which chunks are
present, and then for each chunk which is present, there is a 1 byte
length followed by payload.  So that means in the case where it's
encrypted/casefold && dirdata, the required storage of the directory
entry would take one additional byte, plus setting a bit indicating
that the encrypted/casefold dirdata chunk was present.

So, no, they aren't incompatible ultimatly, but it might require a
tiny bit more work to integrate the combined support for dirdata plus
encrypted/casefold.  One way we can do this, if we have to support the
current encrypted/casefold format because it's out there in deployed
implementations already, is to integrate encrypted/casefold &&
!dirdata first upstream, and then when we integrate dirdata into
upstream, we'll have to add support for the encrypted/casefold &&
dirdata case.  This means that we'll have two variants of the on-disk
format to test and support, but I don't think it's the going to be
that difficult.

Andreas, anything you'd like to correct or add in this summary?

- Ted


Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-16 Thread Daniel Rosenberg
I'm not sure what the conflict is, at least format-wise. Naturally,
there would need to be some work to reconcile the two patches, but my
patch only alters the format for directories which are encrypted and
casefolded, which always must have the additional hash field. In the
case of dirdata along with encryption and casefolding, couldn't we
have the dirdata simply follow after the existing data? Since we
always already know the length, it'd be unambiguous where that would
start. Casefolding can only be altered on an empty directory, and you
can only enable encryption for an empty directory, so I'm not too
concerned there. I feel like having it swapping between the different
methods makes it more prone to bugs, although it would be doable. I've
started rebasing the dirdata patch on my end to see how easy it is to
mix the two. At a glance, they touch a lot of the same areas in
similar ways, so it shouldn't be too hard. It's more of a question of
which way we want to resolve that, and which patch goes first.

I've been trying to figure out how many devices in the field are using
casefolded encryption, but haven't found out yet. The code is
definitely available though, so I would not be surprised if it's being
used, or is about to be.

-Daniel
On Tue, Feb 9, 2021 at 8:03 PM Theodore Ts'o  wrote:
>
> On Tue, Feb 09, 2021 at 08:03:10PM -0700, Andreas Dilger wrote:
> > Depending on the size of the "escape", it probably makes sense to move
> > toward having e2fsck migrate from the current mechanism to using dirdata
> > for all deployments.  In the current implementation, tools don't really
> > know for sure if there is data beyond the filename in the dirent or not.
>
> It's actually quite well defined.  If dirdata is enabled, then we
> follow the dirdata rules.  If dirdata is *not* enabled, then if a
> directory inode has the case folding and encryption flags set, then
> there will be cryptographic data immediately following the filename.
> Otherwise, there is no valid data after the filename.
>
> > For example, what if casefold is enabled on an existing filesystem that
> > already has an encrypted directory?  Does the code _assume_ that there is
> > a hash beyond the name if the rec_len is long enough for this?
>
> No, we will only expect there to be a hash beyond the name if
> EXT4_CASEFOLD_FL and EXT4_ENCRYPT_FL flags are set on the inode.  (And
> if the rec_len is not large enough, then that's a corrupted directory
> entry.)
>
> > I guess it is implicit with the casefold+encryption case for dirents in
> > directories that have the encryption flag set in a filesystem that also
> > has casefold enabled, but it's definitely not friendly to these features
> > being enabled on an existing filesystem.
>
> No, it's fine.  That's because the EXT4_CASEFOLD_FL inode flag can
> only be set if the EXT4_FEATURE_INCOMPAT_CASEFOLD is set in the
> superblock, and EXT4_ENCRYPT_FL inode flag can only be set if
> EXT4_FEATURE_INCOMPAT_ENCRYPT is set in the superblock, this is why it
> will be safe to enable of these features, since merely enabling the
> file system features only allows new directories to be created with
> both CASEFOLD_FL and ENCRYPT_FL set.
>
> The only restriction we would have is a file system has both the case
> folding and encryption features, it will *not* be safe to set the
> dirdata feature flag without first scanning all of the directories to
> see if there are any directories that have both the casefold and
> encrypt flags set on that inode, and if so, to convert all of the
> directory entries to use dirdata.  I don't think this is going to be a
> significant restriction in practice, though.
>
> - Ted
>
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.
>


Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-09 Thread Theodore Ts'o
On Tue, Feb 09, 2021 at 08:03:10PM -0700, Andreas Dilger wrote:
> Depending on the size of the "escape", it probably makes sense to move
> toward having e2fsck migrate from the current mechanism to using dirdata
> for all deployments.  In the current implementation, tools don't really
> know for sure if there is data beyond the filename in the dirent or not.

It's actually quite well defined.  If dirdata is enabled, then we
follow the dirdata rules.  If dirdata is *not* enabled, then if a
directory inode has the case folding and encryption flags set, then
there will be cryptographic data immediately following the filename.
Otherwise, there is no valid data after the filename.

> For example, what if casefold is enabled on an existing filesystem that
> already has an encrypted directory?  Does the code _assume_ that there is
> a hash beyond the name if the rec_len is long enough for this?

No, we will only expect there to be a hash beyond the name if
EXT4_CASEFOLD_FL and EXT4_ENCRYPT_FL flags are set on the inode.  (And
if the rec_len is not large enough, then that's a corrupted directory
entry.)

> I guess it is implicit with the casefold+encryption case for dirents in
> directories that have the encryption flag set in a filesystem that also
> has casefold enabled, but it's definitely not friendly to these features
> being enabled on an existing filesystem.

No, it's fine.  That's because the EXT4_CASEFOLD_FL inode flag can
only be set if the EXT4_FEATURE_INCOMPAT_CASEFOLD is set in the
superblock, and EXT4_ENCRYPT_FL inode flag can only be set if
EXT4_FEATURE_INCOMPAT_ENCRYPT is set in the superblock, this is why it
will be safe to enable of these features, since merely enabling the
file system features only allows new directories to be created with
both CASEFOLD_FL and ENCRYPT_FL set.

The only restriction we would have is a file system has both the case
folding and encryption features, it will *not* be safe to set the
dirdata feature flag without first scanning all of the directories to
see if there are any directories that have both the casefold and
encrypt flags set on that inode, and if so, to convert all of the
directory entries to use dirdata.  I don't think this is going to be a
significant restriction in practice, though.

- Ted




Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-09 Thread Andreas Dilger
On Feb 9, 2021, at 4:22 PM, Theodore Ts'o  wrote:
> 
> On Wed, Feb 03, 2021 at 11:31:28AM -0500, Theodore Ts'o wrote:
>> On Wed, Feb 03, 2021 at 03:55:06AM -0700, Andreas Dilger wrote:
>>> 
>>> It looks like this change will break the dirdata feature, which is similarly
>>> storing a data field beyond the end of the dirent. However, that feature 
>>> also
>>> provides for flags stored in the high bits of the type field to indicate
>>> which of the fields are in use there.
>>> The first byte of each field stores
>>> the length, so it can be skipped even if the content is not understood.
>> 
>> Daniel, for context, the dirdata field is an out-of-tree feature which
>> is used by Lustre, and so has fairly large deployed base.  So if there
>> is a way that we can accomodate not breaking dirdata, that would be
>> good.
>> 
>> Did the ext4 casefold+encryption implementation escape out to any
>> Android handsets?
> 
> So from an OOB chat with Daniel, it appears that the ext4
> casefold+encryption implementation did in fact escape out to Android
> handsets.  So I think what we will need to do, ultiumately, is support
> one way of supporting the casefold IV in the case where "encryption &&
> casefold", and another way when "encryption && casefold && dirdata".
> 
> That's going to be a bit sucky, but I don't think it should be that
> complex.  Daniel, Andreas, does that make sense to you?

I was just going to ping you about this, whether it made sense to remove
this feature addition from the "maint" branch (i.e. make a 1.45.8 without
it), and keep it only in 1.46 or "next" to reduce its spread?

Depending on the size of the "escape", it probably makes sense to move
toward having e2fsck migrate from the current mechanism to using dirdata
for all deployments.  In the current implementation, tools don't really
know for sure if there is data beyond the filename in the dirent or not.

I guess it is implicit with the casefold+encryption case for dirents in
directories that have the encryption flag set in a filesystem that also
has casefold enabled, but it's definitely not friendly to these features
being enabled on an existing filesystem.

For example, what if casefold is enabled on an existing filesystem that
already has an encrypted directory?  Does the code _assume_ that there is
a hash beyond the name if the rec_len is long enough for this?  There will
definitely be some pre-existing dirents that will have a large rec_len
(e.g. those at the end of the block, or with deleted entries immediately
following), that do *not* have the proper hash stored in them.  There may
be random garbage at the end of the dirent, and since every value in the
hash is valid, there is no way to know whether it is good or bad.

With the dirdata mechanism, there would be a bit set in the "file_type"
field that will indicate if the hash was present, as well as a length
field (0x08) that is a second confirmation that this field is valid.

Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP


Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-09 Thread Theodore Ts'o
On Wed, Feb 03, 2021 at 11:31:28AM -0500, Theodore Ts'o wrote:
> On Wed, Feb 03, 2021 at 03:55:06AM -0700, Andreas Dilger wrote:
> > 
> > It looks like this change will break the dirdata feature, which is similarly
> > storing a data field beyond the end of the dirent. However, that feature 
> > also
> > provides for flags stored in the high bits of the type field to indicate
> > which of the fields are in use there.
> > The first byte of each field stores
> > the length, so it can be skipped even if the content is not understood.
> 
> Daniel, for context, the dirdata field is an out-of-tree feature which
> is used by Lustre, and so has fairly large deployed base.  So if there
> is a way that we can accomodate not breaking dirdata, that would be
> good.
> 
> Did the ext4 casefold+encryption implementation escape out to any
> Android handsets?

So from an OOB chat with Daniel, it appears that the ext4
casefold+encryption implementation did in fact escape out to Android
handsets.  So I think what we will need to do, ultiumately, is support
one way of supporting the casefold IV in the case where "encryption &&
casefold", and another way when "encryption && casefold && dirdata".

That's going to be a bit sucky, but I don't think it should be that
complex.  Daniel, Andreas, does that make sense to you?

   - Ted


Re: [PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-03 Thread Theodore Ts'o
On Wed, Feb 03, 2021 at 03:55:06AM -0700, Andreas Dilger wrote:
> 
> It looks like this change will break the dirdata feature, which is similarly
> storing a data field beyond the end of the dirent. However, that feature also
> provides for flags stored in the high bits of the type field to indicate
> which of the fields are in use there.
> The first byte of each field stores
> the length, so it can be skipped even if the content is not understood.

Daniel, for context, the dirdata field is an out-of-tree feature which
is used by Lustre, and so has fairly large deployed base.  So if there
is a way that we can accomodate not breaking dirdata, that would be
good.

Did the ext4 casefold+encryption implementation escape out to any
Android handsets?

Thanks,

- Ted


[PATCH 1/2] ext4: Handle casefolding with encryption

2021-02-03 Thread Daniel Rosenberg
This adds support for encryption with casefolding.

Since the name on disk is case preserving, and also encrypted, we can no
longer just recompute the hash on the fly. Additionally, to avoid
leaking extra information from the hash of the unencrypted name, we use
siphash via an fscrypt v2 policy.

The hash is stored at the end of the directory entry for all entries
inside of an encrypted and casefolded directory apart from those that
deal with '.' and '..'. This way, the change is backwards compatible
with existing ext4 filesystems.

Signed-off-by: Daniel Rosenberg 
Signed-off-by: Paul Lawrence 
---
 Documentation/filesystems/ext4/directory.rst |  27 ++
 fs/ext4/dir.c|  46 ++-
 fs/ext4/ext4.h   |  62 +++-
 fs/ext4/hash.c   |  25 +-
 fs/ext4/ialloc.c |   5 +-
 fs/ext4/inline.c |  41 +--
 fs/ext4/namei.c  | 308 +--
 fs/ext4/super.c  |   6 -
 8 files changed, 373 insertions(+), 147 deletions(-)

diff --git a/Documentation/filesystems/ext4/directory.rst 
b/Documentation/filesystems/ext4/directory.rst
index 073940cc64ed..55f618b37144 100644
--- a/Documentation/filesystems/ext4/directory.rst
+++ b/Documentation/filesystems/ext4/directory.rst
@@ -121,6 +121,31 @@ The directory file type is one of the following values:
* - 0x7
  - Symbolic link.
 
+To support directories that are both encrypted and casefolded directories, we
+must also include hash information in the directory entry. We append
+``ext4_extended_dir_entry_2`` to ``ext4_dir_entry_2`` except for the entries
+for dot and dotdot, which are kept the same. The structure follows immediately
+after ``name`` and is included in the size listed by ``rec_len`` If a directory
+entry uses this extension, it may be up to 271 bytes.
+
+.. list-table::
+   :widths: 8 8 24 40
+   :header-rows: 1
+
+   * - Offset
+ - Size
+ - Name
+ - Description
+   * - 0x0
+ - \_\_le32
+ - hash
+ - The hash of the directory name
+   * - 0x4
+ - \_\_le32
+ - minor\_hash
+ - The minor hash of the directory name
+
+
 In order to add checksums to these classic directory blocks, a phony
 ``struct ext4_dir_entry`` is placed at the end of each leaf block to
 hold the checksum. The directory entry is 12 bytes long. The inode
@@ -322,6 +347,8 @@ The directory hash is one of the following values:
  - Half MD4, unsigned.
* - 0x5
  - Tea, unsigned.
+   * - 0x6
+ - Siphash.
 
 Interior nodes of an htree are recorded as ``struct dx_node``, which is
 also the full length of a data block:
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index ca50c90adc4c..9da6db183d4f 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -30,6 +30,8 @@
 #include "ext4.h"
 #include "xattr.h"
 
+#define DOTDOT_OFFSET 12
+
 static int ext4_dx_readdir(struct file *, struct dir_context *);
 
 /**
@@ -55,6 +57,19 @@ static int is_dx_dir(struct inode *inode)
return 0;
 }
 
+static bool is_fake_entry(struct inode *dir, ext4_lblk_t lblk,
+ unsigned int offset, unsigned int blocksize)
+{
+   /* Entries in the first block before this value refer to . or .. */
+   if (lblk == 0 && offset <= DOTDOT_OFFSET)
+   return true;
+   /* Check if this is likely the csum entry */
+   if (ext4_has_metadata_csum(dir->i_sb) && offset % blocksize ==
+   blocksize - sizeof(struct ext4_dir_entry_tail))
+   return true;
+   return false;
+}
+
 /*
  * Return 0 if the directory entry is OK, and 1 if there is a problem
  *
@@ -67,22 +82,28 @@ int __ext4_check_dir_entry(const char *function, unsigned 
int line,
   struct inode *dir, struct file *filp,
   struct ext4_dir_entry_2 *de,
   struct buffer_head *bh, char *buf, int size,
+  ext4_lblk_t lblk,
   unsigned int offset)
 {
const char *error_msg = NULL;
const int rlen = ext4_rec_len_from_disk(de->rec_len,
dir->i_sb->s_blocksize);
const int next_offset = ((char *) de - buf) + rlen;
+   unsigned int blocksize = dir->i_sb->s_blocksize;
+   bool fake = is_fake_entry(dir, lblk, offset, blocksize);
+   bool next_fake = is_fake_entry(dir, lblk, next_offset, blocksize);
 
-   if (unlikely(rlen < EXT4_DIR_REC_LEN(1)))
+   if (unlikely(rlen < ext4_dir_rec_len(1, fake ? NULL : dir)))
error_msg = "rec_len is smaller than minimal";
else if (unlikely(rlen % 4 != 0))
error_msg = "rec_len % 4 != 0";
-   else if (unlikely(rlen < EXT4_DIR_REC_LEN(de->name_len)))
+   else if (unlikely(rlen < ext4_dir_rec_len(de->name_len,
+   fake ? NULL :