Re: [PATCH] loop: remember whether sysfs_create_group() succeeded

2018-05-08 Thread Milan Broz
On 05/05/2018 01:49 PM, Tetsuo Handa wrote:
> Milan Broz wrote:
>>> Do we want to abort LOOP_SET_FD request if sysfs_create_group() failed?
>>
>> I would prefer failure - there are several utilities that expects attributes 
>> in
>> sysfs to be valid (for example I print info from here in cryptsetup status
>> if the backing image is an image), so ignoring failure put the system
>> in inconsistent state.
> 
> I see. But can we for now send v1 patch for 4.17 release (and postpone making
> LOOP_SET_FD request fail if sysfs_create_group() failed)? This bug has so far
> crashed syzbot tests for 6432 times in 190 days.

Jens already merged it in the block git. So syzbot should be more happy now :)
> We have a lot of bugs regarding loop module which prevent syzbot from
> finding other bugs. I want to immediately squash bugs in block/loop so that
> we can reduce false-positive hung task reports.

Sure, syzbot is definitely very useful idea, thanks!

Milan



Re: [PATCH] loop: remember whether sysfs_create_group() succeeded

2018-05-05 Thread Milan Broz
On 05/04/2018 04:40 PM, Tetsuo Handa wrote:
> The loop module ignores sysfs_create_group() failure and pretends that
> LOOP_SET_FD request succeeded. I guess that the author of commit
> ee86273062cbb310 ("loop: add some basic read-only sysfs attributes")
> assumed that it is not a fatal error enough to abort LOOP_SET_FD request.

IIRC we added sysfs attributes to easily access loop info for a regular user
in lsblk command (and perhaps even in udev rules).

The ioctl interface was still controlling the loop device, so the failure
here was not meant to be fatal. TBH I think was not a great idea.
 
> Do we want to abort LOOP_SET_FD request if sysfs_create_group() failed?

I would prefer failure - there are several utilities that expects attributes in
sysfs to be valid (for example I print info from here in cryptsetup status
if the backing image is an image), so ignoring failure put the system
in inconsistent state.

Thanks for fixing this,
Milan


Re: [PATCH v2 0/3] loop: LO_FLAGS_BLOCKSIZE fixes

2017-08-22 Thread Milan Broz
On 08/18/2017 09:27 PM, Omar Sandoval wrote:
> From: Omar Sandoval 
> 
> Patches 1 and 3 are from the original series.
> 
> Patch 2 gets rid of the redundant struct loop_device.lo_logical_blocksize
> in favor of using the queue's own logical_block_size. Karel, I decided
> against adding another sysfs entry since it will always be the same as
> queue/logical_block_size, is that alright with you?
> 
> Omar Sandoval (3):
>   loop: fix hang if LOOP_SET_STATUS gets invalid blocksize or encrypt
> type
>   loop: use queue limit instead of private lo_logical_blocksize
>   loop: always return block size in LOOP_GET_STATUS

I tested these with cryptsetup & loop mapping with added block size setting
and it fixes loop problems I see in 4.13.

I think all these patches should go to 4.13.

Thanks!
Milan


Re: [PATCH] loop: Fix freeze if configured block size is not supported

2017-08-21 Thread Milan Broz
On 08/21/2017 08:47 PM, Omar Sandoval wrote:
> On Fri, Aug 18, 2017 at 03:07:33PM +0200, Milan Broz wrote:
>> The commit f2c6df7dbf9a60e1cd9941f9fb376d4d9ad1e8dd
>>
>> loop: support 4k physical blocksize
>>
>> adds support for loop block size with only specific block sizes.
>>
>> If the size is not supported, the code returns -EINVAL keeping
>> the loop queue frozen. This causes that device could be locked
>> for a long time by processes trying to scan device (udev).
>> (And also causing subsequent LOOP_CLR_FD operations noop.)
>>
>> Fix it by using goto to proper exit location with queue unfreeze.
>>
>> (The same bug is for setting crypt attribute but this code is
>> probably no more used. Patch fixes it as well though.)
>>
>> Signed-off-by: Milan Broz <gmazyl...@gmail.com>
> 
> Heh, I sent the same patch [1] only hours before :) v2 is here [2] if
> you want to give it a reviewed-by.

Yes, and I noticed it 10 seconds after I sent my patch :)
You can add reviewed by, if it helps anything...

Actually you fixed another problems there with following patches
(physical blocks sizes), we discussed this with Karel
that it need some changes because the original patch caused
that reported blocks differs between old and new kernel (lsblk -t),
even if block size was not used.

I will test them as well (we have code in cryptsetup that allocates
loop device automatically if the image is in file and I would like
to add block size setting there).

Thanks,
Milan

> 
> 1: https://marc.info/?l=linux-block=150303694018659=2
> 2: https://marc.info/?l=linux-block=150308446732748=2
> 


[PATCH] loop: Fix freeze if configured block size is not supported

2017-08-18 Thread Milan Broz
The commit f2c6df7dbf9a60e1cd9941f9fb376d4d9ad1e8dd

loop: support 4k physical blocksize

adds support for loop block size with only specific block sizes.

If the size is not supported, the code returns -EINVAL keeping
the loop queue frozen. This causes that device could be locked
for a long time by processes trying to scan device (udev).
(And also causing subsequent LOOP_CLR_FD operations noop.)

Fix it by using goto to proper exit location with queue unfreeze.

(The same bug is for setting crypt attribute but this code is
probably no more used. Patch fixes it as well though.)

Signed-off-by: Milan Broz <gmazyl...@gmail.com>
---
 drivers/block/loop.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ef8334949b42..26548e07bc31 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1125,11 +1125,15 @@ loop_set_status(struct loop_device *lo, const struct 
loop_info64 *info)
if (info->lo_encrypt_type) {
unsigned int type = info->lo_encrypt_type;
 
-   if (type >= MAX_LO_CRYPT)
-   return -EINVAL;
+   if (type >= MAX_LO_CRYPT) {
+   err = -EINVAL;
+   goto exit;
+   }
xfer = xfer_funcs[type];
-   if (xfer == NULL)
-   return -EINVAL;
+   if (xfer == NULL) {
+   err = -EINVAL;
+   goto exit;
+   }
} else
xfer = NULL;
 
@@ -1144,10 +1148,14 @@ loop_set_status(struct loop_device *lo, const struct 
loop_info64 *info)
if (LO_INFO_BLOCKSIZE(info) != 512 &&
LO_INFO_BLOCKSIZE(info) != 1024 &&
LO_INFO_BLOCKSIZE(info) != 2048 &&
-   LO_INFO_BLOCKSIZE(info) != 4096)
-   return -EINVAL;
-   if (LO_INFO_BLOCKSIZE(info) > lo->lo_blocksize)
-   return -EINVAL;
+   LO_INFO_BLOCKSIZE(info) != 4096) {
+   err = -EINVAL;
+   goto exit;
+   }
+   if (LO_INFO_BLOCKSIZE(info) > lo->lo_blocksize) {
+   err = -EINVAL;
+   goto exit;
+   }
}
 
if (lo->lo_offset != info->lo_offset ||
-- 
2.14.1



Re: [PATCH] bio-integrity: revert "stop abusing bi_end_io"

2017-08-08 Thread Milan Broz
Hi,

On 08/07/2017 05:48 PM, Martin K. Petersen wrote:
> 
>> If you create the integrity tag at or above device mapper level, you
>> will run into problems because the same device can be accessed using
>> device mapper and using physical volume /dev/sd*. If you create
>> integrity tags at device mapper level, they will contain device
>> mapper's logical sector number and the sector number won't match if
>> you access the device directly using /dev/sd*.
> 
> For writes, the bip seed value is adjusted every time a bio is cloned,
> sliced and diced as it traverses partitioning/MD/DM. And then at the
> bottom of the stack, the ref tag values in the protection information
> buffer are verified against the adjusted seed value and remapped
> according to the starting logical block number. The reverse is taking
> place for reads.
> 
> This is much faster than doing a remapping of the actual protection
> buffer values every time the I/O transitions from one address space to
> the other. In addition, some HBA hardware allows us to program the PI
> engine with the seed value. So the submitter value to LBA conversion can
> be done on the fly in hardware.

Yes, this is how enterprise storage works and how the PI was designed.

For authenticated encryption with additional data (AEAD) we have to authenticate
the sector number as well (as part or additional data - authenticated
but not encrypted part).

Unfortunately the whole design cannot work this way if the AEAD is implemented
on some higher layer. (Higher layer has a big advantage that you do not need to 
trust
underlying storage stack because you will detect even intentional tampering 
with data.)
 So, only the layer that own the keys (here dmcrypt) can calculate or verify 
auth. tags
(and then decrypt data). Nobody else can "remap" this sector in tag - without 
the key
you cannot recalculate auth. tags.

In other words, we just treat the whole additional data (delivered in 
bio-integrity)
as "Application tag" (if using DIF terminology...)


Anyway, could we please fix the 4.13 kernel now to not crash with that
dm-integrity use? This is urgent for people that already play with dm-integrity 
from 4.12.


If you want a following discussion, I would really welcome to see what exactly
is wrong because I think we used the bio-integrity interface through existing
API and according to doc.
Just we are unfortunately the first user for own profile (different than
these defined in T10).
And I think it can help in future to people that will try to implement
bio-integrity-aware filesystem as well.

Thanks,
Milan



Re: [RFC PATCH] bio-integrity: Fix regression if profile verify_fn is NULL

2017-08-02 Thread Milan Broz

On 08/02/2017 04:11 PM, Martin K. Petersen wrote:
>> And the integrity profile is perfect interface for this, we register
>> own profile through the proper interface.  (Any other solution for
>> per-sector metadata would be worse, I tried...)
> 
> The DM use case seems a bit weird and I would like to take a closer look
> later (a storm took out my power and internet so I'm a bit logistically
> impaired right now).

Hi Martin,

thanks!

I think it is actually pretty straightforward (if it can be said
about anything in the device-mapper little walled garden :-)

For the authenticated encryption (on the sector level) we cannot use
a length-preserving encryption (as it is dm-crypt in normal mode) but we need
some per-sector metadata to store additional authentication tag.

This is exactly what DIF/T10-PI already does, just we need more flexibility
(and larger metadata to use cryptographically sound modes).

So we developed dm-integrity target that takes a normal block device,
stores per-sector metadata on-disk in special sectors (interleaved with normal
data sectors).
(There is a journalling to provide atomicity of data + metadata writes.)

The dm-integrity can calculate some non-cryptographic integrity data itself,
but this mode does not use block integrity extension at all so it is not 
important here.

The second use case (that is currently broken by the block layer 4.13 changes)
is that dm-integrity can register a new integrity profile for the virtual device
(on top of normal block device) and present data sectors as normal bio layer and
metadata as the bio-integrity layer (dm-crypt will receive bio with the metadata
in integrity fields).

IOW dm-integrity is just "emulator" for the integrity-enabled block device,
just with configurable per-sector metadata size.

Then we can place dm-crypt above this device and process the bio the same way
as dm-crypt does, just it will add authentication tag mapped to the metadata
for that special integrity profile.

The dm-crypt already creates bio clones (and already owns the bio clone memory)
so we do exactly the same for integrity part of bio (clearing the 
BIP_BLOCK_INTEGRITY
flag to indicate that block layer should not free this memory).

Otherwise I think we use bio the exact way how the driver should register
and alloc memory for own integrity profile (see dm_crypt_integrity_io_alloc 
function).

Milan


> But the intent of the integrity infrastructure was
> to be able to carry arbitrary metadata pinned to an I/O. The callback
> hooks in the profile were really only intended for the case where the
> block layer itself needed to generate and verify.
> 
> We already do sanity checking on the callback pointers in the prep
> function. I guess don't have a problem doing the same in the completion
> path.
> 
> Fundamentally, though, the verify function should only ever be called if
> the profile has BLK_INTEGRITY_VERIFY set.
> 
> Previously that was done in the prep function by only adding the verify
> endio hook when it was needed. Christoph's patch to kill the double
> endio changed that subtly (FWIW, Jens originally objected to having an
> integrity conditional in the regular bio completion path. That's why the
> verification handling abused the endio function).
> 
> Anyway. So I think that the BLK_INTEGRITY_VERIFY logic needs to be
> carried over to __bio_integrity_endio()...
> 


[RFC PATCH] bio-integrity: Fix regression if profile verify_fn is NULL

2017-08-02 Thread Milan Broz
In dm-integrity target we register integrity profile that have
both generate_fn and verify_fn callbacks set to NULL.

This is used if dm-integrity is stacked under a dm-crypt device
for authenticated encryption (integrity payload contains authentication
tag and IV seed).

In this case the verification is done through own crypto API
processing inside dm-crypt; integrity profile is only holder
of these data. (And memory is owned by dm-crypt as well.)

After the commit (and previous changes)
  Commit 7c20f11680a441df09de7235206f70115fbf6290
  Author: Christoph Hellwig <h...@lst.de>
  Date:   Mon Jul 3 16:58:43 2017 -0600

bio-integrity: stop abusing bi_end_io

we get this crash:

: BUG: unable to handle kernel NULL pointer dereference at   (null)
: IP:   (null)
: *pde = 
...
:
: Workqueue: kintegrityd bio_integrity_verify_fn
: task: f48ae180 task.stack: f4b5c000
: EIP:   (null)
: EFLAGS: 00210286 CPU: 0
: EAX: f4b5debc EBX: 1000 ECX: 0001 EDX: 
: ESI: 1000 EDI: ed25f000 EBP: f4b5dee8 ESP: f4b5dea4
:  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
: CR0: 80050033 CR2:  CR3: 32823000 CR4: 001406d0
: Call Trace:
:  ? bio_integrity_process+0xe3/0x1e0
:  bio_integrity_verify_fn+0xea/0x150
:  process_one_work+0x1c7/0x5c0
:  worker_thread+0x39/0x380
:  kthread+0xd6/0x110
:  ? process_one_work+0x5c0/0x5c0
:  ? kthread_worker_fn+0x100/0x100
:  ? kthread_worker_fn+0x100/0x100
:  ret_from_fork+0x19/0x24
: Code:  Bad EIP value.
: EIP:   (null) SS:ESP: 0068:f4b5dea4
: CR2: 

Patch just skip the whole verify workqueue if verify_fn is set to NULL.

Signed-off-by: Milan Broz <gmazyl...@gmail.com>
---
 block/bio-integrity.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 83e92beb3c9f..b9d1580bfc13 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -387,7 +387,9 @@ static void bio_integrity_verify_fn(struct work_struct 
*work)
  */
 bool __bio_integrity_endio(struct bio *bio)
 {
-   if (bio_op(bio) == REQ_OP_READ && !bio->bi_status) {
+   struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
+
+   if (bi->profile->verify_fn && bio_op(bio) == REQ_OP_READ && 
!bio->bi_status) {
struct bio_integrity_payload *bip = bio_integrity(bio);
 
INIT_WORK(>bip_work, bio_integrity_verify_fn);
-- 
2.13.3



Re: [RFC PATCH 0/4] Allow file systems to selectively bypass dm-crypt

2017-06-15 Thread Milan Broz
On 06/15/2017 07:24 PM, Michael Halcrow wrote:
...
>> If this is accepted, we basically allow attacker to trick system to
>> write plaintext to media just by setting this flag. This must never
>> ever happen with FDE - BY DESIGN.
> 
> That's an important point.  This expands the attack surface to include
> the file system, so if an adversary can provide a bad encryption key
> or policy at the file system layer or if there's a bug in the file
> system that an adversary can exploit, then users setting the
> allow_encrypt_override option on dmcrypt would be vulnerable.
> 
>> For me, ABSOLUTE NACK to this approach.
> 
> I can understand where you're coming from.  Given our challenges on
> mobile, do you have any suggestions for an alternative approach?

Well, what I really do not like here that you are solving problem
of missing properly designed cryptographic filesystem by hacking
some layers together that were not designed to it.

Obvious solution is to implement encryption in fs properly and encrypt
fs metadata there. I guess this is your long term goal?

Is the double encryption really such a big problem?
With some hw support it should not cost much time and energy.
Do you have some numbers to show that is it real problem?

What I am missing here is the definition of your threat model.

If the encryption of metadata in block layer is ok for you, why it is not
ok for the data?

What are you solving there? Is it that one user must not see other users data?
(And who is an user on a mobile device - an application in own sandbox?)

Because the confidentiality in the case of stolen device is there with
encryption on any layer. And ext4 encryption uses the same algorithms
as dmcrypt IIRC.

It would better to go with some model that actually increases security.

For example, if your patch marks IO operation that comes from fs as 
REQ_NOENCRYPT,
could fs send some metadata information in bio of owner (user, translated to 
key id)
instead and encrypt the sector with a different key?

I am not sure how complicated this would be to implement but we have already
concept of multiple keys in dmcrypt (there is 64 keys for loopAES mapping
and the used key id is calculated as sector# modulo 64).

Maybe the keys can be taken from kernel keyring and loaded dynamically,
according to optional table parameters that defines policy for it.
The IV could be shared (in XTS mode it is not a problem).
If the key is not available, the bio should simply fail early.

Dunno, just an idea - it really depends what are you trying to solve.

Milan


Re: [RFC PATCH 0/4] Allow file systems to selectively bypass dm-crypt

2017-06-15 Thread Milan Broz
On 06/15/2017 01:40 AM, Michael Halcrow wrote:
> Several file systems either have already implemented encryption or are
> in the process of doing so.  This addresses usability and storage
> isolation requirements on mobile devices and in multi-tenant
> environments.
> 
> While distinct keys locked down to user accounts protect the names and
> contents of individual files, a volume-level encryption key should
> protect the file system metadata.  When using dm-crypt to do this, the
> blocks that the file system encrypts undergo another round of
> encryption.  In many contexts this isn't necessary, and it results in
> decreased performance and increased power consumption.  This
> discourages users and vendors from taking steps to protect file system
> metadata in addition to file contents.
> 
> This patchset provides a new I/O request flag that suggests to lower
> layers that encryption may not be necessary because the file system
> has already done it.  The first patch targets the block subsystem and
> adds the REQ_NOENCRYPT flag.  The second patch targets dm-crypt and
> adds logic to observe the flag only when the user has opted-in by
> passing allow_encrypt_override as one of the optional parameters to
> the dm-crypt target.  The third patch targets ext4 and sets the
> REQ_NOENCRYPT flag for blocks it encrypts and decrypts.  The fourth
> patch does the same for f2fs.
> 
> In this patchset I'm proposing that we make dm-crypt's observation of
> the file system "don't encrypt" request optional, but I'm not sure
> that's a good tradeoff.  Under the assumption that encryption in
> upstream file systems is sound, the most common way I expect things
> can go awry is if the file system keys don't meet security
> requirements while dm-crypt keys do.  However I'm not convinced that
> will end up being a widespread problem in practice, and there's a real
> data corruption concern with making dm-crypt's observation of the flag
> optional.
> 
> The problem is that once dm-crypt observes the flag, it must always
> continue to do so or dm-crypt might decrypt content that it didn't
> encrypt.  This can lead to scenarios where a vendor sets the dm-crypt
> option to observe the "don't encrypt" flag, and then down the road a
> user follows one of the many online guides to manually recover a
> dm-crypt partition without setting this new option.
> 
> Should there be an encryption disable flag?  I'm interested in
> considering other solutions.

The whole reason for full disk encryption (FDE) is the it is FULL disk
encryption.

If you do not need encryption on dmcrypt level, just do not use it
by properly configuring storage stack!

The file-level encryption and dm-crypt encryption can have completely different
purpose, even one can be authenticated one (providing integrity)
and second just providing confidentiality.

It is not "redundant" encryption, it is the encryption for different purpose
on different layer.

IMO what you are proposing is incredible security hack and very ugly
layer violation.

If this is accepted, we basically allow attacker to trick system to write
plaintext to media just by setting this flag. This must never ever happen
with FDE - BY DESIGN.

For me, ABSOLUTE NACK to this approach.

(cc dmcrypt list as well)

Milan


> 
> Michael Halcrow (4):
>   block: Add bio req flag to disable encryption in block
>   dm-crypt: Skip encryption of file system-encrypted blocks
>   ext4: Set the bio REQ_NOENCRYPT flag
>   f2fs: Set the bio REQ_NOENCRYPT flag
> 
>  drivers/md/dm-crypt.c | 17 +
>  fs/crypto/bio.c   |  2 +-
>  fs/ext4/ext4.h|  3 +++
>  fs/ext4/inode.c   | 13 -
>  fs/ext4/page-io.c |  5 +
>  fs/ext4/readpage.c|  3 ++-
>  fs/f2fs/data.c| 10 --
>  include/linux/blk_types.h |  2 ++
>  8 files changed, 42 insertions(+), 13 deletions(-)
> 



Re: dm-crypt: Fix error with too large bios

2016-08-31 Thread Milan Broz
On 08/31/2016 12:27 AM, Mikulas Patocka wrote:

...
> 
> Drop that "#ifdef CONFIG_BCACHE". Anyone should be allowed to create a big 
> bio, not just bcache.

Yes. Please, do not hide it behind #ifdef.
If it is in code, it should be enabled always.

There can third party modules or some new code appears and creating strange
config dependence only adds more problems later.

Milan

> 
> That one test has no performance impact, there is no need to hide it 
> behind #ifdef.
> 
> Mikulas
> 
>> From: Mikulas Patocka 
>> Date: Tue, 30 Aug 2016 16:38:42 -0400
>> Subject: [PATCH] dm crypt: fix error with too large bcache bios
>>
>> When dm-crypt processes writes, it allocates a new bio in
>> crypt_alloc_buffer().  The bio is allocated from a bio set and it can
>> have at most BIO_MAX_PAGES vector entries, however the incoming bio can be
>> larger if it was allocated by bcache.  If the incoming bio is larger,
>> bio_alloc_bioset() fails and an error is returned.
>>
>> To avoid the error, we test for a too large bio in the function
>> crypt_map() and use dm_accept_partial_bio() to split the bio.
>> dm_accept_partial_bio() trims the current bio to the desired size and
>> asks DM core to send another bio with the rest of the data.
>>
>> This fix is wrapped with a check for CONFIG_BCACHE because there
>> currently isn't any other code that generates too large bios.  So unless
>> bcache is configured there is no point wasting time making this check.
>>
>> Signed-off-by: Mikulas Patocka 
>> Cc: sta...@vger.kernel.org # v3.16+
>> ---
>>  drivers/md/dm-crypt.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>> index eedba67..743f548 100644
>> --- a/drivers/md/dm-crypt.c
>> +++ b/drivers/md/dm-crypt.c
>> @@ -1924,6 +1924,12 @@ static int crypt_map(struct dm_target *ti, struct bio 
>> *bio)
>>  return DM_MAPIO_REMAPPED;
>>  }
>>  
>> +#ifdef CONFIG_BCACHE
>> +if (unlikely(bio->bi_iter.bi_size > (BIO_MAX_PAGES << PAGE_SHIFT)) &&
>> +bio_data_dir(bio) == WRITE)
>> +dm_accept_partial_bio(bio, ((BIO_MAX_PAGES << PAGE_SHIFT) >> 
>> SECTOR_SHIFT));
>> +#endif
>> +
>>  io = dm_per_bio_data(bio, cc->per_bio_data_size);
>>  crypt_io_init(io, cc, bio, dm_target_offset(ti, 
>> bio->bi_iter.bi_sector));
>>  io->ctx.req = (struct skcipher_request *)(io + 1);
>> -- 
>> 2.7.4 (Apple Git-66)
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html