Re: [f2fs-dev] [PATCH 03/13] fscrypt: rename fscrypt_do_page_crypto() to fscrypt_crypt_block()

2019-05-02 Thread Chandan Rajendra
On Thursday, May 2, 2019 4:15:05 AM IST Eric Biggers wrote:
> From: Eric Biggers 
> 
> fscrypt_do_page_crypto() only does a single encryption or decryption
> operation, with a single logical block number (single IV).  So it
> actually operates on a filesystem block, not a "page" per se.  To
> reflect this, rename it to fscrypt_crypt_block().
>

Looks good to me,

Reviewed-by: Chandan Rajendra 

-- 
chandan





___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [BUG] watchdog: BUG: soft lockup.. after heavy disk access

2019-05-02 Thread Hagbard Celine
2019-05-01 0:35 GMT+02:00, Hagbard Celine :
> 2019-04-30 8:25 GMT+02:00, Chao Yu :
>> On 2019/4/29 21:39, Hagbard Celine wrote:
>>> 2019-04-29 9:36 GMT+02:00, Chao Yu :
 Hi Hagbard,

 At a glance, may I ask what's your device size? Since I notice that you
 used
 -i
 option during mkfs, if size is large enough, checkpoint may crash
 nat/sit_bitmap
 online.

 mkfs-f2fs -a 1 -f -i -l NVME_Gentoo-alt -o7 /dev/nvme0n1p7
>>>
>>> Both partitions are 128GB.
>>
>> 128GB is safe. :)
>>
>>>
 On 2019/4/29 15:25, Hagbard Celine wrote:
> First I must admit that I'm not 100% sure if this is
> f2fs/NMI/something-else bug, but it only triggers after significant
> disk access.
>
> I've hit this bug several times since kernel 5.0.7 (have not tried
> earlier kernels) while compiling many packages in batch. But in those
> occasions it would hang all cores and loose the latest changes to the
> filesystem so I could not get any logs.
> This time it triggered while I was in the process of moving two of my
> installations to new partitions and locked only one core, so I was
> able to get the log after rebooting.
> The I did to trigger was make a new partition and run commands:
> #mkfs-f2fs -a 1 -f -i -l NVME_Gentoo-alt -o7 /dev/nvme0n1p7
> #mount -o
> "rw,realtime,lazytime,background_gc=on,disable_ext_identity,discard,heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache,data_flush,mode=adaptive,active_logs=6,whint_mode=fs-based,alloc_mode=default,fsync_mode=strict"
>>
>> Could you revert below commit to check whether the issue is caused by it?
>
> This did not seem to help. I was not able to reproduce the bug by
> copying partition contents to a newly made partition, but the again I
> was not able to reproduce it that way on unpatched kernel either.
>
> Goth with and without the mentioned patch reverted the only way I am
> able to somewhat reliably reproduce is by compiling a big batch of
> packages in Exherbo (using paludis/sydbox), if the batch contains one
> of a handful big packages (util-linux is the most common to trigger)
> and that big package is not first in the batch. In those cases it will
> trigger one of the following ways in most runs:
>
> -NMI watchdog: BUG: soft lockup.. on from 4- to 8-cores.
>
> -NMI watchdog: Watchdog detected hard LOCKUP on cpu.. on 2- to 4-cores
>
> -no error message at all, everything suddenly goes to a grinding halt.

After some more testing, it seems that this is not quite accurate. I
might be facing at least two different issues. After about a day of
compiling on a pure ext4 install I have not got any of the hangs with
"NMI watchdog" errors, but one of the package that sometimes hung with
no errors at all (dev-lang/llvm, during test phase) still exhibits
this behavior.

I'll do another full system compile on ext4 just to make sure it's not
pure luck I did not get any NMI watchdog errors there. Then I'll copy
the system to f2fs partition and try dropping mount options (starting
with data_flush), running a full system compile for each, to see if
that makes any difference. (unless anyone got a better idea on how to
proceed, or comes up with another patch to test)

> Most of the time I also loose the ability to switch vt by alt+F-key
> and it's not even responding to "magic sysreq key".
>
> The few times I've been able to do vt-switch to the vt with output
> from syslog there is usually nothing in the time-frame of crash but
> one or two "kernel: perf: interrupt took too long..."
> ..or some random subsystem messaging because something timed out
> (usually some message from i915). Just to be clear the i915 is not in
> use during the tests, console is on amdgpudrmfb and X is not running,
> I even tested with a kernel without i915 to completely rule it out.
>
>> commit 50fa53eccf9f911a5b435248a2b0bd484fd82e5e
>> Author: Chao Yu 
>> Date:   Thu Aug 2 23:03:19 2018 +0800
>>
>> f2fs: fix to avoid broken of dnode block list
>>
>> The only risk of reverting this patch is we may potentially lose fsynced
>> data
>> after sudden power-cut, although it's rare, anyway please backup your
>> data
>> before try it.
>>
>> Thanks,
>>
> /dev/nvme0n1p7 /mnt/gentoo-alt-new
> #cd /mnt/gentoo-alt
> #cp -a . /mnt/gentoo-alt-new
> #umount /nmt/gentoo-alt
> The bug triggers just after last command and last command was run
> within 20 seconds ofter the second-last command returned.
> "/mnt/gentoo-alt" was already mounted with same options as
> "/mnt/gentoo-alt-new", and contained a Gentoo-rootfs of 64GB data
> (according df -h).
>
> This was on kernel 5.0.10 with "[PATCH] f2fs: fix potential recursive
> call when enabling data_flush" by Chao Yu
>
> Syslog for bug follows:
>
> Apr 29 07:02:52 40o2 kernel: watchdog: BUG: soft lockup - CPU#4 stuck
> for 21s! [irq/61-nvme0q5:383]
> Apr 29 07:02:52 40o2 kernel: 

Re: [f2fs-dev] [PATCH V2 10/13] fscrypt_encrypt_page: Loop across all blocks mapped by a page range

2019-05-02 Thread Eric Biggers
Hi Chandan,

On Thu, May 02, 2019 at 11:22:05AM +0530, Chandan Rajendra wrote:
> On Thursday, May 2, 2019 3:59:01 AM IST Eric Biggers wrote:
> > Hi Chandan,
> > 
> > On Wed, May 01, 2019 at 08:19:35PM +0530, Chandan Rajendra wrote:
> > > On Wednesday, May 1, 2019 4:38:41 AM IST Eric Biggers wrote:
> > > > On Tue, Apr 30, 2019 at 10:11:35AM -0700, Eric Biggers wrote:
> > > > > On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote:
> > > > > > For subpage-sized blocks, this commit now encrypts all blocks 
> > > > > > mapped by
> > > > > > a page range.
> > > > > > 
> > > > > > Signed-off-by: Chandan Rajendra 
> > > > > > ---
> > > > > >  fs/crypto/crypto.c | 37 +
> > > > > >  1 file changed, 25 insertions(+), 12 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> > > > > > index 4f0d832cae71..2d65b431563f 100644
> > > > > > --- a/fs/crypto/crypto.c
> > > > > > +++ b/fs/crypto/crypto.c
> > > > > > @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const 
> > > > > > struct inode *inode,
> > > > > 
> > > > > Need to update the function comment to clearly explain what this 
> > > > > function
> > > > > actually does now.
> > > > > 
> > > > > >  {
> > > > > > struct fscrypt_ctx *ctx;
> > > > > > struct page *ciphertext_page = page;
> > > > > > +   int i, page_nr_blks;
> > > > > > int err;
> > > > > >  
> > > > > > BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0);
> > > > > >  
> > > > > 
> > > > > Make a 'blocksize' variable so you don't have to keep calling 
> > > > > i_blocksize().
> > > > > 
> > > > > Also, you need to check whether 'len' and 'offs' are 
> > > > > filesystem-block-aligned,
> > > > > since the code now assumes it.
> > > > > 
> > > > >   const unsigned int blocksize = i_blocksize(inode);
> > > > > 
> > > > > if (!IS_ALIGNED(len | offs, blocksize))
> > > > > return -EINVAL;
> > > > > 
> > > > > However, did you check whether that's always true for ubifs?  It 
> > > > > looks like it
> > > > > may expect to encrypt a prefix of a block, that is only padded to the 
> > > > > next
> > > > > 16-byte boundary.
> > > > >   
> > > > > > +   page_nr_blks = len >> inode->i_blkbits;
> > > > > > +
> > > > > > if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) {
> > > > > > /* with inplace-encryption we just encrypt the page */
> > > > > > -   err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, 
> > > > > > lblk_num, page,
> > > > > > -ciphertext_page, len, offs,
> > > > > > -gfp_flags);
> > > > > > -   if (err)
> > > > > > -   return ERR_PTR(err);
> > > > > > -
> > > > > > +   for (i = 0; i < page_nr_blks; i++) {
> > > > > > +   err = fscrypt_do_page_crypto(inode, FS_ENCRYPT,
> > > > > > +   lblk_num, page,
> > > > > > +   ciphertext_page,
> > > > > > +   i_blocksize(inode), 
> > > > > > offs,
> > > > > > +   gfp_flags);
> > > > > > +   if (err)
> > > > > > +   return ERR_PTR(err);
> > > > 
> > > > Apparently ubifs does encrypt data shorter than the filesystem block 
> > > > size, so
> > > > this part is wrong.
> > > > 
> > > > I suggest we split this into two functions, 
> > > > fscrypt_encrypt_block_inplace() and
> > > > fscrypt_encrypt_blocks(), so that it's conceptually simpler what each 
> > > > function
> > > > does.  Currently this works completely differently depending on whether 
> > > > the
> > > > filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations, which is 
> > > > weird.
> > > > 
> > > > I also noticed that using fscrypt_ctx for writes seems to be 
> > > > unnecessary.
> > > > AFAICS, page_private(bounce_page) could point directly to the pagecache 
> > > > page.
> > > > That would simplify things a lot, especially since then fscrypt_ctx 
> > > > could be
> > > > removed entirely after you convert reads to use read_callbacks_ctx.
> > > > 
> > > > IMO, these would be worthwhile cleanups for fscrypt by themselves, 
> > > > without
> > > > waiting for the read_callbacks stuff to be finalized.  Finalizing the
> > > > read_callbacks stuff will probably require reaching a consensus about 
> > > > how they
> > > > should work with future filesystem features like fsverity and 
> > > > compression.
> > > > 
> > > > So to move things forward, I'm considering sending out a series with 
> > > > the above
> > > > cleanups for fscrypt, plus the equivalent of your patches:
> > > > 
> > > > "fscrypt_encrypt_page: Loop across all blocks mapped by a page 
> > > > range"
> > > > "fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page"
> > > > "Add decryption support for sub-pagesized blocks" 

Re: [f2fs-dev] [PATCH 02/13] fscrypt: remove the "write" part of struct fscrypt_ctx

2019-05-02 Thread Chandan Rajendra
On Thursday, May 2, 2019 4:15:04 AM IST Eric Biggers wrote:
> From: Eric Biggers 
> 
> Now that fscrypt_ctx is not used for writes, remove the 'w' fields.
>

Looks good to me,

Reviewed-by: Chandan Rajendra 

-- 
chandan





___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 01/13] fscrypt: simplify bounce page handling

2019-05-02 Thread Chandan Rajendra
On Thursday, May 2, 2019 4:15:03 AM IST Eric Biggers wrote:
> From: Eric Biggers 
> 
> Currently, bounce page handling for writes to encrypted files is
> unnecessarily complicated.  A fscrypt_ctx is allocated along with each
> bounce page, page_private(bounce_page) points to this fscrypt_ctx, and
> fscrypt_ctx::w::control_page points to the original pagecache page.
> 
> However, because writes don't use the fscrypt_ctx for anything else,
> there's no reason why page_private(bounce_page) can't just point to the
> original pagecache page directly.
> 
> Therefore, this patch makes this change.  In the process, it also cleans
> up the API exposed to filesystems that allows testing whether a page is
> a bounce page, getting the pagecache page from a bounce page, and
> freeing a bounce page.

Looks good to me,

Reviewed-by: Chandan Rajendra 

-- 
chandan





___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel