Bug#949336: Mapped integrity devices of size ≥2TiB are unusable on 32-bits platforms

2021-05-12 Thread nbf

Control: tag -1 - moreinfo
Control: found -1 + 2:2.3.4-2

Hi,
I am sorry for the misleading guesses from my previous messages.
I finally got to running some more tests and it is not a 32bit or arm
problem.
Long-keyed volumes created by 2.0.2 cannot be opened by 2.2.2 and vice 
versa.
I tested also with 2:2.3.4-2 amd64 from bullseye and it is the same as 
2.2.2.


Here are the steps that reproduce across all kernel versions I tried:

1) Create the required files and volumes
# head -c 4096 /dev/urandom >integrity-key-file
^ here source code says key-size is BITS, but integritysetup wants 4096 
BYTES(!)

# head -c 10M /dev/zero >integrity-volume2.0.2
# head -c 10M /dev/zero >integrity-volume2.2.2

2) Format one volume with 2.0.2 and the other with 2.2.2
# 2.0.2/integritysetup --sector-size 4096 --tag-size 32 --integrity 
hmac-sha256 --integrity-key-size=4096 
--integrity-key-file=integrity-key-file format integrity-volume2.0.2
# 2.2.2/integritysetup -qv --sector-size 4096 --tag-size 32 --integrity 
hmac-sha256 --integrity-key-size=4096 
--integrity-key-file=integrity-key-file format integrity-volume2.2.2


3a) Open and test the 2.0.2 volume with 2.0.2 - no problem
# 2.0.2/integritysetup --integrity hmac-sha256 --integrity-key-size=4096 
--integrity-key-file=integrity-key-file open integrity-volume2.0.2 
DM-INTEGRITY

# md5sum /dev/mapper/DM-INTEGRITY
e5601036ed0b1020b0179cd3d0d276d8  /dev/mapper/DM-INTEGRITY
# dmsetup status DM-INTEGRITY
0 19952 integrity 0 19952 -
# 2.0.2/integritysetup close DM-INTEGRITY

3b) Open and test the 2.2.2 volume with 2.2.2 - no problem
# 2.2.2/integritysetup --integrity hmac-sha256 --integrity-key-size=4096 
--integrity-key-file=integrity-key-file open integrity-volume2.2.2 
DM-INTEGRITY

# md5sum /dev/mapper/DM-INTEGRITY
e5601036ed0b1020b0179cd3d0d276d8  /dev/mapper/DM-INTEGRITY
# dmsetup status DM-INTEGRITY
0 19952 integrity 0 19952 -
# 2.2.2/integritysetup close DM-INTEGRITY

3c) Open and test the 2.0.2 volume with 2.2.2 - FAIL
# 2.2.2/integritysetup --integrity hmac-sha256 --integrity-key-size=4096 
--integrity-key-file=integrity-key-file open integrity-volume2.0.2 
DM-INTEGRITY

# md5sum /dev/mapper/DM-INTEGRITY
md5sum: /dev/mapper/DM-INTEGRITY: Input/output error
# dmsetup status DM-INTEGRITY
0 19952 integrity 14 19952 -
# 2.2.2/integritysetup close DM-INTEGRITY

3d) Open and test the 2.2.2 volume with 2.0.2 - FAIL
# 2.0.2/integritysetup --integrity hmac-sha256 --integrity-key-size=4096 
--integrity-key-file=integrity-key-file open integrity-volume2.2.2 
DM-INTEGRITY

# md5sum /dev/mapper/DM-INTEGRITY
md5sum: /dev/mapper/DM-INTEGRITY: Input/output error
# dmsetup status DM-INTEGRITY
0 19952 integrity 6 19952 -
# 2.0.2/integritysetup close DM-INTEGRITY

3e) Opening the volume with --integrity-recovery-mode always works
I assume this means data layout is same, but the integrity tags are 
different.


4) I tried to find out how many are actually used ("BITS v.s. BYTES"):
integritysetup 2.0.2 cares only about first 106 bytes (strange number)
integritysetup 2.2.2 cares only about first 114 bytes (strange number, 
+8)

integritysetup 2.3.4-2

6) Based on these results I managed to open the 2.0.2 volume with 2.2.2
# 2.2.2/integritysetup --integrity hmac-sha256 --integrity-key-size=106 
--integrity-key-file=integrity-key-file open integrity-volume2.0.2 
DM-INTEGRITY

# md5sum /dev/mapper/DM-INTEGRITY
e5601036ed0b1020b0179cd3d0d276d8  /dev/mapper/DM-INTEGRITY
# dmsetup status DM-INTEGRITY
19952 integrity 0 19952 -
# 2.2.2/integritysetup close DM-INTEGRITY


Best,
n.b.f.



Bug#979740: Processes blocked in io_schedule when using disks behind JMB575 JBOD

2021-01-10 Thread nbf

Package: linux-image-armmp
Version: 5.9.15-1

Dear Maintainer,
since upgrading to the 5.9.0-x-armmp armhf kernels I am seeing
processes getting getting randomly stuck in D-state.
I observed this issue with 5.9.0-1-armmp and 5.9.0-4-armmp.
I did not observe the issue with 5.8.0-3-armmp or any previous kernel.

Caveat: the system is running a debian kernel, but userspace is ubuntu 
focal.
I thought you might want to know anyway, since it looks like a kernel 
problem.


--

Hardware:
Three HDDs connected via JMB575 SataIII port multiplier to i.MX6q's 
SataII port.
Probe log: ata1.15: Port Multiplier 1.2, 0x197b:0x5755 r0, 5 ports, feat 
0x5/0xf


Steps to reproduce: Start a bigger filecopy operation from one disk to 
another.
I can reproduce this problem copying multi-GB files either via cp or 
rsync.
Copy gets blocked in D-state in few hours and a cpu core is filled by 
io-wait.
I've seen rsync getting blocked both reading the source and writing 
destination.


Details:
As I said, JMB575 PMP is connected to the i.MX6q on-chip SATA 
controller.
All it's ports get configured "SATA link up 3.0 Gbps (SStatus 123 
SControl 320)"
There are three HDDs connected to it - I am going to call them sda, sdb, 
sdc.
Disks "sda" and "sdb" are partitioned and used for regular ext4 
filesystems.
Disk "sdc" is backing a dm-integrity device. I am going to call it 
"dm-sdc".

DM device "dm-sdc" is used for a ext4 filesystem.


Generally the blocked processes arrive via io_schedule or 
io_schedule_timeout:
 [] (io_schedule_timeout) from [] 
(wait_for_completion_io+0xb4/0x134)

 [] (schedule) from [] (io_schedule+0x50/0x70)
 [] (io_schedule) from [] 
(dm_integrity_map_continue+0xb28/0xbc0 [dm_integrity])
 [] (io_schedule) from [] 
(wait_on_page_bit_common+0x18c/0x420)

 [] (io_schedule) from [] (rq_qos_wait+0x120/0x18c)
 [] (io_schedule) from [] (bit_wait_io+0x1c/0x68)
 [] (schedule_timeout) from [] 
(io_schedule_timeout+0x58/0x78)


I have many full stacktraces, but I am not sure if they are safe to 
share.

I'd really like to avoid disclosing user data or crypto keys.
I managed to find one that should be safe:
task:cronjob1l.shstate:D stack:0 pid:28125 ppid: 28124 
flags:0x

Backtrace:
[] (__schedule) from [] (schedule+0x78/0xf4)
 r10:00c3 r9:0002 r8:0002 r7:c0c9da4c r6:e5a27ae8 
r5:e5a26000

 r4:e598be00
[] (schedule) from [] (io_schedule+0x50/0x70)
 r5:c1304a84 r4:
[] (io_schedule) from [] (bit_wait_io+0x1c/0x68)
 r5:c1304a84 r4:0002
[] (bit_wait_io) from [] (__wait_on_bit+0x70/0xc8)
 r5:c1304a84 r4:e5a27adc
[] (__wait_on_bit) from [] 
(out_of_line_wait_on_bit+0xa4/0xcc)

 r9:e5a27c30 r8:bf1562a8 r7:0001 r6: r5:c5fe43a8 r4:e5a27af4
[] (out_of_line_wait_on_bit) from [] 
(__wait_on_buffer+0x40/0x44)

 r4:c00b3b80
[] (__wait_on_buffer) from [] 
(ext4_bread+0x124/0x130 [ext4])

 r5:c5fe43a8 r4:c00b3b80
[] (ext4_bread [ext4]) from [] 
(__ext4_read_dirblock+0x3c/0x45c [ext4])

 r4:e5a27cc0
[] (__ext4_read_dirblock [ext4]) from [] 
(dx_probe+0x54/0x6d8 [ext4])
 r10:00c3 r9:e5a27c30 r8: r7:e5a27cc0 r6:d4a5024c 
r5:c5fe43a8

 r4:e5a27cc0
[] (dx_probe [ext4]) from [] 
(__ext4_find_entry+0x418/0x620 [ext4])
 r10:00c3 r9: r8:0009 r7:c5fe43a8 r6:d4a5024c 
r5:e3092800

 r4:e5a27cc0
[] (__ext4_find_entry [ext4]) from [] 
(ext4_lookup+0xe4/0x2ec [ext4])
 r10:00c3 r9: r8:0009 r7:c5fe43a8 r6:c5fe43a8 
r5:d4a50220

 r4:e5a27ca0
[] (ext4_lookup [ext4]) from [] 
(__lookup_slow+0x98/0x164)

 r9: r8:0009 r7:c5fe43a8 r6:e5a27db8 r5:e40df990 r4:d4a50220
[] (__lookup_slow) from [] 
(walk_component+0x158/0x1cc)

 r8:0009 r7:c5fe4430 r6:0001 r5:e5a27db0 r4:e40df990
[] (walk_component) from [] 
(path_lookupat+0x80/0x1c0)

 r9: r8:e5a27db0 r7:e5a27ea4 r6:0001 r5:e5a27db0 r4:
[] (path_lookupat) from [] 
(filename_lookup+0xb0/0x1ac)

 r7:e5a27ea4 r6:0001 r5:e5509000 r4:0001
[] (filename_lookup) from [] 
(user_path_at_empty+0x7c/0x98)

 r8:0800 r7:014c04c4 r6:e5a27ea4 r5:ff9c r4:0001
[] (user_path_at_empty) from [] 
(vfs_statx+0x7c/0x13c)

 r7:014c04c4 r6:ff9c r5:0001 r4:e5a27ee8
[] (vfs_statx) from [] (__do_sys_stat64+0x40/0x80)
 r10:00c3 r9:e5a26000 r8:c03002c4 r7:00c3 r6:014c051c 
r5:004c2384

 r4:be94d758
[] (__do_sys_stat64) from [] (sys_stat64+0x18/0x1c)
 r4:0004
[] (sys_stat64) from [] (ret_fast_syscall+0x0/0x54)
Exception stack(0xe5a27fa8 to 0xe5a27ff0)
7fa0:   0004 004c2384 014c04c4 be94d758 be94d758 
0003
7fc0: 0004 004c2384 014c051c 00c3 014c0518 0003 004b0204 
004c24ec

7fe0: 00c3 be94d6e4 b6ee69db b6e6dbe6


What information would be helpful to collect next time this happens?
My attempt to document these mis-behavior follows:

1st occurence:  5.9.0-1-armmp #1 Debian 5.9.1-1
One recursive copy (cp -ar) reading from dm-sdc, writing to sdb3.
uninterruptibly sleeping / D-state'd processes:
cp -ar  

Bug#931344: d-s-s slows apt-get too much

2021-01-10 Thread nbf

Dear Maintainer,
I ran into this when upgrading from stretch to buster during 2020/21 
hollidays,
which means I was definitely running d-s-s 1:9+2020.12.04 from 
buster-security.
The system was low on RAM and both /tmp and swap were backed by a slow 
SD card.


This is caused by 
/usr/share/debian-security-support/check-support-status.hook,

which is runs either for too long or too frequently.

It is most likely the latter case, because when I repeatedly execute 'ps 
-A f'
during apt operation, check-support-status is listed present >60% of the 
time.
I guess each invocation takes only few seconds, but it gets invoked too 
often.



Cheers,
n-b-f



Bug#972950: [regression] cal does not highlight current day

2021-01-10 Thread nbf

notfound 972950 11.1.2+b1



Bug#979738: ntpdate aborts on serv-fail DNSSEC response

2021-01-10 Thread nbf

Package: ntpdate
Version: 1:4.2.8p12+dfsg-4
Severity: wishlist
Tags: patch


Dear Maintainer,
the current behavior of nptdate is sub-optimal when it is run on a mix
of DNS names and IP addresses and the DNS resolver performs DNSSEC 
validation.


I propose two alternative solutions to handle DNS errors:
patch-alt1 disables further DNS lookups, but allows IP addresses to be 
used.

patch-alt2 continues resolving in a hope to find a working server.


Cheers,
n-b-f



Bug#979739: checkrestart cannot override built-in exclusion list

2021-01-10 Thread nbf

Package: debian-goodies
Version: 0.87
Severity: wishlist


Dear Maintainer,
checkrestart ships with a list of exclusions that cannot be overriden.
I believe many of those are unwarranted and would like to remove them.
The exclusion list can be appended using the '-i' flag.
I propose implementing '-I' flag for replacing/overriding the list.

Thank you!



Bug#972950: [regression] cal does not highlight current day

2021-01-10 Thread nbf

Dear Maintainer,
I have also been harmed by this regression in 'ncal' binary.

This feature was removed in #904839 due to compatibility concerns.
All systems I used in the last 5 years highlighted the current day,
so I believe those compatibility concerns are very questionable.


Also #833226, #867995 and #898463 all seem relevant to this issue.


Cheers,
n-b-f



Bug#949336: Mapped integrity devices of size ≥2TiB are unusable on 32-bits platforms

2020-02-02 Thread nbf

Hi,


I won't be able to help debugging this as I don't have a >2TiB disk
around.  (Could fake the size and format with --no-wipe, but ...


I recall integrity checks failing on all sectors. It should be enough
to format only a part of the device with 32-bit 2:2.0.

e.g.
prepare# head -c 4096 /dev/urandom >/IDISK.ikey
2:2.0.x# integritysetup --sector-size 4096 --tag-size 32 --integrity 
hmac-sha256 --integrity-key-size=4096 --integrity-key-file=/IDISK.ikey 
format /dev/sda

2:2.0.x# ^C after a gigabyte
2:2.2.2# integritysetup --integrity hmac-sha256 
--integrity-key-size=4096 --integrity-key-file=/IDISK.ikey open /dev/sda 
IDISK





I definitely need a clear reproducer (with the latest stable
- 2.2.2 or 2.3.0-rc0) - ideally with attached debug and system log.


I don't have that system at hand, but I am sure it was running on latest 
stock linux-image-armmp from testing.


I upgraded/downgraded between 2:2.2.2-1 and some old 2:2.0.x multiple
times to try rule out the kernel version.
The result was always the same. When I use 2:2.2.2-1, the disk seems to
return only checksum errors. When I use 2:2.0.x, it works as expected.

How do I add testing's 2:2.2.2-1 to affected versions?
Something like "Control: found -1 2:2.2.2-1"?



Any idea how is dm-integrity supposed to interact with truncation?
Does device size or device blocksize affect checksum layout?
If the underlying device is truncated, should checksums of remaining 
sectors stay valid?
If the mapped length gets truncated like in #935702, should checksums of 
available sectors stay valid?





Best,
n.b.f.



Bug#949336: clarification

2020-01-19 Thread nbf
clarification: I am testing it with a volume I created and used with 
cryptsetup 2:2.0. With 2:2.1 and 2:2.2 integritysetup-open seems to 
succeed, but the embedded ext4 filesystem cannot be used. Attempt to 
read the superblock raise I/O errors due to integrity mismatches.


Best,
n.b.f.



Bug#949336: Mapped integrity devices of size ≥2TiB are unusable on 32-bits platforms

2020-01-19 Thread nbf

Package: cryptsetup-bin
Version: 2:2.1.0-5+deb10u2
Severity: important

Dear Maintainers,

this is an unfortunate folowup to #935702. I thank Milan Broz very much 
for fixing that one so quickly.


Unfortunately, even though the mapped size is now correct, 
integritysetup still silently configures a DM device that cannot be read 
and only gives I/O errors. Most likely this is another issue specific to 
large volumes on 32-bit systems.


Affected versions:
"testing" 2:2.2.2-1
"stable"  2:2.1.0-5+deb10u2

Working version:
same as in #935702

Affected command:
integritysetup --integrity hmac-sha256 --integrity-key-size=4096 
--integrity-key-file=/IDISK.ikey open /dev/sda IDISK



Best,
n.b.f.



Bug#935702: Wrong DM device size due to integer truncation

2019-08-25 Thread nbf

Package: cryptsetup-bin
Version: 2:2.1.0-5
Severity: serious

Dear Maintainer,

cryptsetup in Stable contains multiple severe integer handling issues.
Created DM device's size is set incorrectly due to integer truncation.

Not only the access to protected data is lost, the integritysetup's 
"open" operation actually succeeds. All reads on the incorrectly created 
DM device will of course fail with I/O errors due to bad integrity tags, 
but all writes will happily write wrong tags at wrong places! This makes 
it very easy for the administrator to destroy the data while trying to 
recover with --integrity-recovery-mode.


The issue is caused by a new set of functions "dm_*_target_set", 
introduced with cryptsetup 2:2.1.0, whose arguments use haphazardly 
chosen integer types, even though the actual types are easy to find.


For example, "uint64_t size" is temporarily stored in a size_t variable.
1) stored in lib/utils_dm.h: struct crypt_dm_active_device { uint64_t 
size, ... }
2) passed to lib/libdevmapper.c dm_*_target_set(..., (size_t)dmd.size, 
...

3) stored in lib/utils_dm.h: struct dm_target { uint64_t size, ... }

Seeing such carelessness in a core crypto software makes me very uneasy.


Best,
n.b.f.

-- Notes:
64-bit systems, whose size_t is 64bit, are safe from this bug.
Partitions smaller than 2TiB (2^32 * 512) are safe from this bug.
Severity: grave may be appropriate due to the potential for data loss.