Hi,
Please find attached a hopefully better variant of the fix.
Jean-Pierre
Jean-Pierre André wrote on 8/1/20 12:35 PM:
Hi,
Thank you for your report including a test pointing out
the issue.
Attached is a patch expected to fix it.
Please test and report.
Jean-Pierre
Chris Roehrig wrote on 7/30/20 1:01 AM:
I'm trying to get my Linux-based NTFS backup drive to pass a CHKDSK
and came upon this curious situation where CHKDSK finds errors.
It seems to be some issue with how ntfs-3g modifies a directory index
when renaming many files.
The CHKDSK error always seems to be of the form:
Stage 2: Examining file name linkage ...
The first free byte, 0xc0, and bytes available, 0x150, for root index
$I30 in file 0x40 are not equal.
I've attached a python script (mkbaddir.py) that creates two
(apparently) identical directories, one of which reliably causes this
CHKDSK error; the other doesn't.
How to demonstrate:
- Format an NTFS partition or thumbdrive using Windows or mkfs.ntfs.
- Mount the partition on a Linux system.
I used Mint 20 with ntfs-3g 2017.3.23AR.3 integrated FUSE 28 and
python 3.8.2.
- Chdir to the new NTFS partition and run the script:
/tmp/mkbaddir.py # creates 'baddir' in current dir.
/tmp/mkbaddir.py -G # creates 'gooddir' in current dir.
diff -r baddir gooddir # no difference
du -sB1 baddir gooddir # same size (128K)
- Boot into Windows (10 v1903) and run (from a terminal) chkdsk
X: (where X: is the NTFS drive).
- This will say:
"Errors found. CHKDSK cannot continue in read-only mode."
- Delete baddir (I used cygwin's rm -rf), and run chkdsk X: again.
- This will now have no errors.
My guess at what's happening:
The script creates a directory of 410 empty files and then renames
them with slightly larger names, which as I understand leaves a bunch
of unused nodes in the b-tree. The -G option just renames the 410
known files; without the -G option, it uses os.walk() to traverse the
directory which I'm guessing leaves the b-tree in a slightly
different state with even more unused nodes.
The 410 was chosen by trial-and-error so that some internal
threshhold is just exceeded by the baddir but not by the gooddir.
With more than 410 (using the -c option; say -c 500), both baddir and
gooddir will cause CHKDSK errors.
If I run the script on Windows/cygwin (Python 3.6.9) to create the
folders, it does not give any CHKDSK errors even with many more files.
So there seems to be some issue with how ntfs-3g modifies the b-tree
when renaming many files that is causing CHKDSK to complain.
I encountered this issue when trying to get my Linux-based NTFS
backup drive to consistently pass a CHKDSK. I use a script to first
rename POSIX names to valid windows names, replacing '?' with '@@3F',
etc so I can reverse the renaming afterwards. I have some website
mirror folders with many files of the form:
details.asp?id=xxxxx&key=val
which gave rise to this issue. (In the mkbaddir script I use only
alphanumeric names to be clear this is not an illegal char issue).
--- libntfs-3g/index.c.ref 2017-03-23 10:42:44.000000000 +0100
+++ libntfs-3g/index.c 2020-08-03 09:06:07.579013300 +0200
@@ -1803,6 +1803,7 @@
int ntfs_index_rm(ntfs_index_context *icx)
{
INDEX_HEADER *ih;
+ int freed_space;
int err, ret = STATUS_OK;
ntfs_log_trace("Entering\n");
@@ -1835,6 +1836,19 @@
} else {
if (ntfs_index_rm_leaf(icx))
goto err_out;
+ /*
+ * Removing a leaf may lead to removing an entry from
+ * the root index as a side effect.
+ * Recover the space thus made available in the root index.
+ */
+ ih = &icx->ir->index;
+ freed_space = le32_to_cpu(ih->allocated_size)
+ - le32_to_cpu(ih->index_length);
+ if ((freed_space > 0) && !(freed_space & 7)) {
+ if (ntfs_ir_truncate(icx,
+ le32_to_cpu(ih->index_length)))
+ goto err_out;
+ }
}
out:
return ret;
_______________________________________________
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel