[bug #64253] Suggestion - Add support for libmagic and xattr

Bernhard Voelker Sat, 03 Jun 2023 14:39:58 -0700

Update of bug #64253 (project findutils):

                  Status:             In Progress => None


    _______________________________________________________

Follow-up Comment #6:

No, I don't plan to add a -printf format for mime/magic.
The output doesn't sort in well in the other output formats anyway,
because it's quite verbose.

If I would think about adding such a format, I'd go starting to make
use of the "%{...}" syntax which is currently reserved for future; hence
a "%{magic}" and "%{mime}" would fit - not sure about rawhide, though.

Re. magic/mime implementation:

First of all, it's the first time find looks at file content, and
that processing (open/read/lookup/close) is by magnitudes slower than
the other tests in find(1).

Having played with it for some time now, I have major qualms to
add libmagic support:

a) file(1) has some more options than the -i (for mime output) option.
Of course, they're all available via flags in libmagic.
But it would be strange to have to add further flags or knobs
in find(1) to support these options as well.
But people will require it - "just that one little thing".
That are discussions we have to avoid.

b) error handling:
While libmagic has the flag MAGIC_ERROR to indicate an error
via return value NULL instead of placing the error string
into the magic result buffer, that does not work for
all cases, e.g. the simple open/EPERM case: we still get the error
message "regular file; no read permission" as magic string
instead of NULL.
Likewise file(1):

$ file -E /etc/sudoers; echo $?
/etc/sudoers: regular file, no read permission
0

I looked into file/libmagic code, and found various such places.
Also the library function _magic_error_ does not indicate
an unreadable file as error.
We'd have to single out every such error by string matching,
which I'm not willing to do.  Proper error handling seems to
be tough with libmagic.  I'm not sure how and which other
projects are using libmagic, but the current state of error
handling doesn't work for how find(1) would need it.

c)
It's not really find's business to look at the content of files,
and there are already ways to do the filtering with file(1) as shown
below (*): searching for magic strings or mime types can already be
done "the UNIX way" (i.e., one tool for one purpose).

Even if one likes to continue after the "magic check" with post-processing
via the find(1) command again, it is safe with the -files0-from option
for any kind of exotic file names incl. control or newline chars:


$ find -type f -size -40000c -mtime -10 -exec file -00 '{}' + \
    | sed -nz 'h;n;/^C source/{g;p}' \
    | find -files0-from - -printf "* %p\n  size: %s\n  inode: %i\n"
* ./find/defs.h
  size: 19707
  inode: 216534
* ./find/util.c
  size: 29571
  inode: 217027
* ./find/pred.c
  size: 37310
  inode: 152256


Obviously, that would be much easier if file(1) would provide options
to filter by certain magic/mime strings (as it does to exclude some tests).

I was quite enthusiastic about adding libmagic in the beginning,
but with the issues described above - above all the problematic error
handling -, I'm afraid I can't add libmagic support now.
I'm inclined to abandon or throw away my local work.

(*) The other day, maybe another one comes up with the idea that there's
a little library to get the content of cell A1 of a spreadsheet file,
or the title of a PDF file.  I don't believe it's a good idea to link
all those libraries, but instead encourage people to write tools which
fit well into UNIX pipes and transport remaining file names with
safe and Zero-terminated strings.



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?64253>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

[bug #64253] Suggestion - Add support for libmagic and xattr

Reply via email to