David Kastrup <d...@gnu.org>:
> Marko Rauhamaa <ma...@pacujo.net> writes:
>> You probably cannot produce valid UTF-8 out of invalid UTF-8 snippets
>> with split(1). However split(1) does form filenames out of its
>> arguments by concatenation:
>> split --additional-suffix=suffix file prefix
>> produces these kinds of filenames:
> I don't really get your point here. Why would you start with invalid
> UTF-8 sequences in the filenames?
There's nothing preventing such filenames from appearing on a Linux
system. They might come from a zip file with Latin-1 -encoded names, for
I have files older than UTF-8 on my Linux system. I have files encoded
in Latin-3, for example.
Worst of all, they might be part of an attack on your system. For
example, files whose names contain invalid UTF-8 could evade file
listing altogether, they might make your program crash in unexpected
ways or you might not be able to remove them.