Re: Prevent opening binary/other garbage files

Adam D. Ruppe via Digitalmars-d-learn Sat, 29 Sep 2018 09:06:02 -0700

On Saturday, 29 September 2018 at 15:52:30 UTC, helxi wrote:

I'm writing a utility that checks for specific keyword(s) foundin the files in a given directory recursively. What's the beststrategy to avoid opening a bin file or some sort of garbagedump? Check encoding of the given file?

Simplest might be to read the first few bytes (like couplehundred probably) and if any of them are < 32 && != '\t' && !='\r' && != '\n' && != 0, there's a good chance it is a binaryfile.

Text files are frequently going to have tabs and newlines, butnot so frequently other low bytes.

If you do find a bunch of 0's, but not the other values, youmight have a utf-16 file.

If so, what are the most popular encodings (in POSIX if thatmatters) and how do I detect them?

for text on posix computers they are likely going to be utf8, andyou can try using Phobos' readText function. It will throw if itencounters non-utf8, so you catch that and go on to the next one.

But the simpler check described above will also probably work andcan read less of the file.

Re: Prevent opening binary/other garbage files

Reply via email to