Re: Prevent opening binary/other garbage files

Adam D. Ruppe via Digitalmars-d-learn Mon, 01 Oct 2018 12:45:55 -0700

On Monday, 1 October 2018 at 15:21:24 UTC, helxi wrote:

I tried out https://dlang.org/library/std/utf/validate.htmlbefore manually checking for encoding myself so I ended up withthe code below. I was fairly surprised that "*.o" (object)files are UTF encoded! Is it normal?

Yes. Any random collection of bytes <= 127 is valid utf-8. Lineswill read until it sees a byte 10, and cut off from there.

Quite a few file formats have a 10 early on to detect text/binarytransmission corruption, but even if they don't, it is a fairlycommon byte to see before too long and that cuts off your scanfor later bytes.

You really are better off looking for those <32 bytes like Idescribed earlier - a .o file will likely have some 1's and 3'searly on which that will quickly detect, but those will also passthe validate test.

Re: Prevent opening binary/other garbage files

Reply via email to