On 27/03/20, Andrew Gierth ([email protected]) wrote:
> >>>>> "Thomas" == Thomas Munro <[email protected]> writes:
>
> Thomas> Something like this approach might be useful for fixing the CSV file:
>
> Thomas>
> https://codereview.stackexchange.com/questions/185821/convert-a-mix-of-latin-1-and-utf-8-to-proper-utf-8
>
> Or:
>
> perl -MEncode -pe '
> use bytes;
> sub c { decode("UTF-8",shift,sub { decode("windows-1252", chr(shift)) }); }
> s/([\x80-\xFF]+)/encode("UTF-8",c($1))/eg' <infile >outfile
Or:
iconv -f WINDOWS-1252 -t UTF-8 -c < tempfile2 > tempfile3