On Tuesday, 3 November 2020 05:46:41 GMT the...@sys-concept.com wrote:
> I'm using sql-ledger and while making backup it uses stardard gzip program:
> $gzip = "gzip -S .gz";
>
> The backup works with some dataset but one data set us giving me an
> error while trying to perform backup:
>
> Wide character in print at SL/AM.pm line 2044.
> Content-Type: application/file; Content-Disposition: attachment;
> filename=dataset_3-3.2.6-20201101.sql.gz
> ãù&ü_1604265628.dataset_3-3.2.6-20201101.sqlÏ\Yo;ñ~œØ –
>
> Since sql-ledger file are standard utf-8 files, I was thinking using:
> grep -axv '.*' file
>
> would find all not utf-8 characters. And it did. I use "nano" to remove
> them but I'm still getting the same error while performing backup.
>
> Any ideas?
I have not used sql-ledger, but have come across the following two symptoms
which may be relevant to your problem.
1. A SQL database which was created with an MSWindows application was using
UTF-16 instead of UTF-8. This added some UTF-16 null character at the start
of the SQL dump which messed up the output. The offending character was
obvious as a block when inspecting the dump with 'less' in Linux with its
default UTF-8 character encoding and could be deleted with a text editor. I
don't think this relates to your problem, but I am mentioning it for
completeness.
2. The word "print" in the error reported gives a hint you should follow up.
Perl which is used by sql-ledger, converts bytes to characters and can be set
to use UTF-8 encoding. However, it's conversion algorithm does not get things
right every time and when it concatenates strings it can mistranslate them.
You could fix this by setting both input *and* output encoding characters to
UTF-8. A good explanation of the problem and suitable solutions are described
here:
https://www.ahinea.com/en/tech/perl-unicode-struggle.html
signature.asc
Description: This is a digitally signed message part.