The internal representation of character strings in Perl-5 is not
identical to UTF-8 or UTF-X, although they both may occur in the same
string variable. There is no automatic conversion; the "use utf8;"
pragma is only to enable Perl-5 source code written in UTF-8 (see
"perldoc utf8"). Therefore each UTF-8 text coming from outside the
program must be decoded, as well as all data to leave the program as
UTF-8 text must be encoded.
So please after including the "use Encode;" pragma replace your line
my $data_structure = decode_json(`curl -X GET $url`);
by something like
my $data_structure = decode_json(decode('utf8', `curl -X GET $url`));
and replace analogously
my $Post = `curl -X PUT $url -d '$returnJSON'`;
by
my $JSONutf8 = encode('utf8', $returnJSON);
my $Post = `curl -X PUT $url -d '$JSONutf8'`;
This method helped my a lot to build and use a couch database with many
international names in its texts. Since the error message you included
is related to UTF-8, it should be worth while to try in your case.
Kind regards,
Raimund Riedel
Am 07.04.2018 um 07:24 schrieb Bill Stephenson:
I’ve been working on a “comments” feature for my “CherryPC blog”.
I don’t want readers to have to make a user account to comment so I’m wanting
to use a perl script on the server side that has the user credentials in the
$url variable below.
This is the code I’m using to update the document with the comment.
# Convert the JSON to a perl object
my $data_structure = decode_json(`curl -X GET $url`);
my $_id = $data_structure->{'_id'};
my $_rev = $data_structure->{'_rev'};
my $title = $data_structure->{'title'};
my $subtitle = $data_structure->{'subtitle'};
my $content = $data_structure->{'content'};
my $Text_publish = $data_structure->{'Text_publish'};
my $publishDate = $data_structure->{'publishDate'};
my $returnJSON = qq`{"$_id": "_id", "_rev": "$_rev", "title": "$title", "subtitle": "$subtitle", "content": "$content",
"docType": "text", "Text_publish": "yes", "publishDate": "$publishDate",$newCommentsList}`;
my $Post = `curl -X PUT $url -d '$returnJSON'`;
This works fine with plain text, but the blog posts are made with TinyMCE and
use HTML. I can update them fine with Javascript and PouchDB, but Perl is
dying on double quotes, single quotes, and backslashes:
‘ “ \
I’ve narrowed it down to just those 3 characters. If I strip those from the
html and comments it will all post fine, but html doesn’t work without those so
that’s not an option.
I’m using these modules:
use strict;
use warnings;
use utf8;
use JSON::XS;
use Data::Dumper;
use CGI;
From what I understand "use utf8” forces the all data to be utf-8 encoded and
I’ve used several different modules to encode the data and built the entire document
in a perl object and converted that to JSON as opposed to a simple string like
above, but it still dies on those three characters.
This is what the curl error tells me:
PUT Error: bad_request
reason: invalid UTF-8 JSON
So, it’s those 3 characters that are not being encoded correctly.
If anyone has any ideas and/or advice on how to deal with this I’d sure
appreciate them. I’ve pretty much ran out of them at this point.
Kindest Regards,
Bill Stephenson
--
Raimund Riedel
______________ rajmun...@gmail.com
______________ Mi parolas Esperanton