Hi Mark,

It’s a bit of a mess. Let me start with some background, and then describe a 
possible workaround.

sml:PostRequest always uses the ISO-8859-1 encoding to encode the content. As a 
result, if your payload includes any characters that cannot be represented in 
that encoding, they will be converted to “?”, as happened in your case.

N-Triples allows for two different ways of representing international 
characters. First, they can be represented in hexadecimal \uXXXX form. Second, 
they can be represented with normal UTF-8 encoding. Older versions TopBraid 
used the first form, and it worked fine with sml:PostRequest. But recent 
versions use the second form, which was introduced with RDF 1.1, and doesn’t 
work with sml:PostRequest. There is no way to force the \uXXXX form in 
sml:convertRDFToText.

Serializing in Turtle or RDF/XML instead of N-Triples would present exactly the 
same problem—the international characters would be represented in UTF-8 form, 
which doesn’t work with sml:PostRequest.

There is a possible workaround/hack, which may or may not work for you. It 
involves the following steps:

1. Use sml:ExportToRDFFile to write to a temp file (which will be UTF-8 encoded)
2. Use sml:ImportTextFile to read the content of the temp file, making sure 
that ISO-8859-1 encoding is used
3. Posting the content with sml:PostRequest

The problem is how to force step 2 to use ISO-8859-1. The latest development 
builds of TopBraid Suite have an sml:encoding argument for sml:ImportTextFile 
that can be used to force the encoding, but this isn’t available in the latest 
release yet. In the latest release, sml:ImportTextFile will always use the 
JVM’s system encoding, which will depend on your environment/OS. If your system 
already uses ISO-8859-1, it may just work. Otherwise, you may be able to force 
it in the ini file with -Dfile.encoding=ISO-8859-1, but this may have other 
undesirable side effects.

(Note: This workaround doesn’t actually post the content as ISO-8859-1, it 
still posts it as UTF-8. But it just so happens that reading an UTF-8 file with 
ISO-8859-1 encoding (step 2), and writing the result back with ISO-8859-1 
encoding (step 3) yields correct UTF-8 encoding again, for the full range of 
characters. Keep this in mind in case you want to set sml:contentType for the 
post request—the content is actually in UTF-8.)

I will see about adding an sml:encoding argument to sml:PostRequest for the 
next release of TopBraid.

Best,
Richard



> On 11 Jan 2017, at 17:17, Mark van Berkel <[email protected]> wrote:
> 
> Hi Team,
> 
> I've got a SPARQLMotion Script that converts RDF to text (ntriple format) 
> using smf:convertRDFToText then sends the data using a POST request 
> (sml:PostRequest).  When sending international language characters, e.g. 
> Салон Красоты Sono Day Spas, all the diacritic characters change to "?". I 
> tried setting the content-type parameter to "text/html; charset=utf-8" and 
> tried this solution 
> <https://groups.google.com/d/msg/topbraid-users/g38efkznZGA/j5Q3YOFXJnUJ> 
> which is to add -Dfile.encoding=UTF-8 to TopBraid Composer.ini.orig without 
> any luck.
> 
> Is this, as suggested in the other thread, an issue with PostRequest 
> converting to string?  Is there a workaround?
> 
> Thanks and Regards,
> Mark
> 
> -- 
> You received this message because you are subscribed to the Google Group 
> "TopBraid Suite Users", the topics of which include the TopBraid Suite family 
> of products and its base technologies such as SPARQLMotion, SPARQL Web Pages 
> and SPIN.
> To post to this group, send email to [email protected]
> --- 
> You received this message because you are subscribed to the Google Groups 
> "TopBraid Suite Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Group 
"TopBraid Suite Users", the topics of which include the TopBraid Suite family 
of products and its base technologies such as SPARQLMotion, SPARQL Web Pages 
and SPIN.
To post to this group, send email to [email protected]
--- 
You received this message because you are subscribed to the Google Groups 
"TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to