Hi Eliot,

I loaded it by first creating a new database and pointing to the CSV file as 
input. The default encoding as far as I can tell is UTF-8 as shown in the 
attached screenshot. The CSV file was exported from Excel in UTF-8 encoding.

Perplexed,
Bit

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On May 18, 2018 9:53 AM, Eliot Kimber <ekim...@contrext.com> wrote:

> That mangled string is the result of reading UTF-8 byte sequences as 
> single-byte characters, e.g. ASCII or some Windows code page.
>
> How are you loading it into BaseX? It seems unlikely that BaseX-provided code 
> would make this kind of basic mistake in reading text but it’s possible it 
> applied the incorrect encoding for some reason.
>
> Cheers,
>
> Eliot
>
> --
>
> Eliot Kimber
>
> http://contrext.com
>
> From: <basex-talk-boun...@mailman.uni-konstanz.de> on behalf of BitRider001 
> <bit.rider....@pm.me>
> Reply-To: BitRider001 <bit.rider....@pm.me>
> Date: Thursday, May 17, 2018 at 8:34 PM
> To: Bridger Dyson-Smith <bdysonsm...@gmail.com>
> Cc: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de>
> Subject: Re: [basex-talk] about special characters
>
> Bridger,
>
> Indeed the file was exported from Excel in UTF-8 encoding. I've tried opening 
> the CSV file using Notepad/Wordpad and in Linux with vi in a terminal and in 
> both situations it displays the correct special character.
>
> Its only when I load it into a BaseX db and query it does it show itself, as 
> you said, as "mangled". Saving the results into a text file also contains the 
> "mangled" string.
>
> Strange.
>
> Bit
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>
> On May 18, 2018 9:21 AM, Bridger Dyson-Smith <bdysonsm...@gmail.com> wrote:
>
>> Bit -
>>
>> that's odd; it looks like the characters are being decomposed (or whatever 
>> the term is) and mangled but I'm not sure, unfortunately. Was the CSV an 
>> export from Excel? If so, I suppose this could be a Windows character set 
>> problem (cp-1252 or iso-8859-1 or something?).
>>
>> Bridger
>>
>> On Thu, May 17, 2018 at 9:11 PM BitRider001 <bit.rider....@pm.me> wrote:
>>
>>> Hi Bridger,
>>>
>>> Yes that is right. I'm on the latest (9.0.1). Attaching a screenshot here 
>>> for anyone to take a look.
>>>
>>> Bit
>>>
>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>>
>>> On May 18, 2018 8:41 AM, Bridger Dyson-Smith <bdysonsm...@gmail.com> wrote:
>>>
>>>> Hi Bit - are you using the latest version? There was a problem with 9.0 
>>>> and some Unicode characters. Christian and co. have a fix in v9.0.1.
>>>>
>>>> HTH,
>>>>
>>>> Bridger
>>>>
>>>> On Thu, May 17, 2018, 7:54 PM BitRider001 <bit.rider....@pm.me> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I just joined the mailing list due to a problem I'm having displaying and 
>>>>> storing special characters.
>>>>>
>>>>> I started with a CSV and created a database from it and the CSV is in 
>>>>> UTF-8. However, when I query the special characters become garbled. I'm 
>>>>> using the GUI in Windows 10.
>>>>>
>>>>> It starts with this in the CSV:
>>>>>
>>>>> <name>Cañelas</name>
>>>>>
>>>>> Then ends up with this when I export the query result into a text file:
>>>>>
>>>>> <name>Ca�las</name>
>>>>>
>>>>> Help please.
>>>>>
>>>>> Bit

Reply via email to