I extracted an example. The main issue is curly quotes. The text came
from FileMaker in UTF8, which I textDecode to UTF16. You can assume that
all text is LC native throughout the app.
Here is the template I use for merge:
<p leftindent="10" spacebelow="20"><span metadata="[[tMETADATA]]"><font
size="16" color="#C77C02">[[tSECTION]]</font>[[tCONCEPT]]</span></p>
In the field, this text is displayed accurately with curly quotes:
The New Testament Scholar <soft break> “Dare to reason!”
Here is a result of the merge:
<p leftindent="10" spacebelow="20" bgcolor="#FFDD71"><span metadata="The
New Testament Scholar	EN07_ힿ�Dare to
reason!ힿ�"><font size="16" color="#C77C02">The New
Testament Scholar</font>“Dare to reason!”</span></p>
Notice that the displayed text uses entity names (&ldquo, &rdquo) while
the metadata which was created from the same text block as the field
text has changed the quotes to two numbers in the high 5000s with no
difference between left and right quotes. I was unable to paste the
actual text here, as my mail client refused to render it, but the two
numerical references appear as a single pictograph in LC's variable
watcher, and do not match the card path I need, which in this case is:
EN07_The New Testament Scholar<tab>“Dare to reason!”
Maybe you can make sense of this? I've written an ugly workaround that
pieces together the reference I need, but it would be better if I could
just use the metadata. The metadata works fine as long as there are no
quotes.
On 9/9/19 11:35 PM, dsc--- via use-livecode wrote:
I think I'm doing this wrong. This seems to work, too.
on mouseup
put empty into field 1
put numToCodepoint(0x2200) into x
put numToCodepoint(0x1040F) & "V-" into y
put merge(" é{ [[x]] }é [[y]]") into field 1
end mouseup
On Sep 9, 2019, at 10:25 PM, dsc--- via use-livecode
<use-livecode@lists.runrev.com> wrote:
And this, too, looks OK to me.
on mouseup
put empty into field 1
put "A" into field 1
get numToCodepoint(0x2200) & numToCodepoint(0x1040F) & "V-"
set the metadata of char 1 of field 1 to it
put the metadata of char 1 of field 1 after field 1
end mouseup
I guess the problem is in the merge as you thought.
I did notice in the dictionary that setting the metadata of a line is not the
same as setting the metadata of all of the characters of the line.
Dar Scott
On Sep 9, 2019, at 8:58 PM, Dar Scott Consulting via use-livecode
<use-livecode@lists.runrev.com> wrote:
This quick check seems to work for me.
on mouseup
put "A" into field 1
set the metadata of char 1 of field 1 to "é"
put the metadata of char 1 of field 1 after field 1
end mouseup
On Sep 9, 2019, at 8:32 PM, J. Landman Gay via use-livecode
<use-livecode@lists.runrev.com> wrote:
Well, I've made some changes to the code since I started urlEncoding the text
before merging so I'll check that again. Paul is right that unicode in htmltext
needs to be in hex, but the numbers I'm getting back are very high (8,000+) and
render in the field as strange pictographs. Elsewhere where there is no merge,
curly quotes translate to the named quote or apostrophe entities and are
correct.
By metadata I mean the LC term (see the dictionary) that allows you to attach
some text to a field text chunk. The metadata isn't displayed in the field but
you can use it for anything you want. In my case the field is a list of
clickable entries in a table of contents, each with its own metadata attached
that provides a path to the stack and card the entry needs to open.
When I use normal LC text as metadata, diacriticals aren't rendered correctly
(curly quotes become question marks,) the path is therefore incorrect and the
click goes nowhere.
Since LC is supposed to be unicode throughout, I'd expect metadata to be
compatible. The same text appears correctly when not used as metadata.
--
Jacqueline Landman Gay | jac...@hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
On September 9, 2019 7:25:28 PM Dar Scott Consulting via use-livecode
<use-livecode@lists.runrev.com> wrote:
I think you are trying to think too much about the LC implementation of text.
Maybe.
Text in LC is an abstraction of a sequence of code points. Whether it is UTF16
or not is hidden to me. (mostly)
So,
get textDecode( binaryFromServer, "UTF-8" )
should put that into the correct form, if it is really UTF-8.
A data (binary bytes) is interpreted as native encoding if one tries to use it
as text. I recommend against this. I try to always textDecode() everything
coming in, but I make exceptions at times for ASCII.
I'm not sure what you mean by metadata. Are you referring to HTTP content-type?
Sorry, if I am off on a bunny trail...
Dar
On Sep 9, 2019, at 4:38 PM, J. Landman Gay via use-livecode
<use-livecode@lists.runrev.com> wrote:
It's UTF8 text from a server, which I textDecode to UTF16. When I use the UTF16
text in a merge, diacriticals and/or curly quotes get mangled. (Same with
setting metadata on field text too.)
On 9/9/19 4:16 PM, Dar Scott Consulting via use-livecode wrote:
I'm not sure I understand.
Do you mean "encoded to UTF-16"? In that case you should decode that to convert
it to internal text. And then try merge. (Which still might have problems, I suppose.)
On Sep 9, 2019, at 12:08 PM, J. Landman Gay via use-livecode
<use-livecode@lists.runrev.com> wrote:
It seems that the merge command doesn't respect unicode. Does anyone have a
workaround? The text I'm inserting is already decoded to UTF16.
--
Jacqueline Landman Gay | jac...@hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
--
Jacqueline Landman Gay | jac...@hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode