On Wed, Jun 17, 2015 at 2:12 AM, Richard Shann <[email protected]>
wrote:

> On Tue, 2015-06-16 at 21:09 -0500, Jeremiah Benham wrote:
> > I placed some code to debug what is going on in edit_label_for_button:
> >
> >         gboolean isvalid = g_utf8_validate (label,
> >                  sizeof(label),
> >                  NULL);
> >         printf("\nIs label valid utf8 == %i\n", isvalid);
> >         printf("\nnewlabel= %s\n",label);
> >
> >         isvalid = g_utf8_validate (newlabel,
> >                  sizeof(newlabel),
> >                  NULL);
> >         printf("\nIs newlabel valid utf8 == %i\n", isvalid);
> >         printf("\nnewlabel= %s\n",newlabel);
> >
> >
> > I copied in a treble clef symbol and label was not utf-8 and newlabel
> > was in fact utf8.
>
> So, what you have found is that newlabel on the Mac is UTF-8, as it is
> on Debian, which means that the GtkEntry widget is returning UTF-8
> encoded strings. So newlabel will point to the bytes 0xF0 0x9D 0x84 0x9E
> 0x00 when you have pasted a 𝄞 into the GtkEntry that is popped up when
> you do "Edit Label".
>
> You don't say in the snippet of debug code above where exactly the
>
> g_utf8_validate (label, sizeof(label),           NULL);
>
> was placed - was the value in "label" one attached to the button widget
> at line 241
>
> label = g_object_get_data (G_OBJECT(button), "icon");
>
> or was it one retrieved from the button's GtkLabel at line 243
>
>  label = gtk_button_get_label (GTK_BUTTON(button));
> >
>

It was at line 247. I modified the code a little to do some further
testing. Here is what I have done:

    label = g_object_get_data (G_OBJECT(button), "icon");
    if(label==NULL){
        label = gtk_button_get_label (GTK_BUTTON(button));
        printf("\nThe label was NULL gtk_button_get_label returns %s\n",
label);
    } else {
        printf("\nlabel= %s\n",label);
    }
        gboolean isvalid = g_utf8_validate (label,
          sizeof(label),
          NULL);
    printf("\nIs label valid utf8 == %i\n", isvalid);


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When I ran this after deleting .denemo-* I got some interesting results.
The first time and every subsequent time the label was NULL. It always
return invalid utf8. I tried deleting the field completely leaving it
blank. This rusted in label *not* being NULL. Then every time editing after
that the label was always non NULL and it would not save anything I placed
in the field after hitting ok. If I tried to edit the label the dialog box
popped up empty. I put something in the field that would appear
(incorrectly as usual) on the button but when I edit this again the dialog
would be empty again. The utf-8 validation check failed every single time
except once when I used an official Apple font. It did return as valid
utf-8 but still did not display correctly in the button or the dialog box
when pasted. Each time I can see the intended font in the command line
terminal output from the printf!

?If it is attached to the button widget as data, it could be that the
> libxml2 library is bringing in the strings as UTF-16 (I noticed that the
> xml format itself encodes them as Unicode, but the library on reading
> palettes.xml will convert to some encoding, I would guess dependent on
> the machine, perhaps we have a version of libxml2 that is encoding
> UTF-16).
>

Almost every time it seems to be getting label from gtk_button_get_label
because label is always NULL except when I deleted it entirely.

>
> It might be useful as well to do other validations - when you say "label
> was not utf8" can you put in a check to see if it is UTF-16 or the
> Unicode code point?
>

I did not see any way to validate utf-16. I could try converting it from
utf-16 to utf-8 then validate it.

It also occurred to me that the rasterized box with D834 in it may not
> be intended to convey that two bytes (0xD8 0x34) are present at this
> point in the string, it may be that it just displays the first two bytes
> of anything it can't find a glyph for.


I believe this to be true because it is the same [D834] no matter what
paste in that dialog box (unless it is one of the fonts that already work
like the flat symbol).


> However, what you have written
> above points the other way, namely that Gtk is consistently working with
> UTF-8 and it is somewhere in the backend that the conversion to UTF-16
> is being done, resulting in the appearance of those bytes.
>

Since label is always NULL, do you still think this is so?

Jeremiah


> Richard
>
>
>
>
>
> > Jeremiah
> >
> >
> >
> > On Tue, Jun 16, 2015 at 1:55 PM, Richard Shann
> > <[email protected]> wrote:
> >         On Tue, 2015-06-16 at 12:14 -0500, Jeremiah Benham wrote:
> >         >
> >         >
> >         > On Tue, Jun 16, 2015 at 10:22 AM, Richard Shann
> >         > <[email protected]> wrote:
> >         >
> >         >
> >         >         >         Another example is the first button in the
> >         general
> >         >         palette,
> >         >         >         the one at
> >         >         >         the top of the display. That is a treble
> >         clef sign
> >         >         in a large
> >         >         >         size, its
> >         >         >         label is this
> >         >         >
> >         >         >         <span font='16'> 𝄞   </span>
> >         >         >
> >         >         >         That one displays as D834 in your
> >         screenshot. I can
> >         >         only guess
> >         >         >         that on
> >         >         >         the Mac these embedded characters are
> >         being expected
> >         >         in a
> >         >         >         different
> >         >         >         format (UTF-16 instead of UTF-8 ?).
> >         >         >         Looking in the source of this label, the
> >         file
> >         >         >         actions/palettes.xml I see
> >         >         >
> >         >         >         label="&lt;span font='16'&gt; &#x1D11E;
> >         >          &lt;/span&gt;"
> >         >         >
> >         >         >         which means that 0x1D11E is the character
> >         code being
> >         >         inserted,
> >         >         >         this is
> >         >         >         what is called the unicode codepoint (I
> >         think what
> >         >         would be
> >         >         >         written U
> >         >         >         +1D11E). I don't know what else might work
> >         in that
> >         >         position.
> >         >         >         Looking up
> >         >         >         this unicode value I see that its UTF-16
> >         >         representation is
> >         >         >
> >         >         >
> >         >         >         D8 34 DD 1E
> >         >         >
> >         >         >         which hints to me that the (gtk routines
> >         for) the
> >         >         mac is just
> >         >         >         seeing the
> >         >         >         D834 bit - which would explain why your
> >         screenshots
> >         >         seem to
> >         >         >         show this
> >         >         >         same code on several buttons - they are
> >         all in the
> >         >         musical
> >         >         >         instruments
> >         >         >         block, which is perhaps what the D834
> >         refers to (the
> >         >         bass
> >         >         >         clef, for
> >         >         >         example, is D8 34 DD 22 in UTF-16).
> >         >         >
> >         >
> >         >         > Can we convert the UTF-16 to UTF-8? Something
> >         like:
> >         >         >
> >         >
> >
> https://developer.gnome.org/glib/stable/glib-Unicode-Manipulation.html#g-utf16-to-utf8
> >         >         >
> >         >         > Are these characters expected to be UTF-16 in
> >         windows?
> >         >         >
> >         >
> >         >         I used gdb to stop Denemo just as it is making the
> >         call to
> >         >         write the
> >         >         label on a palette button.
> >         >         this is in palettes.c at line 257
> >         >
> >         >         257             gtk_label_set_markup (GTK_LABEL
> >         >         (label_widget), newlabel);
> >         >
> >         >         I then enquired what bytes were contained in the
> >         string
> >         >         newlabel that is
> >         >         being passed to that function. On my Debian windows
> >         system,
> >         >         the bytes
> >         >         are these:
> >         >
> >         >         0xF0 0x9D 0x84 0x9E
> >         >
> >         >         Looking this up, I see that this is the UTF-8
> >         encoding for the
> >         >         treble
> >         >         clef sign (𝄞) which has the unicode value U+1D11E
> >         >
> >         >         So, the text entry widget is returning a UTF-8
> >         string
> >         >         representation for
> >         >         the text you enter into it on Debian. Specifically
> >         if you
> >         >         paste 𝄞 the
> >         >         text entry widget returns a pointer to the bytes
> >         0xF0 0x9D
> >         >         0x84 0x9E
> >         >         0x00.
> >         >
> >         >         We don't know what bytes that widget is returning on
> >         the Mac
> >         >         but one guess is that it is returning 0xD8 0x34 0xDD
> >         0x1E 0x00
> >         >         that is
> >         >         it is returning the UTF-16 encoding.
> >         >
> >         >         I tried setting newlabel to have this value 0xD8
> >         0x34 0xDD
> >         >         0x1E 0x00
> >         >         from inside gdb and this caused a warning
> >         >
> >         >          Gtk - WARNING : Failed to set text from markup due
> >         to error
> >         >         parsing
> >         >         markup: Error on line 1 char 13: Invalid UTF-8
> >         encoded text in
> >         >         name -
> >         >         not valid '�4�"
> >         >
> >         >         Because of this, it fails to update the label. So
> >         (in Debian)
> >         >         the call
> >         >         to gtk_label_set_markup() is expecting a UTF-8
> >         encoded string
> >         >         and fails
> >         >         when given the string 0xD8 0x34 0xDD 0x1E 0x00
> >         (label is not
> >         >         written
> >         >         to).
> >         >
> >         >         So, if you are able to test on the Mac, do
> >         >
> >         >         Right click on a palette button
> >         >         Edit Label
> >         >         delete all the text and paste in a single 𝄞
> >         character
> >         >         press enter and see if the label updates to a box
> >         with D834 in
> >         >         it, or if
> >         >         it fails to update.
> >         >
> >         >
> >         > I have done this and it fails.  It has the letters D834 like
> >         the
> >         > others. I thought we tested this already.
> >
> >
> >         Sorry, what you have written is ambiguous: did it fail to
> >         update the
> >         label, or did it update it to become D834 in a box?
> >         (That is, to test, start with a label that works, just ascii,
> >         and then
> >         try to edit it to be the single 𝄞 character).
> >
> >         If it fails to update, (stays as the ascii you had before) the
> >         we can't
> >         be sure what the text entry widget is returning.
> >
> >         If it updates to D834 in a box, then we can guess that it is
> >         the
> >         text_entry widget that is returning a UTF_16 string which the
> >         gtk_label_set_markup() function is failing to display.
> >
> >         Perhaps, nailing down what the mis-match is won't be as
> >         important as
> >         getting the right set of libraries that work together on the
> >         Mac. We
> >         don't know if the Mac code is supposed to be using UTF_16 or
> >         8, whatever
> >         it is, should be consistent between the GtkLabel and GtkEntry
> >         widgets.
> >
> >         Richard
> >
> >
> >
> >
> >
> >
> >         >
> >         >
> >         > Jeremiah
> >         >
> >         >
> >         >         Richard
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >         >
> >         >         > Jeremiah
> >         >         >
> >         >         >
> >         >         >         I'm not sure what the way through all this
> >         is,
> >         >         perhaps asking
> >         >         >         someone in
> >         >         >         the gtk mac world about the representation
> >         of
> >         >         characters - or,
> >         >         >         if gtk2
> >         >         >         works, then something in the upgrade
> >         documentation
> >         >         for gtk3
> >         >         >         might help.
> >         >         >
> >         >         >         Richard
> >         >         >
> >         >         >         >
> >         >         >
> >         >         >
> >         >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >
> >
> >
> >
> >
>
>
>
_______________________________________________
Denemo-devel mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/denemo-devel

Reply via email to