On Mon, Mar 21, 2016 at 3:30 PM, Randall Sawyer
wrote:
> Frankly, the use of the term "character" when referring to a "UTF-8
> encoded Unicode code point" was for me a source of confusion
A character means a "Unicode character". That's independent of encoding,
so,
Frankly, the use of the term "character" when referring to a "UTF-8
encoded Unicode code point" was for me a source of confusion when I
leapt to the conclusion of the unmet need of a UTF-8-length-aware
wrapped string type - be it called "G_UTF8String" or "GUString".
I recommend that all Glib
Thank you once again to all who have responded.
I have changed my mind.
I DO grasp the nature of responders' objections.
My understanding has now reached a "tipping point".
What is the tipping point?
On 03/21/2016 04:30 PM, Behdad Esfahbod wrote:
I like to voice my opinion as well:
-
I like to voice my opinion as well:
- Bundling data and its length in a boxed type is useful, but that's
gblob,
- Bundling number-of-Unicode-character is rarely useful indeed,
- A string API that would require any changes to the string content to go
through editing function calls is
On Fri, Mar 18, 2016 at 9:57 AM, Randall Sawyer
wrote:
> 2) If the former is true - which it is - then the developer will need to
> call g_utf8_strlen() to determine if there are multi-byte sequences to
> navigate - and if there are - g_utf8_offset_to_pointer() to
Sure, code point works too. Anyway, enough with the ontology, we're
not a standards body
I still don't think that we need a utf8-string datatype.
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
On 03/19/2016 02:04 PM, Randall Sawyer wrote:
>> It's possible you are focusing on implementation before measuring the
>> problem. DRY alone is not a sufficient argument.
>
> "DRY" is not a term I know - or at least in the way you are using it
> here.
On 03/19/2016 04:09 PM, Christian Hergert wrote:
It's possible you are focusing on implementation before measuring the
problem. DRY alone is not a sufficient argument.
"DRY" is not a term I know - or at least in the way you are using it here.
One topic I'm interested in covering at the
On 03/19/2016 12:25 PM, Randall Sawyer wrote:
>
> If there already were such a structure, then it could already have been
> employed by existing objects and structures such as GtkEntryBuffer and
> PangoLayout - to name two - eliminating the need for extra lines of
> redundant code. In fact - as I
length-aware string object. That's all.
Forwarded Message ----
Subject: Re: G_UTF8String: Boxed Type Proposal
Date: Sat, 19 Mar 2016 15:11:23 -0400
From: Randall Sawyer <srandallsaw...@hushmail.me>
To: Emmanuele Bassi <eba...@gmail.com>
On 03/19/2016 02:57
On Fri, Mar 18, 2016 at 2:57 PM Randall Sawyer
wrote:
> how about the following modifications?
> Change "gstring.h":
> ...
> struct _GString
> {
>gchar *str;
>gsize len;
>gsize allocated_len;
>gsize utf8_len;
> };
> ...
>
Changing the size of a
Hi;
On 19 March 2016 at 18:03, Randall Sawyer wrote:
> The concision of "GUString" over "G_UTF8String" reflects the concision of my
> thoughts over what they were at the beginning of this thread.
Since you've brought it up multiple times, I wanted to ensure you
On 03/19/2016 01:38 PM, Christian Hergert wrote:
On 03/19/2016 06:57 AM, Randall Sawyer wrote:
Some object classes - such as GtkEntryBuffer - store this value and
update it as text is inserted or deleted. That is efficient. The fact
that developers need to write equivalent code for each such
On Thu, Mar 17, 2016 at 4:09 PM, Jasper St. Pierre
wrote:
> The major issue is that "Unicode character" doesn't have a good
> definition. The most likely definition is a "Unicode code point",
> however, Windows uses "Unicode character" to mean a UTF-16 byte
> sequence,
On 03/19/2016 06:57 AM, Randall Sawyer wrote:
>
> Some object classes - such as GtkEntryBuffer - store this value and
> update it as text is inserted or deleted. That is efficient. The fact
> that developers need to write equivalent code for each such class is
> inefficient.
A string abstraction
[ Replying a little randomly to this message. ]
Randall Sawyer:
> 3) Wouldn't it be helpful to keep track of how many code points
> ("characters")are stored in the GString - a number which may be less than
> the value of GString.len - without needing to call g_utf8_strlen() each time
> to find
On 17/03/16 20:29, Matthias Clasen wrote:
> Terminology can certainly be confusing at times, but I think that a
> Unicode character is a perfectly well-defined entity, non-withstanding
> the fact that it can be represented in various encodings (a utf8
> sequence, a ucs4 word, a utf-16 surrogate
On 03/19/2016 03:41 AM, Errol van de l'Isle wrote:
Just to add my two cents worth as a user of glibmm.
Glib::usting uses g_utf8_pointer_to_offset() to obtain the length of
the string in characters in the method Glib::ustring::length. The
method Glib::ustring::bytes returns the length in bytes;
On 03/17/2016 09:30 AM, Matthias Clasen wrote:
Hi Randall,
thanks for contributing!
Pleased to be of service! Looking forward to learning how folks work
together in this community.
I believe that you haven't found such a proposal because most people
don't see much use in a separate boxed
On Wed, Mar 16, 2016 at 6:58 PM, Randall Sawyer
wrote:
> I have a question at the end of this! Please answer if you think it will
> help.
Hi Randall,
thanks for contributing!
>
> I propose the development of a new boxed type for the Glib API named
> "G_UTF8String".
Just to add my two cents worth as a user of glibmm.
Glib::usting uses g_utf8_pointer_to_offset() to obtain the length of
the string in characters in the method Glib::ustring::length. The
method Glib::ustring::bytes returns the length in bytes;
At no point does it store the number of UTF-8
The major issue is that "Unicode character" doesn't have a good
definition. The most likely definition is a "Unicode code point",
however, Windows uses "Unicode character" to mean a UTF-16 byte
sequence, which means that any code point above the Basic Multilingual
Plane is really composed of two
On Thu, Mar 17, 2016 at 2:26 PM, Jasper St. Pierre
wrote:
> I'll also ask what "character" means in this case, even though I know
> glib also has the same confusion. Are you talking about the number of
> Unicode code points in the string, or the number of grapheme
I'll also ask what "character" means in this case, even though I know
glib also has the same confusion. Are you talking about the number of
Unicode code points in the string, or the number of grapheme clusters,
as defined by Unicode TR29 [0]? The number of code points isn't useful
for editing in
On 03/18/2016 10:10 AM, Florian Müllner wrote:
On Fri, Mar 18, 2016 at 2:57 PM Randall Sawyer
> wrote:
how about the following modifications?
Change "gstring.h":
...
struct _GString
{
gchar *str;
On 03/17/2016 02:26 PM, Jasper St. Pierre wrote:
I'll also ask what "character" means in this case, even though I know
glib also has the same confusion. Are you talking about the number of
Unicode code points in the string, or the number of grapheme clusters,
as defined by Unicode TR29 [0]? The
On 03/17/2016 07:23 PM, Matthias Clasen wrote:
Sure, code point works too. Anyway, enough with the ontology, we're
not a standards body
I still don't think that we need a utf8-string datatype.
I have questions, then.
Here are excerpts from the current master files:
"gstring.h"
...
struct
On Fri, 18 Mar 2016 10:19:08 -0400
Randall Sawyer wrote:
> Also - I just discovered that glibmm has a class Glib::ustring
> (https://developer.gnome.org/glibmm/stable/classGlib_1_1ustring.html).
> I am going to take a look through its source to see what they have
>
On 03/17/2016 10:39 AM, Randall Sawyer wrote:
On 03/17/2016 09:30 AM, Matthias Clasen wrote:
I believe that you haven't found such a proposal because most people
don't see much use in a separate boxed type for utf8 strings. Every
string we pass around in GLib and GTK+, and every char * in
29 matches
Mail list logo