If you know your UTF-8 is correct (no errors) you can find the start of
the Nth Unicode code point (not character!) by finding the Nth byte
that is not in the range 0x80-0xBF.
However if you think you need to print N characters then you are not
using Unicode correctly. That will return the
On 09/21/2011 01:39 PM, Nikita Egorov wrote:
off topic: I'm not sure the word glyph is a proper one in our case.
IIRC the glyph can be only part of character. A few glyphs can be at
one character cell and make up grapheme, symbol. So I'm interested in
how many character cells will be
If you know your string is ASCII then the number of glyphs is equal to
the number of bytes. This may be reasonable if you are just trying to
print numbers.
Even using fltk's calls, you can use utf8_fwd to move from the start of
the string to find the Nth code point. This has the advantage that
On 09/22/2011 02:44 AM, Duncan Gibson wrote:
I have changed (r.9055) the Doxygen doc of fl_draw() functions
to state explicitly that all involved strings are UTF-8 encoded
and all lengths are in bytes.
I would suggest to use the fl_utf8decode() function in your case.
It will successively
I have changed (r.9055) the Doxygen doc of fl_draw() functions
to state explicitly that all involved strings are UTF-8 encoded
and all lengths are in bytes.
I would suggest to use the fl_utf8decode() function in your case.
It will successively compute the byte length of each Unicode character
in
I have changed (r.9055) the Doxygen doc of fl_draw() functions
to state explicitly that all involved strings are UTF-8 encoded
and all lengths are in bytes.
Good idea.
I would suggest to use the fl_utf8decode() function in your case.
It will successively compute the byte length of each
I have changed (r.9055) the Doxygen doc of fl_draw() functions
to state explicitly that all involved strings are UTF-8 encoded
and all lengths are in bytes.
I would suggest to use the fl_utf8decode() function in your case.
It will successively compute the byte length of each Unicode
Without looking into it in more detail, I'm going to say docs problem -
it should not say n characters but should probably say something like
the number of bytes needed to represent n characters in UTF8 or some
such thing... Or...?
Yes, description should be replaced to number of bytes, but I
On 21.09.2011, at 14:50, Nikita Egorov wrote:
Without looking into it in more detail, I'm going to say docs problem -
it should not say n characters but should probably say something like
the number of bytes needed to represent n characters in UTF8 or some
such thing... Or...?
Yes,
But in UTF-16 all symbols have size two bytes. There is no problem to
set specified size of string as opposed to UTF8 where every symbol can
have own size (from 1 up to 5?) .
Not true I'm afraid - only glyphs from the BMP are sure to be two bytes
in UTF16.
Any glyph from a higher plane will
The only reliable way to get the width of whatever is printed
is using fl_width() after setting the font and size.
Or my preferred option of fl_text_extents()
SELEX Galileo Ltd
Registered Office: Sigma House, Christopher Martin Road, Basildon, Essex SS14
3EL
A company registered in
No, even if you use monospace fonts, you can not assume that the number of
characters times the width of the font will give you the width of the string
that will be rendered on screen. There are characters and character
combinations in Unicode that need more or less pixels, even in
On 21.09.2011, at 16:52, Nikita Egorov wrote:
No, even if you use monospace fonts, you can not assume that the number of
characters times the width of the font will give you the width of the string
that will be rendered on screen. There are characters and character
combinations in Unicode
characters 0x2e80 for example have a width of two monospace chars. These
are mostly Chinese, Japanese and Korean. Basically, for monospaced font in
Unicode, you can have non-spacing, single-width or double-width characters or
ligatures.
I know about the double-width characters.
This should work (untested):
int findFirstNCharacters(const char *str, int n)
{
int bytes = 0;
int maxBytes = strlen(str);
while (n0 != *str!=0) {
int bytesInChar = fl_utf_nb_char(*str, maxBytes);
if (bytesInChar==-1) break; // error in UTF-8
bytes += bytesInChar;
maxBytes
On 21 Sep 2011, at 18:20, Matthias Melcher wrote:
This should work (untested):
int findFirstNCharacters(const char *str, int n)
{
int bytes = 0;
int maxBytes = strlen(str);
while (n0 != *str!=0) {
int bytesInChar = fl_utf_nb_char(*str, maxBytes);
if (bytesInChar==-1) break;
This should work (untested):
int findFirstNCharacters(const char *str, int n)
{
int bytes = 0;
int maxBytes = strlen(str);
while (n0 != *str!=0) {
int bytesInChar = fl_utf_nb_char(*str, maxBytes);
if (bytesInChar==-1) break; // error in UTF-8
bytes += bytesInChar;
If the intent is to trim glyphs off the end of a string until it only has the
required number of glyphs left, then I think you could do something useful
using:
/* F2: Move backward to the previous valid UTF8 sequence start */
FL_EXPORT const char* fl_utf8back(const char* p, const char*
On 21 Sep 2011, at 21:39, Nikita Egorov wrote:
off topic: I'm not sure the word glyph is a proper one in our case.
IIRC the glyph can be only part of character. A few glyphs can be at
one character cell and make up grapheme, symbol. So I'm interested in
how many character cells will be
19 matches
Mail list logo