2015-02-12 13:11 GMT+04:00 Rutledge Shawn shawn.rutle...@theqtcompany.com:
Consequently we have to do conversion each time we need the renderable
text, and/or cache the results to avoid converting repeatedly. Right?
Pnrftm... what? Cache what? And where? I've missed the point...
And we
On 12 Feb 2015, at 08:55, Konstantin Ritt ritt...@gmail.com wrote:
2015-02-12 11:53 GMT+04:00 Konstantin Ritt ritt...@gmail.com:
2015-02-12 11:39 GMT+04:00 Rutledge Shawn shawn.rutle...@theqtcompany.com:
On 11 Feb 2015, at 18:15, Konstantin Ritt ritt...@gmail.com wrote:
FYI: Unicode
On Wednesday 11 February 2015 17:20:04 Guido Seifert wrote:
Minor OT, but I am too curious... do you have an example?
Are there really cases were turning lower case into upper case or
vice versa changes the length of a string?
office (4 code points) - OFFICE (7 code points)
On 2015-02-11 11:29, Thiago Macieira wrote:
On Wednesday 11 February 2015 11:22:59 Julien Blanc wrote:
On 11/02/2015 10:32, Bo Thorsen wrote:
2) length() returns the number of chars I see on the screen, not a
random implementation detail of the chosen encoding.
How’s that supposed to work
On Wednesday 11 February 2015 10:32:22 Mark Gaiser wrote:
Have you tried to uppercase or lowercase a string using only the Standard
Library?
std::string s(hello);
std::transform(s.begin(), s.end(), s.begin(), ::toupper);
and
std::transform(s.begin(), s.end(), s.begin(), ::tolower);
Yes, and he already said such example, ß becomes SS
The other example that was given is 'i' (UTF-8 0x69) becoming 'İ' under a
Turkish locale (UTF-8 0xc4 0xb0).
Ah sorry. I was too focused on the visible length. 'i' = 'İ' = 1. But of course
I have to look at the memory usage in the
On Wed, Feb 11, 2015 at 2:20 PM, Guido Seifert warg...@gmx.de wrote:
Minor OT, but I am too curious... do you have an example?
Are there really cases were turning lower case into upper case or
vice versa changes the length of a string?
Yes, and he already said such example, ß becomes SS
Minor OT, but I am too curious... do you have an example?
Are there really cases were turning lower case into upper case or
vice versa changes the length of a string?
Guido
std::string s(hello);
std::transform(s.begin(), s.end(), s.begin(), ::toupper);
and
std::transform(s.begin(),
On Wednesday 11 February 2015 11:22:59 Julien Blanc wrote:
On 11/02/2015 10:32, Bo Thorsen wrote:
2) length() returns the number of chars I see on the screen, not a
random implementation detail of the chosen encoding.
How’s that supposed to work with combining characters, which are part of
2015-02-11 20:35 GMT+04:00 Thiago Macieira thiago.macie...@intel.com:
There are probably more examples.
ftp://ftp.unicode.org/Public/UNIDATA/SpecialCasing.txt
___
Development mailing list
Development@qt-project.org
FYI: Unicode codepoint != character visual representation. Moreover, a
single character could be represented with a sequence of glyps or vice
versa - a sequence of characters could be represented with a single glyph.
QString (and every other Unicode string class in the world) represents a
On Wednesday 11 February 2015 17:23:51 Christoph Feck wrote:
On Wednesday 11 February 2015 17:20:04 Guido Seifert wrote:
Minor OT, but I am too curious... do you have an example?
Are there really cases were turning lower case into upper case or
vice versa changes the length of a string?
On Wednesday 11 Feb 2015 17:20:04 Guido Seifert wrote:
Minor OT, but I am too curious... do you have an example?
Are there really cases were turning lower case into upper case or
vice versa changes the length of a string?
What is uppercase ß?
daniel
So forget proposing QString to operate on visual or logical glyphs. There
is QTextBoundaryFinder class that operates on logical items, and
QFontMetrics that operates on visual glyphs.
Regards,
Konstantin
2015-02-11 21:59 GMT+04:00 Matthew Woehlke mw_tr...@users.sourceforge.net:
On 2015-02-11
On Wednesday 11 February 2015 17:28:43 Daniel Teske wrote:
On Wednesday 11 Feb 2015 17:20:04 Guido Seifert wrote:
Minor OT, but I am too curious... do you have an example?
Are there really cases were turning lower case into upper case or
vice versa changes the length of a string?
What is
On Wednesday 11 February 2015 18:26:40 Guido Seifert wrote:
Yes, and he already said such example, ß becomes SS
The other example that was given is 'i' (UTF-8 0x69) becoming 'İ' under a
Turkish locale (UTF-8 0xc4 0xb0).
Ah sorry. I was too focused on the visible length. 'i' = 'İ' = 1.
On 11 Feb 2015, at 18:15, Konstantin Ritt ritt...@gmail.com wrote:
FYI: Unicode codepoint != character visual representation. Moreover, a single
character could be represented with a sequence of glyps or vice versa - a
sequence of characters could be represented with a single glyph.
2015-02-12 11:39 GMT+04:00 Rutledge Shawn shawn.rutle...@theqtcompany.com:
On 11 Feb 2015, at 18:15, Konstantin Ritt ritt...@gmail.com wrote:
FYI: Unicode codepoint != character visual representation. Moreover, a
single character could be represented with a sequence of glyps or vice
versa
2015-02-12 11:53 GMT+04:00 Konstantin Ritt ritt...@gmail.com:
2015-02-12 11:39 GMT+04:00 Rutledge Shawn shawn.rutle...@theqtcompany.com
:
On 11 Feb 2015, at 18:15, Konstantin Ritt ritt...@gmail.com wrote:
FYI: Unicode codepoint != character visual representation. Moreover, a
single
Am 11.02.2015 um 10:11 schrieb Marc Mutz:
You overlooked where a corresponding character exists. Either uppercase ß
exists (it does, it was found in an old printing, so there's a movement to
adopt it, except Unicode doesn't have it), then it's not a problem, or it does
(as is the case in
On Wednesday 11 February 2015 10:32:31 Bo Thorsen wrote:
This would make me very unhappy. I'm doing a customer project right now
that uses std::string all over the place and there is real pain involved
in this. It's an almost empty layer over char* and brings none of the
features of QString.
On 11/02/2015 10:32, Bo Thorsen wrote:
2) length() returns the number of chars I see on the screen, not a
random implementation detail of the chosen encoding.
How’s that supposed to work with combining characters, which are part of
unicode ?
3) at(int) and [] gives the unicode char, not a
On Wednesday 11 February 2015 02:22:45 Thiago Macieira wrote:
charT do_toupper(charT c) const;
const charT* do_toupper(charT* low, const charT* high) const;
Effects: Converts a character or characters to upper case. The second
form replaces each character *p in the range [low,high)
Den 10-02-2015 kl. 23:17 skrev Allan Sandfeld Jensen:
On Tuesday 10 February 2015, Oswald Buddenhagen wrote:
On Wed, Feb 11, 2015 at 12:37:41AM +0400, Konstantin Ritt wrote:
Yes, that would be an ideal solution. Unfortunately, that would also
break a LOT of existing code.
i was thinking of
On Wed, Feb 11, 2015 at 12:33 AM, Thiago Macieira thiago.macie...@intel.com
wrote:
On Tuesday 10 February 2015 23:17:21 Allan Sandfeld Jensen wrote:
Maybe with C++11 we don't need QString that much anymore. Use std::string
with UTF8 and std::u32string for UCS4.
For Qt6 it would be worth
On Tuesday 10 February 2015 17:22:45 Thiago Macieira wrote:
Because unlike std::vector, std::basic_string is woefully inadequate
compared to QString and QByteArray. I just mentioned the easy cases, but a
quick check shows how much more is lacking.
I rest my case. QString will be there at
On Wednesday 11 February 2015 11:11:36 Olivier Goffart wrote:
UB could ckick in has no meaning.
In practice there is no reason why casting a pointer to member function to
remove the const would not work. Yet, you would not accept it[1].
Data races are undefined behavior according to the
On Wednesday 11 February 2015 01:38:12 Olivier Goffart wrote:
Eh... have you tried to convert a UTF-8 or UTF-16 or UCS-4 string to the
locale's narrow character set without using QString?
with std::ctype::tonarrow?
That's std::ctype::narrow, which I didn't realise existed until now. But I
On Wednesday 11 February 2015 01:59:40 Olivier Goffart wrote:
Unless it is a buffer of std::atomic, it is an undefined behavior, so not
only the contents of the buffer is unpredictable, but anything, really.
(A sufficiently smart conforming compiler could see that you are writing at
the same
On Feb 10, 2015, at 17:08, Julien Blanc julien.bl...@nmc-company.com wrote:
On 10/02/2015 16:33, Knoll Lars wrote:
IMO there’s simply too many questions that this one example doesn’t answer
to conclude that what we are doing is bad.
Two arguments :
- implicit sharing is convenient, and
16 bits is completely enough for most spoken languages (see the
Unicode's Blocks.txt and/or Scripts.txt for an approximated list), whereas
8 bits encoding only covers ASCII.
Despite of what http://utf8everywhere.org/#conclusions says, UTF-16 is not
the worst choice; it is a trade-off between the
On Tuesday 10 February 2015 13:26:50 Thiago Macieira wrote:
But given the choice, I would choose to do nothing. Instead, I have a patch
pending for Qt 6 that caches the Latin1 version of the QString in an extra
block past the UTF-16 data.
Sorry, I remembered wrong. I have a patch that sets a
2015-02-11 1:26 GMT+04:00 Thiago Macieira thiago.macie...@intel.com:
On Wednesday 11 February 2015 00:37:41 Konstantin Ritt wrote:
Yes, that would be an ideal solution. Unfortunately, that would also
break
a LOT of existing code.
In Qt4 times, I was doing some experiments with the QString
On Wed, Feb 11, 2015 at 12:37:41AM +0400, Konstantin Ritt wrote:
Yes, that would be an ideal solution. Unfortunately, that would also
break a LOT of existing code.
i was thinking of making it explicit with a smooth migration path - add
QUtf8String (basically QByteArray, but don't permit
On Wednesday 11 February 2015 00:37:41 Konstantin Ritt wrote:
Yes, that would be an ideal solution. Unfortunately, that would also break
a LOT of existing code.
In Qt4 times, I was doing some experiments with the QString adaptive
storage (similar to what NSString does behind the scenes).
I've
Err, s/utf8Data/utf16Data/
Regards,
Konstantin
2015-02-11 1:52 GMT+04:00 Konstantin Ritt ritt...@gmail.com:
2015-02-11 1:26 GMT+04:00 Thiago Macieira thiago.macie...@intel.com:
On Wednesday 11 February 2015 00:37:41 Konstantin Ritt wrote:
Yes, that would be an ideal solution.
On Tuesday 10 February 2015 22:58:58 Konstantin Ritt wrote:
16 bits is completely enough for most spoken languages (see the
s/most/all/
All *living* languages are encoded in the BMP. The SMP and other planes
contain only dead languages (Egyptian hieroglyphs, Linear A, Linear B, etc.),
plus
Yes, that would be an ideal solution. Unfortunately, that would also break
a LOT of existing code.
In Qt4 times, I was doing some experiments with the QString adaptive
storage (similar to what NSString does behind the scenes).
Konstantin
2015-02-11 0:22 GMT+04:00 Oswald Buddenhagen
On Tue, Feb 10, 2015 at 10:58:58PM +0400, Konstantin Ritt wrote:
Despite of what http://utf8everywhere.org/#conclusions says, UTF-16 is not
the worst choice; it is a trade-off between the performance and the memory
consumption in the most-common use case (spoken languages and mixed
scripts).
On Wednesday 11 February 2015 01:52:34 Konstantin Ritt wrote:
2015-02-11 1:26 GMT+04:00 Thiago Macieira thiago.macie...@intel.com:
On Wednesday 11 February 2015 00:37:41 Konstantin Ritt wrote:
Yes, that would be an ideal solution. Unfortunately, that would also
break
a LOT of
On Tuesday 10 February 2015 23:17:21 Allan Sandfeld Jensen wrote:
Maybe with C++11 we don't need QString that much anymore. Use std::string
with UTF8 and std::u32string for UCS4.
For Qt6 it would be worth considering how many of our classes still makes
sense. Those we want CoW semantics on
Can QChar represent a 32 bits codepoint, then?
Regards,
Konstantin
2015-02-11 2:11 GMT+04:00 Thiago Macieira thiago.macie...@intel.com:
On Wednesday 11 February 2015 01:52:34 Konstantin Ritt wrote:
2015-02-11 1:26 GMT+04:00 Thiago Macieira thiago.macie...@intel.com:
On Wednesday 11
On Tuesday 10 February 2015, Oswald Buddenhagen wrote:
On Wed, Feb 11, 2015 at 12:37:41AM +0400, Konstantin Ritt wrote:
Yes, that would be an ideal solution. Unfortunately, that would also
break a LOT of existing code.
i was thinking of making it explicit with a smooth migration path - add
On Tuesday 10 February 2015 15:33:12 Thiago Macieira wrote:
On Tuesday 10 February 2015 23:17:21 Allan Sandfeld Jensen wrote:
Maybe with C++11 we don't need QString that much anymore. Use std::string
with UTF8 and std::u32string for UCS4.
For Qt6 it would be worth considering how many of
On Wednesday 11 February 2015 04:05:02 Konstantin Ritt wrote:
Previously you said QString::data() must return QChar* (and not a generic
uchar*), so that QString with an adaptive storage would have to silently
convert the internal encoding into the one represented by QChar.
If QString has a
On Tuesday 10 February 2015 23:17:21 Allan Sandfeld Jensen wrote:
Maybe with C++11 we don't need QString that much anymore. Use std::string
with UTF8 and std::u32string for UCS4.
For Qt6 it would be worth considering how many of our classes still makes
sense. Those we want CoW semantics on
Previously you said QString::data() must return QChar* (and not a generic
uchar*), so that QString with an adaptive storage would have to silently
convert the internal encoding into the one represented by QChar.
If QString has a UCS-4 indexes and length() that counts the amount of UCS-4
On Wednesday 11 February 2015 02:19:59 Konstantin Ritt wrote:
Can QChar represent a 32 bits codepoint, then?
Yes, it could be widened. But what's the advantage in using UCS-4?
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
On Tuesday 10 February 2015 22:26:50 Thiago Macieira wrote:
It's not insurmountable. I can think of two solutions:
1) pre-allocate enough space for the UTF-16 data (strlen(utf8) * 2), so
that the const functions can implicitly write to the UTF-16 block when
needed. Since the original UTF-8
On 2015-02-10 18:33, Thiago Macieira wrote:
Eh... have you tried to convert a UTF-8 or UTF-16 or UCS-4 string to the
locale's narrow character set without using QString?
Yup... we would need to standardize libiconv (or an equivalent) for that
:-).
Have you tried to convert a number to
On 2015-02-10 18:40, Marc Mutz wrote:
On Tuesday 10 February 2015 22:26:50 Thiago Macieira wrote:
It's not insurmountable. I can think of two solutions:
1) pre-allocate enough space for the UTF-16 data (strlen(utf8) * 2), so
that the const functions can implicitly write to the UTF-16 block
On Wednesday 11 February 2015 00:40:28 Marc Mutz wrote:
On Tuesday 10 February 2015 22:26:50 Thiago Macieira wrote:
It's not insurmountable. I can think of two solutions:
1) pre-allocate enough space for the UTF-16 data (strlen(utf8) * 2), so
that the const functions can implicitly
On Tuesday 10 February 2015 19:07:09 Matthew Woehlke wrote:
Heh. That reminds me, when will Qt classes get emplace methods?
I added those methods to my local refactor of QVector, but..
Or the ability to accept movable-but-not-copyable types?
... they aren't useful because we'll never accept
On Tuesday 10 February 2015 19:10:29 Matthew Woehlke wrote:
On 2015-02-10 18:40, Marc Mutz wrote:
On Tuesday 10 February 2015 22:26:50 Thiago Macieira wrote:
It's not insurmountable. I can think of two solutions:
1) pre-allocate enough space for the UTF-16 data (strlen(utf8) * 2), so
54 matches
Mail list logo