On Monday, March 9, 2015 at 12:05:05 PM UTC+5:30, Steven D'Aprano wrote:
> Chris Angelico wrote:
>
> > As to the notion of rejecting the construction of strings containing
> > these invalid codepoints, I'm not sure. Are there any languages out
> > there that have a Unicode string type that require
On Mon, Mar 9, 2015 at 5:34 PM, Steven D'Aprano
wrote:
> Chris Angelico wrote:
>
>> As to the notion of rejecting the construction of strings containing
>> these invalid codepoints, I'm not sure. Are there any languages out
>> there that have a Unicode string type that requires that all
>> codepoi
Chris Angelico wrote:
> As to the notion of rejecting the construction of strings containing
> these invalid codepoints, I'm not sure. Are there any languages out
> there that have a Unicode string type that requires that all
> codepoints be valid (no surrogates, no U+FFFE, etc)?
U+FFFE and U+FFF
Ben Finney :
> Steven D'Aprano writes:
>
>> '\udd00' should be a SyntaxError.
>
> I find your argument convincing, that attempting to construct a
> Unicode string of a lone surrogate should be an error.
Then we're back to square one:
>>> b'\x80'.decode('utf-8', errors='surrogateescape')
'
On Sun, Mar 8, 2015, at 22:09, Ben Finney wrote:
> Steven D'Aprano writes:
>
> > '\udd00' should be a SyntaxError.
>
> I find your argument convincing, that attempting to construct a Unicode
> string of a lone surrogate should be an error.
>
> Shouldn't the error type be a ValueError, though? T
On Monday, March 9, 2015 at 7:39:42 AM UTC+5:30, Cameron Simpson wrote:
> On 07Mar2015 22:09, Steven D'Aprano wrote:
> >Rustom Mody wrote:
> >>[...big snip...]
> >> Some parts are here some earlier and from my memory.
> >> If details wrong please correct:
> >> - 200 million records
> >> - Containi
On Mon, Mar 9, 2015 at 1:09 PM, Ben Finney wrote:
> Steven D'Aprano writes:
>
>> '\udd00' should be a SyntaxError.
>
> I find your argument convincing, that attempting to construct a Unicode
> string of a lone surrogate should be an error.
>
> Shouldn't the error type be a ValueError, though? The
Steven D'Aprano writes:
> '\udd00' should be a SyntaxError.
I find your argument convincing, that attempting to construct a Unicode
string of a lone surrogate should be an error.
Shouldn't the error type be a ValueError, though? The statement is not,
to my mind, erroneous syntax.
--
\ “P
On 07Mar2015 22:09, Steven D'Aprano
wrote:
Rustom Mody wrote:
[...big snip...]
Some parts are here some earlier and from my memory.
If details wrong please correct:
- 200 million records
- Containing 4 strings with SMP characters
- System made with python and mysql. SMP works with python, brea
Marko Rauhamaa wrote:
> Steven D'Aprano :
>
>> Marko Rauhamaa wrote:
>>> '\udd00' is a valid str object:
>>
>> Is it though? Perhaps the bug is not UTF-8's inability to encode lone
>> surrogates, but that Python allows you to create lone surrogates in
>> the first place. That's not a rhetorical q
On Mon, Mar 9, 2015 at 5:25 AM, Steven D'Aprano
wrote:
> Perhaps the bug is not UTF-8's inability to encode lone
> surrogates, but that Python allows you to create lone surrogates in the
> first place. That's not a rhetorical question. It's a genuine question.
As to the notion of rejecting the co
On Mon, Mar 9, 2015 at 5:25 AM, Steven D'Aprano
wrote:
> Marko Rauhamaa wrote:
>
>> Chris Angelico :
>>
>>> Once again, you appear to be surprised that invalid data is failing.
>>> Why is this so strange? U+DD00 is not a valid character.
>
> But it is a valid non-character code point.
>
>>> It is
Steven D'Aprano :
> Marko Rauhamaa wrote:
>> '\udd00' is a valid str object:
>
> Is it though? Perhaps the bug is not UTF-8's inability to encode lone
> surrogates, but that Python allows you to create lone surrogates in
> the first place. That's not a rhetorical question. It's a genuine
> questio
Rustom Mody wrote:
> On Saturday, March 7, 2015 at 4:39:48 PM UTC+5:30, Steven D'Aprano wrote:
>> Rustom Mody wrote:
>> > This includes not just bug-prone-system code such as Java and Windows
>> > but seemingly working code such as python 3.
>>
>> What Unicode bugs do you think Python 3.3 and abo
Marko Rauhamaa wrote:
> Chris Angelico :
>
>> Once again, you appear to be surprised that invalid data is failing.
>> Why is this so strange? U+DD00 is not a valid character.
But it is a valid non-character code point.
>> It is quite correct to throw this error.
>
> '\udd00' is a valid str o
On Sun, Mar 8, 2015 at 7:09 PM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> Once again, you appear to be surprised that invalid data is failing.
>> Why is this so strange? U+DD00 is not a valid character. It is quite
>> correct to throw this error.
>
> '\udd00' is a valid str object:
>
>>>>
Chris Angelico :
> Once again, you appear to be surprised that invalid data is failing.
> Why is this so strange? U+DD00 is not a valid character. It is quite
> correct to throw this error.
'\udd00' is a valid str object:
>>> '\udd00'
'\udd00'
>>> '\udd00'.encode('utf-32')
b'\xff\xfe
Steven D'Aprano wrote:
> Marko Rauhamaa wrote:
>
>> Steven D'Aprano :
>>
>>> Marko Rauhamaa wrote:
>>>
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
>>>
>>> Can you explain?
>>
>> In Python terms, there are bytes objects b that don't satisfy:
>>
>>b.d
On Sun, Mar 8, 2015 at 6:20 PM, Marko Rauhamaa wrote:
> * it still isn't bijective between str and bytes:
>
>>>> '\udd00'.encode('utf-8', errors='surrogateescape')
>Traceback (most recent call last):
> File "", line 1, in
>UnicodeEncodeError: 'utf-8' codec can't encode character
On Saturday, March 7, 2015 at 4:39:48 PM UTC+5:30, Steven D'Aprano wrote:
> Rustom Mody wrote:
> > This includes not just bug-prone-system code such as Java and Windows but
> > seemingly working code such as python 3.
>
> What Unicode bugs do you think Python 3.3 and above have?
Literal/Legalisti
Steven D'Aprano :
> For those cases where you do wish to take an arbitrary byte stream and
> round-trip it, Python now provides an error handler for that.
>
> py> import random
> py> b = bytes([random.randint(0, 255) for _ in range(1)])
> py> s = b.decode('utf-8')
> Traceback (most recent call
On Saturday, March 7, 2015 at 11:41:53 AM UTC+5:30, Terry Reedy wrote:
> On 3/6/2015 11:20 AM, Rustom Mody wrote:
>
> > =
> > pp = "💩"
> > print (pp)
> > =
> > Try open it in idle3 and you get (at least I get):
> >
> > $ idle3 ff.py
> > Traceback (most recent call last):
> >Fil
On Saturday, March 7, 2015 at 11:49:44 PM UTC+5:30, Mark Lawrence wrote:
> On 07/03/2015 17:16, Marko Rauhamaa wrote:
> > Mark Lawrence:
> >
> >> It would clearly help if you were to type in the correct UK English
> >> accent.
> >
> > Your ad-hominem-to-contribution ratio is alarmingly high.
> >
>
Marko Rauhamaa wrote:
> Steven D'Aprano :
>
>> Marko Rauhamaa wrote:
>>
>>> That said, UTF-8 does suffer badly from its not being
>>> a bijective mapping.
>>
>> Can you explain?
>
> In Python terms, there are bytes objects b that don't satisfy:
>
>b.decode('utf-8').encode('utf-8') == b
Are
On Sat, 07 Mar 2015 19:00:47 +, Mark Lawrence wrote:
> Isn't pathlib
> https://docs.python.org/3/library/pathlib.html#module-pathlib
> effectively a more recent attempt at smoothing or even removing (some
> of) the bumps? Has anybody here got experience of it as I've never
> used it?
I almos
--- Original Message -
> From: Chris Angelico
> To:
> Cc: "python-list@python.org"
> Sent: Saturday, March 7, 2015 6:26 PM
> Subject: Re: Newbie question about text encoding
>
> On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa wrote:
>> See:
>&
Dan Sommers :
> I think we're all agreeing: not all file systems are the same, and
> Python doesn't smooth out all of the bumps, even for something that
> seems as simple as displaying the names of files in a directory. And
> that's *after* we've agreed that filesystems contain files in
> hierarch
On 07/03/2015 18:34, Dan Sommers wrote:
On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers wrote:
On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa wrote:
Correct. Linux pathnames a
On Sun, Mar 8, 2015 at 5:34 AM, Dan Sommers wrote:
> I think we're all agreeing: not all file systems are the same, and
> Python doesn't smooth out all of the bumps, even for something that
> seems as simple as displaying the names of files in a directory. And
> that's *after* we've agreed that
On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:
> On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers wrote:
>> On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
>>
>>> On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa wrote:
>>
Correct. Linux pathnames are octet strings regardless o
On 07/03/2015 17:16, Marko Rauhamaa wrote:
Mark Lawrence :
It would clearly help if you were to type in the correct UK English
accent.
Your ad-hominem-to-contribution ratio is alarmingly high.
Marko
You've been a PITA ever since you first joined this list, what about it?
--
My fellow Py
On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers wrote:
> On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
>
>> On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa wrote:
>
>>> Correct. Linux pathnames are octet strings regardless of the locale.
>>>
>>> That's why Linux developers should refer to
On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
> On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa wrote:
>> Correct. Linux pathnames are octet strings regardless of the locale.
>>
>> That's why Linux developers should refer to filenames using bytes.
>> Unfortunately, Python itself viola
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa wrote:
>> There are two things happening here:
>>
>> 1) The underlying file system is not UTF-8, and you can't depend on
>> that,
>
> Correct. Linux pathnames are octet strings regardless of the locale.
>
> That's why Linux developers should refer to
Chris Angelico :
> On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa wrote:
>> File names encoded with Latin-X are quite commonplace even in UTF-8
>> locales.
>
> That is not a problem with UTF-8, though. I don't understand how
> you're blaming UTF-8 for that.
I'm saying it creates practical proble
On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa wrote:
> See:
>
>$ mkdir /tmp/xyz
>$ touch /tmp/xyz/
> \x80'
>$ python3
>Python 3.3.2 (default, Dec 4 2014, 12:49:00)
>[GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
>Type "help", "copyright", "credits" or "license" for more
Mark Lawrence :
> It would clearly help if you were to type in the correct UK English
> accent.
Your ad-hominem-to-contribution ratio is alarmingly high.
Marko
--
https://mail.python.org/mailman/listinfo/python-list
Chris Angelico :
> If you really REALLY can't use the bytes() type to work with something
> that is, yaknow, bytes, then you could use an alternative encoding
> that has a value for every byte. It's still not Unicode text, so it
> doesn't much matter which encoding you use. But it's much better to
On 07/03/2015 16:48, Marko Rauhamaa wrote:
Mark Lawrence :
On 07/03/2015 16:25, Marko Rauhamaa wrote:
Here's an example:
b = b'\x80'
Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
from str objects to bytes objects.
Python 2 might, Python 3 doesn't.
Python
On Sun, Mar 8, 2015 at 3:54 AM, Marko Rauhamaa wrote:
> You can't operate on file names and text files using Python strings. Or
> at least, you will need to add (nontrivial) exception catching logic.
You can't operate on a JPG file using a Unicode string, nor an array
of integers. What of it? You
On Sun, Mar 8, 2015 at 3:54 AM, Marko Rauhamaa wrote:
>> All you've proven is that there are bit patterns which are not UTF-8
>> streams...
>
> And that causes problems.
Demonstrate.
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Chris Angelico :
> On Sun, Mar 8, 2015 at 3:25 AM, Marko Rauhamaa wrote:
> Marko Rauhamaa wrote:
>> That said, UTF-8 does suffer badly from its not being
>> a bijective mapping.
>
>> Here's an example:
>>
>>b = b'\x80'
>>
>> Yes, it generates an exception. IOW, UTF-8 is not a
On Sun, Mar 8, 2015 at 3:40 AM, Mark Lawrence wrote:
>> Here's an example:
>>
>> b = b'\x80'
>>
>> Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
>> from str objects to bytes objects.
>>
>
> Python 2 might, Python 3 doesn't.
He was talking about this line of code:
b.de
Mark Lawrence :
> On 07/03/2015 16:25, Marko Rauhamaa wrote:
>> Here's an example:
>>
>> b = b'\x80'
>>
>> Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
>> from str objects to bytes objects.
>
> Python 2 might, Python 3 doesn't.
Python 3.3.2 (default, Dec 4 2014, 1
On 07/03/2015 16:25, Marko Rauhamaa wrote:
Chris Angelico :
On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa wrote:
Steven D'Aprano :
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Can you explain?
In Python terms, there are bytes object
On Sun, Mar 8, 2015 at 3:25 AM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa wrote:
>>> Steven D'Aprano :
>>>
Marko Rauhamaa wrote:
> That said, UTF-8 does suffer badly from its not being
> a bijective mapping.
Can you ex
Chris Angelico :
> On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa wrote:
>> Steven D'Aprano :
>>
>>> Marko Rauhamaa wrote:
>>>
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
>>>
>>> Can you explain?
>>
>> In Python terms, there are bytes objects b that don't
On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa wrote:
> Steven D'Aprano :
>
>> Marko Rauhamaa wrote:
>>
>>> That said, UTF-8 does suffer badly from its not being
>>> a bijective mapping.
>>
>> Can you explain?
>
> In Python terms, there are bytes objects b that don't satisfy:
>
>b.decode('utf-
Steven D'Aprano :
> Marko Rauhamaa wrote:
>
>> That said, UTF-8 does suffer badly from its not being
>> a bijective mapping.
>
> Can you explain?
In Python terms, there are bytes objects b that don't satisfy:
b.decode('utf-8').encode('utf-8') == b
Marko
--
https://mail.python.org/mailman/l
Marko Rauhamaa wrote:
> That said, UTF-8 does suffer badly from its not being
> a bijective mapping.
Can you explain?
As far as I am aware, every code point has one and only one valid UTF-8
encoding, and every UTF-8 encoding has one and only one valid code point.
There are *invalid* UTF-8 encod
On 07/03/2015 11:09, Steven D'Aprano wrote:
Rustom Mody wrote:
This includes not just bug-prone-system code such as Java and Windows but
seemingly working code such as python 3.
What Unicode bugs do you think Python 3.3 and above have?
Methinks somebody has been drinking too much loony ju
On 07/03/2015 12:02, Chris Angelico wrote:
On Sat, Mar 7, 2015 at 10:53 PM, Marko Rauhamaa wrote:
The main dream was a fixed-width encoding scheme. People thought 16 bits
would be enough. The dream is so precious and true to us in the West
that people don't want to give it up.
So... use Pike,
On Sat, Mar 7, 2015 at 10:53 PM, Marko Rauhamaa wrote:
> The main dream was a fixed-width encoding scheme. People thought 16 bits
> would be enough. The dream is so precious and true to us in the West
> that people don't want to give it up.
So... use Pike, or Python 3.3+?
ChrisA
--
https://mail
Steven D'Aprano :
> Rustom Mody wrote:
>> My conclusion: Early adopters of unicode -- Windows and Java -- were
>> punished for their early adoption. You can blame the unicode
>> consortium, you can blame the babel of human languages, particularly
>> that some use characters and some only (the equi
On Sat, Mar 7, 2015 at 10:09 PM, Steven D'Aprano
wrote:
> Stop using MySQL, which is a joke of a database[1], and use Postgres which
> does not have this problem.
I agree with the recommendation, though to be fair to MySQL, it is now
possible to store full Unicode. Though personally, I think the
Rustom Mody wrote:
> On Thursday, March 5, 2015 at 7:36:32 PM UTC+5:30, Steven D'Aprano wrote:
[...]
>> Chris is suggesting that going from BMP to all of Unicode is not the hard
>> part. Going from ASCII to the BMP part of Unicode is the hard part. If
>> you can do that, you can go the rest of the
On 3/6/2015 11:20 AM, Rustom Mody wrote:
=
pp = "💩"
print (pp)
=
Try open it in idle3 and you get (at least I get):
$ idle3 ff.py
Traceback (most recent call last):
File "/usr/bin/idle3", line 5, in
main()
File "/usr/lib/python3.4/idlelib/PyShell.py", line 1562, in m
On Friday, March 6, 2015 at 8:20:22 PM UTC+5:30, Steven D'Aprano wrote:
> Rustom Mody wrote:
>
> > On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
>
> [snip example of an analogous situation with NULs]
>
> > Strawman.
>
> Sigh. If I had a dollar for every time somebody c
random...@fastmail.us wrote:
> My point is there are very few
> problems to which "count of Unicode code points" is the only right
> answer - that UTF-32 is good enough for but that are meaningfully
> impacted by a naive usage of UTF-16, to the point where UTF-16 is
> something you have to be "saf
On Sat, Mar 7, 2015 at 3:20 AM, Rustom Mody wrote:
> C's string is not bug-prone its plain buggy as it cannot represent strings
> with nulls.
>
> I would not go that far for UTF-16.
> It is bug-inviting but it can also be implemented correctly
C's standard library string handling functions are re
On Sat, Mar 7, 2015 at 1:50 AM, Steven D'Aprano
wrote:
> Rustom Mody wrote:
>
>> On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
>
> [snip example of an analogous situation with NULs]
>
>> Strawman.
>
> Sigh. If I had a dollar for every time somebody cried "Strawman!" when
Rustom Mody wrote:
> On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
[snip example of an analogous situation with NULs]
> Strawman.
Sigh. If I had a dollar for every time somebody cried "Strawman!" when what
they really should say is "Yes, that's a good argument, I'm afr
On Fri, Mar 6, 2015, at 09:11, Chris Angelico wrote:
> To prevent people from putting three paragraphs of lipsum in and
> calling it a username.
Limiting by UTF-8 bytes or UTF-16 units works just as well for that.
> So you truncate to the desired length, then if the first character of
> the trimm
On Sat, Mar 7, 2015 at 1:03 AM, wrote:
> On Fri, Mar 6, 2015, at 08:39, Chris Angelico wrote:
>> Number of code points is the most logical way to length-limit
>> something. If you want to allow users to set their display names but
>> not to make arbitrarily long ones, limiting them to X code poin
On Fri, Mar 6, 2015, at 08:39, Chris Angelico wrote:
> Number of code points is the most logical way to length-limit
> something. If you want to allow users to set their display names but
> not to make arbitrarily long ones, limiting them to X code points is
> the safest way (and preferably do an N
On Sat, Mar 7, 2015 at 12:33 AM, wrote:
> However, when do you _really_ want the number of characters? You may
> want to use it for, for example, the number of columns in a 'monospace'
> font, which you've already screwed up because you haven't accounted for
> double-wide characters or combining
On Fri, Mar 6, 2015, at 04:06, Rustom Mody wrote:
> Also:
> Can a programmer who is away from UTF-16 in one part of the system (say
> by using python3)
> assume he is safe all over?
The most common failure of UTF-16 support, supposedly, is in programs
misusing the number of code units (for length
On Friday, March 6, 2015 at 3:24:48 PM UTC+5:30, Chris Angelico wrote:
> On Fri, Mar 6, 2015 at 8:02 PM, Rustom Mody wrote:
> >> Broken systems can be shown up by anything. Suppose you have a program
> >> that breaks when it gets a NUL character (not unknown in C code); is
> >> the fault with the U
On Fri, Mar 6, 2015 at 8:02 PM, Rustom Mody wrote:
>> Broken systems can be shown up by anything. Suppose you have a program
>> that breaks when it gets a NUL character (not unknown in C code); is
>> the fault with the Unicode consortium for allocating something at
>> codepoint 0, or the code that
On Friday, March 6, 2015 at 2:33:11 PM UTC+5:30, Rustom Mody wrote:
> Lets please stick to UTF-16 shall we?
>
> Now tell me:
> - Is it broken or not?
> - Is it widely used or not?
> - Should programmers be careful of it or not?
> - Should programmers be warned about it or not?
Also:
Can a program
On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
> On Fri, Mar 6, 2015 at 3:53 PM, Rustom Mody wrote:
> > My conclusion: Early adopters of unicode -- Windows and Java -- were
> > punished
> > for their early adoption. You can blame the unicode consortium, you can
> > blame
On Fri, Mar 6, 2015 at 3:53 PM, Rustom Mody wrote:
> My conclusion: Early adopters of unicode -- Windows and Java -- were punished
> for their early adoption. You can blame the unicode consortium, you can
> blame the babel of human languages, particularly that some use characters
> and some only
On Thursday, March 5, 2015 at 7:36:32 PM UTC+5:30, Steven D'Aprano wrote:
> Rustom Mody wrote:
>
> > On Wednesday, March 4, 2015 at 10:25:24 AM UTC+5:30, Chris Angelico wrote:
> >> On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody wrote:
> >> >
> >> > It lists some examples of software that somehow bre
random...@fastmail.us wrote:
> On Thu, Mar 5, 2015, at 09:06, Steven D'Aprano wrote:
>> I mostly agree with Chris. Supporting *just* the BMP is non-trivial in
>> UTF-8
>> and UTF-32, since that goes against the grain of the system. You would
>> have
>> to program in artificial restrictions that ot
On Thu, Mar 5, 2015, at 09:06, Steven D'Aprano wrote:
> I mostly agree with Chris. Supporting *just* the BMP is non-trivial in
> UTF-8
> and UTF-32, since that goes against the grain of the system. You would
> have
> to program in artificial restrictions that otherwise don't exist.
UTF-8 is alread
Rustom Mody wrote:
> On Wednesday, March 4, 2015 at 10:25:24 AM UTC+5:30, Chris Angelico wrote:
>> On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody wrote:
>> >
>> > It lists some examples of software that somehow break/goof going from
>> > BMP-only unicode to 7.0 unicode.
>> >
>> > IOW the suggestion
On Wednesday, March 4, 2015 at 10:25:24 AM UTC+5:30, Chris Angelico wrote:
> On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody wrote:
> >
> > It lists some examples of software that somehow break/goof going from
> > BMP-only
> > unicode to 7.0 unicode.
> >
> > IOW the suggestion is that the the two-way
On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody wrote:
>
> It lists some examples of software that somehow break/goof going from BMP-only
> unicode to 7.0 unicode.
>
> IOW the suggestion is that the the two-way classification
> - ASCII
> - Unicode
>
> is less useful and accurate than the 3-way
>
> - A
On Wednesday, March 4, 2015 at 12:07:06 AM UTC+5:30, jmf wrote:
> Le mardi 3 mars 2015 19:04:06 UTC+1, Rustom Mody a écrit :
> > On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
> > > On 2/26/2015 8:24 AM, Chris Angelico wrote:
> > > > On Thu, Feb 26, 2015 at 11:40 PM, Rus
On Wednesday, March 4, 2015 at 9:35:28 AM UTC+5:30, Rustom Mody wrote:
> On Wednesday, March 4, 2015 at 8:24:40 AM UTC+5:30, Steven D'Aprano wrote:
> > Rustom Mody wrote:
> >
> > > On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
> > >> On 2/26/2015 8:24 AM, Chris Angelic
On Wednesday, March 4, 2015 at 8:24:40 AM UTC+5:30, Steven D'Aprano wrote:
> Rustom Mody wrote:
>
> > On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
> >> On 2/26/2015 8:24 AM, Chris Angelico wrote:
> >> > On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
> >> >> Wrot
On Wed, Mar 4, 2015 at 1:54 PM, Steven D'Aprano
wrote:
> It is easy to mock what is not important to you. I daresay kids adding emoji
> to their 10 character tweets would mock all the useless maths symbols in
> Unicode too.
Definitely! Who ever sings "do you wanna build an integral sign"?
ChrisA
On Wednesday, March 4, 2015 at 12:14:11 AM UTC+5:30, Chris Angelico wrote:
> On Wed, Mar 4, 2015 at 5:03 AM, Rustom Mody wrote:
> > What I was trying to say expanded here
> > http://blog.languager.org/2015/03/whimsical-unicode.html
> > [Hope the word 'whimsical' is less jarring and more accurate t
Rustom Mody wrote:
> On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
>> On 2/26/2015 8:24 AM, Chris Angelico wrote:
>> > On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
>> >> Wrote something up on why we should stop using ASCII:
>> >> http://blog.languager.org/2015/
On 3/3/2015 1:03 PM, Rustom Mody wrote:
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
You should add emoticons, but not call them or the above 'gibberish'.
I think that this part of your post is more 'unprofessional' than the
character blocks. It is very jarring a
On Wed, Mar 4, 2015 at 5:03 AM, Rustom Mody wrote:
> What I was trying to say expanded here
> http://blog.languager.org/2015/03/whimsical-unicode.html
> [Hope the word 'whimsical' is less jarring and more accurate than
> 'gibberish']
Re footnote #4: ½ is a single character for compatibility rea
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
> On 2/26/2015 8:24 AM, Chris Angelico wrote:
> > On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
> >> Wrote something up on why we should stop using ASCII:
> >> http://blog.languager.org/2015/02/universal-unicode.html
On Sat, 28 Feb 2015 04:45:04 +1100, Chris Angelico wrote:
> Perhaps, but on the other hand, the skill of squeezing code into less
> memory is being replaced by other skills. We can write code that takes
> the simple/dumb approach, let it use an entire megabyte of memory, and
> not care about the co
On Fri, 27 Feb 2015 19:14:00 +, MRAB wrote:
>>
> I suppose you could load the basic parts first so that the user can
> start working, and then load the additional features in the background.
>
quite possible
my opinion on this is very fluid
it may work for some applications, it probably would
On Sat, Feb 28, 2015 at 7:52 AM, Dave Angel wrote:
> If that's the case on the architectures you're talking about, then the
> problem of slow loading is not triggered by the memory usage, but by lots of
> initialization code. THAT's what should be deferred for seldom-used
> portions of code.
s/s
On 02/27/2015 11:00 AM, alister wrote:
On Sat, 28 Feb 2015 01:22:15 +1100, Chris Angelico wrote:
If you're trying to use the pagefile/swapfile as if it's more memory ("I
have 256MB of memory, but 10GB of swap space, so that's 10GB of
memory!"), then yes, these performance considerations are hu
On 2015-02-27 16:45, alister wrote:
On Sat, 28 Feb 2015 03:12:16 +1100, Chris Angelico wrote:
On Sat, Feb 28, 2015 at 3:00 AM, alister
wrote:
I think there is a case for bringing back the overlay file, or at least
loading larger programs in sections only loading the routines as they
are requi
On 2015-02-27, Grant Edwards wrote:
> On 2015-02-27, Steven D'Aprano wrote:
> Dave Angel wrote:
>>> On 02/27/2015 12:58 AM, Steven D'Aprano wrote: Dave Angel wrote:
> (Although I believe Seymour Cray was quoted as saying that virtual
> memory is a crock, because "you can't fake what
On 2015-02-27, Steven D'Aprano wrote:
> Dave Angel wrote:
>
>> On 02/27/2015 12:58 AM, Steven D'Aprano wrote:
>>> Dave Angel wrote:
>>>
(Although I believe Seymour Cray was quoted as saying that virtual
memory is a crock, because "you can't fake what you ain't got.")
>>>
>>> If I recall
On Sat, Feb 28, 2015 at 3:45 AM, alister
wrote:
> On Sat, 28 Feb 2015 03:12:16 +1100, Chris Angelico wrote:
>
>> On Sat, Feb 28, 2015 at 3:00 AM, alister
>> wrote:
>>> I think there is a case for bringing back the overlay file, or at least
>>> loading larger programs in sections only loading the
On Sat, 28 Feb 2015 03:12:16 +1100, Chris Angelico wrote:
> On Sat, Feb 28, 2015 at 3:00 AM, alister
> wrote:
>> I think there is a case for bringing back the overlay file, or at least
>> loading larger programs in sections only loading the routines as they
>> are required could speed up the star
On Sat, Feb 28, 2015 at 3:00 AM, alister
wrote:
> I think there is a case for bringing back the overlay file, or at least
> loading larger programs in sections
> only loading the routines as they are required could speed up the start
> time of many large applications.
> examples libre office, I ra
On Sat, 28 Feb 2015 01:22:15 +1100, Chris Angelico wrote:
>
> If you're trying to use the pagefile/swapfile as if it's more memory ("I
> have 256MB of memory, but 10GB of swap space, so that's 10GB of
> memory!"), then yes, these performance considerations are huge. But
> suppose you need to run
On 02/27/2015 09:22 AM, Chris Angelico wrote:
On Sat, Feb 28, 2015 at 1:02 AM, Dave Angel wrote:
The term "virtual memory" is used for many aspects of the modern memory
architecture. But I presume you're using it in the sense of "running in a
swapfile" as opposed to running in physical RAM.
On Sat, Feb 28, 2015 at 1:02 AM, Dave Angel wrote:
> The term "virtual memory" is used for many aspects of the modern memory
> architecture. But I presume you're using it in the sense of "running in a
> swapfile" as opposed to running in physical RAM.
Given that this started with a quote about "
1 - 100 of 132 matches
Mail list logo