Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Antoine Pitrou
Michael Foord  voidspace.org.uk> writes:
> 
> Earlier this year I was at a pypy sprint helping to work on Python 2.7
compatibility. The bytearray type has much of the string interface, including
swapcase… So there was effort to implement this method with the correct
semantics for pypy. Doubtless the same has been true for IronPython, and will
also be true for Jython.

While I haven't used swapcase() a single time, I doubt there is much difficult
in implementing pure ASCII semantics, is there?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #12567: Add curses.unget_wch() function

2011-09-06 Thread Victor Stinner

Le 06/09/2011 07:50, Antoine Pitrou a écrit :

On Tue, 06 Sep 2011 01:53:32 +0200
victor.stinner  wrote:

http://hg.python.org/cpython/rev/b1e03d10391e
changeset:   72297:b1e03d10391e
user:Victor Stinner
date:Tue Sep 06 01:53:03 2011 +0200
summary:
   Issue #12567: Add curses.unget_wch() function

Push a character so the next get_wch() will return it.


Looks like you broke many buildbots.


Oh, thanks to notify me. I expected failures, but I also forgot the skip 
the test if the function is missing.


I wrote an huge patch for this module to improve Unicode support, but I 
chose to split it into smaller patches. Because a single function broke 
most buildbots, it was a good idea :-)


Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (3.2): Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character

2011-09-06 Thread Victor Stinner

Le 06/09/2011 02:25, Nick Coghlan a écrit :

On Tue, Sep 6, 2011 at 10:01 AM, victor.stinner
  wrote:

Fix also spelling of the null character.


While these cases are legitimately changed to 'null' (since they're
lowercase descriptions of the character), I figure it's worth
mentioning again that the ASCII name for '\0' actually *is* NUL (i.e.
only one 'L'). Strange, but true [1].

Cheers,
Nick.

[1] https://secure.wikimedia.org/wikipedia/en/wiki/ASCII


"NUL" is an abbreviation used in tables when you don't have enough space 
to write the full name: "null character".


Where do you want to mention this abbreviation?

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (3.2): Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character

2011-09-06 Thread Nick Coghlan
On Tue, Sep 6, 2011 at 6:04 PM, Victor Stinner
 wrote:
> "NUL" is an abbreviation used in tables when you don't have enough space to
> write the full name: "null character".

Yep, fair description.

> Where do you want to mention this abbreviation?

Sorry, I meant worth mentioning on the list, not anywhere particular
in the docs  - the topic came up recently when an instance of NUL was
incorrectly changed to read 'NULL' instead and it took me a moment to
figure out why the same reasoning *didn't* apply in this case.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] bigmemtests for really big memory too slow

2011-09-06 Thread martin

I benchmarked some of the bigmemtests when run with -M 80G. They run really
slow, because they try to use all available memory, and then take a lot of
time processing it. Here are some runtimes:

test_capitalize (test.test_bigmem.StrTest) ... ok (420.490846s)
test_center (test.test_bigmem.StrTest) ... ok (149.431523s)
test_compare (test.test_bigmem.StrTest) ... ok (200.181986s)
test_concat (test.test_bigmem.StrTest) ... ok (154.282903s)
test_contains (test.test_bigmem.StrTest) ... ok (173.960073s)
test_count (test.test_bigmem.StrTest) ... ok (186.799731s)
test_encode (test.test_bigmem.StrTest) ... ok (53.752823s)
test_encode_ascii (test.test_bigmem.StrTest) ... ok (8.421414s)
test_encode_raw_unicode_escape (test.test_bigmem.StrTest) ... ok (3.752774s)
test_encode_utf32 (test.test_bigmem.StrTest) ... ok (9.732829s)
test_encode_utf7 (test.test_bigmem.StrTest) ... ok (4.998805s)
test_endswith (test.test_bigmem.StrTest) ... ok (208.022452s)
test_expandtabs (test.test_bigmem.StrTest) ... ok (614.490436s)
test_find (test.test_bigmem.StrTest) ... ok (230.722848s)
test_format (test.test_bigmem.StrTest) ... ok (407.471929s)
test_hash (test.test_bigmem.StrTest) ... ok (325.906271s)

In the test suite, we have the bigmemtest and precisionbigmemtest
decorators. I think bigmemtest cases should all be changed to
precisionbigmemtest, giving sizes of just above 2**31. With that
change, the runtime for test_capitalize would go down to 42s.

What do you think?

Regards,
Martin



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bigmemtests for really big memory too slow

2011-09-06 Thread Antoine Pitrou

Hello Martin,

> In the test suite, we have the bigmemtest and precisionbigmemtest
> decorators. I think bigmemtest cases should all be changed to
> precisionbigmemtest, giving sizes of just above 2**31. With that
> change, the runtime for test_capitalize would go down to 42s.

I have started working on this and other things in
http://hg.python.org/sandbox/antoine/, branch "bigmem".

I was planning to propose the same thing, which indeed makes tests pass
much more quickly, but I was waiting to try and solve some other
crashes in test_bigmem.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Joao S. O. Bueno
On Mon, Sep 5, 2011 at 8:56 AM, Michael Foord  wrote:
> Hey all,
> A while ago there was a discussion of the value of apis like str.swapcase,
> and it was suggested that even though it was acknowledged to be useless the
> effort of deprecating and removing it was thought to be more than the value
> in removing it.
> Earlier this year I was at a pypy sprint helping to work on Python 2.7
> compatibility. The bytearray type has much of the string interface,
> including swapcase… So there was effort to implement this method with the
> correct semantics for pypy. Doubtless the same has been true for IronPython,
> and will also be true for Jython.
> Whilst it is too late for Python 2.x, it *is* (in my opinion) worth removing
> unused and unneeded APIs. Even if the effort to remove them is more than any
> effort saved on the part of users it helps other implementations down the
> road that no longer need to provide these APIs.
> All the best,
> Michael Foord
>

On the other hand,
for any users wanting to use this i n the future, if it is not there,
they'd have to implement the logic for themselves. If it is a "burden"
for someone in a sprint, looking at other implementations, and with
all the unicode knowledge/documentation around, it would be pretty
much undoable in the correct way by a casual user. Removing it would
mean explicitly "batteries removal".

If you get some traction o n that, at least consider moving it to  a
pure python function on the string module.


  js
 -><-

> --
> http://www.voidspace.org.uk/
>
> May you do good and not evil
> May you find forgiveness for yourself and forgive others
> May you share freely, never taking more than you give.
> -- the sqlite blessing http://www.sqlite.org/different.html
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br
>
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bigmemtests for really big memory too slow

2011-09-06 Thread Antoine Pitrou

For the record, I've disabled automatic builds on the bigmem buildbot
until things get sorted out a bit (no need to eat huge amounts of RAM
and eight hours of CPU each time a commit is pushed, only to have the
process killed :-)). It's still possible to run custom builds, of
course.

Regards

Antoine.



On Tue, 06 Sep 2011 15:03:32 +0200
[email protected] wrote:
> I benchmarked some of the bigmemtests when run with -M 80G. They run really
> slow, because they try to use all available memory, and then take a lot of
> time processing it. Here are some runtimes:
> 
> test_capitalize (test.test_bigmem.StrTest) ... ok (420.490846s)
> test_center (test.test_bigmem.StrTest) ... ok (149.431523s)
> test_compare (test.test_bigmem.StrTest) ... ok (200.181986s)
> test_concat (test.test_bigmem.StrTest) ... ok (154.282903s)
> test_contains (test.test_bigmem.StrTest) ... ok (173.960073s)
> test_count (test.test_bigmem.StrTest) ... ok (186.799731s)
> test_encode (test.test_bigmem.StrTest) ... ok (53.752823s)
> test_encode_ascii (test.test_bigmem.StrTest) ... ok (8.421414s)
> test_encode_raw_unicode_escape (test.test_bigmem.StrTest) ... ok (3.752774s)
> test_encode_utf32 (test.test_bigmem.StrTest) ... ok (9.732829s)
> test_encode_utf7 (test.test_bigmem.StrTest) ... ok (4.998805s)
> test_endswith (test.test_bigmem.StrTest) ... ok (208.022452s)
> test_expandtabs (test.test_bigmem.StrTest) ... ok (614.490436s)
> test_find (test.test_bigmem.StrTest) ... ok (230.722848s)
> test_format (test.test_bigmem.StrTest) ... ok (407.471929s)
> test_hash (test.test_bigmem.StrTest) ... ok (325.906271s)
> 
> In the test suite, we have the bigmemtest and precisionbigmemtest
> decorators. I think bigmemtest cases should all be changed to
> precisionbigmemtest, giving sizes of just above 2**31. With that
> change, the runtime for test_capitalize would go down to 42s.
> 
> What do you think?
> 
> Regards,
> Martin
> 
> 
> 


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (3.2): Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character

2011-09-06 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/2011 04:04 AM, Victor Stinner wrote:
> Le 06/09/2011 02:25, Nick Coghlan a écrit :
>> On Tue, Sep 6, 2011 at 10:01 AM, victor.stinner 
>>   wrote:
>>> Fix also spelling of the null character.
>> 
>> While these cases are legitimately changed to 'null' (since
>> they're lowercase descriptions of the character), I figure it's
>> worth mentioning again that the ASCII name for '\0' actually *is*
>> NUL (i.e. only one 'L'). Strange, but true [1].
>> 
>> Cheers, Nick.
>> 
>> [1] https://secure.wikimedia.org/wikipedia/en/wiki/ASCII
> 
> "NUL" is an abbreviation used in tables when you don't have enough
> space to write the full name: "null character".
> 
> Where do you want to mention this abbreviation?

FWIW, the RFC 20 (the ASCII spec) really really defines 'NUL'  as the
*name* of the \0 character, not just an "abbreviation used in tables":

 http://tools.ietf.org/html/rfc20#section-5.2



Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk5mODcACgkQ+gerLs4ltQ7VwACgicaURzX4wAWOi+sRYGBwF5/3
8okAniSkHIlBv/VoibW6klR3WgD8T3ph
=LlKo
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Issue #9561: packaging now writes egg-info files using UTF-8

2011-09-06 Thread Éric Araujo
Le 06/09/2011 00:11, victor.stinner a écrit :
> http://hg.python.org/cpython/rev/56ab3257ca13
> changeset:   72296:56ab3257ca13
> user:Victor Stinner 
> date:Tue Sep 06 00:11:13 2011 +0200
> summary:
>   Issue #9561: packaging now writes egg-info files using UTF-8
> 
> instead of the locale encoding

>  
>  def _distutils_pkg_info(self):
>  tmp = self._distutils_setup_py_pkg()
> -self.write_file([tmp, 'PKG-INFO'], '')
> +self.write_file([tmp, 'PKG-INFO'], '', encoding='UTF-8')

This function is writing an empty string; isn’t it the same bytes in
UTF-8 or in the locale encoding?  (Are there people that use encodings
with BOMs as locale? *shudders*)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Issue #9561: packaging now writes egg-info files using UTF-8

2011-09-06 Thread Victor Stinner

Le 06/09/2011 17:17, Éric Araujo a écrit :

Le 06/09/2011 00:11, victor.stinner a écrit :

http://hg.python.org/cpython/rev/56ab3257ca13
changeset:   72296:56ab3257ca13
user:Victor Stinner
date:Tue Sep 06 00:11:13 2011 +0200
summary:
   Issue #9561: packaging now writes egg-info files using UTF-8

instead of the locale encoding




  def _distutils_pkg_info(self):
  tmp = self._distutils_setup_py_pkg()
-self.write_file([tmp, 'PKG-INFO'], '')
+self.write_file([tmp, 'PKG-INFO'], '', encoding='UTF-8')


This function is writing an empty string; isn’t it the same bytes in
UTF-8 or in the locale encoding?


This patch is just cosmetic: it doesn't change anything (except that 
TextIOWrapper doesn't have to change temporary the locale to get the 
locale encoding).


Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Stephen J. Turnbull
Joao S. O. Bueno writes:

 > Removing it would mean explicitly "batteries removal".

That's what we usually do with a dead battery, no?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/2011 12:59 PM, Stephen J. Turnbull wrote:
> Joao S. O. Bueno writes:
> 
>> Removing it would mean explicitly "batteries removal".
> 
> That's what we usually do with a dead battery, no?

Normally one "replaces" dead batteries. :)



Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk5mUR8ACgkQ+gerLs4ltQ7Y3gCgzRdR3Vjc/i7KsC3S0OFxRi1I
r3sAoMzmSxot9+k5EnatZ8RYvFnhPO5B
=PNN1
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (3.2): Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character

2011-09-06 Thread Terry Reedy

On 9/6/2011 11:11 AM, Tres Seaver wrote:


FWIW, the RFC 20 (the ASCII spec) really really defines 'NUL'  as the
*name* of the \0 character, not just an "abbreviation used in tables":

  http://tools.ietf.org/html/rfc20#section-5.2


As I read the text, the 2 or 3 capital letter *symbols* are 
abbreviations of of the names. Looking back up, I see

'''
4. Legend
4.1 Control Characters
   NUL NullDLE Data Link Escape (CC)
...
4.2 Graphic Characters
   Column/Row  Symbol  Name
   2/0 SP  Space (Normally Non-Printing)
   2/1 !   Exclamation Point
'''
'NUL' and 'SP' are *symbols* that have the names 'Null' and 'Space', 
just as the symbol '!' is named 'Exclamation Point'. They just happen to 
be digraphs and trigraphs composed of 2 or 3 characters.


I am sure that the symbol SP does not appear in the docs. The symbol 
'LF' (for LineFeed) probably does not either. We just call it 'newline' 
or 'newline character' as that is how we use it.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Terry Reedy

On 9/6/2011 12:58 PM, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/2011 12:59 PM, Stephen J. Turnbull wrote:

Joao S. O. Bueno writes:


Removing it would mean explicitly "batteries removal".


That's what we usually do with a dead battery, no?


Normally one "replaces" dead batteries. :)


Not if it is dead and leaking because the device has been unused for years.

https://www.google.com/codesearch#search/&q=lang:^python$%20swapcase%20case:yes&type=cs

returns a mere 300 hits. At least half are definitions of the function, 
or tests thereof, or inclusions in lists. Some actual uses:


1.http://pytof.googlecode.com/svn/trunk/pytof/utils.py
def ListCurrentDirFileFromExt(ext, path):
""" list file matching extension from a list
in the current directory
emulate a `ls *.{(',').join(ext)` with ext in both upper and 
downcase}"""

import glob
extfiles = []
for e in ext:
extfiles.extend(glob.glob(join(path,'*' + e)))
extfiles.extend(glob.glob(join(path,'*' + e.swapcase(

If e is all upper or lower, using e.upper() and e.lower() will do same. 
If e is mixed, using .upper and .lower is required to fulfill the spec. 
On *nix, where matching of letters is case sensitive, both will fail 
with '.Jpg'. On Windows, where letter matching ignores case, the above 
code will list everything twice.


2.http://ydict.googlecode.com/svn/trunk/ydict
k is random word from database.

result.replace(k, "").replace(k.upper(), 
"").replace(k[0].swapcase()+k[1:].lower(),"")


If k is lowercase, .lower() is redundant and 
k[0].swapcase()+k[1:].lower() == k.title(). If k is uppercase, previous 
.upper() is redundant. If k is mixed case, code may have problems.


3. http://migrid.googlecode.com/svn/trunk/mig/sftp-mount/migaccess.py

#This is how we could add stub extended attribute handlers...
#(We can't have ones which aptly delegate requests to the underlying fs
#because Python lacks a standard xattr interface.)
#
#def getxattr(self, path, name, size):
#val = name.swapcase() + '@' + path
#if size == 0:
## We are asked for size of the value.
#return len(val)
#return val

This is not actually used. Passing a name with all cases swapped from 
what they should be is a bit strange.


4.
elif char >= 'A' and  char <= 'Z':
element = element + char.swapcase()

uppercasechar.swapcase() == uppercasechar.lower()


My perusal of the first 70 of 300 hits suggests that .swapcase is more 
of an attractive nuisance or redundant rather than actually useful.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Steven D'Aprano

Terry Reedy wrote:

On 9/6/2011 12:58 PM, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/2011 12:59 PM, Stephen J. Turnbull wrote:

Joao S. O. Bueno writes:


Removing it would mean explicitly "batteries removal".


That's what we usually do with a dead battery, no?


Normally one "replaces" dead batteries. :)


Not if it is dead and leaking because the device has been unused for years.



Can we please not make decisions about what code should be removed based 
on dodgy analogies? :)


Perhaps I missed something early on, but why are we proposing removing a 
function which (presumably) is stable and tested and works and is not 
broken? What maintenance is needed here?



[...]
If k is lowercase, .lower() is redundant and 
k[0].swapcase()+k[1:].lower() == k.title(). 


Not so.

>>> k = ' '
>>> k.title()
'Aaaa Bbbb'
>>> k[0].swapcase()+k[1:].lower()
'Aaaa '


If k is uppercase, previous 
.upper() is redundant. If k is mixed case, code may have problems.


"May" have problems?


pERSONNALLY, i THINK THAT A SWAPCASE COMMAND IS ESSENTIAL FOR TEXT 
EDITOR APPLICATIONS, TO AVOID THOSE LITTLE cAPS lOCK ACCIDENTS.




--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Michael Foord
On 6 Sep 2011, at 20:36, Steven D'Aprano wrote:
> Terry Reedy wrote:
>> On 9/6/2011 12:58 PM, Tres Seaver wrote:
>>> -BEGIN PGP SIGNED MESSAGE-
>>> Hash: SHA1
>>> 
>>> On 09/06/2011 12:59 PM, Stephen J. Turnbull wrote:
 Joao S. O. Bueno writes:
 
> Removing it would mean explicitly "batteries removal".
 
 That's what we usually do with a dead battery, no?
>>> 
>>> Normally one "replaces" dead batteries. :)
>> Not if it is dead and leaking because the device has been unused for years.
> 
> 
> Can we please not make decisions about what code should be removed based on 
> dodgy analogies? :)
> 
> Perhaps I missed something early on, but why are we proposing removing a 
> function which (presumably) is stable and tested and works and is not broken? 
> What maintenance is needed here?


The maintenance burden is on other implementations. Even if there is no 
maintenance burden for CPython having useless methods simply because  it is 
less effort to leave them in place creates work for new implementations wanting 
to be fully compatible. 

> 
> 
> [...]
>> If k is lowercase, .lower() is redundant and k[0].swapcase()+k[1:].lower() 
>> == k.title(). 
> 
> Not so.
> 
> >>> k = ' '
> >>> k.title()
> 'Aaaa Bbbb'
> >>> k[0].swapcase()+k[1:].lower()
> 'Aaaa '
> 
> 
>> If k is uppercase, previous .upper() is redundant. If k is mixed case, code 
>> may have problems.
> 
> "May" have problems?
> 
> 
> pERSONNALLY, i THINK THAT A SWAPCASE COMMAND IS ESSENTIAL FOR TEXT EDITOR 
> APPLICATIONS, TO AVOID THOSE LITTLE cAPS lOCK ACCIDENTS.


Have you ever used str.swapcase for that purpose?

Michael


--
http://www.voidspace.org.uk/


May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing 
http://www.sqlite.org/different.html






___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Fred Drake
On Tue, Sep 6, 2011 at 3:36 PM, Steven D'Aprano  wrote:
> pERSONNALLY, i THINK THAT A SWAPCASE COMMAND IS ESSENTIAL FOR TEXT EDITOR
> APPLICATIONS, TO AVOID THOSE LITTLE cAPS lOCK ACCIDENTS.

There's a better solution to that, but the caps lock lobby has a stranglehold
on keyboard manufacturers.


-- 
Fred L. Drake, Jr.    
"A person who won't read has no advantage over one who can't read."
   --Samuel Langhorne Clemens
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Barry Warsaw
On Sep 06, 2011, at 03:42 PM, Fred Drake wrote:

>On Tue, Sep 6, 2011 at 3:36 PM, Steven D'Aprano  wrote:
>> pERSONNALLY, i THINK THAT A SWAPCASE COMMAND IS ESSENTIAL FOR TEXT EDITOR
>> APPLICATIONS, TO AVOID THOSE LITTLE cAPS lOCK ACCIDENTS.
>
>There's a better solution to that, but the caps lock lobby has a stranglehold
>on keyboard manufacturers.

Fight The Man with xmodmap!

-Barry
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Martin v. Löwis
>> Perhaps I missed something early on, but why are we proposing
>> removing a function which (presumably) is stable and tested and
>> works and is not broken? What maintenance is needed here?
> 
> 
> The maintenance burden is on other implementations.

It's not a maintenance burden (at least not in the sense in which
I understand the word "maintenance" - as an ongoing effort). When
they implement it once, the implementation can likely stay forever,
unmodified.

> Even if there is
> no maintenance burden for CPython having useless methods simply
> because  it is less effort to leave them in place creates work for
> new implementations wanting to be fully compatible.

That's true.

However, that alone is not enough reason to remove the feature, IMO.
The effort that is saved is not only on the developers of CPython,
but also on users of the feature. My claim is that for any little-used
feature, removing it costs more time world-wide than re-implementing
it in 10 alternative Python implementations (with the number 10 drawn
out of blue air), because of the cost of changing the applications that
actually do use the feature.

With the switch to Python 3, there would have been a chance to remove
little-used features. IMO, the next such chance is with Python 4.
It could be useful to start collecting little-used features that might
be removed with Python 4 - which I don't expect until 2020.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Michael Foord
On 6 Sep 2011, at 21:18, Martin v. Löwis wrote:
>>> Perhaps I missed something early on, but why are we proposing
>>> removing a function which (presumably) is stable and tested and
>>> works and is not broken? What maintenance is needed here?
>> 
>> 
>> The maintenance burden is on other implementations.
> 
> It's not a maintenance burden (at least not in the sense in which
> I understand the word "maintenance" - as an ongoing effort). When
> they implement it once, the implementation can likely stay forever,
> unmodified.

Ok, burden rather than "maintenance" burden.

> 
>> Even if there is
>> no maintenance burden for CPython having useless methods simply
>> because  it is less effort to leave them in place creates work for
>> new implementations wanting to be fully compatible.
> 
> That's true.
> 
> However, that alone is not enough reason to remove the feature, IMO.
> The effort that is saved is not only on the developers of CPython,
> but also on users of the feature. My claim is that for any little-used
> feature, removing it costs more time world-wide than re-implementing
> it in 10 alternative Python implementations (with the number 10 drawn
> out of blue air), because of the cost of changing the applications that
> actually do use the feature.
> 

Which applications? I'm not sure the number of applications using str.swapcase 
gets even as high as ten.

> With the switch to Python 3, there would have been a chance to remove
> little-used features. IMO, the next such chance is with Python 4.
> It could be useful to start collecting little-used features that might
> be removed with Python 4 - which I don't expect until 2020.

We still have our standard deprecation policy that we can follow in Python 3. 
We don't have to wait until Python 4 to remove things. Changing semantics or 
syntax is harder because you can't really deprecate. Just removing methods is 
straightforward.

MIchael

> 
> Regards,
> Martin
> 




--
http://www.voidspace.org.uk/


May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing 
http://www.sqlite.org/different.html





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Martin v. Löwis
> Which applications? I'm not sure the number of applications using
> str.swapcase gets even as high as ten.

I think this is what people underestimate. I can't name
applications either - but that doesn't mean they don't exist.
I'm deeply convinced that the majority of Python code (and
I mean *large* majority) is unpublished.

I expect thousands of uses world-wide.

> We still have our standard deprecation policy that we can follow in
> Python 3. We don't have to wait until Python 4 to remove things.

That's true. However, part of the deprecation procedure is also that
there should be a rationale for removing it. In the past, things have
been removed that had been superseded with something new, or things
that had been flawed in their design so that fixing it wasn't really
possible, or that did indeed cause ongoing maintenance effort for
a minority of users (such as the support for little-used platforms).

None if these motivations hold for str.swapcase, and I think the
"other implementations will have to implement it" is not sufficient
motivation. If the other implementations believe that the feature
is truly useless and also not used, they just can declare it a
deliberate deviation from CPython, and refuse to implement it.

If I had to pick a truly useless feature, I'd kill complex numbers,
not str.swapcase.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Raymond Hettinger

On Sep 6, 2011, at 1:36 PM, Martin v. Löwis wrote:

> I think this is what people underestimate. I can't name
> applications either - but that doesn't mean they don't exist.

Google code search is pretty good indicator that this method
has near zero uptake.   If it dies, I don't think anyone will cry.


Raymond___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (3.2): Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character

2011-09-06 Thread Greg Ewing

Victor Stinner wrote:

"NUL" is an abbreviation used in tables when you don't have enough space 
to write the full name: "null character".


It's also the official name of the character, for when you want
to be unambiguous about what you mean (e.g. "null character" as
opposed to "empty string" or "null pointer").

I expect it's 3 chars for consistency with all the other control
character names.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Nick Coghlan
On Wed, Sep 7, 2011 at 7:23 AM, Raymond Hettinger
 wrote:
>
> On Sep 6, 2011, at 1:36 PM, Martin v. Löwis wrote:
>
> I think this is what people underestimate. I can't name
> applications either - but that doesn't mean they don't exist.
>
> Google code search is pretty good indicator that this method
> has near zero uptake.   If it dies, I don't think anyone will cry.

For str itself, I'm -0 on removing it - the Unicode implications mean
implementation isn't completely trivial and there's at least one
legitimate use case (i.e. providing, or deliberately reversing, Caps
Lock style functionality).

However, a big +1 for deprecation in the case of bytes and bytearray.
That's nothing to do with the maintenance burden though, it's to do
with the semantic confusion between binary data and ASCII-encoded text
implied by the retention of methods like upper(), lower() and
swapcase().

Specifically, the methods I consider particularly problematic on that front are:
 'capitalize'
 'islower'
 'istitle'
 'isupper'
 'lower'
 'swapcase'
 'title'
 'upper'

These are all text operations, not something you do with binary data.

There are some other methods that make ASCII specific default
assumptions regarding whitespace and line separators, but ASCII
whitespace is often used as a delimiter in wire protocols so losing
those would be genuinely annoying. I've also left out the methods for
identifying ASCII letters and digits, since again, those are useful
for interpreting various wire encodings. The case-related methods,
though, have no place in sane wire protocol handling.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Steven D'Aprano

Raymond Hettinger wrote:

On Sep 6, 2011, at 1:36 PM, Martin v. Löwis wrote:


I think this is what people underestimate. I can't name
applications either - but that doesn't mean they don't exist.


Google code search is pretty good indicator that this method
has near zero uptake.   If it dies, I don't think anyone will cry.


Near-zero is not zero, and Terry has already shown some examples of code 
which use, or misuse, swapcase.


In any case (pun intended *wink*) this was discussed in December and 
Guido expressed little enthusiasm for the idea:


http://mail.python.org/pipermail/python-dev/2010-December/106650.html

I can't exactly defend the existence of swapcase, it does seem to be a 
fairly specialised function. But given that it exists, I'm -0.5 on 
removal on the basis of "if it ain't broke, don't fix it".




--
Steven

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Antoine Pitrou
On Wed, 7 Sep 2011 10:47:16 +1000
Nick Coghlan  wrote:
> 
> However, a big +1 for deprecation in the case of bytes and bytearray.
> That's nothing to do with the maintenance burden though, it's to do
> with the semantic confusion between binary data and ASCII-encoded text
> implied by the retention of methods like upper(), lower() and
> swapcase().

A big -1 on that.
Bytes objects are often used for partly ASCII strings, not arbitrary
"arrays of bytes". And making indexing of bytes objects return ints was
IMHO a mistake.

Besides, if you want an array of ints, there's already array.array()
with your typecode of choice. Not sure why other types should conform.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > However, a big +1 for deprecation in the case of bytes and bytearray.
 > That's nothing to do with the maintenance burden though, it's to do
 > with the semantic confusion between binary data and ASCII-encoded text
 > implied by the retention of methods like upper(), lower() and
 > swapcase().

[...]

 > These are all text operations, not something you do with binary data.

"Yea, Brother, Amen!"  I like the taste of this Kool-Aid.  But

 > The case-related methods, though, have no place in sane wire
 > protocol handling.

RFC 822 headers are a somewhat insane but venerable (isn't that true
of anything that's reached age 350 in dog-years?), and venerated,
counterexample.  Specifically, field names are case-insensitive (RFC
5322, section 1.2.2).  I'll bet you can find plenty of others if you
look.  You can call that "text" and say it should be processed in
Unicode, if you like, but you're not even going to convince me (and as
I say, I like the Kool-Aid).  Specifically, SMTP processes can (and
even MUST, under some circumstances IIRC) manipulate the RFC 822 header.

Sorry, Nick, no can do.

-1
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Stephen J. Turnbull
Antoine Pitrou writes:

 > Bytes objects are often used for partly ASCII strings,

All I can say to that phrase is, "urk, ISO 2022 anyone?"

 > not arbitrary "arrays of bytes". And making indexing of bytes
 > objects return ints was IMHO a mistake.

Bytes objects are not ASCII strings, even though they can be used to
represent them.  The practice of using magic numbers that look like
English words is a useful one, but by the same token, it should not be
too easy to use bytes to represent *text* just because the programmer
doesn't know any words that don't fit into 7*N bits.  With PEP 393,
there isn't even really a space excuse.

AFAICS, anything that should be done with ASCII-punned magic numbers
("protocol tokens", if you prefer) can be done with slices and (ta-da!)
case conversion.  (Sorry, Nick!)  But the components of a bytes object
are just numbers; they are not characters until you've run them
through a codec.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintenance burden of str.swapcase

2011-09-06 Thread Nick Coghlan
On Wed, Sep 7, 2011 at 11:53 AM, Stephen J. Turnbull  wrote:
> Nick Coghlan writes:
>  > The case-related methods, though, have no place in sane wire
>  > protocol handling.
>
> RFC 822 headers are a somewhat insane but venerable (isn't that true
> of anything that's reached age 350 in dog-years?), and venerated,
> counterexample.  Specifically, field names are case-insensitive (RFC
> 5322, section 1.2.2).  I'll bet you can find plenty of others if you
> look.  You can call that "text" and say it should be processed in
> Unicode, if you like, but you're not even going to convince me (and as
> I say, I like the Kool-Aid).  Specifically, SMTP processes can (and
> even MUST, under some circumstances IIRC) manipulate the RFC 822 header.
>
> Sorry, Nick, no can do.
>
> -1

Heh, I knew as soon as I sent that message that someone would be able
to point out a counter example. I agree that RFC 822 (and
case-insensitive ASCII comparison in general) is enough to save
lower() and upper() and co, but what about this even further reduced
list of text-specific methods:

 'capitalize'
 'istitle'
 'swapcase'
 'title'

While case-insensitive comparison makes sense for wire level data,
where do these methods fit in, even when embedded ASCII text fragments
are involved?

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Deprecating bytes.swapcase and friends [was: Maintenance burden of str.swapcase]

2011-09-06 Thread Stephen J. Turnbull
This is all speculation and no hint of implementation at this point ...
redirecting this subthread to Python-Ideas.  Reply-To set accordingly.

Nick Coghlan writes:

 > Heh, I knew as soon as I sent that message that someone would be able
 > to point out a counter example. I agree that RFC 822 (and
 > case-insensitive ASCII comparison in general) is enough to save
 > lower() and upper() and co, but what about this even further reduced
 > list of text-specific methods:
 > 
 >  'capitalize'
 >  'istitle'
 >  'swapcase'
 >  'title'
 > 
 > While case-insensitive comparison makes sense for wire level data,
 > where do these methods fit in, even when embedded ASCII text fragments
 > are involved?

Well, 'capitalize' could theoretically be used to "beautify" RFC 822
field names, but realistically, to me they're a litmus test for
packages I probably don't want on my system.<0.5 wink>

I don't know if it's worth the effort to deprecate them, though.
There is a school of thought (represented on python-dev by Philip Eby
and Antoine Pitrou, among others, I would say) that says that text
with an implicit encoding is still text if you can figure out what the
encoding is, and the syntactically important tokens are invariably
ASCII, which often is enough information to do the work.  So if you
can do some operation without first converting to str, let's save the
cycles and the bytes (especially in bit-shoveling applications like
WSGI)!  I disagree, but "consenting adults" and all that.

It occurs to me that the bit-shoveling applications would generally be
sufficiently well-served with a special "codec" that just stuffs the
data pointer in a bytes object into the latin1 member of the data
pointer union in a PEP 393 Unicode object, and marks the Unicode
object as "ascii-compatible", ie, anything ASCII can be manipulated as
text, but anything non-ASCII is like a private character that Python
doesn't know anything about, and can't do anything useful with, except
delete or pass through verbatim (perhaps as a slice).

This may be nonsense; I don't know enough about Python internals to be
sure.  And it would be a change to PEP 393, since the encoding of the
8-bit representation would no longer be Unicode.  I wouldn't blame
Martin one bit if he hated the idea in principle!  On the other hand,
the "Latin-1 can be used to decode any binary content" end-around
makes that point moot IMO.  This would give a somewhat safer way of
doing that.

But if feasible and a Pythonic implementation could be devised, that
would take much of the wind out of the sails of the "implicitly it's
ASCII text" crowd.  The whole "it's inefficient in time and space to
work with 'str'" argument goes away, leaving them with "it's verbose"
as the only reason for not doing the conversion.

I don't know if there would be any use case left for bytes at that
point ... but that's clearly a py4k discussion.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com