Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-02 Thread M.-A. Lemburg
On 2009-04-02 17:32, Martin v. Löwis wrote:
 I propose the following PEP for inclusion to Python 3.1.

Thanks for picking this up.

I'd like to extend the proposal to Python 2.7 and later.

 Please comment.
 
 Regards,
 Martin
 
 Specification
 =
 
 Rather than using an imperative mechanism for importing packages, a
 declarative approach is proposed here, as an extension to the existing
 ``*.pkg`` mechanism.
 
 The import statement is extended so that it directly considers ``*.pkg``
 files during import; a directory is considered a package if it either
 contains a file named __init__.py, or a file whose name ends with
 .pkg.

That's going to slow down Python package detection a lot - you'd
replace an O(1) test with an O(n) scan.

Alternative Approach:
-

Wouldn't it be better to stick with a simpler approach and look for
__pkg__.py files to detect namespace packages using that O(1) check ?

This would also avoid any issues you'd otherwise run into if you want
to maintain this scheme in an importer that doesn't have access to a list
of files in a package directory, but is well capable for the checking
the existence of a file.

Mechanism:
--

If the import mechanism finds a matching namespace package (a directory
with a __pkg__.py file), it then goes into namespace package scan mode and
scans the complete sys.path for more occurrences of the same namespace
package.

The import loads all __pkg__.py files of matching namespace packages
having the same package name during the search.

One of the namespace packages, the defining namespace package, will have
to include a __init__.py file.

After having scanned all matching namespace packages and loading
the __pkg__.py files in the order of the search, the import mechanism
then sets the packages .__path__ attribute to include all namespace
package directories found on sys.path and finally executes the
__init__.py file.

(Please let me know if the above is not clear, I will then try to
follow up on it.)

Discussion:
---

The above mechanism allows the same kind of flexibility we already
have with the existing normal __init__.py mechanism.

* It doesn't add yet another .pth-style sys.path extension (which are
difficult to manage in installations).

* It always uses the same naive sys.path search strategy. The strategy
is not determined by some file contents.

* The search is only done once - on the first import of the package.

* It's possible to have a defining package dir and add-one package
dirs.

* Namespace packages are easy to recognize by testing for a single
resource.

* Namespace __pkg__.py modules can provide extra meta-information,
logging, etc. to simplify debugging namespace package setups.

* It's possible to freeze such setups, to put them into ZIP files,
or only have parts of it in a ZIP file and the other parts in the
file-system.

Caveats:

* Changes to sys.path will not result in an automatic rescan for
additional namespace packages, if the package was already loaded.
However, we could have a function to make such a rescan explicit.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 02 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools has divided the Python community

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 04:19, Guido van Rossum wrote:
 - keep distutils, but start deprecating certain higher-level
 functionality in it (e.g. bdist_rpm)
 - don't try to provide higher-level functionality in the stdlib, but
 instead let third party tools built on top of these core APIs compete

Should this be read as:

- remove bdist_rpm from the stdlib and let it live on PyPI

?

Perhaps I just misunderstand the comment.

I think that esp. the bdist_* commands help developers a lot by
removing the need to know how to build e.g. RPMs or Windows
installers and let distutils deal with it.

The bdist_* commands don't really provide any higher level
functionality. They only provide interfaces to certain packaging
formats commonly used on the various platforms.

Instead of removing such functionality, I think we should add
more support for standard packaging formats to distutils, e.g.
bdist_deb, bdist_pkg, etc.

And for eggs, there should be a standard bdist_egg, written against
the core distutils APIs (*), creating archives which other Python
package managers can then use in whatever way they seem fit.

Just please don't tie eggs to one specific package manager,
e.g. having to install setuptools just to run eggified packages
is just plain wrong. The format itself doesn't require this and
neither should the software shipped with those eggs.

(*) I've had a go at this a few months ago and then found out
that the egg format itself is not documented anywhere. As a result
you have to dig deep into setuptools to find out which files
are needed and where. That's something that needs to change
(Tarek is already working on a PEP for this, AFAIK).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools has divided the Python community

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 13:58, David Cournapeau wrote:
 On Fri, Mar 27, 2009 at 9:49 PM, M.-A. Lemburg m...@egenix.com wrote:
 
 I think that esp. the bdist_* commands help developers a lot by
 removing the need to know how to build e.g. RPMs or Windows
 installers and let distutils deal with it.
 
 I think it is a big dangerous to build rpm/deb without knowing how to
 build them, because contrary to windows .exe, rpm/deb install things
 system-wide, and you could easily break something. I don't think you
 can build deb/rpm without knowing quite a lot about them.

Well, at least the bdist_rpm command did a pretty good job: RedHat
used the RPM spec generated by it to ship egenix-mx-base in their
distribution.

 (*) I've had a go at this a few months ago and then found out
 that the egg format itself is not documented anywhere. As a result
 you have to dig deep into setuptools to find out which files
 are needed and where. That's something that needs to change
 (Tarek is already working on a PEP for this, AFAIK).
 
 It is documented here:
 
 http://peak.telecommunity.com/DevCenter/EggFormats

Thanks for the link. I must have missed that in my search for
a format spec.

 But as said in the preambule, people are not supposed to rely on this.
 I for once would be really happy if  I could build eggs without
 setuptools - for example to build eggs from scons, scripts, etc...

Right, the .egg format should be a formal PEP standard.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools has divided the Python community

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 15:00, Ronald Oussoren wrote:
 
 On 27 Mar, 2009, at 7:49, M.-A. Lemburg wrote:
 
 On 2009-03-27 04:19, Guido van Rossum wrote:
 - keep distutils, but start deprecating certain higher-level
 functionality in it (e.g. bdist_rpm)
 - don't try to provide higher-level functionality in the stdlib, but
 instead let third party tools built on top of these core APIs compete

 Should this be read as:

 - remove bdist_rpm from the stdlib and let it live on PyPI

 ?

 Perhaps I just misunderstand the comment.

 I think that esp. the bdist_* commands help developers a lot by
 removing the need to know how to build e.g. RPMs or Windows
 installers and let distutils deal with it.

 The bdist_* commands don't really provide any higher level
 functionality. They only provide interfaces to certain packaging
 formats commonly used on the various platforms.

 Instead of removing such functionality, I think we should add
 more support for standard packaging formats to distutils, e.g.
 bdist_deb, bdist_pkg, etc.
 
 IIRC the reason for wanting to deprecate bdist_rpm (and not adding
 bdist_deb, ...) is that the variour Linux distributions have varying
 policies for how to package Python code and those policies tend to vary
 on another schedule than the Python development schedule. The result of
 this is that the Linux distributors are incapable to use bdist_rpm. 

Hmm, I have heard different things - see my reply to Olemis :-)

Besides, the bdist_rpm command as well as the others are not only
for Linux distributors to use, but to help developers ship their
packages to their users *without* the help of some distribution,
ie. by providing an RPM or DEB file for download.

 It would therefore be better to ensure that Python packages / distutils can
 provide the metadata that's needed to build packages and move the actual
 creation of OS installers outside of the core where the tool can be
 maintained by people that have detailed knowlegde about the needs of the
 packaging system and system policies.

Fair enough, though, I'm sure we have enough developer knowledge
on board to maintain bdist_rpm and add a bdist_deb. Perhaps even
an bdist_dmg :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] version compare function into main lib

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 17:01, Eric Smith wrote:
 Martin v. Löwis wrote:
 Correct me if I wrong, but shouldn't Python include function for
 version comparisons?

 On the packaging summit yesterday, people agreed that yes, we should
 have something like that in the standard library, and it should be more
 powerful than what distutils currently offers.
 
 Yes.
 
 There was no conclusion of how specifically that functionality should
 be offered; several people agreed that Python should mandate a standard
 format, which it is then able to compare. So you might not be able to
 spell it 10.3.40-beta, but perhaps 10.3.40b1 or 10.3.40~beta.
 
 I got the impression that people are generally happy with what
 setuptools provides for version parsing and comparison.
 
 Does anyone think that's not a good model?

Instead of trying to parse some version string, distutils should
require defining the version as tuple with well-defined entries -
much like what we have in sys.version_info for Python.

The developer can then still use whatever string format s/he wants.

The version compare function would then work on this version tuple
and probably be called cmp() (at least in Python 2.x ;-).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools has divided the Python community

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 17:07, P.J. Eby wrote:
 At 11:37 PM 3/26/2009 -0500, Eric Smith wrote:
 P.J. Eby wrote:
  As someone else suggested, moving some of the functionality to PEP 302
  interfaces would also help.  Most of the code, though, deals with
  locating/inspecting installed distributions, resolving version
  requirements, and managing sys.path.  And most of the nastiest
  complexity comes from trying to support true filename access to
  resources -- if that were dropped from the stdlib, there'd be no need
  for egg caches and the like, along with all the complexity entailed.
 
  Application environments such as Chandler, Trac, Zope, etc. that want
  their plugins to live in .egg files wouldn't necessarily be able to use
  such an API, but the independent pkg_resources wouldn't be
  disappearing.  (Of course, they could also implement
  application-specific file extraction, if the stdlib API included the
  ability to inspect and open zipped resources.)

 Could you comment on why they couldn't use such an API?
 
 If a plugin includes C code (.so/.dll), or uses a library that operates
 on filenames rather than bytes in memory (e.g. gettext), then the
 resources would need to be extracted from the .egg.  pkg_resources
 transparently extracts such resources to a cache directory when you ask
 for a resource's filename, rather than asking for a stream or string of
 its contents.
 
 This feature represents a significant chunk of the complexity and code
 size of pkg_resources -- and I was proposing ways to cut down on that
 complexity and code size, for a (limited) stdlib version of the
 functionality.

This functionality is one of the more annoying setuptools
features. It causes each and every user of e.g. Trac to have
their own little version of the same piece of software in their
home dir cache.

The solution to this is simple: don't use ZIP files for installed
packages, instead unzip them into normal directories on sys.path.

This makes all these problems go away and allows users to access
embedded documentation, configuration, etc.

Zip files are great for shipping packages to the end-user, but
there's no need to keep them zipped once they get there.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools has divided the Python community

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 17:19, P.J. Eby wrote:
 At 01:49 PM 3/27/2009 +0100, M.-A. Lemburg wrote:
 (*) I've had a go at this a few months ago and then found out
 that the egg format itself is not documented anywhere.
 
 It's been documented for just under three years now.  Here's where you
 quoted the email where I announced that documentation, provided links to
 it, and asked you to let me know if there's anything else you'd need in it:
 
 http://mail.python.org/pipermail/python-dev/2006-April/064496.html

Thanks for reminding me. I must have forgotten about that
wiki page and instead looked on the setuptools page.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools has divided the Python community

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 20:56, Guido van Rossum wrote:
 On Fri, Mar 27, 2009 at 8:02 AM, Eric Smith e...@trueblade.com wrote:
 M.-A. Lemburg wrote:
 On 2009-03-27 04:19, Guido van Rossum wrote:
 - keep distutils, but start deprecating certain higher-level
 functionality in it (e.g. bdist_rpm)
 - don't try to provide higher-level functionality in the stdlib, but
 instead let third party tools built on top of these core APIs compete
 Should this be read as:

 - remove bdist_rpm from the stdlib and let it live on PyPI

 ?
 As one of the people who proposed this, I think it means: move bdist_rpm,
 bdist_msi, etc. out of distutils, but provide some of them with the standard
 Python installation. I'm certain that as part of the refactoring and
 simplification of distutils we'll gradually move the existing bdist_*
 commands into separate, stand-alone things (scripts, callable modules, or
 something). We'll need to do this if only for testing, so we may as well
 make them work.
 
 One of the motivations for deprecating this (and for using this
 specific example) was that Matthias Klose, the Python packager for
 Debian, said he never uses bdist_rpm.

Why would a Debian developer want to use bdist_rpm ? Perhaps you
meant a bdist_deb command, but that's not part of the stdlib
distutils.

More importantly:

Why is the non-use of a command by a single Python developer enough
motivation to remove a useful feature of distutils that's been in
use by many others for years ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] splitting out bdist_* (was: interminable 'setuptools' thread)

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 21:49, gl...@divmod.com wrote:
 
 On 07:59 pm, fdr...@acm.org wrote:
 I'm actually in favor of removing the bdist_* from the standard
 library, and allowing 3rd-party tools to implement whatever they need
 for the distros.  But I don't think what you're presenting there
 supports it.
 
 I do think that it's relevant that the respective operating system
 packagers don't find bdist_rpm, bdist_deb, et. al. useful.  It's not
 very useful to have a bdist_deb that nobody actually builds debs with.
 This has been a problem for me, personally, since debian has made
 various ad-hoc change to Twisted or Twisted-based packages to break our
 plugin system, since the distutils metadata has been insufficient for
 their purposes.  If the deb-generating stuff were in a separate project
 with a faster release cycle that were easier to contribute packages to,
 perhaps debian packagers could be convinced to contribute their build-
 hacks there (and bdist_deb could invoke debhelper, or vice-versa).
 
 It would be great if someone could volunteer to maintain this stuff
 independently, put it in a Launchpad project or something.  IMHO it
 would be better for the community at large if this were spun as
 increasing the release speed of the bdist_* family, rather than
 removing, which seems to me like it would generate another teacup-
 tempest on the blogowebs.  Of course I'm not volunteering, but I will be
 best friends forever with whoever does this PR/maintenance :).
 
 Given that py2exe and py2app (which includes bdist_mpkg) are both
 based on distutils, it seems like we're on the way to independent
 maintenance anyway.  Perhaps bdist_wininst/_msi could be donated to the
 py2exe project if they would be willing to maintain it, and the new
 project for _deb and _rpm could be called py2packman or something.

Do you really think that splitting up the distutils package is
going to create a better user experience ?

What would benefit the bdist_* commands is externalized maintenance,
ie. have more frequent releases on PyPI, but still ship the
most up-to-date versions with core distutils in each new Python
release.

BTW: py2exe and py2app solve a different set of problems than distutils
is trying to solve. They are about packaging complete applications,
not individual packages, so I don't see how they relate to the
bdist_* commands. But perhaps I'm missing some context.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools has divided the Python community

2009-03-27 Thread M.-A. Lemburg
On 2009-03-27 20:24, s...@pobox.com wrote:
 mal Zip files are great for shipping packages to the end-user, but
 mal there's no need to keep them zipped once they get there.
 
 I thought one of the arguments for zip files was a performance increase
 (reduced stat(2) calls, etc).  I may misremember though.

True and like Fred already mentioned that's the main reason why we
have ZIP file package imports.

Putting the stdlib into a ZIP file does make our favorite interpreter
start up faster. E.g. py2exe makes use of this feature.

However, using eggs (which are ZIP files) directly on the sys.path
causes these to get scanned on every startup - regardless of whether
you use any of their content. Very much unlike standard Python package
directories that only get scanned if they are referenced.

Due to the nature of eggs (many small packages), you usually end up
having a whole Easter basket full of them in your sys.path.

Unless you want to maintain a separate Python installation per task,
your overall Python startup time will increase noticeably for every
single script you run with it.

Perhaps someone should start working on a tool called FryingPan to
create Omelettes, ie. all eggs squashed into a single ZIP file... ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-24 Thread M.-A. Lemburg
On 2009-03-24 23:47, Martin v. Löwis wrote:
 An installer for a pure-python package that made no attempt
 to bundle dependencies might be nice, but I don't quite see how that
 falls outside the scope of distutils/setuptools/etc.  In other words, I
 don't see why the installer can't bootstrap the 'normal' dependency
 management which would be used if the package was installed any other
 way or on other platforms.
 
 Perhaps that could be a solution. However, in package management
 systems that solve this properly, you also have proper uninstallation,
 which includes:
 - uninstallation is rejected if packages still depend on the
   to-be-removed package (or else offers to remove the relying packages
   as well)
 - uninstallation reference-counts, causing an automatically-installed
   package to be uninstalled if it is no longer needed, or else offers
   to compute-then-uninstall all packages which are no longer needed.
 The .exe/.msi installers do support uninstallation, but, alas, no
 dependency management.

Question is: who really needs such dependency management ?

* It may be helpful to developers who wrap up 3rd party code for
an application (e.g. Miro).

* It may also help users that want to install a few plugins for an
already installed application (e.g. Zope).

* It will help users who use OSes that rely on software management
tools to keep the initial distribution size small and prefer
sharing over application isolation (e.g. Ubuntu).

* It won't simplify things if such a system gets in the way of
how the users or developer usually works or wants to work in
a project.

* It introduces dependencies on network resources that may
potentially not be trusted.

* If the package dependencies are not managed with lots of quality
assurance, it can easily ruin your complete installation or
simply prevent you from installing two sets of packages at
the same time.

There are both lots of reasons for wanting dependency checking
and against such dependency checking.

As a result, there is no definite answer as to whether it's
good or bad and there is no single system that would satisfy
all users/developers.

Instead, there needs to be freedom of choice and
distutils provides this freedom of choice by allowing you
to ship .exes, .msi installers, binary drop-in ZIP archives,
RPM packages, Debian packages, etc. etc.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 24 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2009-03-19: Released mxODBC.Connect 1.0.1  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mark old distutils as deprecated

2009-02-16 Thread M.-A. Lemburg
On 2009-02-16 17:54, Tarek Ziadé wrote:
 2009/2/9 M.-A. Lemburg m...@egenix.com:
 On 2009-02-08 11:15, Tarek Ziadé wrote:
 Hello

 To avoid confusion, as suggested by Akira who works on cleaning the
 Distutils pages on the python.org website,
 I would like to move http://svn.python.org/view/distutils/trunk into a
 branch and add a README.txt in an empty trunk
 to explain the current status of the package.

 Any objection ?
 No.

 It be worthwhile keeping just the setup code and adjust that
 to take the distutils package from the python/ dir in order to
 build separate releases of the code for upload to PyPI (ones
 that basically provide the code as released in Python 2.x as
 separate download for 2.(x-1) and 2.(x-2)).
 
 Indeed, and for any Python version in fact, to get early feedbacks
 between two Python releases.
 
 I will make releases as you mentioned, and also a development release
 using the svn revision,
 so people can try out the current trunk.
 
 '%sdev-r69676' % sys.version.split()[0]
 '2.7a0dev-r69676'

Great !

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 16 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issues to be closed: objections?

2009-02-16 Thread M.-A. Lemburg
On 2009-02-16 18:50, Daniel (ajax) Diniz wrote:
 Hi,
 I've marked some issues (25 now) to close, mostly because:
 - there was no reply from OP, nor a clear justification for the issue;
 - there are messages explaining why the issue is invalid;
 - the OSes/versions of the report suggest the issue is currently invalid;
 
 However, I've been mistaken about the desirability of leaving an issue
 open a couple of times in last days. So, I'd really appreciate if
 someone could take a quick look at the issues below to avoid any
 undesirable closing.
 
 I'll also mark them as pending later today, and plan to wait until the
 weekend before closing. Any suggestion/criticism about this plan would
 be welcome too.
 
 Thanks everybody for all the support, helping, patience and enduring the spam!
 
 Daniel
 
 http://bugs.python.org/issue1231081 platform.processor() could be smarter

FYI: I've closed this one.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 16 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mark old distutils as deprecated

2009-02-09 Thread M.-A. Lemburg
On 2009-02-08 11:15, Tarek Ziadé wrote:
 Hello
 
 To avoid confusion, as suggested by Akira who works on cleaning the
 Distutils pages on the python.org website,
 I would like to move http://svn.python.org/view/distutils/trunk into a
 branch and add a README.txt in an empty trunk
 to explain the current status of the package.
 
 Any objection ?

No.

It be worthwhile keeping just the setup code and adjust that
to take the distutils package from the python/ dir in order to
build separate releases of the code for upload to PyPI (ones
that basically provide the code as released in Python 2.x as
separate download for 2.(x-1) and 2.(x-2)).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 09 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange locale problem with Python 3

2009-02-02 Thread M.-A. Lemburg
On 2009-02-01 19:44, Reto Schüttel wrote:
 Hi
 
 While helping Brandon Rhodes to port PyEphem[1] to Python 3 we struggled
 over a
 strange locale-related problem on OS X. PyEphem is a library which can do
 astronomical computations like tracking the position of stars, planets and
 earth satellites relative to the earth's position. When testing out the
 Python
 3 release of PyEphem I noticed that on my OS X laptop a lot of calculations
 were wrong (not completely wrong, but clearly not accurate) compared to
 Python
 2.5. We (well mostly Brandon) were able to track down the problem to the
 TLE
 parser (TLE are data file containing the orbital elements of an object)
 which
 appears to read most values wrong with python 3. In fact it cut of the
 decimal
 parts of all floats (1.123232 got 1, etc). Manually setting LANG and
 LC_ALL to
 C solved the problem.
 
 It turns out that some parts of Python 3 got more locale dependent on some
 platforms. The only platform I am aware of is OS X, on Linux Python 3
 appears
 to behave like Python 2.x did.

This is probably due to the unconditional call to setlocale() in
pythonrun.c:

/* Set up the LC_CTYPE locale, so we can obtain
   the locale's charset without having to switch
   locales. */
setlocale(LC_CTYPE, );

In Python 2, no such call is made and as a result the C lib defaults
to the C locale.

Calling setlocale() in an application is always dangerous due to the
many side-effects this can have on the C lib parsing and formatting
APIs.

If this is done just to figure the environment's locale settings,
then it's better to reset the locale to the one that was active
before the setlocale() call. Python 2 uses this approach. Python 3
does not.

 In case of PyEphem the problem was in the C extension which got more locale
 dependent, for example atof() or scanf() with Python 3 now expected the
 german
 decimal-delimiter ',' instead of the '.' in floats (3.14 vs. 3,14). On the
 other hand the constructor of float still expects '.' all the time. But the
 built-in function strptime() honors locales with Python 3 and expects
 german
 week day.
 
 I've written a simple script and a simple C extension which illustrates the
 problem. Both the extension and the script run python 2.x and python 3,
 so you
 can easily compare the result while executing the script in different
 environments.
 
 I was only able to reproduce the problem on OS X (10.5) and using a german
 locale like de_CH.UTF-8. When manually setting LC_ALL=C, the differences
 disappears.
 
 I can't imagine that his behavior was really intended, and I hope the
 test case
 helps you guys to identify/fix this problem.
 
 Download the test case from:
 http://github.com/retoo/py3k-locale-problem/tarball/master
 or get it using git:
 git://github.com/retoo/py3k-locale-problem.git
 
 You can use the following steps to build it:
 
 $ python2.5 setup.py build
 $ python3.0 setup.py build
 
 To run the tests with python 2.5, enter:
 $ (cd build/lib*-2.5; python2.5 py3k_locale_problem.py)
 ... for 3.0  ...
 $ (cd build/lib*-3.0; python3.0 py3k_locale_problem.py)
 
 In the file 'results.txt' you can see the output from my OS X system.
 
 Cheers,
 Reto Schüttel
 
 [1] http://rhodesmill.org/pyephem/

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 02 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-30 Thread M.-A. Lemburg
On 2009-01-30 11:40, Antoine Pitrou wrote:
 Aahz aahz at pythoncraft.com writes:
 There's absolutely no reason not to have a 3.0.2 before 3.1 comes out.
 You're probably right that what Raymond wants to is best not done for
 3.0.1 -- but once we've agreed in principle that 3.0.x isn't a true
 production release of Python for PEP6 purposes, we can do release early,
 release often.
 
 It's a possibility. To be honest, I didn't envision us releasing a 3.0.2 
 rather
 than focusing on 3.1 (which, as others said, can be released in a few months 
 if
 we keep the amount of changes under control).
 
 But then it's only a matter of naming. We can continue the 3.0.x series and
 incorporate in them whatever was initially planned for 3.1 (including the
 IO-in-C branch, the dbm.sqlite module, etc.), and release 3.1 only when the
 whole thing is good enough.

That would be my preference.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 30 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-29 Thread M.-A. Lemburg
On 2009-01-29 01:59, Stephen J. Turnbull wrote:
 I think there is definitely something to the notion that the 3.x
 vs. 3.0.y distinction should signal something, and I personally like
 MAL's suggestion that 3.0.x should be marked some sort of beta in
 perpetuity, or at least until 3.1 is ready to ship as stable and
 production-ready.  (That's AIUI, MAL's intent may be somewhat
 different.)

That's basically it, yes.

I don't think that marking 3.0 as experimental is bad in any way,
as long as we're clear about it.

Having lots of incompatible changes in a patch level release will
start to get users worrying about the stability of the 3.x branch
anyway, so a heads-up message and clear perspective for the 3.1
release is a lot better than dumping 3.0 altogether or not
providing such a perspective at all.

That said, we should stick to the statement already made for
3.0 (too early as it now appears), ie. that the same development and
releases processes will apply to the 3.x branch as we have for 2.x -
starting with 3.1.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 29 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread M.-A. Lemburg
On 2009-01-27 22:19, Raymond Hettinger wrote:
 From: Martin v. Löwis mar...@v.loewis.de
 Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think
 it should be released earlier (else 3.0 looks fairly ridiculous).
 
 I think it should be released earlier and completely supplant 3.0
 before more third-party developers spend time migrating code.
 We needed 3.0 to get released so we could get the feedback
 necessary to shake it out.  Now, it is time for it to fade into history
 and take advantage of the lessons learned.
 
 The principles for the 2.x series don't really apply here.  In 2.x, there
 was always a useful, stable, clean release already fielded and there
 were tons of third-party apps that needed a slow rate of change.
 
 In contrast, 3.0 has a near zero installed user base (at least in terms
 of being used in production).  It has very few migrated apps.  It is
 not particularly clean and some of the work for it was incomplete
 when it was released.
 
 My preference is to drop 3.0 entirely (no incompatable bugfix release)
 and in early February release 3.1 as the real 3.x that migrators ought
 to aim for and that won't have incompatable bugfix releases.  Then at
 PyCon, we can have a real bug day and fix-up any chips in the paint.
 
 If 3.1 goes out right away, then it doesn't matter if 3.0 looks ridiculous.
 All eyes go to the latest release.  Better to get this done before more
 people download 3.0 to kick the tires.

Why don't we just mark 3.0.x as experimental branch and keep updating/
fixing things that were not sorted out for the 3.0.0 release ?! I think
that's a fair approach, given that the only way to get field testing
for new open-source software is to release early and often.

A 3.1 release should then be the first stable release of the 3.x series
and mark the start of the usual deprecation mechanisms we have
in the 2.x series. Needless to say, that rushing 3.1 out now would
only cause yet another experimental release... major releases do take
time to stabilize.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 28 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Copyright notices in modules

2009-01-20 Thread M.-A. Lemburg
On 2009-01-20 00:56, Raymond Hettinger wrote:
 Why does numbers.py say:
 
# Copyright 2007 Google, Inc. All Rights Reserved.
# Licensed to PSF under a Contributor Agreement.

Because that's where that file originated, I guess. This is part
of what you have to do for things that are licensed to the PSF
under a contributor agreement:

http://www.python.org/psf/contrib/contrib-form/


Contributor shall identify each Contribution by placing the following notice in
its source code adjacent to Contributor's valid copyright notice: Licensed to
PSF under a Contributor Agreement.


 Weren't there multiple contributors including non-google people?

The initial contribution was done by Google (Jeffrey Yasskin
AFAIK) and that's where the above lines originated from.

 Does Google want to be associated with code that
 was submitted with no tests?

Only Google can comment on this.

 Do we want this sort of stuff in the code?

Yes, it is required by the contrib forms.

 If someone signs a contributor agreement, can we
 forgo the external copyright comments?

No. See above. Only the copyright owner can remove such
notices.

 Do we want to make a practice of every contributor
 commenting in the name of the company they were
 working for at the time (if so, I would have to add
 the comment to a lot of modules)?

That depends on the contract a contributor has with the
company that funded the work. It's quite common for such
contracts to include a clause stating that all IP generated
during work time is owned by the employer.

 Does the copyright concept even apply to an
 abstract base class (I thought APIs were not
 subject to copyright, just like database layouts
 and language definitions)?

It applies to the written program text. You are probably
thinking about other IP rights such as patents or designs.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 20 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Copyright notices in modules

2009-01-20 Thread M.-A. Lemburg
On 2009-01-20 11:02, Michael Foord wrote:
 M.-A. Lemburg wrote:
 [snip...]
  
 Does the copyright concept even apply to an
 abstract base class (I thought APIs were not
 subject to copyright, just like database layouts
 and language definitions)?
 

 It applies to the written program text. You are probably
 thinking about other IP rights such as patents or designs.

   
 
 You need to read Van Lindberg's excellent book on intellectual property
 rights and open source (which is about American law and European law
 will be different). Mere collections of facts are not copyrightable as
 they are not creative (the basis of copyright) and this is presumed to
 apply to parts of software like header files and interface descriptions
 - which could easily apply to ABCs in Python.

I doubt that you can make such assumptions in general. It's a
case-by-case decision and also one that depends on the copyright
law or convention you assume.

See e.g. the WIPO copyright treaty:

http://www.wipo.int/treaties/en/ip/wct/trtdocs_wo033.html#P56_5626

and the Berne Convention:

http://www.wipo.int/treaties/en/ip/berne/trtdocs_wo001.html#P85_10661

and TRIPS:

http://www.wto.org/english/docs_e/legal_e/27-trips_04_e.htm#1

That said, for numbers.py there's certainly enough creativity in that
file to enjoy copyright protection.

 I recommend his book by the way - I'm about half way through so far and
 it is highly readable

Thanks for the pointer.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 20 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Copyright notices in modules

2009-01-20 Thread M.-A. Lemburg
On 2009-01-20 16:54, Stephen J. Turnbull wrote:
 M.-A. Lemburg writes:
   On 2009-01-20 11:02, Michael Foord wrote:
 
Mere collections of facts are not copyrightable as they are not
creative (the basis of copyright)
 
 That's incorrect in the U.S.; what is copyrightable is an *original
 work of expression fixed in some medium*.  Original is closely
 related to creative, but it's not the same.  The emphasis is on
 novelty, not on the intellectual power involved.  So, for example, you
 can copyright a set of paint splashes on paper, as long as they're
 *new* paint splashes.
 
 The real issue here, however, is expression.  What's important is
 whether there are different ways to say it.  So you can indeed
 copyright the phone book or a dictionary, which *does* protect such
 things as unusual use of typefaces or color to aid understanding.
 What you can't do is prevent someone from publishing another phone
 book or dictionary based on the same facts, and since put it in
 alphabetical order hasn't been an original form of expression since
 Aristotle or so, they can alphabetize their phone book or dictionary,
 and it is going to look a lot like yours.

The above argument is what makes copyright so complicated. Computer
software has been given the same status as a piece of literary work,
so all conventions for such works apply.

However, this doesn't necessarily mean that all computer software
is copyrightable per-se. The key problem is defining the threshold of
originality needed for a work to become copyrightable at all and
that's where different jurisdictions use different definitions or
guidelines based on case law.

http://en.wikipedia.org/wiki/Threshold_of_originality

E.g. in Germany it is common not to grant copyright on logos that
are used as trademarks. OTOH, use of a logo in the trademark
sense automatically makes it a trademark (even without registration).

 On the other hand, ABCs are not a mere collection of facts. They are
 subject to various forms of organization (top down, bottom up,
 alphabetical order, etc), and that organization will in general be
 copyrightable.  Also, unless your ABCs are all independent of each
 other, you will be making choices about when to derive and when to
 define from scratch.  That aspect of organization is expressive, and
 once written down (fixed in a medium) it is copyrightable.
 
I recommend his book by the way - I'm about half way through so far and
it is highly readable
 
 Larry Rosen's book is also good.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 20 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fixing incorrect indentations in C files (Decoder functions accept str in py3k)

2009-01-08 Thread M.-A. Lemburg
On 2009-01-08 01:01, Collin Winter wrote:
 On Wed, Jan 7, 2009 at 2:35 PM, Brett Cannon br...@python.org wrote:
 On Wed, Jan 7, 2009 at 10:57, M.-A. Lemburg m...@egenix.com wrote:
 [SNIP]
 BTW: The _codecsmodule.c file is a 4 spaces indent file as well (just
 like all Unicode support source files). Someone apparently has added
 tabs when adding support for Py_buffers.

 It looks like this formatting mix-up is just going to get worse for
 the next few years while the 2.x series is still being worked on.
 Should we just bite the bullet and start adding modelines for Vim and
 Emacs to .c/.h files that are written in the old 2.x style? For Vim I
 can then update the vimrc in Misc/Vim to then have 4-space indent be
 the default for C files.
 
 Or better yet, really bite the bullet and just reindent everything to
 spaces. Not every one uses vim or emacs, nor do all tools understand
 their modelines. FYI, there are options to svn blame and git to skip
 whitespace-only changes.

+1... and this should be done for both trunk and the 3.x branch
in a single checkin to resync them.

svn blame -x -b will do the trick for SVN. Perhaps there's even
some .subversion/config option to set this globally.

The question really is: How often do Python developers use svn blame ?

If this is only done for a file or two every now and then, I don't
think that adding the above option to the command would be much
to ask for.

The question to put up against this is: How often do you get
irritated by lines not being correctly indented ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 08 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fixing incorrect indentations in C files (Decoder functions accept str in py3k)

2009-01-08 Thread M.-A. Lemburg
On 2009-01-08 12:36, Kristján Valur Jónsson wrote:
 Oh dear. C code indented by spaces?
 I'll give up programming then.
 Just set your editor tab size to 4 and all is well.

I know this is flame bait, but TABs are 8 spaces in Python land :-)
and most C files in Python that contain TABs and mix them with
spaces rely on this.

BTW: I don't blame anyone for the mixup - some editors simple
go ahead and convert 8 spaces leading whitespace into TABs
without the user knowing about this... after all, white on white
looks all white in the end ;-) (there are even some steganographic
systems out there, applying this scheme to embed data into
text files).

In any case, I think I need to remind people of PEP 7: Style Guide
for C Code ...

http://www.python.org/dev/peps/pep-0007/

It already says: At some point, the whole codebase may be
converted to use only 4-space indents.

 K
 
 -Original Message-
 From: python-dev-bounces+kristjan=ccpgames@python.org 
 [mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
 M.-A. Lemburg
 Sent: 8. janúar 2009 09:49
 To: Collin Winter
 Cc: Antoine Pitrou; python-dev@python.org
 Subject: Re: [Python-Dev] Fixing incorrect indentations in C files (Decoder 
 functions accept str in py3k)
 Or better yet, really bite the bullet and just reindent everything to
 spaces. Not every one uses vim or emacs, nor do all tools understand
 their modelines. FYI, there are options to svn blame and git to skip
 whitespace-only changes.
 
 +1... and this should be done for both trunk and the 3.x branch
 in a single checkin to resync them.
 
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/mal%40egenix.com

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 08 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread M.-A. Lemburg
On 2009-01-07 16:34, Guido van Rossum wrote:
 Sounds like yet another remnant of the old philosophy, which indeed
 supported encode and decode operations on both string types. :-(

No, that's something I explicitly readded to Python 3k, since the
codecs interface is independent of the input and output types (the
codecs decide which combinations to support).

The bytes and Unicode *methods* do guarantee that you get either
Unicode or bytes as output.

 On Wed, Jan 7, 2009 at 5:39 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Hello,

 I've just noticed that in py3k, the decoding functions in the codecs module
 accept str objects as well as bytes:

  # import codecs
  # c = codecs.getdecoder('utf8')
  # c('aa')
  ('aa', 2)
  # c('éé')
  ('éé', 4)
  # c = codecs.getdecoder('latin1')
  # c('aa')
  ('aa', 2)
  # c('éé')
  ('Ã(c)Ã(c)', 4)

 Is it a bug?

 Regards

 Antoine.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/guido%40python.org

 
 
 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 07 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread M.-A. Lemburg
On 2009-01-07 19:32, Antoine Pitrou wrote:
 M.-A. Lemburg mal at egenix.com writes:
 No, that's something I explicitly readded to Python 3k, since the
 codecs interface is independent of the input and output types (the
 codecs decide which combinations to support).
 
 But why would the utf8 decoder accept unicode as input?

It shouldn't.

Looks like the codecs module codec interfaces were not updated
to only accept bytes on decode for the Unicode codecs.

BTW: The _codecsmodule.c file is a 4 spaces indent file as well (just
like all Unicode support source files). Someone apparently has added
tabs when adding support for Py_buffers.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 07 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] #ifdef __cplusplus?

2009-01-06 Thread M.-A. Lemburg
On 2009-01-05 23:17, Mark Hammond wrote:
 On 5/01/2009 11:13 PM, M.-A. Lemburg wrote:
 
 See above. Assertions are not meant to be checked in a production
 build. You use debug builds for debugging such low-level things.
 
 Although ironically, assertions have been disabled in debug builds on
 Windows - http://bugs.python.org/issue4804

Does this only affect asserts defined in the CRT or also ones defined
in the Python C code ? (I was only referring to the latter)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 06 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] #ifdef __cplusplus?

2009-01-06 Thread M.-A. Lemburg
On 2009-01-06 15:15, Kristján Valur Jónsson wrote:
 Only crt asserts, and those assertion features accessible through the 
 crtdbg.h file, such as _ASSERT and _ASSERTE.

Thanks.

In that case, I don't see much of a problem... after all, if someone
runs a Python debug build, they won't be trying to debug the MS CRT,
only Python ;-)

 Kristj'an
 
 -Original Message-
 From: python-dev-bounces+kristjan=ccpgames@python.org 
 [mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf Of 
 M.-A. Lemburg
 Sent: 6. janúar 2009 13:23
 To: mhamm...@skippinet.com.au
 Cc: python-dev@python.org
 Subject: Re: [Python-Dev] #ifdef __cplusplus?
 
 On 2009-01-05 23:17, Mark Hammond wrote:
 On 5/01/2009 11:13 PM, M.-A. Lemburg wrote:

 See above. Assertions are not meant to be checked in a production
 build. You use debug builds for debugging such low-level things.
 Although ironically, assertions have been disabled in debug builds on
 Windows - http://bugs.python.org/issue4804
 
 Does this only affect asserts defined in the CRT or also ones defined
 in the Python C code ? (I was only referring to the latter)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 06 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] #ifdef __cplusplus?

2009-01-05 Thread M.-A. Lemburg
On 2009-01-03 04:15, Adam Olsen wrote:
 On Fri, Jan 2, 2009 at 9:05 AM, M.-A. Lemburg m...@egenix.com wrote:
 On 2009-01-02 08:26, Adam Olsen wrote:
 Python's malloc wrappers are pretty messy.  Of your examples, only
 unicode-str isn't obvious what the result is, as the rest are local
 to that function.  Even that is obvious when you glance at the line
 above, where the size is calculated using sizeof(Py_UNICODE).

 If you're concerned about correctness then you'd do better eliminating
 the redundant malloc wrappers and giving them names that directly
 match what they can be used for.
 ??? Please read the comments in pymem.h and objimpl.h.
 
 I count 7 versions of malloc.  Despite the names, none of them are
 specific to PyObjects.  It's pretty much impossible to know what
 different ones do without a great deal of experience.

Is it ? I suggest you read up on the Python memory management and the
comments in the header files. The APIs are pretty straight forward...

http://docs.python.org/c-api/allocation.html
http://docs.python.org/c-api/memory.html

 Only very specialized uses need to allocate PyObjects directly anyway.
  Normally PyObject_{New,NewVar,GC_New,GC_NewVar} are better.

Better for what ? The APIs you referenced are only used to
allocate Python objects.

The malloc() wrappers provide a sane interface not only for
allocating Python objects, but also for arbitrary memory
chunks, e.g. ones referenced by Python objects.

 If the size calculation bothers you you could include the semantics of
 the PyMem_New() API, which includes the cast you want.  I've no
 opposition to including casts in a single place like that (and it
 would catch errors even with C compilation).
 You should always use PyMem_NEW() (capital letters), if you ever
 intend to benefit from the memory allocation debug facilities
 in the Python memory allocation interfaces.
 
 I don't see why such debugging should require a full recompile, rather
 than having a hook inside the PyMem_Malloc (or even providing a
 different PyMem_Malloc).

Of course it does: you don't want the debug overhead in a
production build.

 The difference between using the _NEW() macros and the _MALLOC()
 macros is that the first apply overflow checking for you. However,
 the added overhead only makes sense if these overflow haven't
 already been applied elsewhere.
 
 They provide assertions.  There's no overflow checking in release builds.

See above. Assertions are not meant to be checked in a production
build. You use debug builds for debugging such low-level things.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 05 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] #ifdef __cplusplus?

2009-01-02 Thread M.-A. Lemburg
On 2009-01-02 08:26, Adam Olsen wrote:
 On Thu, Jan 1, 2009 at 11:24 PM, Alexander Belopolsky
 alexander.belopol...@gmail.com wrote:
 On Fri, Jan 2, 2009 at 12:58 AM, Adam Olsen rha...@gmail.com wrote:
 ..
 As C++ has more specific ways of allocating memory, they impose this
 restriction to annoy you into using them.
 And so does Python API: see PyMem_NEW and PyMem_RESIZE macros.
 
 An optional second API provides convenience, not annoyance.  Besides,
 they're not used much anymore.  I am curious what their history is
 though.

See Include/pymem.h and objimpl.h for details.

PyMem_MALLOC() et al. provide an abstraction layer on top of the system's
malloc() implementation. PyObject_MALLOC() et al. use the Python
memory allocator instead.

  We won't be using them, and the extra casts and nothing but noise.
 A quick grep through the sources shows that these casts are not just nose:

 Objects/stringobject.c: op = (PyStringObject *)PyObject_MALLOC(..
 Objects/typeobject.c:   remain = (int *)PyMem_MALLOC(..
 Objects/unicodeobject.c:unicode-str = (Py_UNICODE*) 
 PyObject_MALLOC(..

 in many cases the type of object being allocated is not obvious from
 the l.h.s. name.  Redundant cast improves readability in these cases.
 
 Python's malloc wrappers are pretty messy.  Of your examples, only
 unicode-str isn't obvious what the result is, as the rest are local
 to that function.  Even that is obvious when you glance at the line
 above, where the size is calculated using sizeof(Py_UNICODE).
 
 If you're concerned about correctness then you'd do better eliminating
 the redundant malloc wrappers and giving them names that directly
 match what they can be used for.

??? Please read the comments in pymem.h and objimpl.h.

 If the size calculation bothers you you could include the semantics of
 the PyMem_New() API, which includes the cast you want.  I've no
 opposition to including casts in a single place like that (and it
 would catch errors even with C compilation).

You should always use PyMem_NEW() (capital letters), if you ever
intend to benefit from the memory allocation debug facilities
in the Python memory allocation interfaces.

The difference between using the _NEW() macros and the _MALLOC()
macros is that the first apply overflow checking for you. However,
the added overhead only makes sense if these overflow haven't
already been applied elsewhere.

  Figure out a way to turn off the warnings instead.

 These are not warnings: these are compile errors in C++.  A compiler
 which allows to suppress them would not be a standard compliant C++
 compiler.
 
 So long as the major compilers allow it I don't particularly care.
 Compiling as C++ is too obscure of a feature to warrant uglifying the
 code.
 
 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 02 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread M.-A. Lemburg
On 2008-12-20 23:16, Martin v. Löwis wrote:
 I will try next week to see if I can come up with a smaller,
 submittable example.  Thanks.
 These long exit times are usually caused by the garbage collection
 of objects. This can be a very time consuming task.
 
 I doubt that. The long exit times are usually caused by a bad
 malloc implementation.

With garbage collection I meant the process of Py_DECREF'ing the
objects in large containers or deeply nested structures, not the GC
mechanism for breaking circular references in Python.

This will usually also involve free() calls, so the malloc
implementation affects this as well. However, I've seen such long
exit times on Linux and Windows, which both have rather good
malloc implementations.

I don't think there's anything much we can do about it at the
interpreter level. Deleting millions of objects takes time and that's
not really surprising at all. It takes even longer if you have
instances with .__del__() methods written in Python.

Applications can choose other mechanisms for speeding up the
exit process in various (less clean) ways, if they have a need for
this.

BTW: Rather than using a huge in-memory dict, I'd suggest to either
use an on-disk dictionary such as the ones found in mxBeeBase or
a database.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread M.-A. Lemburg
On 2008-12-22 19:13, Mike Coleman wrote:
 On Mon, Dec 22, 2008 at 6:20 AM, M.-A. Lemburg m...@egenix.com wrote:
 BTW: Rather than using a huge in-memory dict, I'd suggest to either
 use an on-disk dictionary such as the ones found in mxBeeBase or
 a database.
 
 I really want this to work in-memory.  I have 64G RAM, and I'm only
 trying to use 45G of it (only 45G :-), and I don't need the results
 to persist after the program finishes.
 
 Python should be able to do this.  I don't want to hear Just use Perl
 instead from my co-workers...  ;-)

What kinds of objects are you storing in your dictionary ? Python
instances, strings, integers ?

The time it takes to deallocate the objects in your dictionary
depends a lot on the types you are using.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread M.-A. Lemburg
On 2008-12-20 17:57, Mike Coleman wrote:
 On Sat, Dec 20, 2008 at 4:02 AM, Kristján Valur Jónsson
 krist...@ccpgames.com wrote:
 Can you distill the program into something reproducible?
 Maybe with something slightly less than 45Gb but still exhibiting some 
 degradation of exit performance?
 I can try to point our commercial profiling tools at it and see what it is 
 doing.
 
 I will try next week to see if I can come up with a smaller,
 submittable example.  Thanks.

These long exit times are usually caused by the garbage collection
of objects. This can be a very time consuming task.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 20 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread M.-A. Lemburg
On 2008-12-20 21:20, Leif Walsh wrote:
 On Sat, Dec 20, 2008 at 3:04 PM, M.-A. Lemburg m...@egenix.com wrote:
 These long exit times are usually caused by the garbage collection
 of objects. This can be a very time consuming task.
 
 In that case, the question would be why is the interpreter collecting
 garbage when it knows we're trying to exit anyway?.

It cannot know until the very end, because there may still be
some try: ... except SystemExit: ... somewhere in the code
waiting to trigger and stop the system exit.

If you want a really fast exit, try this:

import os
os.kill(os.getpid(), 9)

But you better know what you're doing if you take this approach...

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 20 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reindenting the C code base?

2008-12-15 Thread M.-A. Lemburg
On 2008-12-14 21:43, Martin v. Löwis wrote:
 Personally, I think the indentation of, at least,
 Objects/unicodeobject.c should be fixed. This file has become so
 mixed-up with tab and space indents that I have no-idea what to use
 when I edit it. Just to give an idea how messy it is, they are 5214
 lines indented with tabs and 4272 indented with spaces (out the 9733
 of the file).
 
 As an Emacs variables block is present in the file, I would consider
 this normative, and declare that the official indenting is 4 spaces
 for the file, no tabs.

All the Unicode C code I wrote at the time used 4 space indents. I
would welcome this being restored. It got diluted over time.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 15 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Trap SIGSEGV and SIGFPE

2008-12-10 Thread M.-A. Lemburg
On 2008-12-10 21:05, Adam Olsen wrote:
 On Wed, Dec 10, 2008 at 12:22 PM, BJörn Lindqvist [EMAIL PROTECTED] wrote:
 One thing i think it would be useful for in the real world is for
 unittesting extension modules. You cant profitably write unit tests
 for segfaults because that breaks the test harness. In situations like
 those, recovering would be likely (caveat emptor of course).
 
 The only safe option there is a subprocess.

True, but that still makes it a little difficult to report the errors
found in the module.

mxTools has an optional safecall() function that allows calling
functions which potentially segfault and still returns control
back to the calling application:

http://www.egenix.com/products/python/mxBase/mxTools/

It's not (yet) documented, but fairly straight forward to use
once you've enabled it in egenix_mx_base.py:

result = mx.Tools.safecall(callable, args, kws)

Using such a function is handy in situations where you have a
multi-process application setup that sometimes needs to call
out to external libraries of varying quality - a situation that's
not uncommon in real-life situations.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 10 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread M.-A. Lemburg
On 2008-12-09 09:41, Anders J. Munch wrote:
 On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy [EMAIL PROTECTED] wrote:
 try:
  files = os.listdir(somedir, errors = strict)
 except OSError as e:
  log(verbose error message that includes somedir and e)
  files = os.listdir(somedir)
 
 Instead of a codecs error handler name, how about a callback for
 converting bytes to str?
 
 os.listdir(somedir, decoder=bytes.decode)
 os.listdir(somedir, decoder=lambda b: b.decode(preferredencoding, 
 errors='xmlcharrefreplace'))
 os.listdir(somedir, decoder=repr)
 
 ISTM that would be simpler and more flexible than going over the
 codecs registry.  One caveat though is that there's no obvious way of
 telling listdir to skip a name.  But if the default behaviour for
 decoder=None is to skip with a warning, then the need to explicitly
 ask for files to be skipped would be small.
 
 Terry's example would then be:
 
 try:
  files = os.listdir(somedir, decoder=bytes.decode)
 except UnicodeDecodeError as e:
  log(verbose error message that includes somedir and e)
  files = os.listdir(somedir)

Well, this is not too far away from just putting the whole decoding
logic into the application directly:

files = [filename.decode(filesystemencoding, errors='warnreplace')
 for filename in os.listdir(dir)]

(or os.listdirb() if that's where the discussion is heading)

... and that also tells us something about this discussion: we're
trying to come up with some magic to work around writing two
lines of Python code.

I'd just have all the os APIs return bytes and leave whatever
conversion to Unicode might be necessary to a higher level API.

Think of it: You really only need the Unicode values if you
ever want to output those values in text form somewhere.

In those cases, it's usually a human reading a log file or
screen output. Most other cases, just care about getting
some form of file identifier in order to open the file
and don't really care about the encoding of the file name
at all.

It's probably better to have a two helper functions in the os module
that take care of the conversion on demand rather than trying
to force this conversion even in cases where the application
never really needs to write the filename somewhere, e.g.
os.decodefilename() and os.encodefilename().

These should then provide some reasonable default logic, e.g.
use a 'warnreplace' error handler. Applications are then
free to use these converters or implement their own.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 09 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-06 01:48, Nick Coghlan wrote:
 You can't display a non-decodable filename to the user, hence the user
 will have no idea what they're working on. Non-filesystem related apps
 have no business trying to deal with insane filenames.

This is not entirely true: OSes, shells, and applications will
typically represent the file names using either ?-replacements or
some form of hex or decimal escapes for the characters they can't
decode. Since humans are usually very good at pattern recognition,
this goes a long way.

Of course, how the application maps that partially converted file name
back to the real thing is another issue and that's something that
Python should not make harder than it should be.

 Linux is moving towards a standard of UTF-8 for filenames, and once we
 get to the point where the idea of encoding filenames and environment
 variables any other way is seen as crazy, then the Python 3 approach
 will work seamlessly.

It's going to take a long time before file names, environment variables
and command line parameters are all encoded using UTF-8, so practicality
beats purity will have to get more attention in this thread.

Python APIs should work out of the box most of the time.

Currently, if you live in a non-ASCII and non-pure-UTF-8 environment,
you have to deal with different and mixed encodings on a regular
basis.

Whether that's a USB stick, you're trying to read, a ZIP file
you're trying to open, a mounted network drive, etc. the problem
pops up in many different kinds of areas.

If I write do_something.py * I expect Python to indeed work on
all the files in my directory, not just the one that happen to
fit a particular encoding.

If I hook up a CGI script written in Python with a web server,
I expect all data to be received by the script, not just data
that happens to be UTF-8 encoded.

 In the meantime, raw bytes APIs will provide an alternative for those
 that disagree with that philosophy.

I think that's a wrong way to put it: The problems are not made
up by people who disagree with the one-encoding-for-everything
strategy.

The problems occur in real-life IT processing all the time - maybe
not so much in places where English scripts dominate, but certainly
in most other places with non-English scripts.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 19:26, Guido van Rossum wrote:
 On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy [EMAIL PROTECTED] wrote:
 Here is a possible use case: I want filenames as 3.0 strings and I
 anticipate no problems at present but, as you say above, something might
 happen years in the future.  I am using 3.0 *because* of the strings ==
 unicode feature.  I would like to write

 try:
  files = os.listdir(somedir, errors = strict)
 except OSError as e:
  log(verbose error message that includes somedir and e)
  files = os.listdir(somedir)

 and go one without the problem file but not without logging the problem so a
 future maintainer can consider what to do about it, but only when there is
 an actual need to think about it.

If that error parameter is the same as in unicode(value, errors),
then this would be a useful feature:

People could then choose among the already existing error handlers
('strict', 'ignore', 'replace', 'xmlcharrefreplace') or register
their own ones via the codecs module.

Such application specific error handlers could then also apply
whatever fancy round-trip safe encoding of non-decodable bytes
to Unicode escapes, private code points, etc. as seen fit by the
application.

Perhaps we should also add an ''encoding'' parameter that can be
set on a per directory basis (if necessary) and defaults to the
global file system encoding.

If an application hits directory that is known to cause problems,
it could then chose to receive the file names in a different,
more suitable encoding. This allows implementing fallback
mechanisms with a list of common encodings for a locale.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 21:45, Antoine Pitrou wrote:
 M.-A. Lemburg mal at egenix.com writes:
 Such application specific error handlers could then also apply
 whatever fancy round-trip safe encoding of non-decodable bytes
 to Unicode escapes, private code points, etc. as seen fit by the
 application.
 
 I'd argue that such fancy round-trip safe error handler should be provided by
 Python. It's not reasonable to expect application coders to come up with their
 own codec variation based on subtle details of the unicode spec.

Fair enough. We could add some e.g.

 * a round-trip safe escape error handler that uses a Unicode private
   code point area which we officially reserve for the Python
   interpreter

 * a human readable escape error handler that encodes the problem
   bytes to say hex escapes, e.g. gives Andr\xe9 for a Latin-1
   encoded directory name instead of failing

 * a warning error handler that replaces the problem cases with
   a question mark and issues a warning through the warning
   framework

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 22:32, Adam Olsen wrote:
 On Mon, Dec 8, 2008 at 2:01 PM, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 On 2008-12-08 21:45, Antoine Pitrou wrote:
 M.-A. Lemburg mal at egenix.com writes:
 Such application specific error handlers could then also apply
 whatever fancy round-trip safe encoding of non-decodable bytes
 to Unicode escapes, private code points, etc. as seen fit by the
 application.
 I'd argue that such fancy round-trip safe error handler should be provided 
 by
 Python. It's not reasonable to expect application coders to come up with 
 their
 own codec variation based on subtle details of the unicode spec.
 Fair enough. We could add some e.g.

  * a round-trip safe escape error handler that uses a Unicode private
   code point area which we officially reserve for the Python
   interpreter
 
 This would of course alter the behaviour of those private code points,
 preventing them from round-tripping properly.
 
 I don't think round-tripping can be done from an error handler.  You
 need a full codec to do it.  A simple option is 8859-1.  Or, ya know,
 bytes.  This has long since gotten repetitive..

The error handler would just map the problem bytes to the private
area. The application would then have to decide what to do with
them, ie. the error handler only provides one half of the round-
tripping.

And that's on purpose: I don't believe we can come up with some magic
solution for the encodings problem. This is essentially something
that applications will have to solve on a case-by-case basis.

  * a human readable escape error handler that encodes the problem
   bytes to say hex escapes, e.g. gives Andr\xe9 for a Latin-1
   encoded directory name instead of failing
 
 Similar to 'ö'.encode('ascii', 'backslashreplace')?  I'm +1 on making that 
 work.

Yes.

  * a warning error handler that replaces the problem cases with
   a question mark and issues a warning through the warning
   framework
 
 I dub thee errors='warnreplace'.

Yep, something along those lines.

Perhaps there are more and better alternatives. These suggestions
are just to show how the idea could be put to some real-life use.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 22:39, Victor Stinner wrote:
 ('strict', 'ignore', 'replace', 'xmlcharrefreplace')
 
 replace (or xmlcharrefreplace) is just useless because you will not be unable 
 to open or rename the file... You just know that there is a strange file in 
 the directory.

Right, but that's already a lot better than not knowing of the
file's existence at all :-)

Note that the above are standard error handlers for Unicode
conversions. The rest of the email you cut away has more useful
error handlers for the purpose in question.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python for windows.

2008-11-28 Thread M.-A. Lemburg
On 2008-11-28 00:15, Christian Heimes wrote:
 Martin v. Löwis wrote:
 All, and not to start flames, but I still do not understand why
 applink.c isn't included in python's main (conditionally) instead of
 expecting users, many of them novices, to do the build.  ???

 One reason is that I don't know what applink is, and why I should
 care about it. (I may have known in the past, but then I have forgotten
 since).
 
 Applink is roughly explained at
 http://www.openssl.org/support/faq.html#PROG2. The matter was discussed
 about half a year ago but no decision was made. See
 http://mail.python.org/pipermail/python-dev/2008-March/077424.html
 
 applink.c is just a table of integer constants to function pointers. It
 makes mixing of different CRTs secure. You'll get the idea after reading
 the file, Martin. A similar approach could be useful for Python, too.

So that's why we don't see a problem with pyOpenSSL. From the first
link:


Your application must link against the same version of the Win32 C-Runtime
against which your openssl libraries were linked. The default version for
OpenSSL is /MD - Multithreaded DLL.


and later on:


As per 0.9.8 the above limitation is eliminated for .DLLs. ...
Instead of re-compiling OpenSSL toolkit, ...[you have to add]
install-root/include/openssl/applink.c ... to your application project
or simply #include-d in one [and only one] of your application source files.
...
[Note that] it is as important to add CRYPTO_malloc_init prior first call
to OpenSSL.


In our eGenix pyOpenSSL distribution we ship the Windows DLLs for
OpenSSL together with the compiled PYDs for pyOpenSSL - all compiled
using the same compiler settings.

Python for Windows does the same, so there should be no issue either.

From the comment it appears that you only see problems, if you try to use
those extensions from a Python executable that was compiled using
different settings, e.g. an embedded Python interpreter.

Note that neither Python nor pyOpenSSL call the required CRYPTO_malloc_init()
prior to using the other SSL APIs, so even including applink.c would
not help - you have to add this call to the used extensions as well.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 28 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-11-12: Released mxODBC.Connect 0.9.3  http://python.egenix.com/

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime and timedelta enhancement

2008-11-17 Thread M.-A. Lemburg
On 2008-11-16 02:14, Nick Coghlan wrote:
 M.-A. Lemburg wrote:
 Guess I could add a .weeks attribute to mxDateTime, but no one ever
 asked for that so far.
 
 Given that there are at least 3 different ways to define the number of
 weeks between two dates, it may be something best left to applications
 to worry about.

I'd just use the term weeks as meaning 7 full days and then
return a float value for fractions. That's the same convention used
for .days. Anything more complicated would need to use DateTime
values (see below).

 OOo implements 2 of them [1] for its WEEKS() function, and there's then
 a fairly obvious 3rd variant based on a Sunday to Saturday week.
 
 Cheers,
 Nick.
 
 [1]
 http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_WEEKS_function

If you need ISO week counting or any other date based counting
mechanism, you need to know the two DateTime values you're dealing
with and possibly the calendar you're using.

mxDateTime has an .iso_week attribute to help with this, e.g.

 Date(2008,11,17).iso_week
(2008, 47, 1)

http://en.wikipedia.org/wiki/ISO_week

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 17 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-11-12: Released mxODBC.Connect 0.9.3  http://python.egenix.com/

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime and timedelta enhancement

2008-11-15 Thread M.-A. Lemburg
On 2008-11-14 23:59, Victor Stinner wrote:
 Hi,
 
 There are some interresting tickets about the datetime module:
 #1673409: datetime module missing some important methods
 #1083: Confusing error message when dividing timedelta using /
 #2706: datetime: define division timedelta/timedelta
 #4291: Allow Division of datetime.timedelta Objects
 
 Wanted features:
 1- convert a datetime object to an epoch value (numbers of seconds since 
the 1st january 1970), eg. with a new totimestamp() method
 2- convert a timedelta to a specific unit (eg. seconds, days, weeks, etc.)
 3- compute the ratio of two timedelta, eg. for a progress bar

Since the datetime module turned out to be mostly a reimplementation
of mxDateTime, why not continue down that road ?

http://www.egenix.com/products/python/mxBase/mxDateTime/

Let's see:

 from mx.DateTime import *

 DateTime(2008,11,15).ticks()
1226703600.0

 TimeDelta(seconds=100)
mx.DateTime.DateTimeDelta object for '00:01:40.00' at 837030

 TimeDelta(seconds=100) / TimeDelta(seconds=50)
2.0

 TimeDelta(seconds=100).seconds
100.0

 TimeDelta(seconds=100).days
0.0011574074074074073

 TimeDelta(seconds=100).weeks
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: weeks

Guess I could add a .weeks attribute to mxDateTime, but no one ever
asked for that so far.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 15 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-11-12: Released mxODBC.Connect 0.9.3  http://python.egenix.com/

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ANN] VPython 0.1

2008-10-24 Thread M.-A. Lemburg
On 2008-10-24 09:53, J. Sievers wrote:
 M.-A. Lemburg [EMAIL PROTECTED] writes:
 
 [snip]
 BTW: I hope you did not use pybench to get profiles of the opcodes.
 That would most certainly result in good results for pybench, but
 less good ones for general applications such as Django or Zope/Plone.
 
 Algorithm used for superinstruction selection:
 
 1) idea: LOAD_CONST/LOAD_FAST + some suffix
 2) potential suffixes:
$ grep '..*(..*--..*)$' ceval.vmg | grep 'a1 a2 --'  INSTRUCTIONS
 3) delete any instruction that I felt wouldn't be particularly frequent
from INSTRUCTIONS (e.g. raise_varargs)
 4) use awk to generate superinstruction definitions
 
 Not only is this relatively unbiased but also very low effort.

Well, the I felt wouldn't be particularly frequent part does sound
a bit biased, but you obviously made good choices ;-)

I thought you used the tracing functions that Vmgen provides for
determining which combinations occur more often. That's how I worked
back then - I instrumented the interpreter and then let it run for
a few days doing whatever I worked on or with at the time.

I then found that it makes sense to process LOAD_FAST completely
outside the switch statement and to move common opcodes such as
CALL_* to the switch with the most used opcodes. Inlining the code
for calling C functions/methods also made a difference, since most
calls in Python are to C functions/methods.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread M.-A. Lemburg
On 2008-10-23 09:08, J. Sievers wrote:
 a) It's fairly easy to implement different types of dispatch, simply by
 changing a few macros (and while I haven't done this, it shouldn't be a
 problem to add some switch dispatch #ifdefs for non-GCC platforms).
 
 In particular, direct threaded code leads to less horrible branch
 prediction than switch dispatch on many machines (exactly how
 pronounced this effect is depends heavily on the specific
 architecture).

Since VPython is GCC-only, how about naming the patch PyGCCVM ?!

I doubt that you'll find the same kind of performance increase
when using switch-based dispatching, but using more profile based
optimizations, it should be possible to come up with a solution
that provides a few 10% performance increase while still remaining
portable and readable.

When working on the switch statement in ceval some 10 years ago
it was easy to get a 10-20% performance increase by just moving
the switch cases around, breaking the switch in two groups of
opcodes that are used a lot and one for less often used ones and
then introducing a few fast paths via goto.

However, at that time CPUs had much smaller internal caches and
the 1.5.2 ceval VM had obviously hit some cache size limit on
my machine, since breaking the switch in two provided the best
performance increase. With todays CPUs, this shouldn't be a
problem anymore.

BTW: I hope you did not use pybench to get profiles of the opcodes.
That would most certainly result in good results for pybench, but
less good ones for general applications such as Django or Zope/Plone.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 23 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ANN] VPython 0.1

2008-10-23 Thread M.-A. Lemburg
On 2008-10-23 15:19, David Ripton wrote:
 On 2008.10.23 12:02:12 +0200, M.-A. Lemburg wrote:
 BTW: I hope you did not use pybench to get profiles of the opcodes.
 That would most certainly result in good results for pybench, but
 less good ones for general applications such as Django or Zope/Plone.
 
 I was wondering about Pybench-specific optimization too, so last night I
 ran a few dozen of my projecteuler.net solver programs with VPython.
 Excluding the ones that are so fast that startup overhead dominates
 runtime, the least speedup I saw versus vanilla 2.5.2 was ~10%, the best
 was ~50%, and average was ~30%.  Pretty consistent with Pybench.

Thanks. That's good to know.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 23 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ANN] VPython 0.1

2008-10-22 Thread M.-A. Lemburg
On 2008-10-22 14:16, J. Sievers wrote:
 Hi,
 
 I implemented a variant of the CPython VM on top of Gforth's Vmgen; this made
 it fairly straightforward to add direct threaded code and superinstructions 
 for
 the various permutations of LOAD_CONST, LOAD_FAST, and most of the 
 two-argument
 VM instructions.

I suppose you get most of the speedup by using threaded code. Unfortunately,
that is only supported by gcc.

Do you get similar results for the switch based method that appears to be
available in VMgen ?

http://www.complang.tuwien.ac.at/anton/vmgen/html-docs/VM-engine.html

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3UnicodeDecodeError

2008-10-07 Thread M.-A. Lemburg
On 2008-10-07 22:18, Fred Drake wrote:
 On Oct 7, 2008, at 4:06 PM, Martin v. Löwis wrote:
  b) I would propose that the notion of a default encoding is entirely
 eliminated from Python, along with sys.(get|set)defaultencoding
 
 +1

As already mentioned in my reply to Viktor: +1. It's not adjustable
anymore, so we might as well get rid off the sys module APIs.

The term default encoding itself still has some value in that it
is associated with the C API char* encoding used for PyUnicode
objects in Python 3.0.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 07 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-10-01 Thread M.-A. Lemburg
On 2008-10-01 09:54, Ulrich Eckhardt wrote:
 On Tuesday 30 September 2008, M.-A. Lemburg wrote:
 On 2008-09-30 08:00, Martin v. Löwis wrote:
 Change the default file system encoding to store bytes in Unicode is
 like introducing a new Python type: fake Unicode for filename hacks.
 Exactly. Seems like the best solution to me, despite your polemics.
 Not a bad idea... have os.listdir() return Unicode subclasses that work
 like file handles, ie. they have an extra buffer that holds the original
 bytes value received from the underlying C API.
 
 Why does it have to be a Unicode subclass? In my eyes, a Unicode object 
 promises a few things, in particular that it contains a Unicode string. If it 
 now suddenly contains bytes without any further meaning, that would be bad.

Please read my entire email. I was proposing to store the underlying
non-decodeable byte string value in such a subclass. The Unicode value
of the object would then be that underlying value decoded as e.g.
Latin-1 in order to be able to work on it as text.

Path operations would have to be made aware of such subclasses and
operate on the underlying bytes value.

However, like Guido mentioned, this only works if all components are
indeed aware of such subclasses... and that's likely to fail for
code outside the stdlib.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 01 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 08:00, Martin v. Löwis wrote:
 Change the default file system encoding to store bytes in Unicode is like 
 introducing a new Python type: fake Unicode for filename hacks.
 
 Exactly. Seems like the best solution to me, despite your polemics.

Not a bad idea... have os.listdir() return Unicode subclasses that work
like file handles, ie. they have an extra buffer that holds the original
bytes value received from the underlying C API.

Passing these handles to open() would then do the right thing by using
whatever os.listdir() got back from the file system to open the file,
while still providing a sane way to display the filename, e.g. using
question marks for the invalid characters.

The only problem with this approach is concatenation of such handles
to form pathnames, but then perhaps those concatenations could just
work on the bytes value as well (I don't know of any OS that uses non-
ASCII path separators).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 30 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 18:46, Guido van Rossum wrote:
 On Tue, Sep 30, 2008 at 8:20 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 In the end, I think it's better not to be clever and just return
 the filenames that cannot be decoded as bytes objects in os.listdir().
 
 Unfortunately that's going to break most code that is using
 os.listdir(), so it's hardly an improved experience.

Right, but this also signals a problem to the application and the
application is in the best position to determine a proper
work-around.

 Passing those to open() will then open the files as expected, in most
 other cases the application will have to provide explicit conversions
 in whatever way best fits the application.
 
 In most cases the app will try to concatenate a pathname given as a
 string and then it will fail.

True, and that's the right thing to do in those cases.
The application will have to deal with the problem, e.g. convert
the path to bytes and retry the joining, or convert the bytes string
to Latin-1 and then convert the result back to bytes (using Latin-1)
for passing it to open() (which will of course only work if there are
no non-Latin-1 characters in the path dir), or apply a different
filename encoding based on the path and then retry to convert the
bytes filename into Unicode, or ask the user what to do, etc.

There are many possibilities to solve the problem, apply a work-around,
or inform the user of ways to correct it.

 Also note that os.listdir() isn't the only source of filesnames. You
 often read them from a file, a database, some socket, etc, so letting
 the application decide what to do is not asking too much, IMHO.
 
 In all those cases, the code that reads them is responsible for
 picking an encoding or relying on a default encoding, and the
 resulting filenames are always expressed as text, not bytes. I don't
 think it's the same at all.

What I was trying to say is that you run into the same problem
in other places as well. Trying to have os.listdir() implement
some strategy is not going to solve the problem at large.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 30 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-29 Thread M.-A. Lemburg
On 2008-09-29 12:50, Ulrich Eckhardt wrote:
 On Sunday 28 September 2008, Gregory P. Smith wrote:
 broken systems will always exist.  Code to deal with them must be
 possible to write in python 3.0.

 since any given path (not just fs) can have its own encoding it makes
 the most sense to me to let the OS deal with the errors and not try to
 enforce bytes vs string encoding type at the python lib. level.
 
 Actually I'm afraid that that isn't really useful. I, too, would like to kick 
 peoples' back in order to get the to fix their systems or use the proper 
 codepage while mounting etc, etc, but that is not going to happen soon. Just 
 ignoring those broken systems is tempting, but alienating a large group of 
 users isn't IMHO worth it.
 
 Instead, I'd like to present a different approach:
 
 1. For POSIX platforms (using a byte string for the path):
 Here, the first approach is to convert the path to Unicode, according to the 
 locale's CTYPE category. Hopefully, it will be UTF-8, but also codepages 
 should work. If there is a segment (a byte sequence between two path 
 separators) where it doesn't work, it uses an ASCII mapping where possible 
 and codepoints from the Private Use Area (PUA) of Unicode for the 
 non-decodable bytes.
 In order to pass this path to fopen(), each segment would be converted to a 
 byte string again, using the locale's CTYPE category except for segments 
 which use the PUA where it simply encodes the original bytes.

I'm not sure how this would work. How would you map the private use
code points back to bytes ? Using a special codec that knows about
these code points ? How would the fopen() know to use that special
codec instead of e.g. the UTF-8 codec ?

BTW: Private use areas in Unicode are meant for e.g. company specific
code points. Using them for escaping purposes is likely to cause problems
due to assignment clashes.

Regarding the subject of file names:

On Unix, it's well possible to have to deal with 2-3 different file
systems mounted on a machine. Each of those may use a different file name
encoding or not support file name encoding at all.

If the OS doesn't guarantee a consistent file name encoding, then
why should Python try to emulate this on top of the OS ?

I think it's more important to be able to open a file, than to have
a readable file name when printing it to stdout, e.g. I wouldn't be able
to tell whether some Chinese file name makes sense or not, but if I know
that all files in a directory are meant for processing I should be able
to iterate over them regardless of whether they make sense or not.

 2. For win32 platforms, the path is already Unicode (UTF-16) and the whole 
 problem is solved or not solved by the OS.
 
 In the end, both approaches yield a path represented by a Unicode string for 
 intermediate use, which provides maximum flexibility. Further, it 
 preserves broken encodings by simply mapping their byte-values to the PUA 
 of Unicode. Maybe not using a string to represent a path would be a good 
 idea, too. At least it would make it very clear that the string is not 
 completely free-form.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 29 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add python.exe to PATH environment variable

2008-09-03 Thread M.-A. Lemburg
On 2008-09-03 04:12, Greg Ewing wrote:
 M.-A. Lemburg wrote:
 
 The problem is: how to undo those changes without accidentally
 undoing an explicit change made by the user ?
 
 Is that really much of an issue? If the PATH contains an
 entry corresponding to the Python installation that's
 being uninstalled, then it's not going to work once the
 installation is gone, so removing it isn't going to do
 much harm.

You have a point there :-)

 In any case, the danger could be reduced by picking
 some distinctive name for a new environment variable that
 a user isn't likely to come up with on their own, such
 as __AUTOPYEXECDIR__, setting that to the Python directory,
 and adding it to PATH. The uninstaller can then be fairly
 selective about what it removes.
 
 BTW: Adding the Python dir to the PATH per default would cause
 problems for users who regularly have multiple different
 Python installations on a machine.
 
 No more problem than having it set the file associations,
 as far as I can see. If you have multiple Pythons, you're
 going to have to be explicit about which one you want
 from the command shell anyway, and not rely on a PATH
 setting.

True, I have configured Windows to provide several
Open with Python x.x in order to have more flexibility.

However, always having the latest version on PATH is not
an option either, since e.g. I wouldn't want all .py scripts
to be run by Python 3.0 just because I installed it for
testing purposes.

 If this is done, it should be an install option and not forced.
 
 Certainly it should be an option. I'm not sure about
 having it disabled by default, though, since naive users
 are the ones that stand to benefit most from it, yet
 they're least likely to know that they need to turn it
 on.

In my experience, Windows apps seem to be moving aways from
cluttering up PATH and include each and every single app
dir to it.

Instead, if they bother at all, they simply place a .bat or
small .exe into the Window system dir (which already is
on PATH).

Perhaps we could have an option to place a python.bat
into C:\Windows\ or C:\Windows\System\.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add python.exe to PATH environment variable

2008-09-03 Thread M.-A. Lemburg
On 2008-09-03 10:15, Cesare Di Mauro wrote:
 On 03 sep 2008 at 00:50:13, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 
 There already is a menu entry that starts the Python interpreter
 on Windows, so why not use that ?
 
 Because i need to start Python from folders which have
 files that define a specific environment.
 
 I have several servers and applications that develop and test this way.

Same here, but I usually have a env.bat that sets up whatever
environment I need (including the required Python version) and
run that when opening a prompt to work on a particular project.

 Also .py files are automatically associated with the last installed
 Python interpreter, so the double-clicking on .py files works and is
 probably the most common way of starting a Python file on Windows.
 
 99% of time I run Python from a command prompt (on specific
 directories).
 
 I use the default menu entry only when I have to play with Python to test some
 pieces of code.

IMHO, the only point of having the installer do this for the user
is for the case where the user does not know how to manipulate
PATH on Windows, but still wants to use the command line to access it
directly.

How many users would fit that category ?

 Adding paths to the PATH variable is not easy on Windows, esp. if
 you want to support multiple Windows versions. The global PATH
 settings are not read from autoexec.bat anymore (only once at boot
 time). Instead those environment variables are managed via the
 registry.

 See e.g.

 http://agiletesting.blogspot.com/2005/06/handling-path-windows-registry-value.html

 for how to setup PATH to your liking using Python.

 The problem is: how to undo those changes without accidentally
 undoing an explicit change made by the user ?

 BTW: Adding the Python dir to the PATH per default would cause
 problems for users who regularly have multiple different
 Python installations on a machine. If this is done, it should
 be an install option and not forced.
 
 Let the user to decide to update or not the PATH envar by marking a
 chechbox in the setup process, displaying that doing that the
 changes will NOT be reverted when uninstalling it.

Hmm, I don't think that's a good way to go about this. The uninstall
should undo all changes.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] 2.6b3 Windows installers

2008-09-02 Thread M.-A. Lemburg
The download page doesn't list any Windows installer for 2.6b3:

http://www.python.org/download/releases/2.6/

I suppose this is due to Martin building the installers and him not
be available at the moment.

Since Python on Windows will likely only get very few beta testers
without a Windows installer build, I'd suggest to postpone the
RC1 release that's planned for tomorrow to get more feedback for the
Windows builds.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 02 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add python.exe to PATH environment variable

2008-09-02 Thread M.-A. Lemburg
On 2008-09-02 23:14, Terry Reedy wrote:
 Tarek Ziadé wrote:
 
 So I don't see any good reason (besides the technical complexity)
 
 Unless and until someone is able and willing to deal with the technical
 complexity, that would seem to be a sufficient reason.
 
  to [not, I presume] add  it to the Windows installer.
 
 So I would love to see this ticket open again; I personnaly would be
 in favor of an automatic change of PATH by the installer.
 
 Martin said he would discuss a patch when there is a patch to discuss.
 He feels strongly about there being a clean uninstall, including PATH
 restoration if it is changed.
 
 The problem is that a) the Window's way to run user apps is via icons
 and menus and that b) the old DOS path/command way, based on Unix, has
 been neglected.
 
 An alternative to manipulating PATH would be to make and add to the
 Start Menu a Command Prompt shortcut, call it Command Window or
 something, that starts in the Python directory.  Then one could enter
python or Scripts/goforit without chdir-ing to the Python directory
 first.  The background could even be set to green, for instance, to
 differentiate it from the standard Command Prompt window.

There already is a menu entry that starts the Python interpreter
on Windows, so why not use that ?

Also .py files are automatically associated with the last installed
Python interpreter, so the double-clicking on .py files works and is
probably the most common way of starting a Python file on Windows.

Adding paths to the PATH variable is not easy on Windows, esp. if
you want to support multiple Windows versions. The global PATH
settings are not read from autoexec.bat anymore (only once at boot
time). Instead those environment variables are managed via the
registry.

See e.g.

http://agiletesting.blogspot.com/2005/06/handling-path-windows-registry-value.html

for how to setup PATH to your liking using Python.

The problem is: how to undo those changes without accidentally
undoing an explicit change made by the user ?

BTW: Adding the Python dir to the PATH per default would cause
problems for users who regularly have multiple different
Python installations on a machine. If this is done, it should
be an install option and not forced.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Documentation Error for __hash__

2008-08-29 Thread M.-A. Lemburg
On 2008-08-28 21:31, Michael Foord wrote:
 Hello all,
 
 The documentation for __hash__ seems to be outdated. I'm happy to submit
 a patch, so long as I am not misunderstanding something.
 
 http://docs.python.org/dev/reference/datamodel.html#object.__hash__
 
 The documentation states:
 
 If a class does not define a __cmp__() or __eq__() method it should not
 define a __hash__() operation either; if it defines __cmp__() or
 __eq__() but not __hash__(), its instances will not be usable as
 dictionary keys. If a class defines mutable objects and implements a
 __cmp__() or __eq__() method, it should not implement __hash__(), since
 the dictionary implementation requires that a key's hash value is
 immutable (if the object's hash value changes, it will be in the wrong
 hash bucket).
 
 
 This may have been true for old style classes, but as new style classes
 inherit a default __hash__ from object - mutable objects *will* be
 usable as dictionary keys (hashed on identity) *unless* they implement a
 __hash__ method that raises a type error.

Being hashable is a different from being usable as dictionary key.

Dictionaries perform the lookup based on the hash value, but will
then have to check for hash collisions based on an equal comparison.

If an object does not define an equal comparison, then it is not
usable as dictionary key.

 Shouldn't the advice be that classes that implement comparison methods
 should always implement __hash__ (wasn't this nearly enforced?),

It's probably a good idea to implement __hash__ for objects that
implement comparisons, but it won't always work and it is certainly
not needed, unless you intend to use them as dictionary keys.

 and that mutable objects should raise a TypeError in __hash__.

That's a good idea, even though it's not needed either ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 29 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Documentation Error for __hash__

2008-08-29 Thread M.-A. Lemburg
On 2008-08-29 13:03, Matt Giuca wrote:
 Being hashable is a different from being usable as dictionary key.

 Dictionaries perform the lookup based on the hash value, but will
 then have to check for hash collisions based on an equal comparison.

 If an object does not define an equal comparison, then it is not
 usable as dictionary key.

 
 But if an object defines *neither* __eq__ or __hash__, then by default it is
 usable as a dictionary key (using the id() of the object for both default
 equality and hashing, which is fine, and works for all user-defined types by
 default).
 
 An object defining __hash__ but not __eq__ is not problematic, since it
 still uses id() default for equality. (It just has potentially bad
 dictionary performance, if lots of things hash the same even though they
 aren't equal). This it not a problem by definition because *it is officially
 okay for two objects to hash the same, even if they aren't equal, though
 undesirable*.

Note that only instances have the default hash value id(obj). This
is not true in general. Most types don't implement the tp_hash
slot and thus are not hashable. Indeed, mutable types should not
implement that slot unless they know what they're doing :-)

 So all hashable objects are usable as dictionary keys, are they not? (As far
 as I know it isn't possible to have an object that does not have an equality
 comparison, unless you explicitly override __eq__ and have it raise a
 TypeError for some reason).

Sorry, I wasn't clear enough: with not defining an equal comparison
I meant that an equal comparison does not succeed, ie. raises an
exception or returns Py_NotImplemented (at the C level).

Due to the default of using the id(obj) as fallback for the equal
comparison, this has to be explicitly coded for. If this is not
the case (and that's probably the most common situation),
then you're right: hashable implies usable as dictionary key.

 It's probably a good idea to implement __hash__ for objects that
 implement comparisons, but it won't always work and it is certainly
 not needed, unless you intend to use them as dictionary keys.

 
 But from what I know, it is a *bad* idea to implement __hash__ for any
 mutable object with non-reference equality (where equality depends on the
 mutable state), as an unbreakable rule. This is because if they are inserted
 into a dictionary, then mutated, they may suddenly be in the wrong bucket.
 This is why all the mutable types in Python with non-reference equality (eg.
 list, set, dict) are explicitly not hashable, while the immutable types (eg.
 tuple, frozenset, str) are hashable, and so are the mutable types with
 reference equality (eg. functions, user-defined classes by default).

Right.

By implementing __hash__ in classes you have the explicit choice of
either raising an exception or returning a useful hash value.

Again, the situation is better at the C level, since types
don't have a default tp_hash implementation, so have to explicitly
code such a slot in order for hash(obj) to work.

 and that mutable objects should raise a TypeError in __hash__.
 That's a good idea, even though it's not needed either ;-)

 
 So I think my above axioms are a better (less restrictive, and still
 correct) rule than this one. It's OK for a mutable object to define
 __hash__, as long as its __eq__ doesn't depend upon its mutable state. For
 example, you can insert a function object into a dictionary, and mutate its
 closure, and it won't matter, because neither the hash nor the equality of
 the object is changing. It's only types like list and dict, with deep
 equality, where you run into this hash table problem.

I think the documentation needs to be changed to make the defaults
explicit.

The documentation should probably say:

If you implement __cmp__ or
__eq__ on a class, also implement a __hash__ method (and either
have it raise an exception or return a valid non-changing hash
value for the object).

If you implement __hash__ on classes, you should consider implementing
__eq__ and/or __cmp__ as well, in order to control how dictionaries use
your objects.

In general, it's probably best to always implement both methods
on classes, even if the application will just use one of them.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 29 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list

Re: [Python-Dev] Things to Know About Super

2008-08-27 Thread M.-A. Lemburg
On 2008-08-27 09:54, Greg Ewing wrote:
 Do you have a real-life example of this where multiple
 inheritance is actually used?
 
 A non-contrived example or two would be a good thing to
 have in tutorials etc. where super() is discussed. It
 would help to convey the kinds of situations in which
 use of super() is and is not appropriate.

The typical use is in mixin classes that can be used to
add functionality to base classes, something you often find
in application frameworks, e.g.

class NewComponent(Feature1Mixin, Feature2Mixin, BaseComponent):
   ...

If the mixin classes have to override one of the methods defined
in BaseComponent, then they must pay attention to all other mixin
classes used to define the NewComponent.

Without super() (or some other mechanism of accessing the base
method, like e.g. mxTools' basemethod() for classic classes), the
mixins could potentially override methods defined by other mixin
classes which would then not get called.

As example, think of a typical application server method

def process_request(self, request):
...

To work properly, each implementation of the method in the mixin classes
and base class will have to be called - in the order they were defined
in the class definition.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 27 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] try-except slower in 2.6 (was: performance)

2008-08-25 Thread M.-A. Lemburg
On 2008-08-24 21:04, Antoine Pitrou wrote:
TryRaiseExcept:   183ms   122ms  +49.6%   184ms   124ms  
 +48.2%
 Whoa, that's a big slowdown.  I wonder if it's consistent?
 
 Yes, I can definitely reproduce it.

That's a huge slow-down compared to 2.5.

Are there any obvious reasons for this ? Has the exception handling
mechanism changed that much between 2.5 and 2.6 ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 25 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread M.-A. Lemburg
On 2008-08-22 03:25, Guido van Rossum wrote:
 On Thu, Aug 21, 2008 at 2:26 PM, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 On 2008-08-21 22:35, Guido van Rossum wrote:
 I was just paid a visit by my Google colleague Mark Davis, co-founder
 of the Unicode project and the president of the Unicode Consortium. He
 would like to see improved Unicode support for Python. (Well duh. :-)
 On his list of top priorities are:

 1. Upgrade the unicodata module to the Unicode 5.1.0 standard
 2. Extende the unicodedata module with some additional properties
 3. Add support for Unicode properties to the regex syntax, including
 Boolean combinations

 I've tried to explain our release schedule and
 no-new-features-in-point-releases policies to him, and he understands
 that it's too late to add #2 or #3 to 2.6 and 3.0, and that these will
 have to wait for 2.7 and 3.1, respectively. However, I've kept the
 door sligthtly ajar for adding #1 -- it can't be too much work and it
 can't have too much impact. Or can it? I don't actually know what the
 impact would be, so I'd like some impact from developers who are
 closer to the origins of the unicodedata module.

 The two, quite separate, questions, then, are (a) how much work would
 it be to upgrade to version 5.1.0 of the database; and (b) would it be
 acceptable to do this post-beta3 (but before rc1). If the answer to
 (b) is positive, Google can help with (a).

 In general, Google has needs in this area that can't wait for 2.7/3.1,
 so what we may end up doing is create internal implementations of all
 three features (compatible with Python 2.4 and later), publish them as
 open source on Google Code, and fold them into core Python at the
 first opportunity, which would likely be 2.7 and 3.1.

 Comments?
 There are two things to consider:

 unicodedata is just an optimized database for accessing code
 point properties of a specific Unicode version (currently 4.1.0
 and 3.2.0). Adding support for a new version needs some work on
 the generation script, perhaps keeping the 4.1.0 version of it
 like we did for 3.2.0, but that's about it.

 However, there are other implications to consider when moving to
 Unicode 5.1.0.

 Just see the top of http://www.unicode.org/versions/Unicode5.1.0/
 for a summary of changes compared to 5.0, plus
 http://www.unicode.org/versions/Unicode5.0.0/ for changes between
 4.1.0 and 5.0.

 So while we could say: we provide access to the Unicode 5.1.0
 database, we cannot say: we support Unicode 5.1.0, simply because
 we have not reviewed the all the necessary changes and implications.
 
 Mark's response to this was:
 
 
 I'd suspect that you'll be as conformant to U5.1.0 as you were to U4.1.0 ;-)
 
 More seriously, I don't think this is a roadblock -- I doubt that
 there are real differences between U5.1.0 and U4.10 in terms of
 conformance that would be touched by Python -- the conformance changes
 tend to be either completely backward compatible or very esoteric.
 What I can do is to review the Python support to see if and where
 there are any problems, but I wouldn't anticipate any.
 
 
 Which suggests that he believes that the differences in the database
 are very minor, and that upgrading just the database would not cause
 any problems for code that worked well with the 4.1.0 database.

Fine with me.

 I think it's better to look through all the changes and then come
 up with proper support for 2.7/3.1. If Google wants to contribute
 to this, even better. To avoid duplication of work or heading in
 different directions, it may be a good idea to create a
 unicode-sig to discuss things.
 
 Not me. :-)

I would really like to see more Unicode support in Python, e.g.
for collation, compression, indexing based on graphemes and
code points, better support for special casing situations (to
cover e.g. the dotted vs. non-dotted i in the Turkish scripts),
etc.

There are also a few changes that we'd need to incorporate into
the UTF codecs, e.g. warn about more ill-formed byte sequences.

Would Google be willing to contribute such support or part
of it ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 25 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread M.-A. Lemburg
On 2008-08-25 19:34, Barry Warsaw wrote:
 On Aug 21, 2008, at 6:30 PM, Terry Reedy wrote:
 
 http://www.unicode.org/versions/Unicode5.1.0/
 Unicode 5.1.0 contains over 100,000 characters, and provides
 significant additions and improvements... to existing features,
 including new files and upgrades to existing files.  Sounds close to
 adding features ;-)
 
 I agree.  This seriously feels like new, potentially high risk code to
 be adding this late in the game.  The BDFL can always override, but
 unless someone is really convincing that this is low risk high benefit,
 I'd vote no for 2.6/3.0.

The above quote from the Unicode site is misleading in this context.

Guido's request was just for updating the Unicode database with
the data from 5.1 - without adding new support for properties or
changing the interfaces.

See this page for a list of changes to the Unicode database:

http://www.unicode.org/Public/UNIDATA/UCD.html

The main file used for the unicodedata module is called UnicodeData.txt.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 25 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode 5.1.0

2008-08-21 Thread M.-A. Lemburg

On 2008-08-21 22:35, Guido van Rossum wrote:

I was just paid a visit by my Google colleague Mark Davis, co-founder
of the Unicode project and the president of the Unicode Consortium. He
would like to see improved Unicode support for Python. (Well duh. :-)
On his list of top priorities are:

1. Upgrade the unicodata module to the Unicode 5.1.0 standard
2. Extende the unicodedata module with some additional properties
3. Add support for Unicode properties to the regex syntax, including
Boolean combinations

I've tried to explain our release schedule and
no-new-features-in-point-releases policies to him, and he understands
that it's too late to add #2 or #3 to 2.6 and 3.0, and that these will
have to wait for 2.7 and 3.1, respectively. However, I've kept the
door sligthtly ajar for adding #1 -- it can't be too much work and it
can't have too much impact. Or can it? I don't actually know what the
impact would be, so I'd like some impact from developers who are
closer to the origins of the unicodedata module.

The two, quite separate, questions, then, are (a) how much work would
it be to upgrade to version 5.1.0 of the database; and (b) would it be
acceptable to do this post-beta3 (but before rc1). If the answer to
(b) is positive, Google can help with (a).

In general, Google has needs in this area that can't wait for 2.7/3.1,
so what we may end up doing is create internal implementations of all
three features (compatible with Python 2.4 and later), publish them as
open source on Google Code, and fold them into core Python at the
first opportunity, which would likely be 2.7 and 3.1.

Comments?


There are two things to consider:

unicodedata is just an optimized database for accessing code
point properties of a specific Unicode version (currently 4.1.0
and 3.2.0). Adding support for a new version needs some work on
the generation script, perhaps keeping the 4.1.0 version of it
like we did for 3.2.0, but that's about it.

However, there are other implications to consider when moving to
Unicode 5.1.0.

Just see the top of http://www.unicode.org/versions/Unicode5.1.0/
for a summary of changes compared to 5.0, plus
http://www.unicode.org/versions/Unicode5.0.0/ for changes between
4.1.0 and 5.0.

So while we could say: we provide access to the Unicode 5.1.0
database, we cannot say: we support Unicode 5.1.0, simply because
we have not reviewed the all the necessary changes and implications.

I think it's better to look through all the changes and then come
up with proper support for 2.7/3.1. If Google wants to contribute
to this, even better. To avoid duplication of work or heading in
different directions, it may be a good idea to create a
unicode-sig to discuss things.

Offline 'til next week-ly,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 21 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintaining old releases

2008-08-14 Thread M.-A. Lemburg

On 2008-08-14 08:43, Martin v. Löwis wrote:

For example, let's project dates for closing 2.6 and 3.0 now, and add
them to PEP 361.


My view is that they should be closed when 2.7 and 3.1 are released.


Since we don't have a fixed release cycle, making the 2.(n-1)
maintenance time frame depend on the 2.n release is not a reliable way
of defining the 2.(n-1) lifetime.

Instead, we should fix the dates based on the 2.(n-1) release date.


Following another informal policy, we were going for an 18 months
release cycle at some time (2.6 clearly took longer), which would
mean that those branches get closed on March 1, 2010. Security
releases will be available until October 1, 2013.


That would only allow 1.5 years for bug fixes - we were discussing
3 years for bug fixes and another 2 years for security fixes, ie.

2.6 bug fixes until  Oct 01 2011

2.6 security fixes until Oct 01 2013

Ditto for 3.0.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 14 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintaining old releases

2008-08-13 Thread M.-A. Lemburg

On 2008-08-13 07:53, Martin v. Löwis wrote:

Because there won't typically be sufficient testing and release
infrastructure to allow arbitrary bug fixes to be committed on the
branch. The buildbots are turned off, and nobody tests the release
candidate, no Windows binaries are provided - thus, chances are very
high that a bug fix release for some very old branch will be *worse*
than the previous release, rather than better.

Second, I don't think this is true. People using those patch
level releases will test and report bugs if they are introduced
by such backports.


They might be using releases, but they are *not* using the subversion
maintenance branches. Do you know anybody who regularly checks out the
2.4 maintenance branch and tests it?

So at best, people will only report bugs *after* the release was made,
meaning that there is a realistic chance that the release itself breaks
things.


I think that's an overly pessimistic view. There's always a chance
of breaking things when patching anything - whether that's a security
fix or a fix for a bug that's hard to work around in an application
using Python.

Note that those fixes will usually be backports from a more recent
release, so even if they don't receive enough direct testing on
the older branch before the release is cut, they will get their
share of testing either in the context of the more recent branch.


As for using the releases themselves: there have been 80462 downloads
of 2.4.5 since it was released in March, as compared to 517325 downloads
of the 2.5.2 MSI in July alone. So I'm skeptical that many people do
actually use the 2.4.5 release.


It's difficult to use such download numbers as hint for the number
of deployed installations. 2.4.5 was not released as binary, so
interested parties had to compile that version by themselves and
those installations don't show up in your statistics.

I'm sure that if we had released binaries as well, the number would
have looked different, esp. if you only look at the Windows binaries.


Besides, developers backporting such changes are diligent enough
to test their changes - they will usually have a reason for applying
the extra effort to backport.


My problem is that this backporting is not systematic. It's arbitrary
whether patches get backported or not. Part of the problem is that
it is/was also unclear whether there ever will be another release made
out of 2.4. 


That's a valid point, but does this really warrant backing out
changes that have already been applied ? Isn't it better to get
at least some bugs fixed rather than to not fix them at all ?


When 2.4.4 was released, Anthony announced, in

http://mail.python.org/pipermail/python-dev/2006-October/069326.html

This will be the last planned release in the Python 2.4 series

So anybody committing to the 2.4 branch after that should have expected
that the patches will never get released.


Perhaps we should just document the maintenance of Python releases
more clearly and also plan for a final bug fix release 3 years after
the initial branch release. That way developers and users can also
adjust their plans accordingly.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintaining old releases

2008-08-13 Thread M.-A. Lemburg

On 2008-08-13 04:57, Guido van Rossum wrote:

And there's a reason for this slow uptake of Python 2.5: as more
and more servers run 64-bit OSes, the Py_ssize_t changes cause
serious trouble with Python C extensions that were not updated
by their authors.


I'm not sure what that has to do with anything. The older releases
have *worse* support for 64-bit platforms!


This is one of the reasons why porting applications from 2.4 to 2.5
takes longer than e.g. moving from 2.3 to 2.4.

Python 2.4 works just fine on 64-bit platforms and so do the C
extensions that were written for it. Moving to 2.5 you often find
that those C extensions do not support the new Py_ssize_t types
and thus generate segfaults.

As a result, you either have to start using a different C extension,
patch the extension, stay with Python 2.4 or use a custom Python
interpreter that is patched always map Py_ssize_t to int.

The move from 2.5 to 2.6 will be a lot easier and uptake a lot
faster.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintaining old releases

2008-08-13 Thread M.-A. Lemburg

On 2008-08-13 15:20, Steve Holden wrote:

M.-A. Lemburg wrote:

Perhaps we should just document the maintenance of Python releases
more clearly and also plan for a final bug fix release 3 years after
the initial branch release. That way developers and users can also
adjust their plans accordingly.

As always the problem is getting someone to do this not insignificant 
amount of work.


We'd just have to add a section to the release schedule PEP:

Release Maintenance

   Bug fix releases will be made available until September 19, 2009.
   The following bug fix releases have been posted:

   2.5.1: February 22, 2008
   2.5.2: April 18, 2007

   Security fix releases will continue to be made available as
   necessary until September 19, 2011. No further updates will
   be posted after that date. The following security fix releases
   have been posted:

   No security fix releases have been released.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maintaining old releases

2008-08-13 Thread M.-A. Lemburg

On 2008-08-13 22:32, Martin v. Löwis wrote:

It's difficult to use such download numbers as hint for the number
of deployed installations. 2.4.5 was not released as binary, so
interested parties had to compile that version by themselves and
those installations don't show up in your statistics.


You mean, they installed it *without* downloading it? How did they
do that?


What I was trying to say is that you only see a single source download,
which someone then takes, compiles and possibly redistributed or
integrates into a product. As a result a single download can
easily map to quite a few installations - and that's what we should
base our assumptions on.


I'm sure that if we had released binaries as well, the number would
have looked different, esp. if you only look at the Windows binaries.


See, that's exactly the problem. We don't have the resources to provide
Windows binaries. So even if the release contained regular bug fixes,
I *still* would not have released Windows binaries.


I was just suggesting that the number of downloads would have
been higher had you released Windows binaries as well. We see that
all the time with the eGenix products.

Anyway, that's just statistics :-)


That's a valid point, but does this really warrant backing out
changes that have already been applied ? Isn't it better to get
at least some bugs fixed rather than to not fix them at all ?


Yes. Otherwise, neither developers nor users have a clear guideline
what to expect.


I disagree on that, but I'm fine with such a plan if it's documented
well in advance.


Perhaps we should just document the maintenance of Python releases
more clearly and also plan for a final bug fix release 3 years after
the initial branch release. That way developers and users can also
adjust their plans accordingly.


There was clear documentation. It said 2.4 is done, finished, closed,
over with, not maintained anymore. We had been doing that for many
releases in the past.


Right, but that documentation was only available after the release
manager decided to stop creating releases for that branch - ie.
around the time that final release was cut.

In order to plan for the end of lifetime of a software product,
you need this information well upfront - for both the developers
(so that they can get fixes in before the end-of-lifetime) and
users (who will then have to plan to upgrade their installations
and products relying on Python).


Now, you and me, we both want to change the policy. I want to change
to provide security releases for a period of five years, and I think
this is feasible with the resources that we have. You just suggested
to provide bug fix releases for three years, and I think that is
not feasible.


Actually, I was suggesting to have bug fix releases for 3 years
and security fixes for another 2 years (ie. 5 years lifetime
in total).


In addition, it still would mean that we should not have
done a bug fix release in 2008 (as 2.4.5 was released in March 2008);
instead, the last bug fix release should have been made in November
2007. Nobody (including yourself) stepped forward at that time and
offered to roll a release. 2.4 was release on November 30, 2004.


I don't want this written in stone, but there should be a pre-defined
roadmap for the whole lifetime Python release branch - from start to
end.

Please note that a policy is really just that: a guideline for
everyone to follow. It doesn't restrict us in maintaining a
release for more than the originally intended 3/5 years phases
or creating a bug fix release after the initial 3 years.

However, it should be seen as guideline for the minimum amount
of time a release is being maintained - for everyone to see
early (ie. in the Python release PEP) and use as basis for
making decisions on which release to take as basis for a software
project.

Regards,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python-committers] next beta

2008-08-12 Thread M.-A. Lemburg

First, I'd like to know why this discussion is happening on
the committers list.

Python-dev is the right list for these things. I've adjusted the
CC accordingly.

On 2008-08-12 20:44, Martin v. Löwis wrote:

I am planning to offer a single file patch for 2.3 and 2.4. As far as
one more 2.5 release, I don't think there's going to be many changes
to the 2.5 branch between now and 2.6/3.0 final - although if there
is, we'll obviously have to do another release.

I would like to establish a tradition where, after some final bug fix
release (e.g. for 2.5), further mere bug fixes are banned from the
maintenance branch (and I did revert several bug fixes from the 2.4
branch).

I'm not sure I agree with this policy.  Can you elaborate on /why/ you
want this?


Because there won't typically be sufficient testing and release
infrastructure to allow arbitrary bug fixes to be committed on the
branch. The buildbots are turned off, and nobody tests the release
candidate, no Windows binaries are provided - thus, chances are very
high that a bug fix release for some very old branch will be *worse*
than the previous release, rather than better.


Second, I don't think this is true. People using those patch
level releases will test and report bugs if they are introduced
by such backports.

Besides, developers backporting such changes are diligent enough
to test their changes - they will usually have a reason for applying
the extra effort to backport.

I don't see any advantage in undoing already tested and committed
patches to an older branch.

Note that Python 2.4 is still widely used out there. As an
example, all the Zope and Plone installations run Python 2.4 and
will continue to do so for quite a while.

And there's a reason for this slow uptake of Python 2.5: as more
and more servers run 64-bit OSes, the Py_ssize_t changes cause
serious trouble with Python C extensions that were not updated
by their authors.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-08-06 Thread M.-A. Lemburg

On 2008-08-06 18:55, Antoine Pitrou wrote:

Martin v. Löwis martin at v.loewis.de writes:

URLs are just not made for non-ASCII characters.


Perhaps they are not, but every non-English wiki (just to take a simple, generic
example) potentially contains non-ASCII URLs.
e.g. http://fr.wikipedia.org/wiki/%C3%89l%C3%A9phant
http://wiki.python.org/moin/J%C3%BCrgenHermann
(notice the utf-8 encoding in both)


Implement IRIs if you want non-ASCII characters; the rules are much clearer

for these.

I think most people would expect something which works with the current World
Wide Web rather than a rigorous implementation of a specific RFC. Implementing
RFCs is fine but it does not magically eliminate all problems, especially when
the RFCs themselves are not in sync with real-world usage.


+1. Practicality beats purity...

The web is moving towards UTF-8 as standard Unicode encoding, so
it's probably wise to follow that approach for quote().

http://en.wikipedia.org/wiki/Percent-encoding

The other way around will also have to deal with old-style URLs
which typically still use the Latin-1 encoding which was the
basis for HTML:

http://www.w3schools.com/TAGS/ref_urlencode.asp

So unquote() should probably try to decode using UTF-8 first
and then fall back to Latin-1 if that doesn't work.

Whether the result of quote()/unquote() should be bytes or
Unicode is a different story and probably also depends on
what the application does with the result. I don't think there's
a good general answer for that one, except maybe just going
for one possible combination and document that.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 05 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.0 C API to decode bytes into unicode?

2008-08-01 Thread M.-A. Lemburg

On 2008-08-01 15:06, Barry Scott wrote:

I cannot see how I implement decode() for bytes objects using the C API
for PyCXX library,

I'd assuming that I should find a PyBytes_Decode function but cannot 
find it

in beta 2.

What is the preferred way to do this?


PyUnicode_FromEncodedObject() should to the trick.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 01 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fuzzing bugs: most bugs are closed

2008-07-21 Thread M.-A. Lemburg

On 2008-07-20 22:45, Victor Stinner wrote:

Le Saturday 19 July 2008 21:52:09 A.M. Kuchling, vous avez écrit :

Excellent work!  Another fruitful area for fuzzing might be the
miniature virtual machine used by the re module.  It's possible to
import _sre and call the compile() function directly (see the end of
Lib/sre_compile.py for how it's invoked); I wonder how the regex VM
copes with random strings of bytecode.


Hum... how can I say it? It's trivial to crash _sre :-) So I blacklisted 
_sre.compile() in my fuzzer.


For information, it's also very easy to crash CPython with fuzzed .pyc file.

It's hard to check bytecode without execute it. It's maybe better to add 
checks directly in the VM.


I don't see that as a big problem: if you execute untrusted byte code,
you are on your own anyway... whether that's byte code for the re
engine or ceval.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 21 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] No beta2 tonight

2008-07-18 Thread M.-A. Lemburg

On 2008-07-18 21:35, Charles Hixson wrote:

Invariably, when someone goes and removes a module, someone else is
going to complain, but I used feature X, not having feature X will
break my code.  We, as maintainers can then say, if you cared,
maintain it.  But I'm not sure that is the greatest thing to tell
people.  I suspect that we may have to include some sort of
work-alike for 2.7 and if not 3.0, 3.1 .  If I were to vote for a
work-alike, it would be based on sqlite.  For one of the most common
use-cases (bsddb.btree), simple sqlite code can be written to do the
right thing.  Recno is a little more tricky, but can also be done.
The bsddb hash may not be possible, because sqlite doesn't support
hashed indices :/.

Just an idea.

 - Josiah


Were I to vote for something it would be a B+Tree in collections.  One that 
didn't impose a requirement that the key be a string (and not, e.g., an 
integer or a float).


OTOH, I don't care enough to build it.  (I've proven this to myself 
repeatedly, as I've started to create such a thing, and then kludged a 
different solution.)


If pybsddb is sufficient as work-around for the stdlib bsddb module
(or perhaps even better), then I don't see much of a problem removing
the module from the stdlib using a PEP 4 process.

Can't really say, since I've never used any of these myself...
for on-disk dictionaries, we use mxBeeBase:

http://www.egenix.com/products/python/mxBase/mxBeeBase/

For anything more complex, a SQL database.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 18 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Consolidating names in the `unittest` module

2008-07-16 Thread M.-A. Lemburg

On 2008-07-16 02:20, Collin Winter wrote:

On Tue, Jul 15, 2008 at 6:58 AM, Ben Finney [EMAIL PROTECTED] wrote:

Significant updates include removing all reference to the
(already-resolved) new-style class issue, adding footnotes and
references, and a Rationale summary of discussion on both sides of the
divide for 'assert*' versus 'fail*' names.


:PEP:   XXX
:Title: Consolidating names in the `unittest` module
:Version:   0.2
:Last-Modified: 2008-07-15
:Author:Ben Finney [EMAIL PROTECTED]
:Status:Draft
:Type:  Standards Track
:Content-Type:  test/x-rst
:Created:   2008-07-14
:Python-Version:2.7, 3.1


+1 for doing this in 3.1.

-1 for Python 2.7.

The main reason is that there's just too much 2.x code out there
using the API names you are suggesting to change and/or remove
from the module.

Since this is a major change in the unit test API, I'd also like
to suggest that you use a new module name.

This is both a precaution to prevent tests failing due to not having
been upgraded and a way for old code to continue working by adding
the old unittest module on sys.path.

Please note that the required renaming of the methods in existing
tests is not going to be as straight forward as you may think,
since you may well rename method calls into the tested application
rather than just the unit test class method calls if you're not
careful.


Abstract


This PEP proposes to consolidate the names that constitute the API of
the standard library `unittest` module, with the goal of removing
redundant names, and conforming with PEP 8.


--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Consolidating names in the `unittest` module

2008-07-16 Thread M.-A. Lemburg

On 2008-07-16 10:14, Ben Finney wrote:

M.-A. Lemburg [EMAIL PROTECTED] writes:


Since this is a major change in the unit test API, I'd also like
to suggest that you use a new module name.

This is both a precaution to prevent tests failing due to not having
been upgraded and a way for old code to continue working by adding
the old unittest module on sys.path.


Do you have a specific argument against the provisions already stated
in the PEP for backward compatibility? They seem to address your
concerns already.


The PEP doesn't mention changing the module name and deprecating
the old one. Instead it wants to deprecate all the old names (and cites
PEP 4 for this), but keeping the module name.

Note that PEP 4 targets deprecating use of whole modules, not single
APIs, or - like in your case - more or less the complete existing API
of a module.

Given the scope of the changes, you are really creating a completely
new API and as a result should also get a new module name. You can then
deprecate use of the old unittest module name and point users to the
new one.

Developers who don't feel like changing 1+ tests can then continue
to use the old module and start using the new module for new projects.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Consolidating names in the `unittest` module

2008-07-16 Thread M.-A. Lemburg

On 2008-07-16 14:02, Michael Foord wrote:

M.-A. Lemburg wrote:

On 2008-07-16 10:14, Ben Finney wrote:

M.-A. Lemburg [EMAIL PROTECTED] writes:


Since this is a major change in the unit test API, I'd also like
to suggest that you use a new module name.

This is both a precaution to prevent tests failing due to not having
been upgraded and a way for old code to continue working by adding
the old unittest module on sys.path.


Do you have a specific argument against the provisions already stated
in the PEP for backward compatibility? They seem to address your
concerns already.


The PEP doesn't mention changing the module name and deprecating
the old one. Instead it wants to deprecate all the old names (and cites
PEP 4 for this), but keeping the module name.

Note that PEP 4 targets deprecating use of whole modules, not single
APIs, or - like in your case - more or less the complete existing API
of a module.


Which PEP is usually referenced for the deprecation of individual APIs?


PEP 5 could be used for that.

Adding several 10s of deprecation warnings to the unittest module
is not going to make life easier for anyone. Adding just a single
one on import and following PEP 4 is.

If you do want to apply major changes to a module without changing
the name, then this could be done as part of the 2.x - 3.x transition.
The 2.x branch is not the right place for such breakage.


Given the scope of the changes, you are really creating a completely
new API and as a result should also get a new module name. You can then
deprecate use of the old unittest module name and point users to the
new one.


You propose that we duplicate the entire module with a new name, 
maintaining both in parallel but with different method names?


No, I'm proposing to apply all the name changes to a new module
and deprecate the old one. unittest will then go unmaintained
until it is removed.


That doesn't sound wise to me.



Developers who don't feel like changing 1+ tests can then continue
to use the old module and start using the new module for new projects.



So we shouldn't bring the API inline with PEP 8 because it is widely used?


I didn't say that. However, if it's not required, then breaking
a complete module API isn't necessary - practicality beats purity.

Instead add a new module with all the changes and have developers
gradually migrate to the new code.

Even if it causes some pain (and the methods won't be removed for 
several years if we follow the normal deprecation schedule), the fact 
that the API is widely used would seem to be an argument in favor of 
making it follow the Python style guidelines.


So you're saying that because many people use the code, we should be
more inclined to make their life harder. That's an interesting
argument :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Consolidating names in the `unittest` module

2008-07-16 Thread M.-A. Lemburg

On 2008-07-16 15:12, Ben Finney wrote:

M.-A. Lemburg [EMAIL PROTECTED] writes:


On 2008-07-16 14:02, Michael Foord wrote:

M.-A. Lemburg wrote:

Note that PEP 4 targets deprecating use of whole modules, not
single APIs, or - like in your case - more or less the complete
existing API of a module.

Which PEP is usually referenced for the deprecation of individual
APIs?

PEP 5 could be used for that.


That seems an even worse fit; it speaks of changing language features,
not library modules. At least PEP 4 talks about when to raise
DeprecationWarning.


Right and the methods described there are usually also applied
to language changes and API changes.

I just wanted to make clear that your ...see PEP 4 for how to handle
backwards compatibility... statement doesn't apply to the changes
described in your PEP. However, it does point at a possible compromise
which would make the transition easier on everyone.


Adding several 10s of deprecation warnings to the unittest module is
not going to make life easier for anyone. Adding just a single one
on import and following PEP 4 is.


I don't see how the first is not going to make life easier if the
second somehow is. Is a programmer going to be helpless in the face of
some DeprecationWarnings but not others?


Using the first method (changing the API names), you force
developers to change existing code, which results in testing
the test code. Lots of work with no real benefit.

With the second method, they can use the new names with new test code
(which then imports the new module). They don't have to test
their existing tests for obscure searchreplace errors.


If you do want to apply major changes to a module without changing
the name, then this could be done as part of the 2.x - 3.x
transition.


This has already been rejected
URL:http://mail.python.org/pipermail/python-dev/2008-April/078485.html


I wasn't suggesting to apply to the change to 3.0, but instead
suggesting that if you want to implement such a major API change,
this should be done only on the 3.x branch and be dealt with in the
2to3 tool.


I'm inclined to agree that it's not right for 2.x. I'll revise the PEP
accordingly.


Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] UCS2/UCS4 default

2008-07-04 Thread M.-A. Lemburg

On 2008-07-03 21:59, Steve Holden wrote:

M.-A. Lemburg wrote:

On 2008-07-03 19:44, Terry Reedy wrote:
The premise of this thread seems to be that the majority should 
suffer for the benefit of a few.  That is not Python's philosophy.


In reality, most Unixes ship with UCS4 builds of Python. Windows
and Mac OS X ship with UCS2 builds. Still, anyone is free to build
their own favorite version - that's freedom of choice, which is good.

Programmers just need to be made aware of the differences in UCS2
and UCS4 builds and deal with it.

Here's talk I've given many many times over the years which explains
some of the details that a Python programmer needs to know when dealing
with Unicode:

http://www.egenix.com/files/python/PyConUK2007-Developing-Unicode-aware-applications-in-Python.pdf 



Perhaps I should add a section on UCS2 vs. UCS4 the next time around ;-)


The indications are that would be helpful to many people (including 
myself).


Ok, I'll add one for one of the next conferences.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 04 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania 2 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] UCS2/UCS4 default

2008-07-03 Thread M.-A. Lemburg

I think the discussion is going in the wrong direction:

The choice between UCS2 and UCS4 builds is really only meant
to enhance the possibility to interface to native OS or
application APIs, e.g. Windows LIBC and Java use UTF-16, glibc
on Unix uses UCS4.

The problem of slicing Unicode objects is far more complicated
than just breaking a surrogate pair. Unicode if full of combining
code points - if you break such a sequence, the output will be
just as wrong; regardless of UCS2 vs. UCS4.

A long time ago we had a discussion about these problems. I had
suggested a new module (unicodeindex IIRC) which takes care of indexing
Unicode strings based on code points (which support for surrogates),
glyphs (taking combining code points into account) and words (with
support for various breaking/non-breaking separation code points).

Trying to solve such issues at the storage level is the wrong
approach, since the problem is application specific and thus requires
a higher-level set of possible solutions.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania 3 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] UCS2/UCS4 default

2008-07-03 Thread M.-A. Lemburg

On 2008-07-03 15:21, Jeroen Ruigrok van der Werven wrote:

-On [20080703 15:00], M.-A. Lemburg ([EMAIL PROTECTED]) wrote:

Unicode if full of combining code points - if you break such a sequence,
the output will be just as wrong; regardless of UCS2 vs. UCS4.


In my opinion you are confusing two related, but very separated things here.
Combining characters have nothing to do with breaking up the encoding of a
single codepoint. Sure enough, if you arbitrary slice up codepoints that
consist of combining characters then your result is indeed odd looking.

I never said that nor is that the point I am making.


Please remember that lone surrogate pair code points are perfectly
valid Unicode code points, nevertheless. Just as a lone combining
code point is valid on its own.


Guido points out that Python supports surrogate pairs and says that if
Python is dealing wrongly with this in the core than it needs to be fixed.
I am pointing out that given the fact we allow surrogate pairs we deal
rather simplistic with it in the core. In fact, we do not consider them at
all. In essence: though we may accept full 21-bit codepoints in the form of
\U escape sequences and store them internally as UTF-16 (which I
still need to verify) we subsequently deal with them programmatically as
UCS-2, which is plain silly.


Python applies conversion from non-BMP code points to surroagtes
for UCS builds in a few places and I agree that we should probably
do that at a few more places.

However, these are mainly conversion issues of encoded Unicode
representations vs. the internal Unicode storage where you want
to avoid exceptions in favor of finding a solution that preserves
data.

To make it clear: UCS2 builds of Python do not support non-BMP
code points out of the box.

A programmer will always have to use a codec to map the internal storage
on these builds to the full Unicode code point range. The following
codecs support surrogates on UCS2 builds:

 * UTF-8
 * UTF-16
 * UTF-32
 * unicode-escape
 * raw-unicode-escape


You either commit yourself fully to UTF-16 and surrogate pairs or not. Not
some form in-between, because that will ultimately lead to more confusion
due to the difference in results when dealing with Unicode.


Programmers will have to be aware of the fact that on UCS2
builds of Python non-BMP code points will have to be treated
differently than on UCS4 builds.

I don't see that as a problem. It is in a way similar to
32-bit vs. 64-bit builds of Python or the fact that floating point
numbers work differently depending on the Python platform or
compiler being used.

BTW: Have you ever run into any problems with UCS2 vs. UCS4
in practice that were not easy to solve ?

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania 3 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] UCS2/UCS4 default

2008-07-03 Thread M.-A. Lemburg

On 2008-07-03 19:21, Adam Olsen wrote:

On Thu, Jul 3, 2008 at 7:57 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:

On 2008-07-03 15:21, Jeroen Ruigrok van der Werven wrote:

-On [20080703 15:00], M.-A. Lemburg ([EMAIL PROTECTED]) wrote:

Unicode if full of combining code points - if you break such a sequence,
the output will be just as wrong; regardless of UCS2 vs. UCS4.

In my opinion you are confusing two related, but very separated things
here.
Combining characters have nothing to do with breaking up the encoding of a
single codepoint. Sure enough, if you arbitrary slice up codepoints that
consist of combining characters then your result is indeed odd looking.

I never said that nor is that the point I am making.

Please remember that lone surrogate pair code points are perfectly
valid Unicode code points, nevertheless. Just as a lone combining
code point is valid on its own.


That is a big part of these problems.  For all practical purposes, a
surrogate is like a UTF-8 code unit, and must be handled the same way,
so why the heck do they confuse everybody by saying oh, it's a code
point too!?


You have to take that up with the Unicode consortium :-)

It would have been better not to add surrogates to the standard
at all. To be fair, I don't think that anybody seriously assumed
at the time that more than 16 bits would be needed.

In practice, you do need to be able to build Unicode strings
that contain half a surrogate (ie. a single code point) or
a combining code point without its anchor code point, so trying
to be smart about detecting surrogates is going to create more
confusion than do good, e.g.

 x1 = u'\udbc0'
 x2 = u'\udc00'
 x1
u'\udbc0'
 x2
u'\udc00'
 len(x1)
1
 len(x2)
1

Having len(x1+x2) == 1 wouldn't be right and break all sorts
of assumptions you normally make about string concatenation.
Which is why len(x1+x2) gives 2 in both UCS2 and UCS4 builds.

The fact that u'\U0010' can map to a length 1 Unicode string
in UCS4 builds and a length 2 string in UCS2 builds is merely
due to the fact that the unicode-escape codec (which converts
the escaped string literal to a Unicode object) does know about
surrogates and uses them to avoid exceptions.

Programmers need to be aware of this fact, that's all...
just like they need to aware of differences between
integer and float division, different behavior of classic
and new-style classes, etc. etc.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania 3 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] UCS2/UCS4 default

2008-07-03 Thread M.-A. Lemburg

On 2008-07-03 19:35, Jeroen Ruigrok van der Werven wrote:

-On [20080703 19:21], Adam Olsen ([EMAIL PROTECTED]) wrote:

On Thu, Jul 3, 2008 at 7:57 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:

Please remember that lone surrogate pair code points are perfectly
valid Unicode code points, nevertheless. Just as a lone combining
code point is valid on its own.

That is a big part of these problems.  For all practical purposes, a
surrogate is like a UTF-8 code unit, and must be handled the same way,
so why the heck do they confuse everybody by saying oh, it's a code
point too!?


Because surrogate code points are not Unicode scalar values, isolated UTF-16
code units in the range 0xd800-0xdfff are ill-formed. (D91 from Unicode
5.0/5.1, section 3.9)


True. They are not valid UTF-16 code units, but a code unit is
just a storage byte representation of a Unicode tranformation...


Code Unit. The minimal bit combination that can represent a unit of encoded text for processing or interchange. The 
Unicode Standard uses 8-bit code units in the UTF-8 encoding form, 16-bit code units in the UTF-16 encoding form, and 
32-bit code units in the UTF-32 encoding form. (See definition D77 in  Section 3.9, Unicode Encoding Forms.)



That's not the same thing as a code point which is an assignment
of a slot in the Unicode character set...


Code Point. Any value in the Unicode codespace; that is, the range of integers from 0 to 1016. (See definition D10 
in Section 3.4, Characters and Encoding.)



Reference: http://www.unicode.org/glossary/

Also see Chapter 3.4 
(http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf#G2212):


Surrogate code points and noncharacters are considered assigned code points,
but not assigned characters.


--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania 3 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] UCS2/UCS4 default

2008-07-03 Thread M.-A. Lemburg

On 2008-07-03 19:44, Terry Reedy wrote:
The premise of this thread seems to be that the majority should suffer 
for the benefit of a few.  That is not Python's philosophy.


In reality, most Unixes ship with UCS4 builds of Python. Windows
and Mac OS X ship with UCS2 builds. Still, anyone is free to build
their own favorite version - that's freedom of choice, which is good.

Programmers just need to be made aware of the differences in UCS2
and UCS4 builds and deal with it.

Here's talk I've given many many times over the years which explains
some of the details that a Python programmer needs to know when dealing
with Unicode:

http://www.egenix.com/files/python/PyConUK2007-Developing-Unicode-aware-applications-in-Python.pdf

Perhaps I should add a section on UCS2 vs. UCS4 the next time around ;-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania 3 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] xmlrpclib.{True, False} (was Re: Assignment to None)

2008-06-16 Thread M.-A. Lemburg

On 2008-06-15 16:47, Georg Brandl wrote:

Thomas Lee schrieb:

Georg Brandl wrote:

Remember that it must still be possible to write (in 2.6)

True = 0
assert not True


Ah of course. Looks like I should just avoid optimizations of 
Name(True) and Name(False) all together. That's a shame!


We can of course decide to make assignment to True and False
illegal in 2.7 :)


Raising a run-time exception would be fine, but not a SyntaxError at
compile time - this would effectively make it impossible to keep
code compatible to Python 2.1.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania20 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Betas today - I hope

2008-06-13 Thread M.-A. Lemburg

On 2008-06-13 11:32, Walter Dörwald wrote:

M.-A. Lemburg wrote:

On 2008-06-12 16:59, Walter Dörwald wrote:

M.-A. Lemburg wrote:

.transform() and .untransform() use the codecs to apply same-type
conversions. They do apply type checks to make sure that the
codec does indeed return the same type.

E.g. text.transform('xml-escape') or data.transform('base64').


So what would a base64 codec do with the errors argument?


It could use it to e.g. try to recover as much data as possible
from broken input data.

Currently (in Py2.x), it raises an exception if you pass in anything
but strict.


I think for transformations we don't need the full codec machinery:

  ...

No need to invent another wheel :-) The codecs already exist for
Py2.x and can be used by the .encode()/.decode() methods in Py2.x
(where no type checks occur).


By using a new API we could get rid of old warts. For example: Why 
does the stateless encoder/decoder return how many input 
characters/bytes it has consumed? It must consume *all* bytes anyway!


No, it doesn't and that's the point in having those return values :-)

Even though the encoder/decoders are stateless, that doesn't mean
they have to consume all input data. The caller is responsible to
make sure that all input data was in fact consumed.

You could for example have a decoder that stops decoding after
having seen a block end indicator, e.g. a base64 line end or
XML closing element.


So how should the UTF-8 decoder know that it has to stop at a closing 
XML element?


The UTF-8 decoder doesn't support this, but you could write a codec
that applies this kind of detection, e.g. to not try to decode
partial UTF-8 byte sequences at the end of input, which would then
result in error.


Just because all codecs that ship with Python always try to decode
the complete input doesn't mean that the feature isn't being used.


I know of no other code that does. Do you have an example for this use.


I already gave you a few examples.


The interface was designed to allow for the above situations.


Then could we at least have a new codec method that does:

def statelesencode(self, input):
   (output, consumed) = self.encode(input)
   assert len(input) == consumed
   return output


You mean as method to the Codec class ?

Sure, we could do that, but please use a different name,
e.g. .encodeall() and .decodeall() - .encode() and .decode()
are already stateles (and so would the new methods be), so
stateless isn't all that meaningful in this context.

We could also add such a check to the PyCodec_Encode() and _Decode()
functions. They currently do not apply the above check.

In Python, those two functions are exposed as codecs.encode()
and codecs.decode().

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania23 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Betas today - I hope

2008-06-12 Thread M.-A. Lemburg

On 2008-06-12 16:59, Walter Dörwald wrote:

M.-A. Lemburg wrote:

.transform() and .untransform() use the codecs to apply same-type
conversions. They do apply type checks to make sure that the
codec does indeed return the same type.

E.g. text.transform('xml-escape') or data.transform('base64').


So what would a base64 codec do with the errors argument?


It could use it to e.g. try to recover as much data as possible
from broken input data.

Currently (in Py2.x), it raises an exception if you pass in anything
but strict.


I think for transformations we don't need the full codec machinery:

  ...

No need to invent another wheel :-) The codecs already exist for
Py2.x and can be used by the .encode()/.decode() methods in Py2.x
(where no type checks occur).


By using a new API we could get rid of old warts. For example: Why does 
the stateless encoder/decoder return how many input characters/bytes it 
has consumed? It must consume *all* bytes anyway!


No, it doesn't and that's the point in having those return values :-)

Even though the encoder/decoders are stateless, that doesn't mean
they have to consume all input data. The caller is responsible to
make sure that all input data was in fact consumed.

You could for example have a decoder that stops decoding after
having seen a block end indicator, e.g. a base64 line end or
XML closing element.

Just because all codecs that ship with Python always try to decode
the complete input doesn't mean that the feature isn't being used.
The interface was designed to allow for the above situations.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 12 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania24 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-06-11 Thread M.-A. Lemburg

On 2008-06-11 05:42, Gregory P. Smith wrote:

On Mon, Jun 9, 2008 at 1:44 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:


On 2008-06-09 07:20, Gregory P. Smith wrote:


On Fri, Jun 6, 2008 at 2:19 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:

 On 2008-06-03 01:29, Gregory P. Smith wrote:

 On Mon, Jun 2, 2008 at 4:09 PM, Guido van Rossum [EMAIL PROTECTED]

wrote:

 I will freely admit that I haven't followed this thread in any detail,


but if it were up to me, I'd have the 2.6 internal code use PyString

 ...

Should we read this as a BDFL pronouncement and make it so?

All that would mean change wise is that trunk r63675 as well as possibly
r63672 and r63677 would need to be rolled back and this whole discussion
over if such a big change should have happened would turn into a moot
point.

 I would certainly welcome reverting the change.

All that's needed to support PyBytes API in 2.x is a set of #defines
that map the new APIs to the PyString names. That's a clean and
easily understandable solution.



Okay, I've reverted r63675 in trunk revision r64048.  That leaves all of
the
python modules and internals using PyString_ api names instead of PyBytes_
api names as they were before.  PyBytes_ #define's exist for the
appropriate
PyString methods incase anyone wants to use those.


Thanks.

 Programmers interested in the code

for a PyString API can then still look up the code in stringobject.c,
e.g. to find out how a certain special case is handled or to check
the ref counting - just like they did for years.



The files still exist with the new names.  bytesobject.c instead of
stringobject.c.  Those renames were done in the other CLs i mentioned
which
have not yet been reverted.  The current state seems a bit odd because
they
depend on the #defines to cause method definitions to be the PyString_
names
instead of the PyBytes_ names.


Please restore the original state, ie. PyString APIs live in
stringobject.h and stringobject.c. bytesobject.h should then have
the #defines for PyBytes APIs, pointing them to the PyString
names (basically what's currently in stringobject.h).



all done as of 64105


Thank you !

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 11 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania25 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Betas today - I hope

2008-06-11 Thread M.-A. Lemburg

On 2008-06-11 13:35, Barry Warsaw wrote:
So I had planned to do a bunch of work last night looking at the release 
blocker issues, but nature intervened.  A bunch of severe thunderstorms 
knock out my 'net access until this morning.


I'll try to find some time during the day to look at the RB issues.  
Hopefully we can get Guido to look at them too and Pronounce on some of 
them.  Guido please start with:


http://bugs.python.org/issue643841

My plan is to begin building the betas tonight, at around 9 or 10pm EDT 
(0100 to 0200 UTC Thursday).  If a showstopper comes up before then, 
I'll email the list.  If you think we really aren't ready for beta, then 
I would still like to get a release out today.  In that case, we'll call 
it alpha and delay the betas.


There are two things I'd like to get in to 3.0:

 * .transform()/.untransform() methods (this is mostly done, just need
   to add the methods to PyUnicode, PyBytes and PyByteArray)

 * cleanup of the PyUnicode_AsString() and PyUnicode_AsStringAndSize()
   C APIs (these APIs don't fit into the naming scheme used in the
   Unicode API and have a few other issues as well, see issue 2799;
   at the very least they should be made interpreter internal, ie.
   rename them to _PyUnicode_AsString() and _PyUnicode_AsStringAndSize()
   to prevent their use in extensions)

I did not have time in the last few days to work on these and won't
in the next few days either. Next week looks much better.

If it's ok to make the above changes after the release (whatever you
call it ;-), that would be great.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 11 2008)

Python/Zope Consulting and Support ...http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


2008-07-07: EuroPython 2008, Vilnius, Lithuania25 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Betas today - I hope

2008-06-11 Thread M.-A. Lemburg

On 2008-06-11 17:15, Walter Dörwald wrote:

M.-A. Lemburg wrote:

On 2008-06-11 13:35, Barry Warsaw wrote:
So I had planned to do a bunch of work last night looking at the 
release blocker issues, but nature intervened.  A bunch of severe 
thunderstorms knock out my 'net access until this morning.


I'll try to find some time during the day to look at the RB issues.  
Hopefully we can get Guido to look at them too and Pronounce on some 
of them.  Guido please start with:


http://bugs.python.org/issue643841

My plan is to begin building the betas tonight, at around 9 or 10pm 
EDT (0100 to 0200 UTC Thursday).  If a showstopper comes up before 
then, I'll email the list.  If you think we really aren't ready for 
beta, then I would still like to get a release out today.  In that 
case, we'll call it alpha and delay the betas.


There are two things I'd like to get in to 3.0:

 * .transform()/.untransform() methods (this is mostly done, just need
   to add the methods to PyUnicode, PyBytes and PyByteArray)


What would these methods do? Use the codec machinery without any type 
checks?


As discussed in another thread some weeks ago:

.transform() and .untransform() use the codecs to apply same-type
conversions. They do apply type checks to make sure that the
codec does indeed return the same type.

E.g. text.transform('xml-escape') or data.transform('base64').


I think for transformations we don't need the full codec machinery:

 ...

No need to invent another wheel :-) The codecs already exist for
Py2.x and can be used by the .encode()/.decode() methods in Py2.x
(where no type checks occur).

In Py3.x, .encode()/.decode() only allow conversions of the type
unicode - bytes. .transform()/.untransform() add conversions
of the type unicode - unicode or bytes - bytes.

All other conversions in Py3.x have to go through codecs.encode() and
codecs.decode() which are the generic codec access functions from
the codec registry.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 11 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania25 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-06-09 Thread M.-A. Lemburg

On 2008-06-09 07:20, Gregory P. Smith wrote:

On Fri, Jun 6, 2008 at 2:19 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:


On 2008-06-03 01:29, Gregory P. Smith wrote:


On Mon, Jun 2, 2008 at 4:09 PM, Guido van Rossum [EMAIL PROTECTED]
wrote:

 I will freely admit that I haven't followed this thread in any detail,

but if it were up to me, I'd have the 2.6 internal code use PyString


...

Should we read this as a BDFL pronouncement and make it so?

All that would mean change wise is that trunk r63675 as well as possibly
r63672 and r63677 would need to be rolled back and this whole discussion
over if such a big change should have happened would turn into a moot
point.


I would certainly welcome reverting the change.

All that's needed to support PyBytes API in 2.x is a set of #defines
that map the new APIs to the PyString names. That's a clean and
easily understandable solution.



Okay, I've reverted r63675 in trunk revision r64048.  That leaves all of the
python modules and internals using PyString_ api names instead of PyBytes_
api names as they were before.  PyBytes_ #define's exist for the appropriate
PyString methods incase anyone wants to use those.


Thanks.


Programmers interested in the code

for a PyString API can then still look up the code in stringobject.c,
e.g. to find out how a certain special case is handled or to check
the ref counting - just like they did for years.



The files still exist with the new names.  bytesobject.c instead of
stringobject.c.  Those renames were done in the other CLs i mentioned which
have not yet been reverted.  The current state seems a bit odd because they
depend on the #defines to cause method definitions to be the PyString_ names
instead of the PyBytes_ names.


Please restore the original state, ie. PyString APIs live in
stringobject.h and stringobject.c. bytesobject.h should then have
the #defines for PyBytes APIs, pointing them to the PyString
names (basically what's currently in stringobject.h).


Developer who want to start differentiating between mixed byte/text
data and bytes-only can start using PyBytes for byte data.

 I would also add macros that map the PyBytes_* APIs to PyString_*, but

I would not start using these internally except in code newly written
for 2.6 and intended to be in the spirit of 3.0. IOW use PyString
for 8-bit strings containing text, and PyBytes for 8-bit strings
containing binary data. For 8-bit strings that could contain either
text or data, I'd use PyString, in the spirit of 2.x.


Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 09 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania27 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-06-06 Thread M.-A. Lemburg

On 2008-06-03 01:29, Gregory P. Smith wrote:

On Mon, Jun 2, 2008 at 4:09 PM, Guido van Rossum [EMAIL PROTECTED] wrote:


I will freely admit that I haven't followed this thread in any detail,
but if it were up to me, I'd have the 2.6 internal code use PyString


...

Should we read this as a BDFL pronouncement and make it so?

All that would mean change wise is that trunk r63675 as well as possibly
r63672 and r63677 would need to be rolled back and this whole discussion
over if such a big change should have happened would turn into a moot point.


I would certainly welcome reverting the change.

All that's needed to support PyBytes API in 2.x is a set of #defines
that map the new APIs to the PyString names. That's a clean and
easily understandable solution.

Programmers interested in the code
for a PyString API can then still look up the code in stringobject.c,
e.g. to find out how a certain special case is handled or to check
the ref counting - just like they did for years.

Developer who want to start differentiating between mixed byte/text
data and bytes-only can start using PyBytes for byte data.


I would also add macros that map the PyBytes_* APIs to PyString_*, but
I would not start using these internally except in code newly written
for 2.6 and intended to be in the spirit of 3.0. IOW use PyString
for 8-bit strings containing text, and PyBytes for 8-bit strings
containing binary data. For 8-bit strings that could contain either
text or data, I'd use PyString, in the spirit of 2.x.


--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 06 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania30 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modules for 2.6 inclusion

2008-06-06 Thread M.-A. Lemburg

On 2008-06-06 13:27, Georg Brandl wrote:

Hi,

PEP 361 lists the following modules for possible inclusion in 2.6 (next
to pyprocessing, which is now accepted):

- winerror
  http://python.org/sf/1505257
  (Owner: MAL)

This patch has been marked as rejected, so I'll remove the entry from
the PEP.


Note that the idea is still valid - the implementation of the module
should be written in C and the patch only comes with a Python module.

If anyone would like to work on a (generated) C module, please have
a look at the patch.

winerror is meant to provide access to the Windows error codes which
are currently not available in Python.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 06 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania30 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-06-03 Thread M.-A. Lemburg

On 2008-06-03 01:09, Guido van Rossum wrote:

I will freely admit that I haven't followed this thread in any detail,
but if it were up to me, I'd have the 2.6 internal code use PyString
(as both what the linker sees and what the human reads in the source
code) and the 3.0 code use PyBytes for the same thing. Let the merges
be damed -- most changes to 2.6 these days seem to be blocked
explicitly from being merged anyway. I'd prefer the 2.6 code base to
stay true to 2.x, and the 3.0 code base start afresh where it makes
sense. We should reindent more of the 3.0 code base to use
4-space-indents in C code too.

I would also add macros that map the PyBytes_* APIs to PyString_*, but
I would not start using these internally except in code newly written
for 2.6 and intended to be in the spirit of 3.0. IOW use PyString
for 8-bit strings containing text, and PyBytes for 8-bit strings
containing binary data. For 8-bit strings that could contain either
text or data, I'd use PyString, in the spirit of 2.x.


+1

Let's work on better merge tools that edit the trunk code base
into shape for a 3.x checkin.

Using automated tools for this
is likely going to lower the probability of bugs introduced
due to unnoticed merge conflicts and in the end is also going
to be a benefit to everyone wanting to maintain a single code
base for both targets.

Perhaps we could revive the old Tools/scripts/fixcid.py that was
used for the 1.4-1.5 renaming ?!

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 03 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania33 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-06-02 Thread M.-A. Lemburg

On 2008-06-02 01:30, Gregory P. Smith wrote:

On Fri, May 30, 2008 at 1:37 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:

Sorry, I probably wasn't clear enough:

Why can't we have both PyString *and* PyBytes exposed as C
APIs (ie. visible in code and in the linker) in 2.x, with one redirecting
to the other ?


* Why should the 2.x code base turn to hacks, just because 3.x wants
to restructure itself ?

With the better explanation from Greg of what the checked in approach
achieves (i.e. preserving exact ABI compatibility for PyString_*, while
allowing PyBytes_* to be used at the source code level), I don't see what
has been done as being any more of a hack than the possibly more common
#define oldname newname (which *would* break binary compatibility).

The only things that I think would tidy it up further would be to:
- include an explanation of the approach and its effects on API and ABI
backward and forward compatibility within 2.x and between 2.x and 3.x in
stringobject.h
- expose the PyBytes_* functions to the linker in 2.6 as well as 3.0

Which is what I was suggesting all along; sorry if I wasn't
clear enough on that.

The standard approach is that you provide #define redirects from the
old APIs to the new ones (which are then picked up by the compiler)
*and* add function wrappers to the same affect (to make linkers,
dynamic load APIs such ctypes and debuggers happy).


Example from pythonrun.h|c:
---

/* Use macros for a bunch of old variants */
#define PyRun_String(str, s, g, l) PyRun_StringFlags(str, s, g, l, NULL)

/* Deprecated C API functions still provided for binary compatiblity */

#undef PyRun_String
PyAPI_FUNC(PyObject *)
PyRun_String(const char *str, int s, PyObject *g, PyObject *l)
{
   return PyRun_StringFlags(str, s, g, l, NULL);
}



Okay, how about this?  http://codereview.appspot.com/1521

Using that patch, both PyString_ and PyBytes_ APIs are available using
function stubs similar to the above.  I opted to define the stub
functions right next to the ones they were stubbing rather than
putting them all at the end of the file or in another file but they
could be moved if someone doesn't like them that way.


Thanks. I was working on a similar patch. Looks like you beat
me to it.

The only thing I'm not sure about is having the wrappers in the
same file - this is likely to cause merge conflicts when doing
direct merging and even with an automated renaming approach,
the extra code would be easier to remove if it were e.g. at
the end of the file or even better: in a separate file.

My patch worked slightly differently: it adds wrappers PyString*
that forward calls to the PyBytes* APIs and they all live in
stringobject.c. stringobject.h then also provides aliases
so that recompiled extensions pick up the new API names.

While working on my patch I ran into an issue that I haven't
been able to resolve: the wrapper functions got optimized away
by the linker and even though they appear in the libpython2.6.a,
they don't end up in the python binary itself.

As a result, importing Python 2.5 in the resulting 2.6
binary still fails with a unresolved PyString symbol.

Please check whether that's the case for your patch as well.


I still believe that we should *not* make easy of merging the
primary motivation for backporting changes in 3.x to 2.x. Software
design should not be guided by restrictions in the tool chain,
if not absolutely necessary.

The main argument for a backport needs to be general usefulness
to the 2.x users, IMHO... just like any other feature that
makes it into 2.x.

If merging is difficult then this needs to be addressed, but
there are more options to that than always going back to the
original 2.x trunk code. I've given a few suggestions on how
this could be approached in other emails on this thread.


I am not the one doing the merging or working on merge tools so I'll
leave this up to those that are.


I'm not sure whether there are any specific merge tools around -
apart from the 2to3.py script.

There also doesn't seem to be any documentation on the merge
process itself (at least nothing that Google can find in the
PEPs), so it's difficult to make any suggestions.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 02 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania34 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611

Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-05-29 Thread M.-A. Lemburg

On 2008-05-28 19:08, Bill Janssen wrote:

I'm beginning to wonder whether I'm the only one who cares about
the Python 2.x branch not getting cluttered up with artifacts caused
by a broken forward merge strategy.


I share your concern.  Seems to me that perhaps (not sure, but
perhaps) the rush to back-port from 3.x, and the concern about
minimizing pain of moving from 2.x to 3.x, has become the tail wagging
the dog.


Indeed.

If the need to be able to forward merge changes from the 2.x trunk
to the 3.x branch is the only reason for the current approach, then
we need to find a better procedure for getting patches to 2.x
forwarded to 3.x.

I believe that everyone is aware that 3.x breaks things and that's
fine.

However, the reason for introducing such breakage in 3.x
is that users have the option to decide whether and when to switch
to the new major version.

Being able to play with 3.x features in 2.x is nice, but I wouldn't
really consider those essential for 2.x. It certainly doesn't
warrant causing major problems in the 2.x releases.

The module renaming backport was one example (which was undone again),
the C API renaming is another. I expect more such features to be
backported from 3.x to 2.x (even though I don't really think it's
worth the trouble) and since this always means that changes have
to applied in two worlds, we'll need a better process for getting
changes in one major release ported to the other.

Simply tweaking 2.x into shape so that the rather simple minded
SVN merge command works, isn't a good enough procedure for this.

That's why I suggested to use an intermediate form or branch
for the merging - one that implements the 2.x with all renaming
and syntax fixing applied.

This would:

 * reduce the number of merge conflicts since the renaming
   would already have happened

 * reduce the patch sizes that have to be applied to 3.x in
   order to stay in sync with 2.x

 * result in a tool chain that makes it easier for all Python
   users to port their code to 3.x

 * simplify renaming or reorg of modules, functions, methods
   and C APIs without requiring major changes on either side

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 29 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania38 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-05-29 Thread M.-A. Lemburg

On 2008-05-28 22:47, Gregory P. Smith wrote:

On Wed, May 28, 2008 at 3:12 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:

I'm beginning to wonder whether I'm the only one who cares about
the Python 2.x branch not getting cluttered up with artifacts caused
by a broken forward merge strategy.

How can it be that we allow major C API changes such as the renaming
of the PyString APIs to go into the trunk without discussion or
a PEP ?


I do not consider it a C API change.  The API and ABI have not
changed.  Old code still compiles.  Old binaries still dynamically
load and work fine.  (I just confirmed this by importing a couple
python2.4 .so files into my non-debug build of 2.6 trunk)

A of the PyString APIs are the real implementations in 2.x and are
still there.  We only switched to using their PyBytes equivalent names
within the Python trunk code base.

Are you objecting to our own code switching to use a different name
even though the actual underlying API and ABI haven't changed?  I
suppose to people reading the code and going against old reference
books it could be confusing but they've got to get used to the new
names somehow and sometime.

I strongly support changes like this one that makes the life of
porting C code forwards and backwards between 2.x and 3.x easier
without breaking compatibility with earlier 2.x version because that
is going to be a serious pain for all of us otherwise.


Well, first of all, it is a change in the C API:
APIs have different names now, they live in different files,
the Python documentation doesn't apply anymore, books have to
be updated, programmers trained, etc. etc. That's fine for
3.x, it's not for 2.x.

Second, if you leave out the ease merging argument, all of
this is not really necessary in 2.x. If you absolutely want
to have PyBytes APIs in 2.x, then you can *add* them, without
removing the PyString APIs. We have done that on a smaller
scale a couple of times in the past (turned functions into
macros or vice-versa).

And finally, the merge argument itself is not really all that
strong. It's just a matter of getting the procedure corrected.
Then you can rename and restructure as much as you want in
3.x - without affecting the stability and matureness of the
2.x branch.

I suspect more of these backports to happen, so we better get
things done right now instead of putting Python's reputation
as stable and mature programming language at risk.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 29 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania38 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-05-29 Thread M.-A. Lemburg

Christian,

so far you have not responded to any of the suggestions made on
this thread, only defended your checkin. That's not very helpful
in getting to some conclusion.

* What's so hard about going with a proper, standard solution that
doesn't involve using your preprocessor hack ?

* Why can't we have both PyString *and* PyBytes exposed in 2.x,
with one redirecting to the other ?

* Why should the 2.x code base turn to hacks, just because 3.x wants
to restructure itself ?

* Why aren't you even considering my proposed solution for this
whole renaming and reorg problem ?

BTW: Is there some PEP or wiki page explaining how you actually
implement the merging from 2.x to 3.x ? I'm still under the assumption
that you're only using svnmerge.py for this and doing straight
merging from the trunk to the branch.

Not sure how others feel about it, but if the only option you would
feel comfortable with is not having  the 3.x renaming backported,
then I'd rather go with that, really. It's easy enough to add
a header file to map PyString APIs to PyBytes if you want to
port an extension to 3.x.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 29 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania38 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611


On 2008-05-29 17:45, Christian Heimes wrote:

M.-A. Lemburg schrieb:

Well, first of all, it is a change in the C API:
APIs have different names now, they live in different files,
the Python documentation doesn't apply anymore, books have to
be updated, programmers trained, etc. etc. That's fine for
3.x, it's not for 2.x.


No, that's not correct. The 2.x API is still the same. I've only changed
the internal code.


Second, if you leave out the ease merging argument, all of
this is not really necessary in 2.x. If you absolutely want
to have PyBytes APIs in 2.x, then you can *add* them, without
removing the PyString APIs. We have done that on a smaller
scale a couple of times in the past (turned functions into
macros or vice-versa).


The PyString methods are still available and the official API for
dealing with str objects in 2.x.


And finally, the merge argument itself is not really all that
strong. It's just a matter of getting the procedure corrected.
Then you can rename and restructure as much as you want in
3.x - without affecting the stability and matureness of the
2.x branch.


I'm volunteering to revert my chances if you are volunteering to keep
the Python 2.x series in sync with the 3.x series.

Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/mal%40egenix.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Stabilizing the C API of 2.6 and 3.0

2008-05-28 Thread M.-A. Lemburg

I'm beginning to wonder whether I'm the only one who cares about
the Python 2.x branch not getting cluttered up with artifacts caused
by a broken forward merge strategy.

How can it be that we allow major C API changes such as the renaming
of the PyString APIs to go into the trunk without discussion or
a PEP ?

We're having lengthy discussions about the addition of single method
to an object, but such major changes just go in like that and nobody
seems to really care.

Puzzled,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 28 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania39 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611


On 2008-05-27 12:10, M.-A. Lemburg wrote:

On 2008-05-26 23:34, Christian Heimes wrote:

M.-A. Lemburg schrieb:

Isn't that an awefuly confusing approach ?

Wouldn't it be better to keep PyString APIs and definitions in
stringobject.c|h

and only add a new bytesobject.h header file that #defines the
PyBytes APIs in terms of PyString APIs ? That maintains
backwards compatibility and allows Python internals to use the
new API names.

With your approach, you've basically backported the confusing
notion in Py3k that str() maps PyUnicode, only that in Py2
str() will now map to PyBytes.


The last time I brought up the topic, I had a lengthy discussion with
Guido. At first I wanted to rename the API in Python 3.0 only. Guido
argued that it's going to cause too much merge conflicts. He then
suggested the approach I implemented today.


That's the same argument that came up in the module renaming
discussion.

I have a feeling that we should be looking for better merge
tools, rather than implement code changes that cause more trouble
than do good, just because our existing tools aren't smart
enough.

Wouldn't it be possible to have a 2to3.py converter
take the 2.x code (including the C code), convert it and then
apply any changes to the 3.x branch ?

This wouldn't be merging in the classical sense, it would be
automated forward porting.


I find the approach less confusing than your suggestion and my initial
idea.


I disagree on that.

Renaming old APIs to use the new names by adding a header file with
#define oldname newname is standard practice.

Renaming the old APIs in the source code and undoing the renaming
with a header file is not.


The internal API names are consistent for Python 2.6 and 3.0. The
byte string C API is prefixed PyBytes and the unicode C API is prefixed
PyUnicode. A core developer has just to remember that 'str' is a byte
string in 2.x but an unicode object in 3.0.


So you've solved part of the problem for 3.x by moving the naming mixup
back to 2.x.


Extension developers don't have to worry at all. The ABI and external
API is mostly the same and still exposes the 'str' functions as PyString.


Well, yes, but only due to a preprocessor hack that turns the
names used in bytesobject.c back into names you'd normally look
for in stringobject.c.

And all this, just because Subversion can't handle merging of
symbol renaming.


You'd have to add an aliase bytes - str to the builtins to
at least reduce the confusion a bit.


Python 2.6 already has an alias bytes - str


Yes, but please let's first discuss this some more. I don't think
that the timing was right you started this thread just yesterday
and the patches are already checked in.


I'm sorry if I was too hasty for you. I got +1 from a couple of
developers and it's basically Guido's suggestion.


Please discuss any changes of the 2.x code base on python-dev.

Such major changes do need more discussion and possibly a PEP as well.

Thanks,


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-05-28 Thread M.-A. Lemburg

On 2008-05-28 14:29, Paul Moore wrote:

On 28/05/2008, M.-A. Lemburg [EMAIL PROTECTED] wrote:

I'm beginning to wonder whether I'm the only one who cares about
the Python 2.x branch not getting cluttered up with artifacts caused
by a broken forward merge strategy.


I care, but I struggle to understand the implications and/or what is
being proposed in many cases.


Thanks, so I'm not the only :-)


Recent examples are the ABC backports and the current thread (string C
API). I simply don't follow the issues well enough to comment.


How can it be that we allow major C API changes such as the renaming
of the PyString APIs to go into the trunk without discussion or
a PEP ?


Christian has raised this a couple of times, but there has been little
discussion. I suspect that this is because there is not enough clarity
over the practical consequences. A PEP may help here, but I'm not sure
how much - it could spark discussion, but would anyone actually end up
any better informed?


Probably, yes.

The reason is that if you have a PEP, more people are likely to
review it and make comments.

If you start a discussion with a general subject line which then
results in lots of little sub-threads, important aspects of the
discussion are likely to go unnoticed in the noise.


We're having lengthy discussions about the addition of single method
to an object, but such major changes just go in like that and nobody
seems to really care.


I suspect deadline pressure and burnout are involved here.

In all honesty, there's been little or no work done on the C API,
which is just as much in need of review and possible cleanup for 3.0
as the language. It's as close as makes no difference to too late now
- does that mean we've lost the chance?


Perhaps, but the C API is certainly not used by as many people
as the Python front-end and changes to the C API also have much
deeper consequences due the API being written in C rather than
Python.

Overall, I don't think there's a lot to cleanup in the C API.
Perhaps remove a few of those '...Ex()' APIs that were introduced
to extend the original APIs and maybe remove or free up a few
type slots that are no longer needed, but that's about it.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 28 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania39 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


<    1   2   3   4   5   6   7   8   9   10   >