Re: [Python-Dev] A smarter shutil.copytree ?
On Mon, Apr 21, 2008 at 2:25 AM, Steven Bethard
<[EMAIL PROTECTED]> wrote:
>
> On Sun, Apr 20, 2008 at 4:15 PM, Tarek Ziadé <[EMAIL PROTECTED]> wrote:
> > I have submitted a patch for review here: http://bugs.python.org/issue2663
> >
> > glob-style patterns or a callable (for complex cases) can be provided
> > to filter out files or directories.
>
> I'm not a big fan of the sequence-or-callable argument. Why not just
> make it a callable argument, and supply a utility function so that you
> can write something like::
>
> exclude_func = shutil.excluding_patterns('*.tmp', 'test_dir2')
> shutil.copytree(src_dir, dst_dir, exclude=exclude_func)
>
> ?
I made another draft based on a single callable argument to try out:
http://bugs.python.org/file10073/shutil.copytree.filtering.patch
The callable takes the src directory + its content as a list, and
returns filter eligible for exclusion
That makes me wonder, like Alexander said on the bug tracker:
In the glob-style patterns callable, do we want to deal with absolute paths ?
Tarek
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] r62342 - python/branches/py3k/Objects/bytesobject.c
Neal Norwitz wrote: > I haven't seen any action on 3to2 (although I'm very behind on email). > Stefan, could you try to implement some of these and report back how > it works? No, sorry, that's too low a priority for me currently. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 2008-04-21 23:31, Martin v. Löwis wrote: This is useful when you get a hunk of data which _should_ be some sort of intelligible text from the Big Scary Internet (say, a posted web form or email message), and you want to do something useful with it (say, search the content). I don't think that should be part of the standard library. People will mistake what it tells them for certain. +1 I also think that it's better to educate people to add (correct) encoding information to their text data, rather than give them a guess mechanism... http://chardet.feedparser.org/docs/faq.html#faq.yippie chardet is based on the Mozilla algorithm and at least in my experience that algorithm doesn't work too well. The Mozilla algorithm may work for Asian encodings due to the fact that those encodings are usually also bound to a specific language (and you can then use character and word frequency analysis), but for encodings which can encode far more than just a single language (e.g. UTF-8 or Latin-1), the correct detection rate is rather low. The problem becomes completely even more difficult when leaving the normal text domain or when mixing languages in the same text, e.g. when trying to detect source code with comments using a non-ASCII encoding. The "trick" to just pass the text through a codec and see whether it roundtrips also doesn't necessarily help: Latin-1, for example, will always round-trip, since Latin-1 is a subset of Unicode. IMHO, more research has to be done into this area before a "standard" module can be added to the Python's stdlib... and who knows, perhaps we're lucky and by the time everyone is using UTF-8 anyway :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 22 2008) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] pydoc works with eggs? (python-2.5.1)
pydoc blew up when I tried to view doc for pytools module, which is an egg:
pydoc -p 8082
pydoc server ready at http://localhost:8082/
Exception happened during processing of request from ('127.0.0.1', 52915)
Traceback (most recent call last):
File "/usr/lib64/python2.5/SocketServer.py", line 222, in handle_request
self.process_request(request, client_address)
File "/usr/lib64/python2.5/SocketServer.py", line 241, in process_request
self.finish_request(request, client_address)
File "/usr/lib64/python2.5/SocketServer.py", line 254, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib64/python2.5/SocketServer.py", line 522, in __init__
self.handle()
File "/usr/lib64/python2.5/BaseHTTPServer.py", line 316, in handle
self.handle_one_request()
File "/usr/lib64/python2.5/BaseHTTPServer.py", line 310, in handle_one_request
method()
File "/usr/lib64/python2.5/pydoc.py", line 1924, in do_GET
self.send_document(describe(obj), html.document(obj, path))
File "/usr/lib64/python2.5/pydoc.py", line 321, in document
if inspect.ismodule(object): return self.docmodule(*args)
File "/usr/lib64/python2.5/pydoc.py", line 672, in docmodule
contents.append(self.document(value, key, name, fdict, cdict))
File "/usr/lib64/python2.5/pydoc.py", line 322, in document
if inspect.isclass(object): return self.docclass(*args)
File "/usr/lib64/python2.5/pydoc.py", line 807, in docclass
lambda t: t[1] == 'method')
File "/usr/lib64/python2.5/pydoc.py", line 735, in spill
funcs, classes, mdict, object))
File "/usr/lib64/python2.5/pydoc.py", line 323, in document
if inspect.isroutine(object): return self.docroutine(*args)
File "/usr/lib64/python2.5/pydoc.py", line 891, in docroutine
getdoc(object), self.preformat, funcs, classes, methods)
File "/usr/lib64/python2.5/pydoc.py", line 79, in getdoc
result = inspect.getdoc(object) or inspect.getcomments(object)
File "/usr/lib64/python2.5/inspect.py", line 521, in getcomments
lines, lnum = findsource(object)
File "/usr/lib64/python2.5/inspect.py", line 510, in findsource
if pat.match(lines[lnum]): break
IndexError: list index out of range
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
> IMHO, more research has to be done into this area before a > "standard" module can be added to the Python's stdlib... and > who knows, perhaps we're lucky and by the time everyone is > using UTF-8 anyway :-) I walked over to our computational linguistics group and asked. This is often combined with language guessing (which uses a similar approach, but using characters instead of bytes), and apparently can usually be done with high confidence. Of course, they're usually looking at clean texts, not random "stuff". I'll see if I can get some references and report back -- most of the research on this was done in the 90's. Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] configure error: "rm: conftest.dSYM: is a directory"
On 5 Apr, 2008, at 21:17, [EMAIL PROTECTED] wrote: I just noticed this error message during configure: checking whether gcc accepts -Olimit 1500... no checking whether gcc supports ParseTuple __format__... no checking whether pthreads are available without options... yes checking whether g++ also accepts flags for thread support... no checking for ANSI C header files... rm: conftest.dSYM: is a directory rm: conftest.dSYM: is a directory yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes Note the "rm: conftest.dSYM: is a directory". This occurred a few times during the configure process. Didn't cause it to conk out, but is annoying. I've looked into this issue. It is harmless and caused by an interaction between AC_TRY_RUN and gcc on leopard. Gcc generates '.dSYM' directories when linking with debugging enabled. These directories contain detached debugging information (see man dsymutil). AC_TRY_RUN tries to remove 'conftest.*' using rm, without the -r flag. The end result is an error message during configure and a 'config.dSYM' turd. AFAIK this not easily fixed without changing the definition of AC_TRY_RUN, at least not without crude hacks. Ronald Skip ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com smime.p7s Description: S/MIME cryptographic signature ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 22-Apr-08, at 12:30 AM, Martin v. Löwis wrote:
IMO, encoding estimation is something that many web programs will
have
to deal with
Can you please explain why that is? Web programs should not normally
have the need to detect the encoding; instead, it should be specified
always - unless you are talking about browsers specifically, which
need to support web pages that specify the encoding incorrectly.
Two cases come immediately to mind: email and web forms.
When a web browser POSTs data, there is no standard way of
communicating which encoding it's using. There are some hints which
make it easier (accept-charset attributes, the encoding used to send
the page to the browser), but no guarantees.
Email is a smaller problem, because it usually has a helpful content-
type header, but that's no guarantee.
Now, at the moment, the only data I have to support this claim is my
experience with DrProject in non-English locations.
If I'm the only one who has had these sorts of problems, I'll go back
to "Unicode for Dummies".
so it might as well be built in; I would prefer the option
to run `text=input.encode('guess')` (or something similar) than
relying
on an external dependency or worse yet using a hand-rolled algorithm.
Ok, let me try differently then. Please feel free to post a patch to
bugs.python.org, and let other people rip it apart.
For example, I don't think it should be a codec, as I can't imagine it
working on streams.
As things frequently are, it seems like this is a much larger problem
that I originally believed.
I'll go back and take another look at the problem, then come back if
new revelations appear.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
The 2002 paper "A language and character set determination method based on N-gram statistics" by Izumi Suzuki and Yoshiki Mikami and Ario Ohsato and Yoshihide Chubachi seems to me a pretty good way to go about this. They're looking at "LSE"s, language-script-encoding triples; a "script" is a way of using a particular character set to write in a particular language. Their system has these requirements: R1. the response must be either "correct answer" or "unable to detect" where "unable to detect" includes "other than registered" [the registered set of LSEs]; R2. Applicable to multi-LSE texts; R3. never accept a wrong answer, even when the program does not have enough data on an LSE; and R4. applicable to any LSE text. So, no wrong answers. The biggest disadvantage would seem to be that the registration data for a particular LSE is kind of bulky; on the order of 10,000 shift-codons, each of three bytes, about 30K uncompressed. http://portal.acm.org/ft_gateway.cfm?id=772759&type=pdf Bill > > IMHO, more research has to be done into this area before a > > "standard" module can be added to the Python's stdlib... and > > who knows, perhaps we're lucky and by the time everyone is > > using UTF-8 anyway :-) > > I walked over to our computational linguistics group and asked. This > is often combined with language guessing (which uses a similar > approach, but using characters instead of bytes), and apparently can > usually be done with high confidence. Of course, they're usually > looking at clean texts, not random "stuff". I'll see if I can get > some references and report back -- most of the research on this was > done in the 90's. > > Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A smarter shutil.copytree ?
On Tue, Apr 22, 2008 at 1:56 AM, Tarek Ziadé <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 21, 2008 at 2:25 AM, Steven Bethard <[EMAIL PROTECTED]> wrote:
> > On Sun, Apr 20, 2008 at 4:15 PM, Tarek Ziadé <[EMAIL PROTECTED]> wrote:
> > > I have submitted a patch for review here:
> http://bugs.python.org/issue2663
> > >
> > > glob-style patterns or a callable (for complex cases) can be provided
> > > to filter out files or directories.
> >
> > I'm not a big fan of the sequence-or-callable argument. Why not just
> > make it a callable argument, and supply a utility function so that you
> > can write something like::
> >
> > exclude_func = shutil.excluding_patterns('*.tmp', 'test_dir2')
> > shutil.copytree(src_dir, dst_dir, exclude=exclude_func)
>
> I made another draft based on a single callable argument to try out:
> http://bugs.python.org/file10073/shutil.copytree.filtering.patch
>
> The callable takes the src directory + its content as a list, and
> returns filter eligible for exclusion
FWIW, that looks better to me.
> That makes me wonder, like Alexander said on the bug tracker:
> In the glob-style patterns callable, do we want to deal with absolute paths ?
I think that it would be okay to document that
shutil.ignore_patterns() only accepts patterns matching individual
filenames (not complex paths). If someone needs to do something with
absolute paths, then they can write their own 'ignore' function,
right?
Steve
--
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
> When a web browser POSTs data, there is no standard way of communicating > which encoding it's using. That's just not true. Web browser should and do use the encoding of the web page that originally contained the form. > There are some hints which make it easier > (accept-charset attributes, the encoding used to send the page to the > browser), but no guarantees. Not true. The latter is guaranteed (unless you assume bugs - but if you do, can you present a specific browser that has that bug?) > Email is a smaller problem, because it usually has a helpful > content-type header, but that's no guarantee. Then assume windows-1252. Mailers who don't use MIME for non-ASCII characters mostly died 10 years ago; those people who continue to use them likely can accept occasional moji-bake (or else they would have switched long ago). > Now, at the moment, the only data I have to support this claim is my > experience with DrProject in non-English locations. > If I'm the only one who has had these sorts of problems, I'll go back to > "Unicode for Dummies". For web forms, I always encode the pages in UTF-8, and that always works. For email, I once added encoding processing to the pipermail (the mailman archiver), and that also always works. > I'll go back and take another look at the problem, then come back if new > revelations appear. Good luck! Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BSDDB3
Hi Jesus, > Martin v. Löwis wrote: > | I think it would be helpful if you could analyze the crashes that > | bsddb caused on Windows. Just go back a few revisions in the > | subversion tree to reproduce the crashes. > > I have no MS Windows machines in my environment :-( I remember those rampant BSDDB crashes on Windows well. I brought this up with Martin at PyCon; I really don't think we can fault BSDDB here -- basically, the tests weren't cleaning up their environment in the right order, so BSDDB was getting passed completely and utterly bogus values. I *think* I managed to persuade Martin that this was indeed our fault, and we can't really hold BSDDB accountable. (My argument being that if a 3rd party app says the behaviour of a method is undefined if you pass it a null pointer, and you pass it a null pointer, and it crashes your program, it's your fault, not theirs.) Once this was addressed, the BSDDB tests ran more or less on Windows 32-bit without error. Windows x64 was another matter though -- I traced the problem down to wildly conflicting compiler and linker flags between our Python build and how we were building BSDDB (or rather how BSDDB builds out of the box on Windows). My solution was to drop our reliance on the Berkeley_DB.sln/db_static.vcproj files completely, and mimic a bsddb44 vcproj in our own pcbuild.sln, which basically meant all the BSDDB source code got built in the exact same fashion as the rest of Python. I also took this approach with sqlite3 and it's worked really well -- there have been no issues with either module since this change. I've also got bsddb45.vcproj and bsddb46.vcproj projects floating around in one of my local branches somewhere. These mimic the corresponding BSDDB projects, with the intent being that when it comes to release time for 2.6 and 3.0, we'd make a decision about which one to ship with, and then set the Python _bsddb module to use that. I should probably pick that up again... Hope this clarifies things... Regards, Trent. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3k checkin mails to python-checkins
> Since a few days, checkin notifications for the 3k branch seem to be sent > to both the python-checkins and the python-3000-checkins lists. Was that a > deliberate decision or has some bug crept into the SVN hook? This should be fixed now. The new mailer.py had named some config options differently from the old one. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 22-Apr-08, at 3:31 AM, M.-A. Lemburg wrote: I don't think that should be part of the standard library. People will mistake what it tells them for certain. +1 I also think that it's better to educate people to add (correct) encoding information to their text data, rather than give them a guess mechanism... That is a fallacious alternative: the programmers that need encoding detection are not the same people who are omitting encoding information. I only have a small opinion on whether charset detection should appear in the stdlib, but I am somewhat perplexed by the arguments in this thread. I don't see how inclusion in the stdlib would make people more inclined to think that the algorithm is always correct. In terms of the need of this functionality: Martin wrote: Can you please explain why that is? Web programs should not normally have the need to detect the encoding; instead, it should be specified always - unless you are talking about browsers specifically, which need to support web pages that specify the encoding incorrectly. Any program that needs to examine the contents of documents/feeds/ whatever on the web needs to deal with incorrectly-specified encodings (which, sadly, is rather common). The set of programs of programs that need this functionality is probably the same set that needs BeautifulSoup--I think that set is larger than just browsers -Mike ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
[CCing python-dev again] On 2008-04-22 12:38, Greg Wilson wrote: I don't think that should be part of the standard library. People will mistake what it tells them for certain. [etc] These are all good arguments, but the fact remains that we can't control our inputs (e.g., we're archiving mail messages sent to lists managed by DrProject), and some of those inputs *don't* tell us how they're encoded. Under those circumstances, what would you recommend? I haven't done much research into this, but in general, I think it's better to: * first try to look at other characteristics of a text message, e.g. language, origin, topic, etc., * then narrow down the number of encodings which could apply, * rank them to try to avoid ambiguities and * then try to see what percentage of the text you can decode using each of the encodings in reverse ranking order (ie. more specialized encodings should be tested first, latin-1 last). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Known doctest bug with unicode?
-On [20080418 18:05], Adam Olsen ([EMAIL PROTECTED]) wrote: >4. Make doctest smarter, so that it can grab the original module's encoding. >5. Wait until 3.0, where this is hopefully fixed by making doctests >use unicode by default? Getting rid of the u in front of the strings as required made Python 3 indeed run the doctests as they should. So there's a difference in behaviour between 2.x and 3.0 when it comes to this part. I guess the better behaviour would be for doctest to honour the encoding specified in the file/module? If other people agree I can see what I can to make that work. -- Jeroen Ruigrok van der Werven / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Confutatis maledictis, flammis acribus addictis... ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
>> Can you please explain why that is? Web programs should not normally >> have the need to detect the encoding; instead, it should be specified >> always - unless you are talking about browsers specifically, which >> need to support web pages that specify the encoding incorrectly. > > Any program that needs to examine the contents of > documents/feeds/whatever on the web needs to deal with > incorrectly-specified encodings That's not true. Most programs that need to examine the contents of a web page don't need to guess the encoding. In most such programs, the encoding can be hard-coded if the declared encoding is not correct. Most such programs *know* what page they are webscraping, or else they couldn't extract the information out of it that they want to get at. As for feeds - can you give examples of incorrectly encoded one (I don't ever use feeds, so I honestly don't know whether they are typically encoded incorrectly. I've heard they are often XML, in which case I strongly doubt they are incorrectly encoded) As for "whatever" - can you give specific examples? > (which, sadly, is rather common). The > set of programs of programs that need this functionality is probably the > same set that needs BeautifulSoup--I think that set is larger than just > browsers Again, can you give *specific* examples that are not web browsers? Programs needing BeautifulSoup may still not need encoding guessing, since they still might be able to hard-code the encoding of the web page they want to process. In any case, I'm very skeptical that a general "guess encoding" module would do a meaningful thing when applied to incorrectly encoded HTML pages. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] py3k: print function treats sep=None and end=None in an unintuitive way
Can anybody please point me why print('a', 'b', sep=None, end=None) should
produce "a b\n" instead of "ab"?
I've read http://docs.python.org/dev/3.0/library/functions.html#print, pep-3105
and some
ml threads but did not find a good reason justifying such a strange behaviour.
Thanks.
-Alessandro Guido
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] New project : Spyke python-to-C compiler
Hello, Your project sounds very interesting! I would be happy to send you some test code -- perhaps you could give me some more details about what you're looking for. (E.g. *small,* self-contained python/numpy functions.) Also, are you aware of Shed Skin? "Shed Skin is an experimental Python-to-C++ compiler. It can convert pure, but implicitly statically typed Python programs into optimized C++ code." http://shed-skin.blogspot.com/ Best, James -- --- James Coughlan, Ph.D., Scientist The Smith-Kettlewell Eye Research Institute Email: [EMAIL PROTECTED] URL: http://www.ski.org/Rehab/Coughlan_lab/ Phone: 415-345-2146 Fax: 415-345-8455 --- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Tue, 2008-04-08 at 10:01 -0700, zooko wrote: > They both agreed that it made perfect sense. I told one of them > about the alternate proposal to define a new database file to > contain > a list of installed packages, and he sighed and rolled his eyes and > said "So they are planning to reinvent apt!". When I wear my sysadmin hat, eggs become a nuisance. They are not listed in the system packages; if zipped they won't work when the apache user tries to import them; easy_install can produce unexpected upgrades. The system package manager (apt or yum) is much preferred. As a developer, eggs are great. If a python module is not already available from my system packagers, easy_install will find it, get it, and install it. I waste almost no time with system administration issues while developing. Fortunately, distutils includes tools like bdist_rpm so that python modules can be packaged for easy processing by the system package manager. So once I need to switch back to a sysadmin role, I can use the system tools to install and track packages. -- Lloyd Kvam Venix Corp DLSLUG/GNHLUG library http://www.librarything.com/catalog/dlslug http://www.librarything.com/profile/dlslug http://www.librarything.com/rsshtml/recent/dlslug ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, Apr 09, 2008 at 11:37:07AM +1000, Ben Finney wrote: > zooko <[EMAIL PROTECTED]> writes: > > I am skeptical that prorgammers are going to be willing to use a new > > database format. They already have a database -- their filesystem -- > > and they already have the tools to control it -- mv, rm, and > > PYTHONPATH. Many of them already hate the existence the > > "easy_instlal.pth" database file, and I don't see why a new database > > file would be any different. > Moreover, many of us already have a database of *all* packages on the > system, not just Python-language ones: the package database of our > operating system. Adding another, parallel, database which needs > separate maintenance, and only applies to Python packages, is not a > step forward in such a situation. 90 % (at least) of the world does not have such database. I, and probably you, have such a very nice database. I works well, and we can choose to forget the problems our users are facing. It does not solve them though. In addition, packaging is system-specific. I recently had to learn some Debian packaging, because I wanted my Ubuntu and Debian users to be able to use my projects seamlessly. What about RPMs for RHEL, Fedora, Mandriva? ... and coronary packages? and MSIs? ... When do I find time to do development if I have to learn all this packaging. It would be fantastic to have an abstraction on all these packaging systems, including, as you point out, their database. I do agree that reusing the system packaging's database is great, and would be the best option for system-wide install. However one of the very neat features of setuptools and eggs is that you don't need administrator access to install the packages, and that is great in a shared environment, like a computation cluster. The system's database is thus unfortunately not a complete solution to the problem. My 2 cents, Gaël ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, Apr 09, 2008 at 12:41:32AM -0400, Phillip J. Eby wrote: > >The way to achieve a database for Python would be to provide tools for > >conversion of eggs to rpms and debs, > Such tools already exist, although the conversion takes place from > source distributions rather than egg distributions. What is the status of the deb backend? The only one I know is unofficial maintained by Andrew Straw, but my information my be lagging behind. By the way, if these tools work well, they are priceless! Cheers, Gaël ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, April 9, 2008 12:41 am, Phillip J. Eby wrote: > At 10:49 PM 4/8/2008 -0400, Stanley A. Klein wrote: >>On Tue, April 8, 2008 9:37 pm, Ben Finney >><[EMAIL PROTECTED]> wrote: >> > Date: Wed, 09 Apr 2008 11:37:07 +1000 >> > From: Ben Finney <[EMAIL PROTECTED]> >> > Subject: Re: [Distutils] how to easily consume just the parts of eggs >> > thatare good for you >> > To: [EMAIL PROTECTED] >> > >> > >> > zooko <[EMAIL PROTECTED]> writes: >> >> eyes and said "So they are planning to reinvent apt!". >> > >> > That's pretty much my reaction, too. >> >>I have the same reaction. > > I'm curious. Have any of you actually read PEP 262 in any detail? I > have seen precious little discussion so far that doesn't appear to be > based on significant misunderstandings of either the purpose of > reviving the PEP, or the mechanics of its proposed implementation. I haven't read the PEP at all. I generally don't read PEP's. > > >>I have tried in the past to use easy_install, but have run into problems >>because there is no communication between easy_install and the rpm >>database, resulting in failure of easy_install to recognize that >>dependencies have already been installed using rpms. > > This problem doesn't exist with Python 2.5, unless you're using a > platform that willfully strips out the installation information that > Python 2.5 provides for these packages. > IIRC, I have had the problem with Python 2.5 on Fedora 7. Until recently, Fedora packagers did strip out the egg information included with Python packages they packaged. I left those files in when packaging myself using bdist_rpm. However, are you implying that the installation information for Python egg packages accesses and coordinates with the rpm database? I found myself having to go into the setup.py for the relevant package(s) and delete any statements regarding dependencies. Otherwise, IIRC, the packaging couldn't proceed because the Python packaging tool couldn't find the dependencies that had already been installed as rpms. After installation, Python managed to find the relevant files, but the packaging tool couldn't. > >>A database focused only on Python packages is highly inappropriate for >>Linux systems, violates the Linux standards, and creates problems because >>eggs are not coordinated with the operating system package manager. > > The revamp of PEP 262 is aimed at removing .egg files and directories > from the process, by allowing system packagers to tell Python what > files belong to them and should not be messed with. And conversely, > allowing systems and installation targets *without* package managers > to safely manage their Python installations. IMHO, the main system without a package manager is Windows. A reasonable way to deal with Windows would be to create a package manager for it that could be used by Python and anyone else who wanted to use it. The package manager could establish a file hierarchy similar to the Unix FHS and install files appropriately, except for what is needed to satisfy the Windows OS. That would probably go a long way to addressing the issues being discussed here. This is primarily a Windows problem, not a Python problem. > > >> The >>way to achieve a database for Python would be to provide tools for >>conversion of eggs to rpms and debs, > > Such tools already exist, although the conversion takes place from > source distributions rather than egg distributions. > You are talking here about bdist_rpm and not about a tool that would take a Python package distributed as an egg file and convert the egg to an rpm or a deb. Unfortunately, some Python packagers are beginning to limit their focus only to egg distribution. That creates a problem for users who have native operating system package management. > >>to have eggs support conformance to >>the LSB and FHS, > > Applying LSB and FHS to the innards of Python packages makes as much > sense as applying them to the contents of Java .jar files -- i.e., > none. If it's unchanging data that's part of a program or library, > then it's a program or library, just like static data declared in a C > program or library. Whether the file extension is .py, .so, or even > .png is irrelevant. The FHS defines places to put specific kinds of files, such as command scripts (/bin, /usr/bin, /sbin, or /usr/sbin), documentation (/usr/share/doc/package-name), and configuration files (/etc). There are several kinds of files identified and places defined to put them. Distribution by eggs has a tendency to scoop up all of those files and put them in /usr/lib/python/site-packages, regardless of where they belong. Having eggs support conformance to FHS would mean recognizing and tagging the relevant files. A tool for converting eggs to rpms or debs would essentially reformat the egg to rpm or deb and put files where they belong. Stan Klein ___ Python-Dev mailing list [email protected] htt
Re: [Python-Dev] Python Leopard DLL Hell
I have learned that this is a specific behavior of OS X. I have submitted a formal bug report to Apple about the problem. It appears that this is documented by Apple as acceptable: http://developer.apple.com/documentation/DeveloperTools/Reference/MachOReference/Reference/reference.html#//apple_ref/c/func/dlopen Whereas, linux will respect the fact you gave it a specific shared library: http://linux.die.net/man/3/dlopen If I am provided a workaround by apple I will post a python patch. A little scary that someone can circumvent my application by just setting an environment variable. -Brian Cole On Tue, Apr 8, 2008 at 7:52 PM, Michael Torrie <[EMAIL PROTECTED]> wrote: > Brian Cole wrote: > > That appears to be working correctly at first glance. The argument to > > dlopen is the correct shared library. Unfortunately, either python or > > OS X is lying to me here. If I inspect the python process with OS X's > > Activity Monitor and look at the "Open Files and Ports" tab, it shows > > that the _foo.so shared library is actually the one located inside > > $DYLD_LIBRARY_PATH. > > > > So this problem may not be python's, but I place it here as a first > > shot (maybe I'm using the imp module incorrectly). > > Sounds like you're going to need to learn how to use dtrace. Then you > can more closely monitor exactly what python and the loader are doing. > dtrace is very complicated (borrowed from Solaris) but extremely > powerful. Worth learning anyway, but sounds like it's probably the > debugging tool you need. > > Another thing you can do is check through the python source code and see > how the os x-specific code is handling this type of situation. > > -- > http://mail.python.org/mailman/listinfo/python-list > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
All my development is done on Linux. I use Windows very minimally (such as for tax preparation) and unless forced to do so for specific circumstances (such as submittal to grants.gov) do not expose Windows to the Internet. In the future there may possibly arise a need for us to port some Linux-developed Python code to Windows, but we will have to cross that bridge when we get there. I think you raise an interesting issue: What is a package manager? I have minimal experience installing packages on Windows over the last 5-10 years, but in my experience a Windows package comes as an executable that, when run, installs itself. Unless a third-party program monitors the installation, uninstalling is a nasty chore, as is finding out what files were installed or where they went. The rpm and deb package managers (and their yum and other higher level dependency managers) do a lot of things: 1. They install packages and maintain databases of what packages were installed 2. They manage dependencies 3. They support clean uninstalling of packages 4. They can query packages, both installed (via their databases) and not yet installed (e.g., as rpm or deb files), to determine attributes, such as files they install, dependencies, and other information defined at packaging time. 5. They build packages and (in some cases) can rebuild packages. 6. They can verify packages for integrity and security purposes. 7. They can download package files and maintain archives of installed package files for use as local repositories. There may be other functions, but the above is a top-of-the-head list. I can say that I'm not terribly happy with Python packaging that is only minimally compatible with rpm. I haven't used Ubuntu all that much. I do like Ubuntu's packaging and package management, and I do know that there are programs, such as alias, that can translate from rpm to deb formats. I don't think I ever said that Windows is broken in the area of package management. My own experience is that the files of Windows programs tend to be put in a directory devoted to the program, rather than put in directories with other files having similar purposes. At one time, the default location in Windows for word processing files was even in a sub-directory of the word processing program. That changed to having a form of user home directory, but it didn't change much for the program files themselves. Unix/Linux puts the files in specific areas of the file system having functional commonality. One could almost say that the Windows default approach to structuring its filesystem avoids or minimizes the need for package management. I repeat that this issue mainly arises because Windows doesn't have the same kind of filesystem structure (and therefore the need for package management) that other systems have. I don't know what Windows add/remove programs function does, but all it might do is to run the executable to install packages and record the installation (as was previously done by third party programs) to facilitate clean removal. Unless you can perform more of the other functions I listed above, I doubt I would call add/remove a package manager. Stan Klein On Wed, April 9, 2008 1:23 pm, Paul Moore wrote: > On 09/04/2008, Stanley A. Klein <[EMAIL PROTECTED]> wrote: >> IMHO, the main system without a package manager is Windows. A >> reasonable >> way to deal with Windows would be to create a package manager for it >> that >> could be used by Python and anyone else who wanted to use it. The >> package >> manager could establish a file hierarchy similar to the Unix FHS and >> install files appropriately, except for what is needed to satisfy the >> Windows OS. That would probably go a long way to addressing the issues >> being discussed here. This is primarily a Windows problem, not a >> Python >> problem. > > Windows does have a package manager - the add/remove programs > application. It's extremely limited, and doesn't make any attempt at > doing dependency resolution, certainly - but that's a separate issue. > > I don't know if you use Windows (as in, develop programs using Python > on Windows). If you do, then I'd be interested in your views on > bdist_wininst and bdist_msi installers, and how they fit into the > setuptools/egg environment, particularly with regard to the package > manager you are proposing. If you don't use Windows, then I don't see > how you can usefully comment. > > Personally, as I've said before, I don't have a problem with a > Python-only package manager, as long as it replaces or integrates > bdist_wininst and bdist_msi. Having two package managers is far worse > than having none - and claiming that add/remove programs "isn't a > package manager" is just ignoring reality (if it isn't, then why do > bdist_wininst and bdist_msi exist?). > > Are the Linux users happy with having a Python package manager that > ignores RPM/apt? Why should Windows users be any happier? > > Sorry - I'm f
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, Apr 09, 2008 at 02:26:31PM -0400, Stanley A. Klein wrote: > The rpm and deb package managers (and their yum and other higher level > dependency managers) do a lot of things: > 1. They install packages and maintain databases of what packages were > installed > 2. They manage dependencies > 3. They support clean uninstalling of packages > 4. They can query packages, both installed (via their databases) and not > yet installed (e.g., as rpm or deb files), to determine attributes, such > as files they install, dependencies, and other information defined at > packaging time. > 5. They build packages and (in some cases) can rebuild packages. > 6. They can verify packages for integrity and security purposes. > 7. They can download package files and maintain archives of installed > package files for use as local repositories. You are collapsing three different functionalities in one: * Dealing with repositories and downloading: yum/apt * Installing + uninstalling packages, and dealing with system consistency (thus checking the dependencies are available): rpm/dpkg * Building For me it is important that the 3 are separated: * I may want to download the dependencies of a package to burn to a CD for a computer that does not have internet access. * I may want to send a tarball to a build server that does the building, but no install (so as not to corrupt my working system). Cheers, Gaël ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, April 9, 2008 3:19 pm, Gael Varoquaux wrote: > On Wed, Apr 09, 2008 at 02:26:31PM -0400, Stanley A. Klein wrote: >> The rpm and deb package managers (and their yum and other higher level >> dependency managers) do a lot of things: > >> 1. They install packages and maintain databases of what packages were >> installed >> 2. They manage dependencies >> 3. They support clean uninstalling of packages >> 4. They can query packages, both installed (via their databases) and >> not >> yet installed (e.g., as rpm or deb files), to determine attributes, such >> as files they install, dependencies, and other information defined at >> packaging time. >> 5. They build packages and (in some cases) can rebuild packages. >> 6. They can verify packages for integrity and security purposes. >> 7. They can download package files and maintain archives of installed >> package files for use as local repositories. > > You are collapsing three different functionalities in one: > > * Dealing with repositories and downloading: yum/apt > > * Installing + uninstalling packages, and dealing with system >consistency (thus checking the dependencies are available): rpm/dpkg > > * Building > > For me it is important that the 3 are separated: > > * I may want to download the dependencies of a package to burn to a CD > for a computer that does not have internet access. > > * I may want to send a tarball to a build server that does the building, > but no install (so as not to corrupt my working system). > > Cheers, > > Gaël > Gael - The functionalities are combined in programs but are not necessarily required to be used all at the same time. I'm not that familiar with apt, but yum also installs, including downloading both a package and its dependencies. Yum also has a query capability (yum list, yum info). I think synaptic does the same thing yum does, and adds a GUI and search capabilities similar to yum info as well. The build capabilities of rpm were moved to rpmbuild, but the building remains part of the rpm system. IIRC, bdist_rpm actually calls rpmbuild as part of its processing. Also, IIRC, rpmbuild can build from a tarball if it contains an rpm spec. It does not install in the same process. That is a separate step. You would not corrupt your working system by building an rpm from a tarball on it. BTW, I would not want to do dependencies with rpm if yum is available. Doing dependencies with rpm is very difficult and it is easy to wind up in "dependency hell". Yum will find the dependencies and install them as long as they are in repositories that are registered in the yum configuration. I looked at "man yum" and couldn't find an option to download dependencies to the local repository without installing. However, if you did install a package and its dependencies, and if you have selected the option of retaining the cache and not cleaning it after installation, the rpms (e.g., for updates) are in /var/cache/yum/updates/packages/. They can be copied from there to a CD for a system without internet connectivity. Also, both Fedora and Ubuntu have software for building installable live CD's, although I don't know how they get their package files. Stan Klein ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, April 9, 2008 3:40 pm, Phillip J. Eby wrote: > At 11:52 AM 4/9/2008 -0400, Stanley A. Klein wrote: >>However, are you implying that the installation information for Python >> egg >>packages accesses and coordinates with the rpm database? > > Yes, when the information isn't stripped out. Try a more recent Fedora. > > >>IMHO, the main system without a package manager is Windows. > > You're ignoring shared environments and development > environments. (Not to mention Mac OS.) > I don't understand what you mean by "shared environments and development environments". I also don't know much about Mac OS, except that its underlying Darwin system is a version of BSD (that I assume would follow the Unix FHS). > >> A reasonable >>way to deal with Windows would be to create a package manager for it that >>could be used by Python and anyone else who wanted to use it. > > Let us know when you've finished it, along with the one for Mac OS. :) I have enough trouble with what I'm already doing. :-) > > Of course this still won't do anything for shared environments and > development environments. > > >>You are talking here about bdist_rpm and not about a tool that would take >>a Python package distributed as an egg file and convert the egg to an rpm >>or a deb. Unfortunately, some Python packagers are beginning to limit >>their focus only to egg distribution. That creates a problem for users >>who have native operating system package management. > > That is indeed a problem -- but it's a social one, not a technical > one. It's trivial for the publisher of an egg to change their > command line from "setup.py bdist_egg upload" to "setup.py sdist > bdist_egg upload", as soon as their users (politely) request that they do > so. > I agree that we are dealing with a combination of technical and social issues here. However, I think it takes a lot more understanding for a publisher to get everything straight. > >> > Applying LSB and FHS to the innards of Python packages makes as much >> > sense as applying them to the contents of Java .jar files -- i.e., >> > none. If it's unchanging data that's part of a program or library, >> > then it's a program or library, just like static data declared in a C >> > program or library. Whether the file extension is .py, .so, or even >> > .png is irrelevant. >> >>The FHS defines places to put specific kinds of files, such as command >>scripts (/bin, /usr/bin, /sbin, or /usr/sbin), documentation >>(/usr/share/doc/package-name), and configuration files (/etc). There are >>several kinds of files identified and places defined to put them. >>Distribution by eggs has a tendency to scoop up all of those files and >> put >>them in /usr/lib/python/site-packages, regardless of where they belong. > > Eggs don't include documentation or configuration files, and they > install scripts in script directories, so I don't get what you're > talking about here. For any other data that a package accesses at > runtime, my earlier comments apply. > But rpms and debs do include these files, plus manual pages, localization files and a lot of other ancillary stuff. IIRC, you once mentioned that you have a CENTOS system. Do an "rpm -qa |sort|less" to get an alphabetized list of your installed packages, and then an "rpm -qil" on some of the packages, and you will see the range of different kinds of files in there. > >>Having eggs support conformance to FHS would mean recognizing and tagging >>the relevant files. A tool for converting eggs to rpms or debs would >>essentially reformat the egg to rpm or deb and put files where they >>belong. > > No, because such files as you describe don't exist. If you think > they do, then either you have misunderstood the nature of the files > in question, or the developer has incorrectly placed non-runtime > files in their installation tree. > Most of the Python tarballs I have downloaded have all kinds of files in their installation trees. This is a major pain in the you-know-what for someone trying to use bdist_rpm and get proper, FHS-compliant rpms. If eggs are supposed to be strictly runtime files, I think very few developers actually understand that. Better yet, how do you define what should be included in an installation? It sounds like the egg concept doesn't include several kinds of files that rpm and deb would include in an installation. I think that may be an important issue here. Stan Klein ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, Apr 09, 2008 at 11:46:19PM +0100, Paul Moore wrote: > I find this whole discussion hugely confusing, because a lot of people > are stating opinions about environments which it seems they don't use, > or know much about. I don't know how to avoid this, but it does make > it highly unlikely that any practical progress will get made. I find that something that doesn't help at all the discussion move forward is that everybody has different usecases in mind, on different platforms, and is not interested in other people's usecases. Hopefuly I am wrong, Cheers, Gaël ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, Apr 09, 2008 at 11:52:08PM +0100, Paul Moore wrote: > And I would say that Windows doesn't have a problem. Are any Windows > users proposing building a package management system for Windows > (Python-specific or otherwise)? It's a genuine question - is this > something that Windows users are after, or is it just Linux users > trying to show Windows users what they are missing? Well, users don't phrase this that way, because they don't know what package management (or rather automatic dependency tracking) is, but yes, they are some usecases. It is nowadays really tedious to deploy Python applications making uses of many packages on Python. The scientific community is a domain in which this problem is crucial, as we are trying to ship desktop applications to non-computer-savy people, with many dependencies outside the standard library. Enthought is working on shipping a Python distribution with some sort of package management for this purpose ( see http://code.enthought.com/enstaller/ ), and finding it is not an easy problem. Cheers, Gael ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PyArg_ParseTuple and Py_BuildValue question
Hello fellow pythonistas,
I'm currently writing a simple python SCTP module in C. So far it works
sending and receiving strings from it. The C sctp function sctp_sendmsg()
has been wrapped and my function looks like this:
sendMessage(PyObject *self, PyObject *args)
{
const char *msg = "";
if (!PyArg_ParseTuple(args, "s", &msg))
return NULL;
snprintf(buffer, 1025, msg);
ret = sctp_sendmsg(connSock, (void *)buffer, (size_t)strlen(buffer), 0, 0,
0x0300, 0, 0, 0, 0);
return Py_BuildValue("b", "");
}
I'm going to construct an SS7 packet in python using struct.pack(). Here's
the question, how am I going to pass the packet I wrote in python to my
module and back? I already asked this question in comp.lang.python but so
far no responses yet. I hope anyone can point me to the right direction.
Thanks in advance.
---
Alvin Delagon
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] how to easily consume just the parts of eggs that are good for you
On Wed, 2008-04-09 at 18:17 -0500, Dave Peterson wrote: > I think I can sum up any further points by simply asking: "Should it > be safe to assume I can distribute my application via eggs / > easy_install just because it is written in Python?" I think that based on this discussion the bottom line answer to this question is "No". Stan Klein On Wed, 2008-04-09 at 18:17 -0500, Dave Peterson wrote: I think I can sum up any further points by simply asking: "Should it be safe to assume I can distribute my application via eggs / easy_install just because it is written in Python?" I think that based on this discussion the bottom line answer to this question is "No". Stan Klein ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Patch submitted for xmlrpclib
I submitted a patch a few days ago (http://bugs.python.org/issue2623) to fix a datetime parameter formatting issue (See issue) I was wondering if this was adequate and whether it could be included in future releases. Thank you. -- Leonard Clark ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] SetType=set in types module ?
Hi, the SetType is not available in the "types" module, so wouldn't it be needed here ? (in 2.6 by example) I guess the change is really simple and would be backward compatible : adding SetType = set ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Global Python Sprint Weekends: May 10th-11th and June 21st-22nd.
Anyone in Melbourne, Australia keen for the first sprint? I'm not sure if I'll be available, but if I can it'd be great to work with some others. Failing that, it's red bull and pizza in my lounge room :) I've been working on some neat code for an AST optimizer. If I'm free that weekend, I'll probably continue my work on that. Cheers, T Trent Nelson wrote: Following on from the success of previous sprint/bugfix weekends and sprinting efforts at PyCon 2008, I'd like to propose the next two Global Python Sprint Weekends take place on the following dates: * May 10th-11th (four days after 2.6a3 and 3.0a5 are released) * June 21st-22nd (~week before 2.6b2 and 3.0b2 are released) It seems there are a few of the Python User Groups keen on meeting up in person and sprinting collaboratively, akin to PyCon, which I highly recommend. I'd like to nominate Saturday across the board as the day for PUGs to meet up in person, with Sunday geared more towards an online collaboration day via IRC, where we can take care of all the little things that got in our way of coding on Saturday (like finalising/preparing/reviewing patches, updating tracker and documentation, writing tests ;-). For User Groups that are planning on meeting up to collaborate, please reply to this thread on [email protected] and let every- one know your intentions! As is commonly the case, #python-dev on irc.freenode.net will be the place to be over the course of each sprint weekend; a large proportion of Python developers with commit access will be present, increasing the amount of eyes available to review and apply patches. For those that have an idea on areas they'd like to sprint on and want to look for other developers to rope in (or just to communicate plans in advance), please also feel free to jump on this thread via python-dev@ and indicate your intentions. For those that haven't the foggiest on what to work on, but would like to contribute, the bugs tracker at http://bugs.python.org is the best place to start. Register an account and start searching for issues that you'd be able to lend a hand with. All contributors that submit code patches or documentation updates will typically get listed in Misc/ACKS.txt; come September when the final release of 2.6 and 3.0 come about, you'll be able to point at the tarball or .msi and exclaim loudly ``I helped build that!'', and actually back it up with hard evidence ;-) Bring on the pizza and Red Bull! Trent. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] socket recv on win32 can be extremly delayed (python bug?)
hello,
I tried to implement a simple python XMLRPC service on a win32
environment (client/server code inserted below).
The profiler of the client told me, that a simple function call needs
about 200ms (even if I run it in a loop, the time needed per call stays
the same).
After analysing the problem with etherreal I found out, that the XMLRPC
request is transmitted via two TCP packets. One containing the HTTP
header and one containting the data. But the acknowledge to the first
TCP packet is delayed by 200ms.
I tried around on the server side and found out that if the server reads
exactly all bytes transfered in the first TCP frame (via socket.recv()),
the next socket.recv(), even if reading only one byte, needs about 200
ms. But if I read one byte less than transfered in the first TCP frame
and then reading 2 bytes (socket.recv(2)) there is no delay, although
the same total amount of data was read.
After some googling I found the website
http://support.microsoft.com/?scid=kb%3Ben-us%3B823764&x=12&y=15, which
proposed a workaround (modifing the registryentry for the tcp/ip driver)
that did work. But modifing the clients registry settings is no option
for us.
Is there anybody who nows how to solve the problem? Or is it even a
problem if the python socket implementation?
By the way: I testet Win2000 SP4 and WinXP SP2 with Python 2.3.3 and
Python 2.5.1 each.
CLIENT:
--
import xmlrpclib
import profile
server = xmlrpclib.ServerProxy("http://server:80";)
profile.run('server.test(1,2)')
SERVER:
--
import SimpleXMLRPCServer
def test(a,b): return a+b
server = SimpleXMLRPCServer.SimpleXMLRPCServer( ('', 80) )
server.register_function(test)
server.serve_forever()
--
Mit freundlichen Grüßen,
Best Regards,
Robert Hölzl
BALTECH AG
Firmensitz: Lilienthalstrasse 27, D-85399 Hallbergmoos
Registergericht: Amtsgericht München, HRB 115215
Vorstand: Jürgen Rösch (Vorsitzender), Martina M. Schuster
Aufsichtsratsvorsitzende: Eva Zeising
begin:vcard
fn;quoted-printable:Robert H=C3=B6lzl
n;quoted-printable:H=C3=B6lzl;Robert
org:Baltech AG;Development
adr:;;Lilienthalstrasse 27;Hallbergmoos;;85399;Germany
email;internet:[EMAIL PROTECTED]
title:Mr.
tel;work:+49 (811) 99 88 1-18
tel;fax:+49 (811) 99 88 1-11
note;quoted-printable:Registergericht: Amtsgericht M=C3=BCnchen, HRB 115215=0D=0A=
Vorstand: Martina Schuster-R=C3=B6sch=0D=0A=
Vorstandsvorsitzender: J=C3=BCrgen R=C3=B6sch=0D=0A=
Aufsichtsratsvorsitzende: Eva Zeising
x-mozilla-html:TRUE
url:http://www.baltech.de
version:2.1
end:vcard
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Security Advisory for unicode repr() bug?
___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Known doctest bug with unicode?
> So there's a difference in behaviour between 2.x and 3.0 when it comes to > this part. I guess the better behaviour would be for doctest to honour the > encoding specified in the file/module? If other people agree I can see what > I can to make that work. I'm fairly skeptical that you can make that work, whether or not it's a good idea. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 2008-04-22 18:33, Bill Janssen wrote: The 2002 paper "A language and character set determination method based on N-gram statistics" by Izumi Suzuki and Yoshiki Mikami and Ario Ohsato and Yoshihide Chubachi seems to me a pretty good way to go about this. Thanks for the reference. Looks like the existing research on this just hasn't made it into the mainstream yet. Here's their current project: http://www.language-observatory.org/ Looks like they are focusing more on language detection. Another interesting paper using n-grams: "Language Identification in Web Pages" by Bruno Martins and Mário J. Silva http://xldb.fc.ul.pt/data/Publications_attach/ngram-article.pdf And one using compression: "Text Categorization Using Compression Models" by Eibe Frank, Chang Chui, Ian H. Witten http://portal.acm.org/citation.cfm?id=789742 They're looking at "LSE"s, language-script-encoding triples; a "script" is a way of using a particular character set to write in a particular language. Their system has these requirements: R1. the response must be either "correct answer" or "unable to detect" where "unable to detect" includes "other than registered" [the registered set of LSEs]; R2. Applicable to multi-LSE texts; R3. never accept a wrong answer, even when the program does not have enough data on an LSE; and R4. applicable to any LSE text. So, no wrong answers. The biggest disadvantage would seem to be that the registration data for a particular LSE is kind of bulky; on the order of 10,000 shift-codons, each of three bytes, about 30K uncompressed. http://portal.acm.org/ft_gateway.cfm?id=772759&type=pdf For a server based application that doesn't sound too large. Unless you're using a very broad scope, I don't think that you'd need more than a few hundred LSEs for a typical application - nothing you'd want to put in the Python stdlib, though. Bill IMHO, more research has to be done into this area before a "standard" module can be added to the Python's stdlib... and who knows, perhaps we're lucky and by the time everyone is using UTF-8 anyway :-) I walked over to our computational linguistics group and asked. This is often combined with language guessing (which uses a similar approach, but using characters instead of bytes), and apparently can usually be done with high confidence. Of course, they're usually looking at clean texts, not random "stuff". I'll see if I can get some references and report back -- most of the research on this was done in the 90's. Bill -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 22 2008) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
> > When a web browser POSTs data, there is no standard way of communicating > > which encoding it's using. > > That's just not true. Web browser should and do use the encoding of the > web page that originally contained the form. Since the site that receives the POST doesn't necessarily have access to the Web page that originally contained the form, that's not really helpful. However, POSTs can use the MIME type "multipart/form-data" for non-Latin-1 content, and should. That contains facilities for indicating the encoding and other things as well. Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyArg_ParseTuple and Py_BuildValue question
On Wed, Apr 9, 2008 at 8:23 PM, Alvin Delagon <[EMAIL PROTECTED]> wrote: > > I'm going to construct an SS7 packet in python using struct.pack(). Here's > the question, how am I going to pass the packet I wrote in python to my > module and back? I already asked this question in comp.lang.python but so > far no responses yet. I hope anyone can point me to the right direction. > Thanks in advance. What exactly is your problem? -- Cheers, Benjamin Peterson ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] SetType=set in types module ?
On Wed, Apr 16, 2008 at 8:08 AM, iks hefem <[EMAIL PROTECTED]> wrote: > Hi, > > the SetType is not available in the "types" module, so wouldn't it be > needed here ? (in 2.6 by example) Nothing new is currently being added to the types module because we are trying to decide whether to remove it or not. Please file a bug report, though, to remind us if we decide to keep it. -- Cheers, Benjamin Peterson ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
> Unless you're using a very broad scope, I don't think that > you'd need more than a few hundred LSEs for a typical > application - nothing you'd want to put in the Python stdlib, > though. I tend to agree with this (and I'm generally in favor of putting everything in the standard library!). For those of us doing document-processing applications (Martin, it's not just about Web browsers), this would be a very useful package to have up on PyPI. Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 22-Apr-08, at 2:16 PM, Martin v. Löwis wrote: Any program that needs to examine the contents of documents/feeds/whatever on the web needs to deal with incorrectly-specified encodings That's not true. Most programs that need to examine the contents of a web page don't need to guess the encoding. In most such programs, the encoding can be hard-coded if the declared encoding is not correct. Most such programs *know* what page they are webscraping, or else they couldn't extract the information out of it that they want to get at. I certainly agree that if the target set of documents is small enough it is possible to hand-code the encoding. There are many applications, however, that need to examine the content of an arbitrary, or at least non-small set of web documents. To name a few such applications: - web search engines - translation software - document/bookmark management systems - other kinds of document analysis (market research, seo, etc.) As for feeds - can you give examples of incorrectly encoded one (I don't ever use feeds, so I honestly don't know whether they are typically encoded incorrectly. I've heard they are often XML, in which case I strongly doubt they are incorrectly encoded) I also don't have much experience with feeds. My statement is based on the fact that chardet, the tool that has been cited most in this thread, was written specifically for use with the author's feed parsing package. As for "whatever" - can you give specific examples? Not that I can substantiate. Documents & feeds covers a lot of what is on the web--I was only trying to make the point that on the web, whenever an encoding can be specified, it will be specified incorrectly for a significant chunk of exemplars. (which, sadly, is rather common). The set of programs of programs that need this functionality is probably the same set that needs BeautifulSoup--I think that set is larger than just browsers Again, can you give *specific* examples that are not web browsers? Programs needing BeautifulSoup may still not need encoding guessing, since they still might be able to hard-code the encoding of the web page they want to process. Indeed, if it is only one site it is pretty easy to work around. My main use of python is processing and analyzing hundreds of millions of web documents, so it is pretty easy to see applications (which I have listed above). I think that libraries like Mark Pilgrim's FeedParser and BeautifulSoup are possible consumers of guessing as well. In any case, I'm very skeptical that a general "guess encoding" module would do a meaningful thing when applied to incorrectly encoded HTML pages. Well, it does. I wish I could easily provide data on how often it is necessary over the whole web, but that would be somewhat difficult to generate. I can say that it is much more important to be able to parse all the different kinds of encoding _specification_ on the web (Content-Type/Content-Encoding/malformed cases of these. I can also think of good arguments for excluding encoding detection for maintenance reasons: is every case of the algorithm guessing wrong a bug that needs to be fixed in the stdlib? That is an unbounded commitment. -Mike ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] socket recv on win32 can be extremly delayed (python bug?)
Hi,
This is not a python-specific problem. See
http://en.wikipedia.org/wiki/Nagle's_algorithm
-Mike
On 17-Apr-08, at 3:08 AM, Robert Hölzl wrote:
hello,
I tried to implement a simple python XMLRPC service on a win32
environment (client/server code inserted below).
The profiler of the client told me, that a simple function call
needs about 200ms (even if I run it in a loop, the time needed per
call stays the same).
After analysing the problem with etherreal I found out, that the
XMLRPC request is transmitted via two TCP packets. One containing
the HTTP header and one containting the data. But the acknowledge to
the first TCP packet is delayed by 200ms.
I tried around on the server side and found out that if the server
reads exactly all bytes transfered in the first TCP frame (via
socket.recv()), the next socket.recv(), even if reading only one
byte, needs about 200 ms. But if I read one byte less than
transfered in the first TCP frame and then reading 2 bytes
(socket.recv(2)) there is no delay, although the same total amount
of data was read.
After some googling I found the website http://support.microsoft.com/?scid=kb%3Ben-us%3B823764&x=12&y=15
, which proposed a workaround (modifing the registryentry for the
tcp/ip driver) that did work. But modifing the clients registry
settings is no option for us.
Is there anybody who nows how to solve the problem? Or is it even a
problem if the python socket implementation?
By the way: I testet Win2000 SP4 and WinXP SP2 with Python 2.3.3 and
Python 2.5.1 each.
CLIENT:
--
import xmlrpclib
import profile
server = xmlrpclib.ServerProxy("http://server:80";)
profile.run('server.test(1,2)')
SERVER:
--
import SimpleXMLRPCServer
def test(a,b): return a+b
server = SimpleXMLRPCServer.SimpleXMLRPCServer( ('', 80) )
server.register_function(test)
server.serve_forever()
--
Mit freundlichen Grüßen,
Best Regards,
Robert Hölzl
BALTECH AG
Firmensitz: Lilienthalstrasse 27, D-85399 Hallbergmoos
Registergericht: Amtsgericht München, HRB 115215
Vorstand: Jürgen Rösch (Vorsitzender), Martina M. Schuster
Aufsichtsratsvorsitzende: Eva Zeising
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/mike.klaas%40gmail.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] python hangs when parsing a bad-formed email
Hello,
Alberto Casado Martín wrote:
> Hi all,
> First of all, sorry if this isn't the list where I have to post this.
> And sorry for my english.
>
> As the subject says, I'm having problems with the attached email, when
> I try to get a email object reading the attached file, the python
> process gets hang and gets all cpu.
>
> I have debuged my code to find where it happens, and I found that is
> _parsegen method of the FeedParser class. I know that the email format
> is wrong but I don't know why python hangs.
>
> following paste the code showing where hangs.
[snip]
> bash-3.00$ python
> Python 2.5.1 (r251:54863, Feb 28 2008, 07:48:25)
> [GCC 3.4.6] on sunos5
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import email
> >>> fp = open('raro.txt')
> >>> mail = email.message_from_file(fp)
> never return
When you think you found a problem with python, please submit an issue
in the python issue tracker:
http://bugs.python.org/
In your case, I suspect some regular expression trying to match all
the empty lines of the message, one character at a time.
--
Amaury Forgeot d'Arc
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] GSoC student introduction and sandbox commit privileges request
Hi there, I've just been accepted into this year's Google Summer of Code, to work for the Python Software Foundation on 2to3. My project is to give 2to3 fixers the ability to rank how confident they are on each fix, and let users choose to intervene manually whenever that confidence level is below a certain threshold. Among other things, this might allow fixers for situations where the code translation is not always guaranteed to be correct (like % string formatting, which came up recently in another thread). The full proposal is at http://isnomore.net/2to3 . Collin Winter will be my mentor, and I'd like to thank him and Christian Heimes for all the help they gave me in designing the project. I'd also like to thank Martin Löwis, for discussing a project with me which ended up not turning into a proposal, but helped me write the 2to3 one. Finally, I'd like to request commit privileges to work on a sandbox branch, during the Summer of Code. If you have any further questions, please feel free to contact me. I'm really looking forward to working on this project! Cheers, rbp -- Rodrigo Bernardo Pimentel <[EMAIL PROTECTED]> | GPG: <0x0DB14978> ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] python hangs when parsing a bad-formed email
"Amaury Forgeot d'Arc" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | When you think you found a problem with python, please submit an issue | in the python issue tracker: |http://bugs.python.org/ Or post to comp.lang.python / python mailing list / gmane.comp.python.general ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
> Yup, but DrProject (the target application) also serves as a relay and > archive for email. We have no control over the agent used for > composition, and AFAIK there's no standard way to include encoding > information. Greg, Internet-compliant email actually has well-specified mechanisms for including encoding information; see RFCs 2047 and 2231. There's no need to guess; you can just look. Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GSoC student introduction and sandbox commit privileges request
On Tue, Apr 22, 2008 at 4:35 PM, Rodrigo Bernardo Pimentel <[EMAIL PROTECTED]> wrote: > Hi there, > > I've just been accepted into this year's Google Summer of Code, to work for > the Python Software Foundation on 2to3. My project is to give 2to3 fixers > the ability to rank how confident they are on each fix, and let users choose > to intervene manually whenever that confidence level is below a certain > threshold. Among other things, this might allow fixers for situations where > the code translation is not always guaranteed to be correct (like % string > formatting, which came up recently in another thread). The full proposal is > at http://isnomore.net/2to3 . > > Collin Winter will be my mentor, and I'd like to thank him and Christian > Heimes for all the help they gave me in designing the project. I'd also like > to thank Martin Löwis, for discussing a project with me which ended up not > turning into a proposal, but helped me write the 2to3 one. > > Finally, I'd like to request commit privileges to work on a sandbox branch, > during the Summer of Code. > Isn't this a chance for bzr to shine? With lib2to3 in the 3.0 bzr branch, can't Rodrigo and the other students who don't have some funky requirement just use bzr? > If you have any further questions, please feel free to contact me. I'm > really looking forward to working on this project! Thanks for contributing! -Brett ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GSoC student introduction and sandbox commit privileges request
On Tue, Apr 22 2008 at 09:02:49PM BRT, Brett Cannon <[EMAIL PROTECTED]> wrote: > On Tue, Apr 22, 2008 at 4:35 PM, Rodrigo Bernardo Pimentel > <[EMAIL PROTECTED]> wrote: > > I've just been accepted into this year's Google Summer of Code (...) > > Finally, I'd like to request commit privileges to work on a sandbox > > branch, during the Summer of Code. > Isn't this a chance for bzr to shine? With lib2to3 in the 3.0 bzr > branch, can't Rodrigo and the other students who don't have some funky > requirement just use bzr? FWIW, +1 from me, I'm perfectly comfortable with bzr. > > If you have any further questions, please feel free to contact me. I'm > > really looking forward to working on this project! > > Thanks for contributing! My pleasure :) Cheers, rbp -- Rodrigo Bernardo Pimentel <[EMAIL PROTECTED]> | GPG: <0x0DB14978> ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] SetType=set in types module ?
Benjamin Peterson schrieb: > On Wed, Apr 16, 2008 at 8:08 AM, iks hefem <[EMAIL PROTECTED]> wrote: >> Hi, >> >> the SetType is not available in the "types" module, so wouldn't it be >> needed here ? (in 2.6 by example) > > Nothing new is currently being added to the types module because we > are trying to decide whether to remove it or not. Please file a bug > report, though, to remind us if we decide to keep it. Eventually the types module will go away or at least be stripped down in Python 3.0. New types like the set type weren't added to types deliberately. Please don't file a bug report. Christian ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyArg_ParseTuple and Py_BuildValue question
Alvin Delagon schrieb: > I'm going to construct an SS7 packet in python using struct.pack(). Here's > the question, how am I going to pass the packet I wrote in python to my > module and back? I already asked this question in comp.lang.python but so > far no responses yet. I hope anyone can point me to the right direction. > Thanks in advance. The Python developer list is meant for the development OF Python, not WITH Python. Please use the general Python user list to get help. Christian ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] GSoC Student Introduction
Hello, My name is Nick Edds. I am going to be working on the 2to3 tool with Collin Winter as my mentor. More specifically, I will be working on improving the performance of the 2to3 tool in general, and its use of patterns in particular. I would like to request commit privileges to work in a sandbox branch and although I don't have any familiarity with bzr, I would be comfortable using it. Regards, Nick ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GSoC student introduction and sandbox commit privileges request
On Tue, Apr 22, 2008 at 5:18 PM, Rodrigo Bernardo Pimentel <[EMAIL PROTECTED]> wrote: > On Tue, Apr 22 2008 at 09:02:49PM BRT, Brett Cannon <[EMAIL PROTECTED]> wrote: > > On Tue, Apr 22, 2008 at 4:35 PM, Rodrigo Bernardo Pimentel > > <[EMAIL PROTECTED]> wrote: > > > > > I've just been accepted into this year's Google Summer of Code > (...) > > > > Finally, I'd like to request commit privileges to work on a sandbox > > > branch, during the Summer of Code. > > > Isn't this a chance for bzr to shine? With lib2to3 in the 3.0 bzr > > branch, can't Rodrigo and the other students who don't have some funky > > requirement just use bzr? > > FWIW, +1 from me, I'm perfectly comfortable with bzr. Fine by me; I don't care one way or the other. Collin > > > If you have any further questions, please feel free to contact me. I'm > > > really looking forward to working on this project! > > > > Thanks for contributing! > > My pleasure :) > > Cheers, > > > rbp > -- > Rodrigo Bernardo Pimentel <[EMAIL PROTECTED]> | GPG: <0x0DB14978> > > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GSoC Student Introduction
On Tue, Apr 22, 2008 at 7:42 PM, Nick Edds <[EMAIL PROTECTED]> wrote: > Hello, > > My name is Nick Edds. I am going to be working on the 2to3 tool with Collin > Winter as my mentor. More specifically, I will be working on improving the > performance of the 2to3 tool in general, and its use of patterns in > particular. > I would like to request commit privileges to work in a sandbox branch and > although I don't have any familiarity with bzr, I would be comfortable using > it. Luckily, Bazaar is really easy. Thanks for contributing! -- Cheers, Benjamin Peterson ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BSDDB3
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Trent Nelson wrote: | I remember those rampant BSDDB crashes on Windows well. [...] | basically, the tests weren't cleaning up their environment in | the right order, so BSDDB was getting passed completely and | utterly bogus values. Next week I will (if nothing goes wrong) publish pybsddb 4.6.4. This release supports distributed transactions and replication, testsuite is way faster, and rewritten to be able to launch tests from multiple threads/processes if you wish, setuptools/pypi support, etc. I think this release would be appropiate to integrate in Python. I think most demands are solved and new features are interesting (replication, distributed transactions, do not crash when closing objects in the wrong order...). Also, I completed the documentation, with the full supported API, and ported it to Python 2.6 documentation system. The result: http://www.jcea.es/programacion/pybsddb.htm#bsddb3-4.6.4 http://www.jcea.es/programacion/pybsddb_doc/preview/ I'm very interested in integrating this release in Python 2.6 for the new features, the full documentation, and to get feedback from Buildbot and python-dev community. Also, I would like to avoid to integrate pybsddb late in the python 2.6 release cycle; I hope to be away of my computer in August! :). I'm a bit nervous about syncing, because I have the feeling that python-dev is committing changes to python private branch of pybsddb. I would rather prefer patches send to me and integrate "canonical" pybsddb releases in Python frequently. Somebody suggested to post patches in the tracker, but I think this is not going to work. The diff from current python bsddb and the official version is so huge that nobody could follow it. A more sensible approach, I think, is to "diff" current python pybsddb against the version I used as my root (January?), integrate the changes in current "canonical" pybsddb and, then, drop the entire updated package into python. Then, commits to python pybsddb should be avoided; patches should be send to me. I think this is the only way when integrating a project outside python SVN. Suggestions?. PS: I can't comment on Win64. It is an alien world to me :). - -- Jesus Cea Avion _/_/ _/_/_/_/_/_/ [EMAIL PROTECTED] - http://www.jcea.es/ _/_/_/_/ _/_/_/_/ _/_/ jabber / xmpp:[EMAIL PROTECTED] _/_/_/_/ _/_/_/_/_/ ~ _/_/ _/_/_/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/_/_/ _/_/_/_/ _/_/ "My name is Dump, Core Dump" _/_/_/_/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBSA6oeJlgi5GaxT1NAQItswP+KR15vZWbnYZ23WQHoUozVOWvf+ghG2Q8 acVhCwJajzvxOEfozRMZRmQkPUBmWga1zbHjkHt5c196vku7+X0bDc7aO4T2jRHx 00PbPLGnYth972elTVFfSWpZVNkX/9A4EbtTHVCav105nW+u1/Kod/rY5fzgKcTn SxYkmk4Ax7U= =98uc -END PGP SIGNATURE- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GSoC Student Introduction
On Tue, Apr 22, 2008 at 7:38 PM, Benjamin Peterson <[EMAIL PROTECTED]> wrote: > > On Tue, Apr 22, 2008 at 7:42 PM, Nick Edds <[EMAIL PROTECTED]> wrote: > > Hello, > > > > My name is Nick Edds. I am going to be working on the 2to3 tool with Collin > > Winter as my mentor. More specifically, I will be working on improving the > > performance of the 2to3 tool in general, and its use of patterns in > > particular. > > > I would like to request commit privileges to work in a sandbox branch and > > although I don't have any familiarity with bzr, I would be comfortable > using > > it. > > Luckily, Bazaar is really easy. > See http://python.org/dev/bazaar/ for info. And if you have any other issues feel free to ask, Nick. -Brett ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
Bill Janssen writes: > Internet-compliant email actually has well-specified mechanisms for > including encoding information; see RFCs 2047 and 2231. There's no > need to guess; you can just look. You must be very special to get only compliant email. About half my colleagues use RFC 2047 to encode Japanese file names in MIME attachments (a MUST NOT behavior according to RFC 2047), and a significant fraction of the rest end up with binary Shift JIS or EUC or MacRoman in there. And those are just the most widespread violations I can think of off the top of my head. Not to mention that I find this: =?X-UNKNOWN?Q?Martin_v=2E_L=F6wis?= <[EMAIL PROTECTED]>, in the header I got from you. (I'm not ragging on you, I get Martin's name wrong a significant portion of the time myself. :-( ) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
"Martin v. Löwis" writes: > In any case, I'm very skeptical that a general "guess encoding" > module would do a meaningful thing when applied to incorrectly > encoded HTML pages. That depends on whether you can get meaningful information about the language from the fact that you're looking at the page. In the browser context, for one, 99.44% of users are monolingual, so you only have to distinguish among the encodings for their language. In this context a two stage process of determining a category of encoding (eg, ISO 8859, ISO 2022 7-bit, ISO 2022 8-bit multibyte, UTF-8, etc), and then picking an encoding from the category according to a user-specified configuration has served Emacs/MULE users very well for about 20 years. It does *not* work in a context where multiple encodings from the same category are in use (eg, the email folder of a Polish Gastarbeiter in Berlin). Nonetheless it is pretty useful for user agents like mail clients, web browsers, and editors. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
Guido van Rossum writes: > To the contrary, an encoding-guessing module is often needed, and > guessing can be done with a pretty high success rate. Other Unicode > libraries (e.g. ICU) contain guessing modules. I suppose the API could > return two values: the guessed encoding and a confidence indicator. > Note that the locale settings might figure in the guess. Not locale settings, but user configuration. A Bayesian detector (CodeBayes? hi, Skip!) might be a good way to go for servers, while a simple language preference might really up the probability for user agents. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
> Yup, but DrProject (the target application) also serves as a relay and > archive for email. We have no control over the agent used for > composition, and AFAIK there's no standard way to include encoding > information. That's not at all the case. MIME defines that in full detail, since 1993. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
> I certainly agree that if the target set of documents is small enough it > is possible to hand-code the encoding. There are many applications, > however, that need to examine the content of an arbitrary, or at least > non-small set of web documents. To name a few such applications: > > - web search engines > - translation software I'll question whether these are "many" programs. Web search engines and translation software have many more challenges to master, and they are fairly special-cased, so I would expect they need to find their own answer to character set detection, anyway (see Bill Janssen's answer on machine translation, also). > - document/bookmark management systems > - other kinds of document analysis (market research, seo, etc.) Not sure what specifically you have in mind, however, I expect that these also have their own challenges. For example, I would expect that MS-Word documents are frequent. You don't need character set detection there (Word is all Unicode), but you need an API to look into the structure of .doc files. > Not that I can substantiate. Documents & feeds covers a lot of what is > on the web--I was only trying to make the point that on the web, > whenever an encoding can be specified, it will be specified incorrectly > for a significant chunk of exemplars. I firmly believe this assumption is false. If the encoding comes out of software (which it often does), it will be correct most of the time. It's incorrect only if the content editor has to type it. > Indeed, if it is only one site it is pretty easy to work around. My > main use of python is processing and analyzing hundreds of millions of > web documents, so it is pretty easy to see applications (which I have > listed above). Ok. What advantage would you (or somebody working on a similar project) gain if chardet was part of the standard library? What if it was not chardet, but some other algorithm? > I can also think of good arguments for excluding encoding detection for > maintenance reasons: is every case of the algorithm guessing wrong a bug > that needs to be fixed in the stdlib? That is an unbounded commitment. Indeed, that's what I meant with my initial remark. People will expect that it works correctly - both with the consequence of unknowingly proceeding with the incorrect response, and then complaining when they find out that it did produce an incorrect answer. For chardet specifically, my usual standard-library remark applies: it can't become part of the standard library unless the original author contributes it, anyway. I would then hope that he or a group of people would volunteer to maintain it, with the threat of removing it from the stdlib again if these volunteers go away and too many problems show up. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
""Martin v. Löwis"" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] |> I certainly agree that if the target set of documents is small enough it | | Ok. What advantage would you (or somebody working on a similar project) | gain if chardet was part of the standard library? What if it was not | chardet, but some other algorithm? It seems to me that since there is not a 'correct' algorithm but only competing heuristics, encoding detection modules should be made available via PyPI and only be considered for stdlib after a best of breed emerges with community support. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] socket recv on win32 can be extremly delayed (python bug?)
> Is there anybody who nows how to solve the problem? If it's really the problem described in the MSKB article, the article also suggests a solution: use non-blocking sockets. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GSoC Student Introduction
> See http://python.org/dev/bazaar/ for info. And if you have any other > issues feel free to ask, Nick. I certainly can't speak for the respective mentors, but I feel that "use bazaar" really isn't the right answer to "can I get commit access?" One motivation for GSoC is also community bonding, and having the mentor (but not *only* the mentor) comment on the proposed changes, and monitor the progress of the project. That the development branch sits on the student's laptop doesn't really help in that process. Instead, the student would have to push the branch somewhere to a web-visible location. Now I question whether it's the student's obligation to find a server himself, or whether the mentoring org should provide the infrastructure (or, failing that, Google (*)). So I think an answer to the question above involving bazaar might be "yes, but please don't commit to subversion, but only to the bazaar repository". Regards, Martin (*) FWIW, Google does provide the infrastructure; students are encouraged (required?) to commit their work to code.google.com. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BSDDB3
> I'm a bit nervous about syncing, because I have the feeling that > python-dev is committing changes to python private branch of pybsddb. I > would rather prefer patches send to me and integrate "canonical" pybsddb > releases in Python frequently. My understanding was that the pybsddb copy in Python *is* the official code base. I think it's unfortunate that the pybsddb project was revived, even though it was already dead. Perhaps we should remove bsddb from Python again, and refer people to pybsddb instead? > Somebody suggested to post patches in the tracker, but I think this is > not going to work. The diff from current python bsddb and the official > version is so huge that nobody could follow it. A more sensible > approach, I think, is to "diff" current python pybsddb against the > version I used as my root (January?), integrate the changes in current > "canonical" pybsddb and, then, drop the entire updated package into python. -1. Who is going to do the 3k porting? > I think this is the only way when integrating a project outside python > SVN. Suggestions?. The usual solution is to not integrate then, at all. Python doesn't really ship with any libraries that also have an active life outside of Python. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
