Re: The RAISE_VARARGS opcode in Python 3
Arnaud Delobelle wrote: Here is an extract from the dis module doc [1] RAISE_VARARGS(argc) Raises an exception. argc indicates the number of parameters to the raise statement, ranging from 0 to 3. The handler will find the traceback as TOS2, the parameter as TOS1, and the exception as TOS. OTOH, looking at PEP 3109: In Python 3, the grammar for raise statements will change from [2] raise_stmt: 'raise' [test [',' test [',' test]]] to raise_stmt: 'raise' [test] So it seems that RAISE_VARARGS's argument can only be 0 or 1 in Python 3, whereas it could be up to 3 in Python 2. It can be up to 2 in Python 3, help(raise) The ``raise`` statement *** raise_stmt ::= raise [expression [from expression]] ... confirmed by a quick look into ceval.c: TARGET(RAISE_VARARGS) v = w = NULL; switch (oparg) { case 2: v = POP(); /* cause */ case 1: w = POP(); /* exc */ case 0: /* Fallthrough */ why = do_raise(w, v); break; default: PyErr_SetString(PyExc_SystemError, bad RAISE_VARARGS oparg); why = WHY_EXCEPTION; break; } break; Can anyone confirm that this is the case? In this case, I guess the dis docs need to be updated. Indeed. -- http://mail.python.org/mailman/listinfo/python-list
comp.lang.python is a high-volume Usenet open (not moderated) newsgroup for general discussions and questions about Python. You can also access it as a ...
comp.lang.python · Discussions · + new post · About this group · Subscribe to this group. This is a Usenet group - learn more. View this group in the new Google ... http://123maza.com/65/purple505/ -- http://mail.python.org/mailman/listinfo/python-list
Re: The RAISE_VARARGS opcode in Python 3
On 27 August 2011 07:49, Peter Otten __pete...@web.de wrote: Arnaud Delobelle wrote: Here is an extract from the dis module doc [1] RAISE_VARARGS(argc) Raises an exception. argc indicates the number of parameters to the raise statement, ranging from 0 to 3. The handler will find the traceback as TOS2, the parameter as TOS1, and the exception as TOS. OTOH, looking at PEP 3109: In Python 3, the grammar for raise statements will change from [2] raise_stmt: 'raise' [test [',' test [',' test]]] to raise_stmt: 'raise' [test] So it seems that RAISE_VARARGS's argument can only be 0 or 1 in Python 3, whereas it could be up to 3 in Python 2. It can be up to 2 in Python 3, help(raise) The ``raise`` statement *** raise_stmt ::= raise [expression [from expression]] ... confirmed by a quick look into ceval.c: TARGET(RAISE_VARARGS) v = w = NULL; switch (oparg) { case 2: v = POP(); /* cause */ case 1: w = POP(); /* exc */ case 0: /* Fallthrough */ why = do_raise(w, v); break; default: PyErr_SetString(PyExc_SystemError, bad RAISE_VARARGS oparg); why = WHY_EXCEPTION; break; } break; Thanks again, Peter! I'm out of Python practice, and I've forgotten some things like help(keyword). Also PEP 3109 does indeed mention the raise .. from .. syntax in an example at the end. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
how to format long if conditions
Hi all, I'm wondering what advice you have about formatting if statements with long conditions (I always format my code to 80 colums) Here's an example taken from something I'm writing at the moment and how I've formatted it: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) What would you do? -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
Arnaud Delobelle wrote: Hi all, I'm wondering what advice you have about formatting if statements with long conditions (I always format my code to 80 colums) Here's an example taken from something I'm writing at the moment and how I've formatted it: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) What would you do? I believe that PEP 8 now suggests something like this: if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): ) py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) I consider that hideous and would prefer to write this: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) Or even this: tmp = ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]) ) if tmp: py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) But perhaps the best solution is to define a helper function: def is_next(left, right): Returns True if right is the next PyCompare to left. return (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]) # PEP 8 version left as an exercise. # later... if is_next(left, right): py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
On 27/08/11 09:08:20, Arnaud Delobelle wrote: I'm wondering what advice you have about formatting if statements with long conditions (I always format my code to80 colums) Here's an example taken from something I'm writing at the moment and how I've formatted it: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) What would you do? I would break after the '(' and indent the condition once and put the '):' bit on a separate line, aligned with the 'if': if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] ): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) It may look ugly, but it's very clear where the condition part ends and the 'then' part begins. -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: Python IDE/Eclipse
On Aug 26, 5:18 pm, Dave Boland dbola...@fastmail.fm wrote: I'm looking for a good IDE -- easy to setup, easy to use -- for Python. Any suggestions? I use Eclipse for other projects and have no problem with using it for Python, except that I can't get PyDev to install. It takes forever, then produces an error that makes no sense. An error occurred while installing the items session context was:(profile=PlatformProfile, phase=org.eclipse.equinox.internal.provisional.p2.engine.phases.Install, operand=null -- [R]org.eclipse.cvs 1.0.400.v201002111343, action=org.eclipse.equinox.internal.p2.touchpoint.eclipse.actions.InstallBu ndleAction). Cannot connect to keystore. This trust engine is read only. The artifact file for osgi.bundle,org.eclipse.cvs,1.0.400.v201002111343 was not found. Any suggestions on getting this to work? Thanks, Dave I use Aptana Studio 3, it's pretty good and it's based on eclipse -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
Hans Mulder wrote: [...] It may look ugly, but it's very clear where the condition part ends and the 'then' part begins. Immediately after the colon, surely? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Catch and name an exception in Python 2.5 +
On 27/08/11 05:45, Steven D'Aprano wrote: Thomas Jollans wrote: On 26/08/11 21:56, Steven D'Aprano wrote: Is there any way to catch an exception and bind it to a name which will work across all Python versions from 2.5 onwards? I'm pretty sure there isn't, but I thought I'd ask just in case. It's not elegant, and I haven't actually tested this, but this should work: try: ... except (ValueError, KeyError): error = sys.exc_info()[2] Great! Thanks for that, except I think you want to use [1], not [2]. Ah, yes. Of course. -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
On 27 August 2011 08:24, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Arnaud Delobelle wrote: Hi all, I'm wondering what advice you have about formatting if statements with long conditions (I always format my code to 80 colums) Here's an example taken from something I'm writing at the moment and how I've formatted it: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) What would you do? I believe that PEP 8 now suggests something like this: if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): ) py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) I consider that hideous and would prefer to write this: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) Or even this: tmp = ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]) ) if tmp: py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) But perhaps the best solution is to define a helper function: def is_next(left, right): Returns True if right is the next PyCompare to left. return (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]) # PEP 8 version left as an exercise. # later... if is_next(left, right): py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) Thanks Steven and Hans for you suggestions. For this particular instance I've decided to go for a hybrid approach: * Add two methods to PyCompare: class PyCompare(PyExpr): ... def extends(self, other): if not isinstance(other, PyCompare): return False else: return self.complist[0] == other.complist[-1] def chain(self, other): return PyCompare(self.complist + other.complist[1:]) * Rewrite the if as: if isinstance(right, PyCompare) and right.extends(left): py_and = left.chain(right) else: py_and = PyBooleanAnd(left, right) The additional benefit is to hide the implementation details of PyCompare (I suppose this could illustrate the thread on when to create functions). -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
On 27/08/11 11:05:25, Steven D'Aprano wrote: Hans Mulder wrote: [...] It may look ugly, but it's very clear where the condition part ends and the 'then' part begins. Immediately after the colon, surely? On the next line, actually :-) The point is, that this layout makes it very clear that the colon isn't in its usual position (at the end of the line that starts with 'if') and it is clearly visible. With the layout Arnaud originally propose, finding the colon takes longer. (Arnaud has since posted a better approach, in which the colon is back in its usual position.) -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: Run time default arguments
On Thursday, August 25, 2011 1:54:35 PM UTC-7, ti...@thsu.org wrote: On Aug 25, 10:35 am, Arnaud Delobelle arn...@gmail.com wrote: You're close to the usual idiom: def doSomething(debug=None): if debug is None: debug = defaults['debug'] ... Note the use of 'is' rather than '==' HTH Hmm, from what you are saying, it seems like there's no elegant way to handle run time defaults for function arguments, meaning that I should probably write a sql-esc coalesce function to keep my code cleaner. I take it that most people who run into this situation do this? I don't; it seems kind of superfluous when if arg is not None: arg = whatever is just as easy to type and more straightforward to read. I could see a function like coalesce being helpful if you have a list of several options to check, though. Also, SQL doesn't give you a lot of flexibility, so coalesce is a lot more needed there. But for simple arguments in Python, I'd recommend sticking with if arg is not None: arg = whatever Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes: I believe that PEP 8 now Specifically the “Indentation” section contains:: When using a hanging indent the following considerations should be applied; there should be no arguments on the first line and further indentation should be used to clearly distinguish itself as a continuation line. suggests something like this: if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): ) py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) That gives a SyntaxError. I think you mean one of these possible PEP 8 compliant forms:: if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) or maybe:: if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] ): py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) I consider that hideous I think both of those (once modified to conform to both the Python syntax and the PEP 8 guidelines) look clear and readable. I mildy prefer the first for being a little more elegant, but the second is slightly better for maintainability and reducing diff noise. Either one makes me happy. and would prefer to write this: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:] else: py_and = PyBooleanAnd(left, right) That one keeps tripping me up because the indentation doesn't make clear where subordinate clauses begin and end. The (current) PEP 8 rules are much better for readability in my eyes. Having said that, I'm only a recent convert to the current PEP 8 style for indentation of condition clauses. It took several heated arguments with colleagues before I was able to admit the superiority of clear indentation :-) -- \ “I am too firm in my consciousness of the marvelous to be ever | `\ fascinated by the mere supernatural …” —Joseph Conrad, _The | _o__) Shadow-Line_ | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
[ANN] Oktest 0.9.0 released - a new-style testing library
Hi, I released Oktest 0.9.0. http://pypi.python.org/pypi/Oktest/ http://packages.python.org/Oktest/ Oktest is a new-style testing library for Python. :: from oktest import ok, NG ok (x) 0 # same as assert_(x 0) ok (s) == 'foo'# same as assertEqual(s, 'foo') ok (s) != 'foo'# same as assertNotEqual(s, 'foo') ok (f).raises(ValueError) # same as assertRaises(ValueError, f) ok (u'foo').is_a(unicode) # same as assert_(isinstance(u'foo', unicode)) NG (u'foo').is_a(int) # same as assert_(not isinstance(u'foo', int)) ok ('A.txt').is_file() # same as assert_(os.path.isfile('A.txt')) NG ('A.txt').is_dir() # same as assert_(not os.path.isdir('A.txt')) See http://packages.python.org/Oktest/ for details. NOTICE!! Oktest is a young project and specification may change in the future. Main Enhancements - * New '@test' decorator provided. It is simple but very powerful. Using @test decorator, you can write test description in free text instead of test method. ex:: class FooTest(unittest.TestCase): def test_1_plus_1_should_be_2(self): # not cool... self.assertEqual(2, 1+1) @test(1 + 1 should be 2)# cool! easy to read write! def _(self): self.assertEqual(2, 1+1) * Fixture injection support by '@test' decorator. Arguments of test method are regarded as fixture names and they are injected by @test decorator automatically. Instance methods or global functions which name is 'provide_' are regarded as fixture provider (or builder) for fixture ''. ex:: class SosTest(unittest.TestCase): ## ## fixture providers. ## def provide_member1(self): return {name: Haruhi} def provide_member2(self): return {name: Kyon} ## ## fixture releaser (optional) ## def release_member1(self, value): assert value == {name: Haruhi} ## ## testcase which requires 'member1' and 'member2' fixtures. ## @test(validate member's names) def _(self, member1, member2): assert member1[name] == Haruhi assert member2[name] == Kyon Dependencies between fixtures are resolved automatically. ex:: class BarTest(unittest.TestCase): ## ## for example: ## - Fixture 'a' depends on 'b' and 'c'. ## - Fixture 'c' depends on 'd'. ## def provide_a(b, c): return b + c + [A] def provide_b(): return [B] def provide_c(d): return d + [C] def provide_d(): reutrn [D] ## ## Dependencies between fixtures are solved automatically. ## @test(dependency test) def _(self, a): assert a == [B, D, C, A] If loop exists in dependency then @test reports error. If you want to integrate with other fixture library, see the following example:: class MyFixtureManager(object): def __init__(self): self.values = { x: 100, y: 200 } def provide(self, name): return self.values[name] def release(self, name, value): pass oktest.fixure_manager = MyFixtureResolver() Other Enhancements and Changes -- * Supports command-line interface to execute test scripts. * Reporting style is changed. * New assertion method ``ok(x).attr(name, value)`` to check attribute. * New assertion method ``ok(x).length(n)``. * New feature``ok().should`` helps you to check boolean method. * 'ok(str1) == str2' displays diff if text1 != text2, even when using with unittest module. * Assertion ``raises()`` supports regular expression to check error message. * Helper functions in oktest.dummy module are now available as decorator. * 'AssertionObject.expected' is renamed to 'AssertionObject.boolean'. * ``oktest.run()`` is changed to return number of failures and errors of tests. * ``before_each()`` and ``after_each()`` are now non-supported. * (Experimental) New function ``NOT()`` provided which is same as ``NG()``. * (Experimental) ``skip()`` and ``@skip.when()`` are provided to skip tests:: See CHANGES.txt for details. http://packages.python.org/Oktest/CHANGES.txt Have a nice testing life! -- regards, makoto kuwata -- http://mail.python.org/mailman/listinfo/python-list
typing question
Hello everyone, This is probably a basic question with an obvious answer, but I don't quite get why the type(foo).__name__ works differently for some class instances and not for others. If I have an underived class, any instance of that class is simply of type instance. If I include an explicit base class, then its type __name__ is the name of the class. $ python Python 2.7.2 (default, Aug 26 2011, 22:35:24) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type help, copyright, credits or license for more information. class MyClass: ... pass ... foo = MyClass() type(foo) type 'instance' type(foo).__name__ 'instance' class MyClass1(): ... pass ... bar = MyClass1() type(bar) type 'instance' type(bar).__name__ 'instance' class MyClass2(object): ... pass ... foobar = MyClass2() type(foobar) class '__main__.MyClass2' type(foobar).__name__ 'MyClass2' I can't explain this behavior (since doesn't every class inherit from object by default? And if so, there should be no difference between any of my class definitions). I would prefer that every approach give me the name of the class (rather than the first 2 just return 'instance'). Why is this not the case? Also, is there any way to access the name of the of the class type foo or bar in the above example? Thanks! Jason P.S. I'll note that my preferred behavior is how python3.2 actually operates $ python3.2 Python 3.2.1 (default, Aug 26 2011, 23:20:19) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type help, copyright, credits or license for more information. class MyClass: ... pass ... foo = MyClass() type(foo).__name__ 'MyClass' -- Jason M. Swails Quantum Theory Project, University of Florida Ph.D. Candidate 352-392-4032 -- http://mail.python.org/mailman/listinfo/python-list
Re: typing question
On Sat, Aug 27, 2011 at 11:42 PM, Jason Swails jason.swa...@gmail.com wrote: I can't explain this behavior (since doesn't every class inherit from object by default? And if so, there should be no difference between any of my class definitions). That is true in Python 3, but not in Python 2. That's why your example works perfectly in version 3.2. Be explicit about deriving from object and your code should work fine in both versions. Chris Angelico -- http://mail.python.org/mailman/listinfo/python-list
UnicodeEncodeError -- 'character maps to undefined'
Hi there, I'm attempting to print a dictionary entry of some twitter data to screen but every now and then I get the following error: (type 'exceptions.UnicodeEncodeError', UnicodeEncodeError('charmap', u'RT @ciaraluvsjb26: BIEBER FEVER \u2665', 32, 33, 'character maps to undefined'), traceback object at 0x01B323C8) I have googled this but haven't really found any way to overcome the error. Any ideas? J -- http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeEncodeError -- 'character maps to undefined'
J wrote: Hi there, I'm attempting to print a dictionary entry of some twitter data to screen but every now and then I get the following error: (type 'exceptions.UnicodeEncodeError', UnicodeEncodeError('charmap', u'RT @ciaraluvsjb26: BIEBER FEVER \u2665', 32, 33, 'character maps to undefined'), traceback object at 0x01B323C8) Showing the actual traceback will help far more than a raw exception tuple. I have googled this but haven't really found any way to overcome the error. Any ideas? I can only try to guess what you are doing, since you haven't shown either any code or a traceback, but I can imagine that you're probably trying to encode a Unicode string into bytes, but using the wrong encoding. I can almost replicate the error: the exception is the same, the message is not, although it is similar. s = u'BIEBER FEVER \u2665' print s # Printing Unicode is fine. BIEBER FEVER ♥ s.encode() # but encoding defaults to ASCII Traceback (most recent call last): File stdin, line 1, in module UnicodeEncodeError: 'ascii' codec can't encode character u'\u2665' in position 13: ordinal not in range(128) The right way is to specify an encoding that includes all the characters you need. Unless you have some reason to choose another encoding, the best thing to do is to just use UTF-8. s.encode('utf-8') 'BIEBER FEVER \xe2\x99\xa5' -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
On 27-Aug-11 03:50 AM, Hans Mulder wrote: On 27/08/11 09:08:20, Arnaud Delobelle wrote: I'm wondering what advice you have about formatting if statements with long conditions (I always format my code to80 colums) Here's an example taken from something I'm writing at the moment and how I've formatted it: if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) What would you do? I would break after the '(' and indent the condition once and put the '):' bit on a separate line, aligned with the 'if': if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] ): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) It may look ugly, but it's very clear where the condition part ends and the 'then' part begins. -- HansM What about: cond= isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] py_and= PyCompare(left.complist + right.complist[1:])if cond else: py_and = PyBooleanAnd(left, right) Colin W. -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
On 27/08/11 17:16:51, Colin J. Williams wrote: What about: cond= isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] py_and= PyCompare(left.complist + right.complist[1:])if cond else: py_and = PyBooleanAnd(left, right) Colin W. That's a syntax error. You need to add parenthesis. How about: cond = ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] } py_and = ( PyCompare(left.complist + right.complist[1:]) if cond else PyBooleanAnd(left, right) ) -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
In article mailman.457.1314428909.27778.python-l...@python.org, Arnaud Delobelle arno...@gmail.com wrote: Hi all, I'm wondering what advice you have about formatting if statements with long conditions (I always format my code to 80 colums) [...] if (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) To tie this into the ongoing, When should I write a new function? discussion, maybe the right thing here is to refactor all of that mess into its own function, so the code looks like: if _needs_compare(left, right): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) and then def _needs_compare(left, right): Decide if we need to call PyCompare return isinstance(left, PyCompare) and \ isinstance(right, PyCompare) and \ left.complist[-1] is right.complist[0] This seems easier to read/understand than what you've got now. It's an even bigger win if this will get called from multiple places. -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
Chris Angelico ros...@gmail.com wrote: the important considerations are not will it take two extra nanoseconds to execute but can my successor understand what the code's doing and will he, if he edits my code, have a reasonable expectation that he's not breaking stuff. These are always important. Forget about your successor. Will *you* be able to figure out what you did 6 months from now? I can't tell you how many times I've looked at some piece of code, muttered, Who wrote this crap? and called up the checkin history only to discover that *I* wrote it :-) -- http://mail.python.org/mailman/listinfo/python-list
Understanding .pth files
I am developing a library for Python 2.7. I'm on Windows XP. I am also learning the proper way to do this (per PyPi) but not in a linear fashion: I've built a prototype for the library, created my setup script, and run the install to make sure I had that bit working properly. Now I'm continuing to develop the library alongside my examples and applications that use this library. The source is at c:\Dev\XmlDB. The installed package in in c:\Python27\lib\site-packages\xmldb\ According to the docs, I should be able to put a file in the site-packages directory called xmldb.pth pointing anywhere else on my drive to include the package. I'd like to use this to direct Python to include the version in the dev folder and not the site-packages folder. (Otherwise I have my dev folder, but end up doing actual library development in the site-packages folder) So my C:\Python27\lib\site-packages\xmldb.pth file has one line: c:\dev\XmlDB\xmldb (I've tried the slashes the other way, too, but it doesn't seem to work). Is the only solution to delete the installed library and add the dev folder to my site.py file? Josh -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
On Sun, Aug 28, 2011 at 2:41 AM, Roy Smith r...@panix.com wrote: Forget about your successor. Will *you* be able to figure out what you did 6 months from now? I can't tell you how many times I've looked at some piece of code, muttered, Who wrote this crap? and called up the checkin history only to discover that *I* wrote it :-) Heh. In that case, you were your own successor :) I always word it as a different person to dodge the But I'll remember! excuse, but you are absolutely right, and I've had that exact same experience myself. Fred comes up to me and says, How do I use FooMatic? Me: I dunno, ask Joe. Fred: But didn't you write it? Me: Yeah, that was years ago, I've forgotten. Ask Joe, he still uses the program. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
On 2011-08-26, D'Arcy J.M. Cain da...@druid.net wrote: On 26 Aug 2011 18:39:07 GMT greymaus greyma...@mail.com wrote: Is there an equivelent for the AWK RS in Python? as in RS='\n\n' will seperate a file at two blank line intervals open(file.txt).read().split(\n\n) Ta!.. bit awkard. :)) -- maus . . ... NO CARRIER -- http://mail.python.org/mailman/listinfo/python-list
Understanding .pth in site-packages
(This may be a shortened double post) I have a development version of a library in c:\dev\XmlDB\xmldb After testing the setup script I also have c:\python27\lib\site-packages\xmldb Now I'm continuing to develop it and simultaneously building an application with it. I thought I could plug into my site-packages directory a file called xmldb.pth with: c:\dev\XmlDB\xmldb which should redirect import statements to the development version of the library. This doesn't seem to work. Is there a better way to redirect import statements without messing with the system path or the PYTHONPATH variable? Josh -- http://mail.python.org/mailman/listinfo/python-list
Arrange files according to a text file
Hello, What would be the best way to accomplish this task? I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
On Aug 27, 2011, at 12:56 PM, Josh English wrote: (This may be a shortened double post) I have a development version of a library in c:\dev\XmlDB\xmldb After testing the setup script I also have c:\python27\lib\site-packages\xmldb Now I'm continuing to develop it and simultaneously building an application with it. I thought I could plug into my site-packages directory a file called xmldb.pth with: c:\dev\XmlDB\xmldb which should redirect import statements to the development version of the library. This doesn't seem to work. xmldb.pth should contain the directory that contains xmldb: c:\dev\XmlDB Examining sys.path at runtime probably would have helped you to debug the effect of your .pth file. On another note, I don't know if the behavior of 'import xmldb' is defined when xmldb is present both as a directory in site-pacakges and also as a .pth file. You're essentially giving Python two choices from where to import xmldb, and I don't know which Python will choose. It may be arbitrary. I've looked for some sort of statement on this topic in the documentation, but haven't come across it yet. Is there a better way to redirect import statements without messing with the system path or the PYTHONPATH variable? Personally I have never used PYTHONPATH. Hope this helps Philip -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
On 27/08/2011 18:03, r...@rdo.python.org wrote: Hello, What would be the best way to accomplish this task? I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. I would get a name from the text file, eg. Adler, Jack, and then identify all the files which contain Adler, Jack or Adler Jack or Jack Adler in the filename, also checking the surrounding characters to ensure that I don't split a name, eg. that John isn't part of Johnson. -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
On 8/27/2011 9:41 AM Roy Smith said... Chris Angelicoros...@gmail.com wrote: the important considerations are not will it take two extra nanoseconds to execute but can my successor understand what the code's doing and will he, if he edits my code, have a reasonable expectation that he's not breaking stuff. These are always important. Forget about your successor. Will *you* be able to figure out what you did 6 months from now? I can't tell you how many times I've looked at some piece of code, muttered, Who wrote this crap? and called up the checkin history only to discover that *I* wrote it :-) When you consider that you're looking at the code six months later it's likely for one of three reasons: you have to fix a bug; you need to add features; or the code's only now getting used. So you then take the extra 20-30 minutes, tease the code apart, refactor as needed and end up with better more readable debugged code. I consider that the right time to do this type of cleanup. For all the crap I write that works well for six months before needing to be cleaned up, there's a whole lot more crap that never gets looked at again that I didn't clean up and never spent the extra 20-30 minutes considering how my future self might view what I wrote. I'm not suggesting that you shouldn't develop good coding habits that adhere to established standards and result in well structured readable code, only that if that ugly piece of code works that you move on. You can bullet proof it after you uncover the vulnerabilities. Code is first and foremost written to be executed. Emile -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
greymaus wrote: On 2011-08-26, D'Arcy J.M. Cain da...@druid.net wrote: On 26 Aug 2011 18:39:07 GMT greymaus greyma...@mail.com wrote: Is there an equivelent for the AWK RS in Python? as in RS='\n\n' will seperate a file at two blank line intervals open(file.txt).read().split(\n\n) Ta!.. bit awkard. :)) Er, is that meant to be a pun? Awk[w]ard, as in awk-ward? In any case, no, the Python line might be a handful of characters longer than the AWK equivalent, but it isn't awkward. It is logical and easy to understand. It's embarrassingly easy to describe what it does: open(file.txt) # opens the file .read() # reads the contents of the file .split(\n\n)# splits the text on double-newlines. The only tricky part is knowing that \n means newline, but anyone familiar with C, Perl, AWK etc. should know that. The Python code might be long (but only by the standards of AWK, which can be painfully concise), but it is simple, obvious and readable. A few extra characters is the price you pay for making your language readable. At the cost of a few extra key presses, you get something that you will be able to understand in 10 years time. AWK is a specialist text processing language. Python is a general scripting and programming language. They have different values: AWK values short, concise code, Python is willing to pay a little more in source code. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
On Sun, Aug 28, 2011 at 3:27 AM, Emile van Sebille em...@fenx.com wrote: Code is first and foremost written to be executed. +1 QOTW. Yes, it'll be read, and most likely read several times, by humans, but ultimately its purpose is to be executed. And in the case of some code, the programmer needs the same treatment, but that's a different issue... ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
Josh English wrote: I have a development version of a library in c:\dev\XmlDB\xmldb After testing the setup script I also have c:\python27\lib\site-packages\xmldb Now I'm continuing to develop it and simultaneously building an application with it. I thought I could plug into my site-packages directory a file called xmldb.pth with: c:\dev\XmlDB\xmldb which should redirect import statements to the development version of the library. This doesn't seem to work. You have to put the directory containing the package into the pth-file. That's probably c:\dev\XmlDB in your case. Also, Python will stop at the first matching module or package; if you keep c:\python27\lib\site-packages\xmldb that will shadow c:\dev\XmlDB\xmldb. %APPDATA%/Python/Python26/site-packages may be a good place for the pth-file (I'm not on Windows and too lazy to figure out where %APPDATA% actually is. The PEP http://www.python.org/dev/peps/pep-0370/ may help) -- http://mail.python.org/mailman/listinfo/python-list
Re: typing question
On Sat, Aug 27, 2011 at 6:42 AM, Jason Swails jason.swa...@gmail.com wrote: Hello everyone, This is probably a basic question with an obvious answer, but I don't quite get why the type(foo).__name__ works differently for some class instances and not for others. If I have an underived class, any instance of that class is simply of type instance. If I include an explicit base class, then its type __name__ is the name of the class. $ python Python 2.7.2 (default, Aug 26 2011, 22:35:24) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type help, copyright, credits or license for more information. class MyClass: ... pass ... foo = MyClass() type(foo) type 'instance' type(foo).__name__ 'instance' class MyClass1(): ... pass ... bar = MyClass1() type(bar) type 'instance' type(bar).__name__ 'instance' class MyClass2(object): ... pass ... foobar = MyClass2() type(foobar) class '__main__.MyClass2' type(foobar).__name__ 'MyClass2' I can't explain this behavior (since doesn't every class inherit from object by default? That's only true in Python 3.x. Python 2.7.2 (default, Jul 27 2011, 04:14:23) class Foo: ... pass ... Foo.__bases__ () class Bar(object): ... pass ... Bar.__bases__ (type 'object',) And if so, there should be no difference between any of my class definitions). I would prefer that every approach give me the name of the class (rather than the first 2 just return 'instance'). Why is this not the case? Classes directly or indirectly inheriting from `object` are new-style; classes which don't are old-style. The two kinds of classes have different semantics (including whether they have a .__name__, but that's minor relative to the other changes). Old-style classes are deprecated and were removed in Python 3. See http://docs.python.org/reference/datamodel.html#new-style-and-classic-classes Cheers, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
In article 4e592852$0$29965$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: open(file.txt) # opens the file .read() # reads the contents of the file .split(\n\n)# splits the text on double-newlines. The biggest problem with this code is that read() slurps the entire file into a string. That's fine for moderately sized files, but will fail (or at least be grossly inefficient) for very large files. It's always annoyed me a little that while it's easy to iterate over the lines of a file, it's more complicated to iterate over a file character by character. You could write your own generator to do that: for c in getchar(open(file.txt)): whatever def getchar(f): for line in f: for c in line: yield c but that's annoyingly verbose (and probably not hugely efficient). Of course, the next problem for the specific problem at hand is that even with an iterator over the characters of a file, split() only works on strings. It would be nice to have a version of split which took an iterable and returned an iterator over the split components. Maybe there is such a thing and I'm just missing it? -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
Philip, Yes, the proper path should be c:\dev\XmlDB, which has the setup.py, xmldb subfolder, the docs subfolder, and example subfolder, and the other text files proscribed by the package development folder. I could only get it to work, though, by renaming the xmldb folder in the site-packages directory, and deleting the egg file created in the site-packages directory. Why the egg file, which doesn't list any paths, would interfere I do not know. But with those changes, the xmldb.pth file is being read. So I think the preferred search order is: 1. a folder in the site-packages directory 2. an Egg file (still unsure why) 3. A .pth file It's a strange juju that I haven't figured out yet. Thanks for the hint. Josh -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
On 8/27/2011 10:03 AM r...@rdo.python.org said... Hello, What would be the best way to accomplish this task? I'd do something like: usernames = Adler, Jack Smith, John Smith, Sally Stone, Mark.split('\n') filenames = Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc.split('\n') from difflib import SequenceMatcher as SM def ignore(x): return x in ' ,.' for filename in filenames: ratios = [SM(ignore,filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner Emile I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
On Aug 27, 2011, at 1:57 PM, Josh English wrote: Philip, Yes, the proper path should be c:\dev\XmlDB, which has the setup.py, xmldb subfolder, the docs subfolder, and example subfolder, and the other text files proscribed by the package development folder. I could only get it to work, though, by renaming the xmldb folder in the site-packages directory, and deleting the egg file created in the site-packages directory. Why the egg file, which doesn't list any paths, would interfere I do not know. But with those changes, the xmldb.pth file is being read. So I think the preferred search order is: 1. a folder in the site-packages directory 2. an Egg file (still unsure why) 3. A .pth file That might be implementation-dependent or it might even come down to something as simple as the in which order the operating system returns files/directories when asked for a listing. In other words, unless you can find something in the documentation (or Python's import implementation) that confirms your preferred search order observation, I would not count on it working the same way with all systems, all Pythons, or even all directory names. Good luck Philip -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
On Aug 27, 10:45 am, Roy Smith r...@panix.com wrote: In article 4e592852$0$29965$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: open(file.txt) # opens the file .read() # reads the contents of the file .split(\n\n) # splits the text on double-newlines. The biggest problem with this code is that read() slurps the entire file into a string. That's fine for moderately sized files, but will fail (or at least be grossly inefficient) for very large files. It's always annoyed me a little that while it's easy to iterate over the lines of a file, it's more complicated to iterate over a file character by character. You could write your own generator to do that: for c in getchar(open(file.txt)): whatever def getchar(f): for line in f: for c in line: yield c but that's annoyingly verbose (and probably not hugely efficient). read() takes an optional size parameter; so f.read(1) is another option... Of course, the next problem for the specific problem at hand is that even with an iterator over the characters of a file, split() only works on strings. It would be nice to have a version of split which took an iterable and returned an iterator over the split components. Maybe there is such a thing and I'm just missing it? I don't know if there is such a thing; but for the OP's problem you could read the file in chunks, e.g.: def readgroup(f, delim, buffsize=8192): tail='' while True: s = f.read(buffsize) if not s: yield tail break groups = (tail + s).split(delim) tail = groups[-1] for group in groups[:-1]: yield group for group in readgroup(open('file.txt'), '\n\n'): # do something Cheers - Chas -- http://mail.python.org/mailman/listinfo/python-list
Re: Run time default arguments
On 8/25/11 1:54 PM, t...@thsu.org wrote: On Aug 25, 10:35 am, Arnaud Delobelle arno...@gmail.com wrote: You're close to the usual idiom: def doSomething(debug=None): if debug is None: debug = defaults['debug'] ... Note the use of 'is' rather than '==' HTH Hmm, from what you are saying, it seems like there's no elegant way to handle run time defaults for function arguments, Well, elegance is in the eye of the beholder: and the above idiom is generally considered elegant in Python, more or less. (The global nature of 'defaults' being a question) meaning that I should probably write a sql-esc coalesce function to keep my code cleaner. I take it that most people who run into this situation do this? def coalesce(*args): for a in args: if a is not None: return a return None def doSomething(debug=None): debug = coalesce(debug,defaults['debug']) # blah blah blah Er, I'd say that most people don't do that, no. I'd guess that most do something more along the lines of if debug is None: debug = default as Arnaud said. Its very common Pythonic code. In fact, I'm not quite sure what you think you're getting out of that coalesce function. Return the first argument that is not None, or return None? That's a kind of odd thing to do, I think. In Python at least. Why not just: debug = defaults.get(debug, None) (Strictly speaking, providing None to get is not needed, but I always feel odd leaving it off.) That's generally how I spell it when I need to do run time defaults. -- Stephen Hansen ... Also: Ixokai ... Mail: me+list/python (AT) ixokai (DOT) io ... Blog: http://meh.ixokai.io/ signature.asc Description: OpenPGP digital signature -- http://mail.python.org/mailman/listinfo/python-list
How can I solve a equation like solve a function containint expressions like sqrt(log(x) - 1) = 2 and exp((log(x) - 1.5)**2 - 3) = 5
HI, Hi, I am trying to solve an equation containing both exp, log, erfc, and they may be embedded into each otherBut sympy cannot handle this, as shown below: from sympy import solve, exp, log, pi from sympy.mpmath import * from sympy import Symbol x=Symbol('x') sigma = 4 mu = 1.5 solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - 1, x) Traceback (most recent call last): File stdin, line 1, in module File /home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/functions/functions.py, line 287, in log return ctx.ln(x) File /home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py, line 984, in f x = ctx.convert(x) File /home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py, line 662, in convert return ctx._convert_fallback(x, strings) File /home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp.py, line 556, in _convert_fallback raise TypeError(cannot create mpf from + repr(x)) TypeError: cannot create mpf from x But sqrt, log, exp, itself is OK, as shown as below: solve((1.0 / sqrt(2 * pi) * x * sigma) - 1, x) [0.626657068657750] SO, How can I solve an equation containint expressions like sqrt(log(x) - 1)=0 or exp((log(x) - mu)**2 - 3) = 0??? If there are any other methods without Sympy, it is still OK. Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
Josh English wrote: Philip, Yes, the proper path should be c:\dev\XmlDB, which has the setup.py, xmldb subfolder, the docs subfolder, and example subfolder, and the other text files proscribed by the package development folder. I could only get it to work, though, by renaming the xmldb folder in the site-packages directory, and deleting the egg file created in the site-packages directory. Why the egg file, which doesn't list any paths, would interfere I do not know. But with those changes, the xmldb.pth file is being read. So I think the preferred search order is: 1. a folder in the site-packages directory 2. an Egg file (still unsure why) 3. A .pth file You say that the egg file was created by the setup script for the library. Are you sure that this script did not also create or modify a .pth file of its own, adding the egg to the path? .pth files do not redirect imports from site-packages; they add EXTRA directories to sys.path. Also note that this means the .pth file itself is not part of the search path; it's not like you shadow a package xyz by creating a .pth file xyz.pth instead. A single .pth file can list multiple directories, and it's those directories that are added to the path. I'm not sure how your package is set up, but easy_install, for instance, creates an easy_install.pth file in site-packages. This file contains references to egg files (or, at least in my case, .egg directories created by unpacking the eggs) for each package installed with easy_install. As far as I'm aware, Python doesn't have special rules for putting egg files in the search path, so my guess is that it's something like that: the setup script is creating a .pth file (or modifying an existing .pth file) to add the egg to the path. Read http://docs.python.org/library/site.html for the description of how .PTH files work. I don't think there is a general way to globally shadow a package that exists in site-packages. However, according to the docs the .pth files are added in alphabetical order, so if it is indeed easy_install.pth that is adding your egg, you could hack around it by making a file with an alphabetically earlier name (e.g., a_xmldb.pth). -- --OKB (not okblacke) Brendan Barnwell Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail. --author unknown -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
On 8/27/2011 1:45 PM, Roy Smith wrote: In article4e592852$0$29965$c3e8da3$54964...@news.astraweb.com, Steven D'Apranosteve+comp.lang.pyt...@pearwood.info wrote: open(file.txt) # opens the file .read() # reads the contents of the file .split(\n\n)# splits the text on double-newlines. The biggest problem with this code is that read() slurps the entire file into a string. That's fine for moderately sized files, but will fail (or at least be grossly inefficient) for very large files. I read the above as separating the file into paragraphs, as indicated by blank lines. def paragraphs(file): para = [] for line in file: if line: para.append(line) else: yield para # or ''.join(para), as desired para = [] -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
On Sun, Aug 28, 2011 at 6:03 AM, Terry Reedy tjre...@udel.edu wrote: yield para # or ''.join(para), as desired Or possibly '\n'.join(para) if you want to keep the line breaks inside paragraphs. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: typing question
On 8/27/2011 9:42 AM, Jason Swails wrote: P.S. I'll note that my preferred behavior is how python3.2 actually operates Python core developers agree. This is one of the reasons for breaking a bit from 2.x to make Python 3. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
Hello Emile , Thank you for the code below as I have not encountered SequenceMatcher before and would have to take a look at it closer. My question would it work for a text file list of names about 25k lines and a directory with say 100 files inside? Thank you once again. On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebille em...@fenx.com wrote: On 8/27/2011 10:03 AM r...@rdo.python.org said... Hello, What would be the best way to accomplish this task? I'd do something like: usernames = Adler, Jack Smith, John Smith, Sally Stone, Mark.split('\n') filenames = Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc.split('\n') from difflib import SequenceMatcher as SM def ignore(x): return x in ' ,.' for filename in filenames: ratios = [SM(ignore,filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner Emile I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
On 8/27/2011 2:07 PM, Philip Semanchuk wrote: On Aug 27, 2011, at 1:57 PM, Josh English wrote: Philip, Yes, the proper path should be c:\dev\XmlDB, which has the setup.py, xmldb subfolder, the docs subfolder, and example subfolder, and the other text files proscribed by the package development folder. I could only get it to work, though, by renaming the xmldb folder in the site-packages directory, and deleting the egg file created in the site-packages directory. Why the egg file, which doesn't list any paths, would interfere I do not know. But with those changes, the xmldb.pth file is being read. So I think the preferred search order is: 1. a folder in the site-packages directory 2. an Egg file (still unsure why) 3. A .pth file That might be implementation-dependent or it might even come down to something as simple as the in which order the operating system returns files/directories when asked for a listing. Doc says first match, and I presume that includes first match within a directory. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
On Aug 27, 2011, at 4:14 PM, Terry Reedy wrote: On 8/27/2011 2:07 PM, Philip Semanchuk wrote: On Aug 27, 2011, at 1:57 PM, Josh English wrote: Philip, Yes, the proper path should be c:\dev\XmlDB, which has the setup.py, xmldb subfolder, the docs subfolder, and example subfolder, and the other text files proscribed by the package development folder. I could only get it to work, though, by renaming the xmldb folder in the site-packages directory, and deleting the egg file created in the site-packages directory. Why the egg file, which doesn't list any paths, would interfere I do not know. But with those changes, the xmldb.pth file is being read. So I think the preferred search order is: 1. a folder in the site-packages directory 2. an Egg file (still unsure why) 3. A .pth file That might be implementation-dependent or it might even come down to something as simple as the in which order the operating system returns files/directories when asked for a listing. Doc says first match, and I presume that includes first match within a directory. First match using which ordering? Do the docs clarify that? Thanks Philip -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
Chris Angelico wrote: On Sun, Aug 28, 2011 at 3:27 AM, Emile van Sebille em...@fenx.com wrote: Code is first and foremost written to be executed. +1 QOTW. Yes, it'll be read, and most likely read several times, by humans, but ultimately its purpose is to be executed. You've never noticed the masses of code written in text books, blogs, web pages, discussion forums like this one, etc.? Real world code for production is usually messy and complicated and filled with data validation and error checking code. There's a lot of code without that, because it was written explicitly to be read by humans, and the fact that it may be executed as well is incidental. Some code is even written in pseudo-code that *cannot* be executed. It's clear to me that a non-trivial amount of code is specifically written to be consumed by other humans, not by machines. It seems to me that, broadly speaking, there are languages designed with execution of code as the primary purpose: Fortran, C, Lisp, Java, PL/I, APL, Forth, ... and there are languages designed with *writing* of code as the primary purpose: Perl, AWK, sed, bash, ... and then there are languages where *reading* is the primary purpose: Python, Ruby, Hypertalk, Inform 7, Pascal, AppleScript, ... and then there are languages where the torment of the damned is the primary purpose: INTERCAL, Oook, Brainf*ck, Whitespace, Malbolge, ... and then there are languages with few, or no, design principles to speak of, or as compromise languages that (deliberately or accidentally) straddle the other categories. It all depends on the motivation and values of the language designer, and the trade-offs the language makes. Which category any specific language may fall into may be a matter of degree, or a matter of opinion, or both. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
On Sun, Aug 28, 2011 at 6:27 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: You've never noticed the masses of code written in text books, blogs, web pages, discussion forums like this one, etc.? Real world code for production is usually messy and complicated and filled with data validation and error checking code. There's a lot of code without that, because it was written explicitly to be read by humans, and the fact that it may be executed as well is incidental. Some code is even written in pseudo-code that *cannot* be executed. It's clear to me that a non-trivial amount of code is specifically written to be consumed by other humans, not by machines. Yes, I'm aware of the quantities of code that are primarily for human consumption. But in the original context, which was of editing code six months down the track, I still believe that such code is primarily for the machine. In that situation, there are times when it's not worth the hassle of writing beautiful code; you'd do better to just get that code generated and in operation. Same goes for lint tools and debuggers - sometimes, it's easier to just put the code into a live situation (or a perfect copy of) and see where it breaks, than to use a simulation/test harness. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
On 8/27/2011 1:15 PM r...@rdo.python.org said... Hello Emile , Thank you for the code below as I have not encountered SequenceMatcher before and would have to take a look at it closer. My question would it work for a text file list of names about 25k lines and a directory with say 100 files inside? Sure. Emile Thank you once again. On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com wrote: On 8/27/2011 10:03 AM r...@rdo.python.org said... Hello, What would be the best way to accomplish this task? I'd do something like: usernames = Adler, Jack Smith, John Smith, Sally Stone, Mark.split('\n') filenames = Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc.split('\n') from difflib import SequenceMatcher as SM def ignore(x): return x in ' ,.' for filename in filenames: ratios = [SM(ignore,filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner Emile I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
In article mailman.477.1314475482.27778.python-l...@python.org, Terry Reedy tjre...@udel.edu wrote: On 8/27/2011 1:45 PM, Roy Smith wrote: In article4e592852$0$29965$c3e8da3$54964...@news.astraweb.com, Steven D'Apranosteve+comp.lang.pyt...@pearwood.info wrote: open(file.txt) # opens the file .read() # reads the contents of the file .split(\n\n)# splits the text on double-newlines. The biggest problem with this code is that read() slurps the entire file into a string. That's fine for moderately sized files, but will fail (or at least be grossly inefficient) for very large files. I read the above as separating the file into paragraphs, as indicated by blank lines. def paragraphs(file): para = [] for line in file: if line: para.append(line) else: yield para # or ''.join(para), as desired para = [] Plus or minus the last paragraph in the file :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
In article 4e595334$0$3$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: and then there are languages with few, or no, design principles to speak of Oh, like PHP? -- http://mail.python.org/mailman/listinfo/python-list
Re: how to format long if conditions
On 27-Aug-11 11:53 AM, Hans Mulder wrote: On 27/08/11 17:16:51, Colin J. Williams wrote: What about: cond= isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] py_and= PyCompare(left.complist + right.complist[1:])if cond else: py_and = PyBooleanAnd(left, right) Colin W. That's a syntax error. You need to add parenthesis. How about: cond = ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] } py_and = ( PyCompare(left.complist + right.complist[1:]) if cond else PyBooleanAnd(left, right) ) -- HansM I like your 11:53 message but suggest indenting the if cond as below to make it clearer that it, with the preceding line, is all one statement. Colin W. #!/usr/bin/env python z= 1 class PyCompare: complist = [True, False] def __init__(self): pass left= PyCompare right= PyCompare def isinstance(a, b): return True def PyBooleanAnd(a, b): return True def PyCompare(a): return False z=2 def try1(): '''Hans Mulder suggestion 03:50 ''' if ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] ): py_and = PyCompare(left.complist + right.complist[1:]) else: py_and = PyBooleanAnd(left, right) def try2(): '''cjw response - corrected 11:56 ''' cond= (isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0]) py_and= (PyCompare(left.complist + right.complist[1:]) if cond else PyBooleanAnd(left, right)) def try3(): ''' Hans Mulder 11:53 ''' cond = ( isinstance(left, PyCompare) and isinstance(right, PyCompare) and left.complist[-1] is right.complist[0] ) # not } py_and = ( PyCompare(left.complist + right.complist[1:]) if cond else PyBooleanAnd(left, right) ) def main(): try1() try2() try3() if __name__ == '__main__': main() pass -- http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeEncodeError -- 'character maps to undefined'
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes: s = u'BIEBER FEVER \u2665' print s # Printing Unicode is fine. BIEBER FEVER ♥ You're a cruel man. Why do you hate me? -- \ “If nature has made any one thing less susceptible than all | `\others of exclusive property, it is the action of the thinking | _o__) power called an idea” —Thomas Jefferson, 1813-08-13 | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
Emile van Sebille em...@fenx.com writes: Code is first and foremost written to be executed. −1 QotW. I disagree, and have a counter-aphorism: “Programs must be written for people to read, and only incidentally for machines to execute.” —Abelson Sussman, _Structure and Interpretation of Computer Programs_ Yes, the primary *function* of the code you write is for it to eventually execute. But the primary *audience* of the text you type into your buffer is not the computer, but the humans who will read it. That's what must be foremost in your mind while writing that text. -- \ “If you can't beat them, arrange to have them beaten.” —George | `\Carlin | _o__) | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
On 8/27/2011 2:57 PM Ben Finney said... Emile van Sebilleem...@fenx.com writes: Code is first and foremost written to be executed. “Programs must be written for people to read, and only incidentally for machines to execute.” —Abelson Sussman, _Structure and Interpretation of Computer Programs_ That's certainly self-fulfilling -- code that doesn't execute will need to be read to be understood, and to be fixed so that it does run. Nobody cares about code not intended to be executed. Pretty it up as much as you have free time to do so to enlighten your intended audience. Code that runs from the offset may not ever again need to be read, so the only audience will ever be the processor. I find it much to easy to waste enormous amounts of time prettying up code that works. Pretty it up when it doesn't -- that's the code that needs the attention. Emile Yes, the primary *function* of the code you write is for it to eventually execute. But the primary *audience* of the text you type into your buffer is not the computer, but the humans who will read it. That's what must be foremost in your mind while writing that text. -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
I have .egg files in my system path. The Egg file created by my setup script doesn't include anything but the introductory text. If I open other eggs I see the zipped data, but not for my own files. Is having a zipped egg file any faster than a regular package? or does it just prevent people from seeing the code? Josh -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
When I run: os.listdir('c:\Python27\lib\site-packages') I get the contents in order, so the folders come before .pth files (as nothing comes before something.) I would guess Python is using os.listdir. Why wouldn't it? -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
OKB, The setup.py script created the egg, but not the .pth file. I created that myself. Thank you for clarifying about how .pth works. I know redirect imports was the wrong phrase, but it worked in my head at the time. It appears, at least on my system, that Python will find site-packages/foo before it finds and reads site-packages/foo.pth. At least this solution gives me a way to develop my libraries outside of site-packages. Josh -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
On Aug 27, 5:21 pm, Emile van Sebille em...@fenx.com wrote: On 8/27/2011 2:57 PM Ben Finney said... Emile van Sebilleem...@fenx.com writes: Code is first and foremost written to be executed. “Programs must be written for people to read, and only incidentally for machines to execute.” —Abelson Sussman, _Structure and Interpretation of Computer Programs_ That's certainly self-fulfilling -- code that doesn't execute will need to be read to be understood, and to be fixed so that it does run. Nobody cares about code not intended to be executed. Pretty it up as much as you have free time to do so to enlighten your intended audience. Code that runs from the offset may not ever again need to be read, so the only audience will ever be the processor. WRONG! Code may need to be extended someday no matter HOW well it executes today. Also, code need to be readable so the readers can learn from it. -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
In article mailman.489.1314483681.27778.python-l...@python.org, Emile van Sebille em...@fenx.com wrote: code that doesn't execute will need to be read to be understood, and to be fixed so that it does run. That is certainly true, but it's not the whole story. Even code that works perfectly today will need to be modified in the future. Business requirements change. Your code will need to be ported to a new OS. You'll need to make it work for 64-bit. Or i18n. Or y2k (well, don't need to worry about that one any more). Or with a different run-time library. A new complier. A different database. Regulatory changes will impose new requirements Or, your company will get bought and you'll need to interface with a whole new system. Code is never done. At least not until the project is dead. -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
Thank you so much. The code worked perfectly. This is what I tried using Emile code. The only time when it picked wrong name from the list was when the file was named like this. Data Mark Stone.doc How can I fix this? Hope I am not asking too much? import os from difflib import SequenceMatcher as SM path = r'D:\Files ' txt_names = [] with open(r'D:/python/log1.txt') as f: for txt_name in f.readlines(): txt_names.append(txt_name.strip()) def ignore(x): return x in ' ,.' for filename in os.listdir(path): ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in txt_names] best = max(ratios) owner = txt_names[ratios.index(best)] print filename,:,owner On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebille em...@fenx.com wrote: On 8/27/2011 1:15 PM r...@rdo.python.org said... Hello Emile , Thank you for the code below as I have not encountered SequenceMatcher before and would have to take a look at it closer. My question would it work for a text file list of names about 25k lines and a directory with say 100 files inside? Sure. Emile Thank you once again. On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com wrote: On 8/27/2011 10:03 AM r...@rdo.python.org said... Hello, What would be the best way to accomplish this task? I'd do something like: usernames = Adler, Jack Smith, John Smith, Sally Stone, Mark.split('\n') filenames = Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc.split('\n') from difflib import SequenceMatcher as SM def ignore(x): return x in ' ,.' for filename in filenames: ratios = [SM(ignore,filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner Emile I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
On 8/27/11 3:21 PM, Emile van Sebille wrote: On 8/27/2011 2:57 PM Ben Finney said... Emile van Sebilleem...@fenx.com writes: Code is first and foremost written to be executed. “Programs must be written for people to read, and only incidentally for machines to execute.” —Abelson Sussman, _Structure and Interpretation of Computer Programs_ That's certainly self-fulfilling -- code that doesn't execute will need to be read to be understood, and to be fixed so that it does run. Nobody cares about code not intended to be executed. Pretty it up as much as you have free time to do so to enlighten your intended audience. Er, you're interpreting the quote... way overboard. No one's talking about code that isn't intended to be executed, I don't think; the quote includes, and only incidentally for machines to execute. That's still the there, and its still important. It should just not be the prime concern while actually writing the code. The code has to actually do something. If not, obviously you'll have to change it. The Pythonic emphasis on doing readable, pretty code isn't JUST about making code that just looks good; its not merely an aesthetic that the community endorses. And although people often tout the very valid reason why readability counts-- that code is often read more then written, and that coming back to a chunk of code 6 months later and being able to understand fully what its doing is very important... that's not the only reason readability counts. Readable, pretty, elegantly crafted code is also far more likely to be *correct* code. However, this: Code that runs from the offset may not ever again need to be read, so the only audience will ever be the processor. I find it much to easy to waste enormous amounts of time prettying up code that works. Pretty it up when it doesn't -- that's the code that needs the attention. ... seems to me to be a rather significant self-fulfilling prophecy in its own right. The chances that the code does what its supposed to do, accurately, and without any bugs, goes down in my experience quite significantly the farther away from pretty it is. If you code some crazy, overly clever, poorly organized, messy chunk of something that /works/ -- that's fine and dandy. But unless you have some /seriously/ comprehensive test coverage then the chances that you can eyeball it and be sure it doesn't have some subtle bugs that will call you back to fix it later, is pretty low. In my experience. Its not that pretty code is bug-free, but code which is easily read and understood is vastly more likely to be functioning correctly and reliably. Also... it just does not take that much time to make pretty code. It really doesn't. The entire idea that its hard, time-consuming, effort-draining or difficult to make code clean and pretty from the get-go is just wrong. You don't need to do a major prettying up stage after the fact. Sure, sometimes refactoring would greatly help a body of code as it evolves, but you can do that as it becomes beneficial for maintenance reasons and not just for pretty's sake. -- Stephen Hansen ... Also: Ixokai ... Mail: me+list/python (AT) ixokai (DOT) io ... Blog: http://meh.ixokai.io/ signature.asc Description: OpenPGP digital signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
On 8/27/11 11:06 AM, Emile van Sebille wrote: from difflib import SequenceMatcher as SM def ignore(x): return x in ' ,.' for filename in filenames: ratios = [SM(ignore,filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner It amazes me that I can still find a surprising new tool in the stdlib after all these years. Neat. /pinboards -- Stephen Hansen ... Also: Ixokai ... Mail: me+list/python (AT) ixokai (DOT) io ... Blog: http://meh.ixokai.io/ signature.asc Description: OpenPGP digital signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
On 8/27/11 3:41 PM, Josh English wrote: I have .egg files in my system path. The Egg file created by my setup script doesn't include anything but the introductory text. If I open other eggs I see the zipped data, but not for my own files. Sounds like your setup.py isn't actually including your source. Is having a zipped egg file any faster than a regular package? or does it just prevent people from seeing the code? IIUC, its nominally very slightly faster to use an egg, because it can skip a lot of filesystem calls. But I've only heard that and can't completely confirm it (internal testing at my day job did not conclusively support this, but our environments are uniquely weird). But that speed boost (if even true) isn't really the point of eggs-as-files -- eggs are just easy to deal with as files is all. They don't prevent people from seeing the code*, they're just regular zip files and can be unzipped fine. I almost always install unzip my eggs on a developer machine, because I inevitably want to go poke inside and see what's actually going on. -- Stephen Hansen ... Also: Ixokai ... Mail: me+list/python (AT) ixokai (DOT) io ... Blog: http://meh.ixokai.io/ * Although you can make an egg and then go and remove all the .PY files from it, and leave just the compiled .PYC files, and Python will load it fine. At the day job, that's what we do. But, you have to be aware that this ties the egg to a specific version of Python, and its not difficult for someone industrious to disassemble and/or decompile the PYC back to effectively equivalent PY files to edit away if they want. signature.asc Description: OpenPGP digital signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
On 28/08/2011 00:18, r...@rdo.python.org wrote: Thank you so much. The code worked perfectly. This is what I tried using Emile code. The only time when it picked wrong name from the list was when the file was named like this. Data Mark Stone.doc How can I fix this? Hope I am not asking too much? Have you tried the alternative word orders, Mark Stone as well as Stone, Mark, picking whichever name has the best ratio for either? import os from difflib import SequenceMatcher as SM path = r'D:\Files ' txt_names = [] with open(r'D:/python/log1.txt') as f: for txt_name in f.readlines(): txt_names.append(txt_name.strip()) def ignore(x): return x in ' ,.' for filename in os.listdir(path): ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in txt_names] best = max(ratios) owner = txt_names[ratios.index(best)] print filename,:,owner On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebilleem...@fenx.com wrote: On 8/27/2011 1:15 PM r...@rdo.python.org said... Hello Emile , Thank you for the code below as I have not encountered SequenceMatcher before and would have to take a look at it closer. My question would it work for a text file list of names about 25k lines and a directory with say 100 files inside? Sure. Emile Thank you once again. On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com wrote: On 8/27/2011 10:03 AM r...@rdo.python.org said... Hello, What would be the best way to accomplish this task? I'd do something like: usernames = Adler, Jack Smith, John Smith, Sally Stone, Mark.split('\n') filenames = Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc.split('\n') from difflib import SequenceMatcher as SM def ignore(x): return x in ' ,.' for filename in filenames: ratios = [SM(ignore,filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner Emile I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
On Sun, 28 Aug 2011 00:48:20 +0100, MRAB pyt...@mrabarnett.plus.com wrote: On 28/08/2011 00:18, r...@rdo.python.org wrote: Thank you so much. The code worked perfectly. This is what I tried using Emile code. The only time when it picked wrong name from the list was when the file was named like this. Data Mark Stone.doc How can I fix this? Hope I am not asking too much? Have you tried the alternative word orders, Mark Stone as well as Stone, Mark, picking whichever name has the best ratio for either? Yes I tried and the result was the same. I will try to work out something. thank you. import os from difflib import SequenceMatcher as SM path = r'D:\Files ' txt_names = [] with open(r'D:/python/log1.txt') as f: for txt_name in f.readlines(): txt_names.append(txt_name.strip()) def ignore(x): return x in ' ,.' for filename in os.listdir(path): ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in txt_names] best = max(ratios) owner = txt_names[ratios.index(best)] print filename,:,owner On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebilleem...@fenx.com wrote: On 8/27/2011 1:15 PM r...@rdo.python.org said... Hello Emile , Thank you for the code below as I have not encountered SequenceMatcher before and would have to take a look at it closer. My question would it work for a text file list of names about 25k lines and a directory with say 100 files inside? Sure. Emile Thank you once again. On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com wrote: On 8/27/2011 10:03 AM r...@rdo.python.org said... Hello, What would be the best way to accomplish this task? I'd do something like: usernames = Adler, Jack Smith, John Smith, Sally Stone, Mark.split('\n') filenames = Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc.split('\n') from difflib import SequenceMatcher as SM def ignore(x): return x in ' ,.' for filename in filenames: ratios = [SM(ignore,filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner Emile I have many files in separate directories, each file name contain a persons name but never in the same spot. I need to find that name which is listed in a large text file in the following format. Last name, comma and First name. The last name could be duplicate. Adler, Jack Smith, John Smith, Sally Stone, Mark etc. The file names don't necessary follow any standard format. Smith, John - 02-15-75 - business files.doc Random Data - Adler Jack - expenses.xls More Data Mark Stone files list.doc etc I need some way to pull the name from the file name, find it in the text list and then create a directory based on the name on the list Smith, John and move all files named with the clients name into that directory. -- http://mail.python.org/mailman/listinfo/python-list
packaging a python application
Hi I created a python application which consists of multiple python files and a configuration file. I am not sure, how can I distribute it. I read distutils2 documentation and a few blogs on python packaging. But I still have the following questions. 1. My package has a configuration file which has to be edited by the user. How do we achieve that? 2. Should the user directly edit the configuration file, or there would be an interface for doing it...?(I remember my sendmail installations in Debian/Ubuntu. It would ask a bunch of questions and the cfg file would be ready) I am just confused how to go about... thanks suresh -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
On 8/27/2011 5:07 PM, Roy Smith wrote: In articlemailman.477.1314475482.27778.python-l...@python.org, Terry Reedytjre...@udel.edu wrote: On 8/27/2011 1:45 PM, Roy Smith wrote: In article4e592852$0$29965$c3e8da3$54964...@news.astraweb.com, Steven D'Apranosteve+comp.lang.pyt...@pearwood.info wrote: open(file.txt) # opens the file .read() # reads the contents of the file .split(\n\n)# splits the text on double-newlines. The biggest problem with this code is that read() slurps the entire file into a string. That's fine for moderately sized files, but will fail (or at least be grossly inefficient) for very large files. I read the above as separating the file into paragraphs, as indicated by blank lines. def paragraphs(file): para = [] for line in file: if line: para.append(line) else: yield para # or ''.join(para), as desired para = [] Plus or minus the last paragraph in the file :-) Or right, I forgot the last line, which is a repeat of the yield after the for loop finishes. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
On 8/27/2011 4:18 PM r...@rdo.python.org said... Thank you so much. The code worked perfectly. This is what I tried using Emile code. The only time when it picked wrong name from the list was when the file was named like this. Data Mark Stone.doc How can I fix this? Hope I am not asking too much? What name did it pick? I imagine if you're picking a name from a list of 25000 names that some subset of combinations may yield like ratios. But, if you double up on the file name side you may get closer: for filename in filenames: ratios = [SM(ignore,filename+filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner ... on the other hand, if you've only got a 100 files to sort out, you should already be done. :) Emile -- http://mail.python.org/mailman/listinfo/python-list
Re: Record seperator
http://stromberg.dnsalias.org/svn/bufsock/trunk does it. $ cat double-file daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/bin/sh man:x:6:12:man:/var/cache/man:/bin/sh root:x:0:0:root:/root:/bin/bash lp:x:7:7:lp:/var/spool/lpd:/bin/sh mail:x:8:8:mail:/var/mail:/bin/sh news:x:9:9:news:/var/spool/news:/bin/sh uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh proxy:x:13:13:proxy:/bin:/bin/sh benchbox-dstromberg:~/src/home-svn/bufsock/trunk i686-pc-linux-gnu 8830 - above cmd done 2011 Sat Aug 27 06:19 PM $ python Python 2.6.6 (r266:84292, Sep 15 2010, 15:52:39) [GCC 4.4.5] on linux2 Type help, copyright, credits or license for more information. import bufsock file_ = open('double-file', 'rb') bs = bufsock.bufsock(file_) bs.readto('oo') 'daemon:x:1:1:daemon:/usr/sbin:/bin/sh\nbin:x:2:2:bin:/bin:/bin/sh\nsys:x:3:3:sys:/dev:/bin/sh\nsync:x:4:65534:sync:/bin:/bin/sync\ngames:x:5:60:games:/usr/games:/bin/sh\nman:x:6:12:man:/var/cache/man:/bin/sh\nroo' bs.close() Don't let the name fool you - it's not just for sockets anymore. On Fri, Aug 26, 2011 at 11:39 AM, greymaus greyma...@mail.com wrote: Is there an equivelent for the AWK RS in Python? as in RS='\n\n' will seperate a file at two blank line intervals -- maus . . ... NO CARRIER -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Custom dict to prevent keys from being overridden
Hi, With a simple dict, the following happens: d = { ... 'a': 1, ... 'b': 2, ... 'a': 3 ... } d {'a': 3, 'b': 2} ... i.e. the value for the 'a' key gets overridden. What I'd like to achieve is: d = { ... 'a': 1, ... 'b': 2, ... 'a': 3 ... } Error: The key 'a' already exists. Is that possible, and if so, how? Many thanks! Kind regards, Julien -- http://mail.python.org/mailman/listinfo/python-list
Re: Custom dict to prevent keys from being overridden
Julien wrote: What I'd like to achieve is: d = { ... 'a': 1, ... 'b': 2, ... 'a': 3 ... } Error: The key 'a' already exists. Is that possible, and if so, how? Not if the requirements including using built-in dicts { }. But if you are happy enough to use a custom class, like this: d = StrictDict(('a', 1), ('b', 2'), ('a', 3)) then yes. Just subclass dict and have it validate items as they are added. Something like: # Untested class StrictDict(dict): def __init__(self, items): for key, value in items: self[key] = value def __setitem__(self, key, value): if key in self: raise KeyError('key %r already exists' % key) super(StrictDict, self).__setitem__(key, value) should more or less do it. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
On Aug 27, 2011, at 6:49 PM, Josh English wrote: When I run: os.listdir('c:\Python27\lib\site-packages') I get the contents in order, so the folders come before .pth files (as nothing comes before something.) That's one definition of in order. =) I would guess Python is using os.listdir. Why wouldn't it? If you mean that Python uses os.listdir() during import resolution, then yes I agree that's probable. And os.listdir() doesn't guarantee any consistent order. In fact, the documentation explicitly states that the list is returned in arbitrary order. Like a lot of things in Python, os.listdir() probably relies on the underlying C library which varies from system to system. (Case in point -- on my Mac, os.listdir() returns things in the same order as the 'ls' command, which is case-sensitive alphabetical, files directories mixed -- different from Windows.) So if import relies on os.listdir(), then you're relying on arbitrary resolution when you have a .pth file that shadows a site-packages directory. Those rules will probably work consistently on your particular system, you're developing a habit around what is essentially an implementation quirk. Cheers Philip -- http://mail.python.org/mailman/listinfo/python-list
Why do closures do this?
Somewhat apropos of the recent function principle thread, I was recently surprised by this: funcs=[] for n in range(3): def f(): return n funcs.append(f) [i() for i in funcs] The last expression, IMO surprisingly, is [2,2,2], not [0,1,2]. Google tells me I'm not the only one surprised, but explains that it's because n in the function f refers to whatever n is currently bound to, not what it was bound to at definition time (if I've got that right), and that there are at least two ways around it: either make a factory function: def mkfnc(n): def fnc(): return n return fnc funcs=[] for n in range(3): funcs.append(mkfnc(n)) which seems roundabout, or take advantage of the default values set at definition time behaviour: funcs=[] for n in range(3): def f(n=n): return n funcs.append(f) which seems obscure, and a side-effect. My question is, is this an inescapable consequence of using closures, or is it by design, and if so, what are some examples of where this would be the preferred behaviour? Regards, John -- http://mail.python.org/mailman/listinfo/python-list
Re: Why do closures do this?
On 8/27/2011 11:45 PM, John O'Hagan wrote: Somewhat apropos of the recent function principle thread, I was recently surprised by this: funcs=[] for n in range(3): def f(): return n funcs.append(f) The last expression, IMO surprisingly, is [2,2,2], not [0,1,2]. Google tells me I'm not the only one surprised, but explains that it's because n in the function f refers to whatever n is currently bound to, not what it was bound to at definition time (if I've got that right), and that there are at least two ways around it: either make a factory function: def f(): return n is a CONSTANT value. It is not a closure. Your code above is the same as def f(): return n funcs = [f,f,f] n = 2 [i() for i in funcs] def mkfnc(n): def fnc(): return n return fnc fnc is a closure and n in a nonlocal name. Since you only read it, no nonlocal declaration is needed. funcs=[] for n in range(3): funcs.append(mkfnc(n)) which seems roundabout, or take advantage of the default values set at definition time behaviour: funcs=[] for n in range(3): def f(n=n): return n funcs.append(f) which seems obscure, and a side-effect. It was the standard idiom until nested functions were upgraded to enclose or capture the values of nonlocals. My question is, is this an inescapable consequence of using closures, I cannot answer since I am not sure what you mean by 'this'. Closures are nested functions that access the locals of enclosing functions. To ensure that the access remains possible even after the enclosing function returns, the last value of such accessed names is preserved even after the enclosing function returns. (That is the tricky part.) -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: is there any principle when writing python function
smith jack wrote: i have heard that function invocation in python is expensive, but make lots of functions are a good design habit in many other languages, so is there any principle when writing python function? for example, how many lines should form a function? Once Abraham Lincoln was asked how long a man's legs should be. (Well, he was a tall man and had exceptionally long legs... his bed had to be specially made.) Old Abe said, A man's legs ought to be long enough to reach from his body to the floor. One time the Austrian Emperor decided that one of Wolfgang Amadeus Mozart's masterpieces contained too many notes... when asked how many notes a masterpiece ought to contain it is reported that Mozart retorted, I use precisely as many notes as the piece requires, not one note more, and not one note less. After starting the python interpreter import this: import this ... study carefully. If you're not Dutch, don't worry if some of it confuses you. ... apply liberally to your function praxis. kind regards, -- m harris FSF ...free as in freedom/ http://webpages.charter.net/harrismh777/gnulinux/gnulinux.htm -- http://mail.python.org/mailman/listinfo/python-list
Re: Why do closures do this?
On Sun, 28 Aug 2011 00:19:07 -0400 Terry Reedy tjre...@udel.edu wrote: On 8/27/2011 11:45 PM, John O'Hagan wrote: Somewhat apropos of the recent function principle thread, I was recently surprised by this: funcs=[] for n in range(3): def f(): return n funcs.append(f) The last expression, IMO surprisingly, is [2,2,2], not [0,1,2]. [...] def f(): return n is a CONSTANT value. It is not a closure. Quite right: I originally encountered this inside a function, but removed the enclosing function to show the issue in minimal form. Your code above is the same as def f(): return n funcs = [f,f,f] n = 2 [i() for i in funcs] Also right, but I still find this surprising. [...] My question is, is this an inescapable consequence of using closures, I cannot answer since I am not sure what you mean by 'this'. Ah, but you are and you have: Closures are nested functions that access the locals of enclosing functions. To ensure that the access remains possible even after the enclosing function returns, the last value of such accessed names is preserved even after the enclosing function returns. (That is the tricky part.) Thanks, John -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding .pth in site-packages
Josh English wrote: OKB, The setup.py script created the egg, but not the .pth file. I created that myself. Thank you for clarifying about how .pth works. I know redirect imports was the wrong phrase, but it worked in my head at the time. It appears, at least on my system, that Python will find site-packages/foo before it finds and reads site-packages/foo.pth. At least this solution gives me a way to develop my libraries outside of site-packages. Well, I'm still not totally sure what your setup is, but assuming site-packages/foo is a directory containing an __init__.py (that is, it is a package), then yes, it will be found before an alternative package in a directory named with a .pth file. Note that I don't say it will be found before the .pth file, because, again, the finding of the package (when you do import foo) happens much later than the processing of the .pth file. So it doesn't find site-packages/foo before it reads foo.pth; it just finds site-packages/foo before it finds the other foo that foo.pth was trying to point to. Let's say your .pth file specifies the directory /elsewhere. The .pth file is processed by site.py when the interpreter starts up, and at that time /elsewhere will be appended to sys.path. Later, when you do the import, it searches sys.path in order. site-packages itself will be earlier in sys.path than /elsewhere, so a package site-packages/foo will be found before /elsewhere/foo. The key here is that the .pth file is processed at interpreter-start time, but the search for foo doesn't take place until you actually execute import foo. If you want to make your /elsewhere jump the line and go to the front, look at easy_install.pth, which seems to have some magic code at the end that moves its eggs ahead of site-packages in sys.path. I'm not sure how this works, though, and it seems like a risky proposition. -- --OKB (not okblacke) Brendan Barnwell Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail. --author unknown -- http://mail.python.org/mailman/listinfo/python-list
Re: Arrange files according to a text file
No, it turned out to be my mistake. Your code was correct and I appreciate it very much. Thank you again On Sat, 27 Aug 2011 18:10:07 -0700, Emile van Sebille em...@fenx.com wrote: On 8/27/2011 4:18 PM r...@rdo.python.org said... Thank you so much. The code worked perfectly. This is what I tried using Emile code. The only time when it picked wrong name from the list was when the file was named like this. Data Mark Stone.doc How can I fix this? Hope I am not asking too much? What name did it pick? I imagine if you're picking a name from a list of 25000 names that some subset of combinations may yield like ratios. But, if you double up on the file name side you may get closer: for filename in filenames: ratios = [SM(ignore,filename+filename,username).ratio() for username in usernames] best = max(ratios) owner = usernames[ratios.index(best)] print filename,:,owner ... on the other hand, if you've only got a 100 files to sort out, you should already be done. :) Emile -- http://mail.python.org/mailman/listinfo/python-list
[issue12768] docstrings for the threading module
Graeme Cross gjcr...@gmail.com added the comment: I will check that the patch works with 3.2; if not, I'll redo the patch for 3.2. I will also incorporate the review changes from Ezio and Eric. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12768 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12833] raw_input misbehaves when readline is imported
Nadeem Vawda nadeem.va...@gmail.com added the comment: Reproduced on 3.3 head. Looking at the documentation of the C readline library, it needs to know the length of the prompt in order to display properly, so this seems to be an acknowledged limitation of the underlying library rather than a bug on our side. Still, this behavior is surprising and undesirable. I would suggest adding a note to the docs for the readline module, directing users to write: input(foo ) instead of: sys.stdout.write(foo ) input() -- nosy: +nadeem.vawda ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12833 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12833] raw_input misbehaves when readline is imported
Idan Kamara idank...@gmail.com added the comment: You're right, as this little C program verifies: #include stdio.h #include stdlib.h #include readline/readline.h int main() { printf(foo ); char* buf = readline(); free(buf); return 0; } Passing ' ' seems to be a suitable workaround for those who can't pass the text directly to raw_input though (such is the case where you have special classes who handle output). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12833 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: Unfortunately, it won't work. _dosmaperr() is not exported by msvcrt.dll, it is only available when you link against the static version of the C runtime. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12802 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug
Tom Christiansen tchr...@perl.com added the comment: Guido van Rossum rep...@bugs.python.org wrote on Sat, 27 Aug 2011 03:26:21 -: To me, making (default) iteration deviate from indexing is anathema. So long is there's a way to interate through a string some other way that by code unit, that's fine. However, the Java way of 16-bit code units is so annoying because there often aren't code point APIs, and so you get a lot of niggling errors creeping in. This is part of why I strongly prefer wide builds, so that code point and code unit are the same thing again. However, there is nothing wrong with providing a library function that takes a string and returns an iterator that iterates over code points, joining surrogate pairs as needed. You could even have one that iterates over characters (I think Tom calls them graphemes), if that is well-defined and useful. Character can sometimes be a confusing term when it means something different to us programmers as it does to users. Code point to mean the integer is a lot clearer to us but to no one else. At work I often just give in and go along with the crowd and say character for the number that sits in a char or wchar_t or Character variable, even though of course that's a code point. I only rebel when they start calling code units characters, which (inexperienced) Java people tend to do, because that leads to surrogate splitting and related errors. By grapheme I mean something the user perceives as a single character. In full Unicodese, this is an extended grapheme cluster. These are code point sequences that start with a grapheme base and have zero or more grapheme extenders following it. For our purposes, that's *mostly* like saying you have a non-Mark followed by any number of Mark code points, the main excepting being that a CR followed by a LF also counts as a single grapheme in Unicode. If you are in an editor and wanted to swap two characters, the one under the user's cursor and the one next to it, you have to deal with graphemes not individual code points, or else you'd get the wrong answer. Imagine swapping the last two characters of the first string below, or the first two characters of second one: contrôléecontro\x{302}le\x{301}e élèvee\x{301}le\x{300}ve While you can sometimes fake a correct answer by considering things in NFC not NFD, that's doesn't work in the general case, as there are only a few compatibility glyphs for round-tripping for legacy encodings (like ISO 8859-1) compared with infinitely many combinations of combining marks. Particularly in mathematics and in phonetics, you often end up using marks on characters for which no pre-combined variant glyph exists. Here's the IPA for a couple of Spanish words with their tight (phonetic, not phonemic) transcriptions: anécdota[a̠ˈne̞ɣ̞ð̞o̞t̪a̠] rincón [rĩŋˈkõ̞n] NFD: ane\x{301}cdota [a\x{320}\x{2C8}ne\x{31E}\x{263}\x{31E}\x{F0}\x{31E}o\x{31E}t\x{32A}a\x{320}] rinco\x{301}n [ri\x{303}\x{14B}\x{2C8}ko\x{31E}\x{303}n] NFD: an\x{E9}cdota [a\x{320}\x{2C8}ne\x{31E}\x{263}\x{31E}\x{F0}\x{31E}o\x{31E}t\x{32A}a\x{320}] rinc\x{F3}n [r\x{129}\x{14B}\x{2C8}k\x{F5}\x{31E}n] So combining marks don't just go away in NFC, and you really do have to deal with them. Notice that to get the tabs right (your favorite subject :), you have to deal with print widths, which is another place that you get into trouble if you only count code points. BTW, did you know that the stress mark used in the phonetics above is actually a (modifier) letter in Unicode, not punctuation? # uniprops -a 2c8 U+02C8 ‹ˈ› \N{MODIFIER LETTER VERTICAL LINE} \w \pL \p{L_} \p{Lm} All Any Alnum Alpha Alphabetic Assigned InSpacingModifierLetters Case_Ignorable CI Common Zyyy Dia Diacritic L Lm Gr_Base Grapheme_Base Graph GrBase ID_Continue IDC ID_Start IDS Letter L_ Modifier_Letter Print Spacing_Modifier_Letters Word XID_Continue XIDC XID_Start XIDS X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word Age=1.1 Bidi_Class=ON Bidi_Class=Other_Neutral BC=ON Block=Spacing_Modifier_Letters Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered CCC=NR Canonical_Combining_Class=NR Script=Common Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=BB Line_Break=Break_Before LB=BB Numeric_Type=None NT=None Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1 Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2 Present_In=6.0 IN=6.0 SC=Zyyy
[issue12847] crash with negative PUT in pickle
New submission from Antoine Pitrou pit...@free.fr: This doesn't happen on 2.x cPickle, where PUT keys are simply treated as strings. import pickle, pickletools s = b'Va\np-1\n.' pickletools.dis(s) 0: VUNICODE'a' 3: pPUT-1 7: .STOP highest protocol among opcodes = 0 pickle.loads(s) Erreur de segmentation -- messages: 143062 nosy: pitrou priority: normal severity: normal status: open title: crash with negative PUT in pickle type: crash versions: Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12847 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12847] crash with negative PUT in pickle
Antoine Pitrou pit...@free.fr added the comment: Same with LONG_BINPUT on a 32-bit build: s = b'\x80\x03X\x01\x00\x00\x00ar\xff\xff\xff\xff.' pickletools.dis(s) 0: \x80 PROTO 3 2: XBINUNICODE 'a' 8: rLONG_BINPUT -1 13: .STOP highest protocol among opcodes = 2 pickle.loads(s) Erreur de segmentation -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12847 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11564] pickle not 64-bit ready
Antoine Pitrou pit...@free.fr added the comment: Here is a new patch against 3.2. I can't say it works for sure, but it should be much better. It also adds a couple more tests. There seems to be a separate issue where pure-Python pickle.py considers 32-bit lengths signed where the C impl considers them unsigned... -- Added file: http://bugs.python.org/file23052/pickle64-4.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11564 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned
New submission from Antoine Pitrou pit...@free.fr: In several opcodes (BINBYTES, BINUNICODE... what else?), _pickle.c happily accepts 32-bit lengths of more than 2**31, while pickle.py uses marshal's i typecode which means signed... and therefore fails reading the data. Apparently, pickle.py uses marshal for speed reasons, but marshal doesn't support unsigned types. (seen from http://bugs.python.org/issue11564) -- components: Library (Lib) messages: 143065 nosy: alexandre.vassalotti, pitrou priority: normal severity: normal status: open title: pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned type: behavior versions: Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12848 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12835] Missing SSLSocket.sendmsg() wrapper allows programs to send unencrypted data by mistake
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset b06f011a3529 by Nick Coghlan in branch 'default': Fix #12835: prevent use of the unencrypted sendmsg/recvmsg APIs on SSL wrapped sockets (Patch by David Watson) http://hg.python.org/cpython/rev/b06f011a3529 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12835 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12835] Missing SSLSocket.sendmsg() wrapper allows programs to send unencrypted data by mistake
Changes by Nick Coghlan ncogh...@gmail.com: -- resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12835 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9923] mailcap module may not work on non-POSIX platforms if MAILCAPS env variable is set
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 7b83d2c1aad9 by Nick Coghlan in branch 'default': Fix #9923: mailcap now uses the OS path separator for the MAILCAP envvar. Not backported, since it could break cases where people worked around the old POSIX-specific behaviour on non-POSIX platforms. http://hg.python.org/cpython/rev/7b83d2c1aad9 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9923 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12174] Multiprocessing logging levels unclear
Vinay Sajip vinay_sa...@yahoo.co.uk added the comment: Although the reference docs don't list the numeric values of logging levels, this happened during reorganising of the docs. The table has moved to the HOWTO: http://docs.python.org/howto/logging.html#logging-levels That said, I don't understand the need for special logging levels in the multiprocessing package. From the section following the one linked to above: Defining your own levels is possible, but should not be necessary, as the existing levels have been chosen on the basis of practical experience. However, if you are convinced that you need custom levels, great care should be exercised when doing this, and it is possibly *a very bad idea to define custom levels if you are developing a library*. That’s because if multiple library authors all define their own custom levels, there is a chance that the logging output from such multiple libraries used together will be difficult for the using developer to control and/or interpret, because a given numeric value might mean different things for different libraries. -- nosy: +vinay.sajip ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12174 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9923] mailcap module may not work on non-POSIX platforms if MAILCAPS env variable is set
Nick Coghlan ncogh...@gmail.com added the comment: As noted in the commit message, I didn't backport this, since it didn't seem worth risking breaking even the unlikely case that someone actually *was* using the MAILCAP environment variable on Windows. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed versions: -Python 2.7, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9923 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL
Vlad Riscutia riscutiav...@gmail.com added the comment: Oh, got it. Interesting. Then should I just add a comment somewhere or should we resolve this as Won't Fix? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12802 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL
Antoine Pitrou pit...@free.fr added the comment: We could add a special case to generrmap.c (but how can I compile and execute this file? it doesn't seem to be part of the project files). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12802 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation
Tom Christiansen tchr...@perl.com added the comment: Guido van Rossum rep...@bugs.python.org wrote on Fri, 26 Aug 2011 21:11:24 -: Would this also affect .islower() and friends? SHORT VERSION: (7 lines) I don't believe so, but the relationship between lower() and islower() is not as clear to me as I would have thought, and more importantly, the code and the documentation for Python's islower() etc currently seem to disagree. For future releases, I recommend fixing the code, but if compatibility is an issue, then perhaps for previous releases still in maintenance mode fixing only the documentation would possibly be good enough--your call. === MEDIUM VERSION: (87 lines) I was initially confused with Python's islower() family because of the way they are defined to operate on full strings. They don't check that everything is lowercase even though they say they do. http://docs.python.org/py3k/library/stdtypes.html#sequence-types-str-bytes-bytearray-list-tuple-range str.lower() Return a copy of the string with all the cased characters [4] converted to lowercase. str.islower() Return true if all cased characters [4] in the string are lowercase and there is at least one cased character, false otherwise. [4] (1, 2, 3, 4) Cased characters are those with general category property being one of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase). This is strange in several ways. Of lesser importance is that strings can be considered lowercase even if they don't match ^\p{lowercase}+$ Another is that the result of calling str.lower() may not be .islower(). I'm not sure what these are particularly for, since I myself would just use a regex to get finer-grained control. (I suppose that's because re doesn't give access to the Unicode properties needed that this approach never gained any traction in the Python community.) However, the worst of this is that the documentation defines both cased characters and lowercase characters *differently* from how Unicode does defines those very same terms. This was quite confusing. Unicode distinguishes Cased code points from Cased_*Letter* code points. Python is using the Cased_Letter property but calling it Cased. Cased in a proper superset of Cased_Letter. From the DerivedCoreProperties file in the Unicode Character Database: # Derived Property: Cased (Cased) # As defined by Unicode Standard Definition D120 # C has the Lowercase or Uppercase property or has a General_Category value of Titlecase_Letter. In the same way, the Lowercase and Uppercase properties are not the same as the Lowercase_*Letter* and Uppercase_*Letter* properties. Rather, the former are respectively proper supersets of the latter. # Derived Property: Lowercase # Generated from: Ll + Other_Lowercase [...] # Derived Property: Uppercase # Generated from: Lu + Other_Uppercase In all these, you almost always want the superset versions not the restricted subset versions you are using. If it were in the regex engine, the user could select either. Java used to miss all these, too. But in 1.7, they updated their character methods to use the properties that they'd all along said they were using: http://download.oracle.com/javase/7/docs/api/java/lang/Character.html#isLowerCase(char) public static boolean isLowerCase(char ch) Determines if the specified character is a lowercase character. A character is lowercase if its general category type, provided by Character.getType(ch), is LOWERCASE_LETTER, or it has contributory - property Other_Lowercase as defined by the Unicode Standard. Note: This method cannot handle supplementary characters. To support all Unicode characters, including supplementary characters, use the isLowerCase(int) method. (And yes, that's where Java uses character to mean code unit not code point, alas. No wonder people get confused) I'm pretty sure that Python needs to either update its documentation to match its code, update its code to match its documentation, or both. Java chose to update the code to match the documentation, and this is the course I would recommend if at all possible. If you say you are checking for cased code points, then you should use the Unicode definition of cased code points not your own, and if you say you are checking for lowercase code points, then you should use the Unicode definition not your own. Both of these require access to contributory properties from the UCD and not just general categories alone. --tom === LONG VERSION: (222 lines) Essential tools I use for inspecting Unicode code points and their properties include
[issue10015] Creating a multiprocess.pool.ThreadPool from a child thread blows up.
Changes by Vinay Sajip vinay_sa...@yahoo.co.uk: -- title: Creating a multiproccess.pool.ThreadPool from a child thread blows up. - Creating a multiprocess.pool.ThreadPool from a child thread blows up. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10015 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL
Antoine Pitrou pit...@free.fr added the comment: Ok, apparently I can use errmap.mak, except that I get the following error: Z:\default\PCnmake errmap.mak Microsoft (R) Program Maintenance Utility Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. cl generrmap.c Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64 Copyright (C) Microsoft Corporation. All rights reserved. generrmap.c generrmap.c(1) : fatal error C1034: stdio.h: no include path set NMAKE : fatal error U1077: 'C:\Program Files (x86)\Microsoft Visual Studio 9.0\ VC\bin\amd64\cl.EXE' : return code '0x2' Stop. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12802 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com