date:20110827

Arnaud Delobelle wrote:

 Hi all,
 
 I'm wondering what advice you have about formatting if statements with
 long conditions (I always format my code to 80 colums)
 
 Here's an example taken from something I'm writing at the moment and
 how I've formatted it:
 
 
 if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
 and left.complist[-1] is right.complist[0]):
 py_and = PyCompare(left.complist + right.complist[1:])
 else:
 py_and = PyBooleanAnd(left, right)
 
 What would you do?

I believe that PEP 8 now suggests something like this:

if (
isinstance(left, PyCompare) and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]):
)
py_and = PyCompare(left.complist + right.complist[1:]
else:
py_and = PyBooleanAnd(left, right)


I consider that hideous and would prefer to write this:


if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]):
py_and = PyCompare(left.complist + right.complist[1:]
else:
py_and = PyBooleanAnd(left, right)


Or even this:

tmp = (
isinstance(left, PyCompare) and isinstance(right, PyCompare) 
and left.complist[-1] is right.complist[0])
)
if tmp:
py_and = PyCompare(left.complist + right.complist[1:]
else:
py_and = PyBooleanAnd(left, right)


But perhaps the best solution is to define a helper function:

def is_next(left, right):
Returns True if right is the next PyCompare to left.
return (isinstance(left, PyCompare) and isinstance(right, PyCompare) 
and left.complist[-1] is right.complist[0])
# PEP 8 version left as an exercise.


# later...
if is_next(left, right):
py_and = PyCompare(left.complist + right.complist[1:]
else:
py_and = PyBooleanAnd(left, right)

-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

2011-08-27 Thread Hans Mulder


On 27/08/11 09:08:20, Arnaud Delobelle wrote:

I'm wondering what advice you have about formatting if statements with
long conditions (I always format my code to80 colums)

Here's an example taken from something I'm writing at the moment and
how I've formatted it:


 if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
 and left.complist[-1] is right.complist[0]):
 py_and = PyCompare(left.complist + right.complist[1:])
 else:
 py_and = PyBooleanAnd(left, right)

What would you do?


I would break after the '(' and indent the condition once and
put the '):' bit on a separate line, aligned with the 'if':


  if (
  isinstance(left, PyCompare)
  and isinstance(right, PyCompare)
  and left.complist[-1] is right.complist[0]
  ):
  py_and = PyCompare(left.complist + right.complist[1:])
  else:
  py_and = PyBooleanAnd(left, right)

It may look ugly, but it's very clear where the condition part ends
and the 'then' part begins.

-- HansM
--
http://mail.python.org/mailman/listinfo/python-list

Re: Python IDE/Eclipse

2011-08-27 Thread UncleLaz

On Aug 26, 5:18 pm, Dave Boland dbola...@fastmail.fm wrote:
 I'm looking for a good IDE -- easy to setup, easy to use -- for Python.
   Any suggestions?

 I use Eclipse for other projects and have no problem with using it for
 Python, except that I can't get PyDev to install.  It takes forever,
 then produces an error that makes no sense.

 An error occurred while installing the items
    session context was:(profile=PlatformProfile,
 phase=org.eclipse.equinox.internal.provisional.p2.engine.phases.Install,
 operand=null -- [R]org.eclipse.cvs 1.0.400.v201002111343,
 action=org.eclipse.equinox.internal.p2.touchpoint.eclipse.actions.InstallBu 
 ndleAction).
    Cannot connect to keystore.
    This trust engine is read only.
    The artifact file for
 osgi.bundle,org.eclipse.cvs,1.0.400.v201002111343 was not found.

 Any suggestions on getting this to work?

 Thanks,
 Dave

I use Aptana Studio 3, it's pretty good and it's based on eclipse
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

Hans Mulder wrote:

[...]
 It may look ugly, but it's very clear where the condition part ends
 and the 'then' part begins.

Immediately after the colon, surely?



-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Catch and name an exception in Python 2.5 +

2011-08-27 Thread Thomas Jollans

On 27/08/11 05:45, Steven D'Aprano wrote:
 Thomas Jollans wrote:
 
 On 26/08/11 21:56, Steven D'Aprano wrote:
 
 Is there any way to catch an exception and bind it to a name which will
 work across all Python versions from 2.5 onwards?

 I'm pretty sure there isn't, but I thought I'd ask just in case.

 It's not elegant, and I haven't actually tested this, but this should
 work:

 try:
 ...
 except (ValueError, KeyError):
 error = sys.exc_info()[2]
 
 Great! Thanks for that, except I think you want to use [1], not [2].

Ah, yes. Of course.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

2011-08-27 Thread Arnaud Delobelle

On 27 August 2011 08:24, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Arnaud Delobelle wrote:

 Hi all,

 I'm wondering what advice you have about formatting if statements with
 long conditions (I always format my code to 80 colums)

 Here's an example taken from something I'm writing at the moment and
 how I've formatted it:


         if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
                 and left.complist[-1] is right.complist[0]):
             py_and = PyCompare(left.complist + right.complist[1:])
         else:
             py_and = PyBooleanAnd(left, right)

 What would you do?

 I believe that PEP 8 now suggests something like this:

        if (
                isinstance(left, PyCompare) and isinstance(right, PyCompare)
                and left.complist[-1] is right.complist[0]):
            )
            py_and = PyCompare(left.complist + right.complist[1:]
        else:
            py_and = PyBooleanAnd(left, right)


 I consider that hideous and would prefer to write this:


        if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
            and left.complist[-1] is right.complist[0]):
            py_and = PyCompare(left.complist + right.complist[1:]
        else:
            py_and = PyBooleanAnd(left, right)


 Or even this:

        tmp = (
            isinstance(left, PyCompare) and isinstance(right, PyCompare)
            and left.complist[-1] is right.complist[0])
            )
        if tmp:
            py_and = PyCompare(left.complist + right.complist[1:]
        else:
            py_and = PyBooleanAnd(left, right)


 But perhaps the best solution is to define a helper function:

 def is_next(left, right):
    Returns True if right is the next PyCompare to left.
    return (isinstance(left, PyCompare) and isinstance(right, PyCompare)
        and left.complist[-1] is right.complist[0])
    # PEP 8 version left as an exercise.


 # later...
        if is_next(left, right):
            py_and = PyCompare(left.complist + right.complist[1:]
        else:
            py_and = PyBooleanAnd(left, right)


Thanks Steven and Hans for you suggestions.  For this particular
instance I've decided to go for a hybrid approach:

* Add two methods to PyCompare:

class PyCompare(PyExpr):
...
def extends(self, other):
if not isinstance(other, PyCompare):
return False
else:
return self.complist[0] == other.complist[-1]
def chain(self, other):
return PyCompare(self.complist + other.complist[1:])

* Rewrite the if as:

if isinstance(right, PyCompare) and right.extends(left):
py_and = left.chain(right)
else:
py_and = PyBooleanAnd(left, right)


The additional benefit is to hide the implementation details of
PyCompare (I suppose this could illustrate the thread on when to
create functions).

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

2011-08-27 Thread Hans Mulder


On 27/08/11 11:05:25, Steven D'Aprano wrote:

Hans Mulder wrote:

[...]

It may look ugly, but it's very clear where the condition part ends
and the 'then' part begins.


Immediately after the colon, surely?


On the next line, actually :-)

The point is, that this layout makes it very clear that the colon
isn't in its usual position (at the end of the line that starts
with 'if') and it is clearly visible.

With the layout Arnaud originally propose, finding the colon takes
longer.  (Arnaud has since posted a better approach, in which the
colon is back in its usual position.)

-- HansM

--
http://mail.python.org/mailman/listinfo/python-list

Re: Run time default arguments

2011-08-27 Thread Carl Banks

On Thursday, August 25, 2011 1:54:35 PM UTC-7, ti...@thsu.org wrote:
 On Aug 25, 10:35 am, Arnaud Delobelle arn...@gmail.com wrote:
  You're close to the usual idiom:
 
  def doSomething(debug=None):
      if debug is None:
          debug = defaults['debug']
      ...
 
  Note the use of 'is' rather than '=='
  HTH
 
 Hmm, from what you are saying, it seems like there's no elegant way to
 handle run time defaults for function arguments, meaning that I should
 probably write a sql-esc coalesce function to keep my code cleaner. I
 take it that most people who run into this situation do this?

I don't; it seems kind of superfluous when if arg is not None: arg = whatever 
is just as easy to type and more straightforward to read.

I could see a function like coalesce being helpful if you have a list of 
several options to check, though.  Also, SQL doesn't give you a lot of 
flexibility, so coalesce is a lot more needed there.

But for simple arguments in Python, I'd recommend sticking with if arg is not 
None: arg = whatever


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

2011-08-27 Thread Ben Finney

Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:

 I believe that PEP 8 now

Specifically the “Indentation” section contains::

When using a hanging indent the following considerations should be
applied; there should be no arguments on the first line and further
indentation should be used to clearly distinguish itself as a
continuation line.

 suggests something like this:

 if (
 isinstance(left, PyCompare) and isinstance(right, PyCompare)
 and left.complist[-1] is right.complist[0]):
 )
 py_and = PyCompare(left.complist + right.complist[1:]
 else:
 py_and = PyBooleanAnd(left, right)

That gives a SyntaxError. I think you mean one of these possible PEP 8
compliant forms::

if (
isinstance(left, PyCompare) and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]):
py_and = PyCompare(left.complist + right.complist[1:]
else:
py_and = PyBooleanAnd(left, right)

or maybe::

if (
isinstance(left, PyCompare) and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]
):
py_and = PyCompare(left.complist + right.complist[1:]
else:
py_and = PyBooleanAnd(left, right)

 I consider that hideous

I think both of those (once modified to conform to both the Python
syntax and the PEP 8 guidelines) look clear and readable.

I mildy prefer the first for being a little more elegant, but the second
is slightly better for maintainability and reducing diff noise. Either
one makes me happy.

 and would prefer to write this:

 if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
 and left.complist[-1] is right.complist[0]):
 py_and = PyCompare(left.complist + right.complist[1:]
 else:
 py_and = PyBooleanAnd(left, right)

That one keeps tripping me up because the indentation doesn't make clear
where subordinate clauses begin and end. The (current) PEP 8 rules are
much better for readability in my eyes.


Having said that, I'm only a recent convert to the current PEP 8 style
for indentation of condition clauses. It took several heated arguments
with colleagues before I was able to admit the superiority of clear
indentation :-)

-- 
 \  “I am too firm in my consciousness of the marvelous to be ever |
  `\   fascinated by the mere supernatural …” —Joseph Conrad, _The |
_o__) Shadow-Line_ |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list

[ANN] Oktest 0.9.0 released - a new-style testing library

2011-08-27 Thread Makoto Kuwata

Hi,

I released Oktest 0.9.0.
http://pypi.python.org/pypi/Oktest/
http://packages.python.org/Oktest/

Oktest is a new-style testing library for Python.
::

from oktest import ok, NG
ok (x)  0 # same as assert_(x  0)
ok (s) == 'foo'# same as assertEqual(s, 'foo')
ok (s) != 'foo'# same as assertNotEqual(s, 'foo')
ok (f).raises(ValueError)  # same as assertRaises(ValueError, f)
ok (u'foo').is_a(unicode)  # same as assert_(isinstance(u'foo', unicode))
NG (u'foo').is_a(int)  # same as assert_(not isinstance(u'foo', int))
ok ('A.txt').is_file() # same as assert_(os.path.isfile('A.txt'))
NG ('A.txt').is_dir()  # same as assert_(not os.path.isdir('A.txt'))

See http://packages.python.org/Oktest/ for details.

NOTICE!! Oktest is a young project and specification may change in the future.


Main Enhancements
-

* New '@test' decorator provided. It is simple but very powerful.
  Using @test decorator, you can write test description in free text
  instead of test method.
  ex::

class FooTest(unittest.TestCase):

def test_1_plus_1_should_be_2(self):  # not cool...
self.assertEqual(2, 1+1)

@test(1 + 1 should be 2)# cool! easy to read  write!
def _(self):
self.assertEqual(2, 1+1)

* Fixture injection support by '@test' decorator.
  Arguments of test method are regarded as fixture names and
  they are injected by @test decorator automatically.
  Instance methods or global functions which name is 'provide_' are
  regarded as fixture provider (or builder) for fixture ''.
  ex::

class SosTest(unittest.TestCase):

##
## fixture providers.
##
def provide_member1(self):
return {name: Haruhi}

def provide_member2(self):
return {name: Kyon}

##
## fixture releaser (optional)
##
def release_member1(self, value):
assert value == {name: Haruhi}

##
## testcase which requires 'member1' and 'member2' fixtures.
##
@test(validate member's names)
def _(self, member1, member2):
assert member1[name] == Haruhi
assert member2[name] == Kyon

  Dependencies between fixtures are resolved automatically.
  ex::

class BarTest(unittest.TestCase):

##
## for example:
## - Fixture 'a' depends on 'b' and 'c'.
## - Fixture 'c' depends on 'd'.
##
def provide_a(b, c):  return b + c + [A]
def provide_b():  return [B]
def provide_c(d): return d + [C]
def provide_d():  reutrn [D]

##
## Dependencies between fixtures are solved automatically.
##
@test(dependency test)
def _(self, a):
assert a == [B, D, C, A]

  If loop exists in dependency then @test reports error.

  If you want to integrate with other fixture library, see the following
  example::

  class MyFixtureManager(object):
  def __init__(self):
  self.values = { x: 100, y: 200 }
  def provide(self, name):
  return self.values[name]
  def release(self, name, value):
  pass

  oktest.fixure_manager = MyFixtureResolver()



Other Enhancements and Changes
--

* Supports command-line interface to execute test scripts.
* Reporting style is changed.
* New assertion method ``ok(x).attr(name, value)`` to check attribute.
* New assertion method ``ok(x).length(n)``.
* New feature``ok().should`` helps you to check boolean method.
* 'ok(str1) == str2' displays diff if text1 != text2, even when using
  with unittest module.
* Assertion ``raises()`` supports regular expression to check error message.
* Helper functions in oktest.dummy module are now available as decorator.
* 'AssertionObject.expected' is renamed to 'AssertionObject.boolean'.
* ``oktest.run()`` is changed to return number of failures and errors of tests.
* ``before_each()`` and ``after_each()`` are now non-supported.
* (Experimental) New function ``NOT()`` provided which is same as ``NG()``.
* (Experimental) ``skip()`` and ``@skip.when()`` are provided to skip tests::

See CHANGES.txt for details.
http://packages.python.org/Oktest/CHANGES.txt


Have a nice testing life!

--
regards,
makoto kuwata
-- 
http://mail.python.org/mailman/listinfo/python-list

typing question

2011-08-27 Thread Jason Swails

Hello everyone,

This is probably a basic question with an obvious answer, but I don't quite
get why the type(foo).__name__ works differently for some class instances
and not for others.  If I have an underived class, any instance of that
class is simply of type instance.  If I include an explicit base class,
then its type __name__ is the name of the class.

$ python
Python 2.7.2 (default, Aug 26 2011, 22:35:24)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type help, copyright, credits or license for more information.
 class MyClass:
... pass
...
 foo = MyClass()
 type(foo)
type 'instance'
 type(foo).__name__
'instance'
 class MyClass1():
... pass
...
 bar = MyClass1()
 type(bar)
type 'instance'
 type(bar).__name__
'instance'
 class MyClass2(object):
... pass
...
 foobar = MyClass2()
 type(foobar)
class '__main__.MyClass2'
 type(foobar).__name__
'MyClass2'

I can't explain this behavior (since doesn't every class inherit from object
by default? And if so, there should be no difference between any of my class
definitions).  I would prefer that every approach give me the name of the
class (rather than the first 2 just return 'instance').  Why is this not the
case?  Also, is there any way to access the name of the of the class type
foo or bar in the above example?

Thanks!
Jason

P.S.  I'll note that my preferred behavior is how python3.2 actually
operates

$ python3.2
Python 3.2.1 (default, Aug 26 2011, 23:20:19)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type help, copyright, credits or license for more information.
 class MyClass:
... pass
...
 foo = MyClass()
 type(foo).__name__
'MyClass'


-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: typing question

On Sat, Aug 27, 2011 at 11:42 PM, Jason Swails jason.swa...@gmail.com wrote:
 I can't explain this behavior (since doesn't every class inherit from object
 by default? And if so, there should be no difference between any of my class
 definitions).

That is true in Python 3, but not in Python 2. That's why your example
works perfectly in version 3.2. Be explicit about deriving from object
and your code should work fine in both versions.

Chris Angelico
-- 
http://mail.python.org/mailman/listinfo/python-list

UnicodeEncodeError -- 'character maps to undefined'

2011-08-27 Thread J

Hi there,

I'm attempting to print a dictionary entry of some twitter data to screen but 
every now and then I get the following error:

(type 'exceptions.UnicodeEncodeError', UnicodeEncodeError('charmap', u'RT 
@ciaraluvsjb26: BIEBER FEVER \u2665', 32, 33, 'character maps to undefined'), 
traceback object at 0x01B323C8)

I have googled this but haven't really found any way to overcome the error. Any 
ideas?

J
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: UnicodeEncodeError -- 'character maps to undefined'

J wrote:

 Hi there,
 
 I'm attempting to print a dictionary entry of some twitter data to screen
 but every now and then I get the following error:
 
 (type 'exceptions.UnicodeEncodeError', UnicodeEncodeError('charmap',
 u'RT @ciaraluvsjb26: BIEBER FEVER \u2665', 32, 33, 'character maps to
 undefined'), traceback object at 0x01B323C8)

Showing the actual traceback will help far more than a raw exception tuple.


 I have googled this but haven't really found any way to overcome the
 error. Any ideas?

I can only try to guess what you are doing, since you haven't shown either
any code or a traceback, but I can imagine that you're probably trying to
encode a Unicode string into bytes, but using the wrong encoding.

I can almost replicate the error: the exception is the same, the message is
not, although it is similar.

 s = u'BIEBER FEVER \u2665'
 print s  # Printing Unicode is fine.
BIEBER FEVER ♥
 s.encode()  # but encoding defaults to ASCII
Traceback (most recent call last):
  File stdin, line 1, in module
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2665' in
position 13: ordinal not in range(128)



The right way is to specify an encoding that includes all the characters you
need. Unless you have some reason to choose another encoding, the best
thing to do is to just use UTF-8.

 s.encode('utf-8')
'BIEBER FEVER \xe2\x99\xa5'



-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

2011-08-27 Thread Colin J. Williams


On 27-Aug-11 03:50 AM, Hans Mulder wrote:

On 27/08/11 09:08:20, Arnaud Delobelle wrote:

I'm wondering what advice you have about formatting if statements with
long conditions (I always format my code to80 colums)

Here's an example taken from something I'm writing at the moment and
how I've formatted it:


if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]):
py_and = PyCompare(left.complist + right.complist[1:])
else:
py_and = PyBooleanAnd(left, right)

What would you do?


I would break after the '(' and indent the condition once and
put the '):' bit on a separate line, aligned with the 'if':


if (
isinstance(left, PyCompare)
and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]
):
py_and = PyCompare(left.complist + right.complist[1:])
else:
py_and = PyBooleanAnd(left, right)

It may look ugly, but it's very clear where the condition part ends
and the 'then' part begins.

-- HansM


What about:
  cond=  isinstance(left, PyCompare)
 and isinstance(right, PyCompare)
 and left.complist[-1] is right.complist[0]
  py_and= PyCompare(left.complist + right.complist[1:])if cond
  else: py_and = PyBooleanAnd(left, right)
Colin W.

--
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

2011-08-27 Thread Hans Mulder


On 27/08/11 17:16:51, Colin J. Williams wrote:


What about:
cond= isinstance(left, PyCompare)
  and isinstance(right, PyCompare)
  and left.complist[-1] is right.complist[0]
py_and= PyCompare(left.complist + right.complist[1:])if cond
  else: py_and = PyBooleanAnd(left, right)
Colin W.


That's a syntax error.  You need to add parenthesis.

How about:

cond = (
isinstance(left, PyCompare)
and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]
}
py_and = (
 PyCompare(left.complist + right.complist[1:])
if   cond
else PyBooleanAnd(left, right)
)

-- HansM
--
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

In article mailman.457.1314428909.27778.python-l...@python.org,
 Arnaud Delobelle arno...@gmail.com wrote:

 Hi all,
 
 I'm wondering what advice you have about formatting if statements with
 long conditions (I always format my code to 80 colums)
 [...]
 if (isinstance(left, PyCompare) and isinstance(right, PyCompare)
 and left.complist[-1] is right.complist[0]):
 py_and = PyCompare(left.complist + right.complist[1:])
 else:
 py_and = PyBooleanAnd(left, right)

To tie this into the ongoing, When should I write a new function? 
discussion, maybe the right thing here is to refactor all of that mess 
into its own function, so the code looks like:

   if _needs_compare(left, right):
 py_and = PyCompare(left.complist + right.complist[1:])
else:
 py_and = PyBooleanAnd(left, right)

and then

def _needs_compare(left, right):
   Decide if we need to call PyCompare
   return isinstance(left, PyCompare) and \
  isinstance(right, PyCompare) and \
  left.complist[-1] is right.complist[0]

This seems easier to read/understand than what you've got now.  It's an 
even bigger win if this will get called from multiple places.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

Chris Angelico ros...@gmail.com wrote:

 the important
 considerations are not will it take two extra nanoseconds to execute
 but can my successor understand what the code's doing and will he,
 if he edits my code, have a reasonable expectation that he's not
 breaking stuff. These are always important.

Forget about your successor.  Will *you* be able to figure out what you 
did 6 months from now?  I can't tell you how many times I've looked at 
some piece of code, muttered, Who wrote this crap? and called up the 
checkin history only to discover that *I* wrote it :-)
-- 
http://mail.python.org/mailman/listinfo/python-list

Understanding .pth files

I am developing a library for Python 2.7. I'm on Windows XP. I am also learning 
the proper way to do this (per PyPi) but not in a linear fashion: I've built  
a prototype for the library, created my setup script, and run the install to 
make sure I had that bit working properly.

Now I'm continuing to develop the library alongside my examples and 
applications that use this library.

The source is at c:\Dev\XmlDB.
The installed package in in c:\Python27\lib\site-packages\xmldb\

According to the docs, I should be able to put a file in the site-packages 
directory called xmldb.pth pointing anywhere else on my drive to include the 
package. I'd like to use this to direct Python to include the version in the 
dev folder and not the site-packages folder.

(Otherwise I have my dev folder, but end up doing actual library development in 
the site-packages folder)

So my C:\Python27\lib\site-packages\xmldb.pth file has one line:

c:\dev\XmlDB\xmldb

(I've tried the slashes the other way, too, but it doesn't seem to work).

Is the only solution to delete the installed library and add the dev folder to 
my site.py file?

Josh
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

On Sun, Aug 28, 2011 at 2:41 AM, Roy Smith r...@panix.com wrote:
 Forget about your successor.  Will *you* be able to figure out what you
 did 6 months from now?  I can't tell you how many times I've looked at
 some piece of code, muttered, Who wrote this crap? and called up the
 checkin history only to discover that *I* wrote it :-)

Heh. In that case, you were your own successor :) I always word it as
a different person to dodge the But I'll remember! excuse, but you
are absolutely right, and I've had that exact same experience myself.

Fred comes up to me and says, How do I use FooMatic? Me: I dunno,
ask Joe. Fred: But didn't you write it? Me: Yeah, that was years
ago, I've forgotten. Ask Joe, he still uses the program.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator

2011-08-27 Thread greymaus

On 2011-08-26, D'Arcy J.M. Cain da...@druid.net wrote:
 On 26 Aug 2011 18:39:07 GMT
 greymaus greyma...@mail.com wrote:
 
 Is there an equivelent for the AWK RS in Python?
 
 
 as in RS='\n\n'
 will seperate a file at two blank line intervals

 open(file.txt).read().split(\n\n)



Ta!.. bit awkard. :))


-- 
maus
 .
  .
...   NO CARRIER
-- 
http://mail.python.org/mailman/listinfo/python-list

Understanding .pth in site-packages

(This may be a shortened double post)

I have a development version of a library in c:\dev\XmlDB\xmldb

After testing the setup script I also have c:\python27\lib\site-packages\xmldb

Now I'm continuing to develop it and simultaneously building an application 
with it.

I thought I could plug into my site-packages directory a file called xmldb.pth 
with:

c:\dev\XmlDB\xmldb

which should redirect import statements to the development version of the 
library.

This doesn't seem to work.

Is there a better way to redirect import statements without messing with the 
system path or the PYTHONPATH variable?

Josh
-- 
http://mail.python.org/mailman/listinfo/python-list

Arrange files according to a text file

Hello,

What would be the best way to accomplish this task?
I have many files in separate directories, each file name
contain a persons name but never in the same spot.
I need to find that name which is listed in a large
text file in the following format. Last name, comma
and First name. The last name could be duplicate.

Adler, Jack
Smith, John
Smith, Sally
Stone, Mark
etc.


The file names don't necessary follow any standard 
format. 

Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc
etc

I need some way to pull the name from the file name, find it in the
text list and then create a directory based on the name on the list
Smith, John and move all files named with the clients name into that
directory.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages


On Aug 27, 2011, at 12:56 PM, Josh English wrote:

 (This may be a shortened double post)
 
 I have a development version of a library in c:\dev\XmlDB\xmldb
 
 After testing the setup script I also have c:\python27\lib\site-packages\xmldb
 
 Now I'm continuing to develop it and simultaneously building an application 
 with it.
 
 I thought I could plug into my site-packages directory a file called 
 xmldb.pth with:
 
 c:\dev\XmlDB\xmldb
 
 which should redirect import statements to the development version of the 
 library.
 
 This doesn't seem to work.


xmldb.pth should contain the directory that contains xmldb:
c:\dev\XmlDB

Examining sys.path at runtime probably would have helped you to debug the 
effect of your .pth file.

On another note, I don't know if the behavior of 'import xmldb' is defined when 
xmldb is present both as a directory in site-pacakges and also as a .pth file. 
You're essentially giving Python two choices from where to import xmldb, and I 
don't know which Python will choose. It may be arbitrary. I've looked for some 
sort of statement on this topic in the documentation, but haven't come across 
it yet. 


 Is there a better way to redirect import statements without messing with the 
 system path or the PYTHONPATH variable?

Personally I have never used PYTHONPATH.


Hope this helps
Philip


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file

2011-08-27 Thread MRAB


On 27/08/2011 18:03, r...@rdo.python.org wrote:

Hello,

What would be the best way to accomplish this task?
I have many files in separate directories, each file name
contain a persons name but never in the same spot.
I need to find that name which is listed in a large
text file in the following format. Last name, comma
and First name. The last name could be duplicate.

Adler, Jack
Smith, John
Smith, Sally
Stone, Mark
etc.


The file names don't necessary follow any standard
format.

Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc
etc

I need some way to pull the name from the file name, find it in the
text list and then create a directory based on the name on the list
Smith, John and move all files named with the clients name into that
directory.


I would get a name from the text file, eg. Adler, Jack, and then
identify all the files which contain Adler, Jack or Adler Jack or
Jack Adler in the filename, also checking the surrounding characters
to ensure that I don't split a name, eg. that John isn't part of
Johnson.
--
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function


On 8/27/2011 9:41 AM Roy Smith said...

Chris Angelicoros...@gmail.com  wrote:


the important
considerations are not will it take two extra nanoseconds to execute
but can my successor understand what the code's doing and will he,
if he edits my code, have a reasonable expectation that he's not
breaking stuff. These are always important.


Forget about your successor.  Will *you* be able to figure out what you
did 6 months from now?  I can't tell you how many times I've looked at
some piece of code, muttered, Who wrote this crap? and called up the
checkin history only to discover that *I* wrote it :-)


When you consider that you're looking at the code six months later it's 
likely for one of three reasons: you have to fix a bug; you need to add 
features; or the code's only now getting used.


So you then take the extra 20-30 minutes, tease the code apart, refactor 
as needed and end up with better more readable debugged code.


I consider that the right time to do this type of cleanup.

For all the crap I write that works well for six months before needing 
to be cleaned up, there's a whole lot more crap that never gets looked 
at again that I didn't clean up and never spent the extra 20-30 minutes 
considering how my future self might view what I wrote.


I'm not suggesting that you shouldn't develop good coding habits that 
adhere to established standards and result in well structured readable 
code, only that if that ugly piece of code works that you move on.  You 
can bullet proof it after you uncover the vulnerabilities.


Code is first and foremost written to be executed.

Emile



--
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator

greymaus wrote:

 On 2011-08-26, D'Arcy J.M. Cain da...@druid.net wrote:
 On 26 Aug 2011 18:39:07 GMT
 greymaus greyma...@mail.com wrote:
 
 Is there an equivelent for the AWK RS in Python?
 
 
 as in RS='\n\n'
 will seperate a file at two blank line intervals

 open(file.txt).read().split(\n\n)

 
 
 Ta!.. bit awkard. :))

Er, is that meant to be a pun? Awk[w]ard, as in awk-ward?

In any case, no, the Python line might be a handful of characters longer
than the AWK equivalent, but it isn't awkward. It is logical and easy to
understand. It's embarrassingly easy to describe what it does:

open(file.txt)   # opens the file
 .read()   # reads the contents of the file
 .split(\n\n)# splits the text on double-newlines.

The only tricky part is knowing that \n means newline, but anyone familiar
with C, Perl, AWK etc. should know that.

The Python code might be long (but only by the standards of AWK, which can
be painfully concise), but it is simple, obvious and readable. A few extra
characters is the price you pay for making your language readable. At the
cost of a few extra key presses, you get something that you will be able to
understand in 10 years time.

AWK is a specialist text processing language. Python is a general scripting
and programming language. They have different values: AWK values short,
concise code, Python is willing to pay a little more in source code.


-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

On Sun, Aug 28, 2011 at 3:27 AM, Emile van Sebille em...@fenx.com wrote:
 Code is first and foremost written to be executed.


+1 QOTW. Yes, it'll be read, and most likely read several times, by
humans, but ultimately its purpose is to be executed.

And in the case of some code, the programmer needs the same treatment,
but that's a different issue...

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

2011-08-27 Thread Peter Otten

Josh English wrote:

 I have a development version of a library in c:\dev\XmlDB\xmldb
 
 After testing the setup script I also have
 c:\python27\lib\site-packages\xmldb
 
 Now I'm continuing to develop it and simultaneously building an
 application with it.
 
 I thought I could plug into my site-packages directory a file called
 xmldb.pth with:
 
 c:\dev\XmlDB\xmldb
 
 which should redirect import statements to the development version of the
 library.
 
 This doesn't seem to work.

You have to put the directory containing the package into the pth-file. 
That's probably

c:\dev\XmlDB

in your case. Also, Python will stop at the first matching module or 
package; if you keep c:\python27\lib\site-packages\xmldb that will shadow 
c:\dev\XmlDB\xmldb.

%APPDATA%/Python/Python26/site-packages

may be a good place for the pth-file (I'm not on Windows and too lazy to 
figure out where %APPDATA% actually is. The PEP 
http://www.python.org/dev/peps/pep-0370/ may help)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: typing question

2011-08-27 Thread Chris Rebert

On Sat, Aug 27, 2011 at 6:42 AM, Jason Swails jason.swa...@gmail.com wrote:
 Hello everyone,

 This is probably a basic question with an obvious answer, but I don't quite
 get why the type(foo).__name__ works differently for some class instances
 and not for others.  If I have an underived class, any instance of that
 class is simply of type instance.  If I include an explicit base class,
 then its type __name__ is the name of the class.

 $ python
 Python 2.7.2 (default, Aug 26 2011, 22:35:24)
 [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
 Type help, copyright, credits or license for more information.
 class MyClass:
 ... pass
 ...
 foo = MyClass()
 type(foo)
 type 'instance'
 type(foo).__name__
 'instance'
 class MyClass1():
 ... pass
 ...
 bar = MyClass1()
 type(bar)
 type 'instance'
 type(bar).__name__
 'instance'
 class MyClass2(object):
 ... pass
 ...
 foobar = MyClass2()
 type(foobar)
 class '__main__.MyClass2'
 type(foobar).__name__
 'MyClass2'

 I can't explain this behavior (since doesn't every class inherit from object
 by default?

That's only true in Python 3.x.

Python 2.7.2 (default, Jul 27 2011, 04:14:23)
 class Foo:
... pass
...
 Foo.__bases__
()
 class Bar(object):
... pass
...
 Bar.__bases__
(type 'object',)

 And if so, there should be no difference between any of my class
 definitions).  I would prefer that every approach give me the name of the
 class (rather than the first 2 just return 'instance').  Why is this not the
 case?

Classes directly or indirectly inheriting from `object` are
new-style; classes which don't are old-style. The two kinds of
classes have different semantics (including whether they have a
.__name__, but that's minor relative to the other changes). Old-style
classes are deprecated and were removed in Python 3.
See 
http://docs.python.org/reference/datamodel.html#new-style-and-classic-classes

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator

In article 4e592852$0$29965$c3e8da3$54964...@news.astraweb.com,
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:

 open(file.txt)   # opens the file
  .read()   # reads the contents of the file
  .split(\n\n)# splits the text on double-newlines.

The biggest problem with this code is that read() slurps the entire file 
into a string.  That's fine for moderately sized files, but will fail 
(or at least be grossly inefficient) for very large files.

It's always annoyed me a little that while it's easy to iterate over the 
lines of a file, it's more complicated to iterate over a file character 
by character.  You could write your own generator to do that:

for c in getchar(open(file.txt)):
   whatever

def getchar(f):
   for line in f:
  for c in line:
 yield c

but that's annoyingly verbose (and probably not hugely efficient).

Of course, the next problem for the specific problem at hand is that 
even with an iterator over the characters of a file, split() only works 
on strings.  It would be nice to have a version of split which took an 
iterable and returned an iterator over the split components.  Maybe 
there is such a thing and I'm just missing it?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

Philip,

Yes, the proper path should be c:\dev\XmlDB, which has the setup.py, xmldb 
subfolder, the docs subfolder, and example subfolder, and the other text files 
proscribed by the package development folder.

I could only get it to work, though, by renaming the xmldb folder in the 
site-packages directory, and deleting the egg file created in the site-packages 
directory. 

Why the egg file, which doesn't list any paths, would interfere I do not know.

But with those changes, the xmldb.pth file is being read.

So I think the preferred search order is:

1. a folder in the site-packages directory
2. an Egg file (still unsure why)
3. A .pth file

It's a strange juju that I haven't figured out yet. 

Thanks for the hint.

Josh
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file


On 8/27/2011 10:03 AM r...@rdo.python.org said...

Hello,

What would be the best way to accomplish this task?


I'd do something like:


usernames = Adler, Jack
Smith, John
Smith, Sally
Stone, Mark.split('\n')

filenames = Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc.split('\n')

from difflib import SequenceMatcher as SM


def ignore(x):
return x in ' ,.'


for filename in filenames:
ratios = [SM(ignore,filename,username).ratio() for username in 
usernames]

best = max(ratios)
owner = usernames[ratios.index(best)]
print filename,:,owner


Emile




I have many files in separate directories, each file name
contain a persons name but never in the same spot.
I need to find that name which is listed in a large
text file in the following format. Last name, comma
and First name. The last name could be duplicate.

Adler, Jack
Smith, John
Smith, Sally
Stone, Mark
etc.


The file names don't necessary follow any standard
format.

Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc
etc

I need some way to pull the name from the file name, find it in the
text list and then create a directory based on the name on the list
Smith, John and move all files named with the clients name into that
directory.



--
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages


On Aug 27, 2011, at 1:57 PM, Josh English wrote:

 Philip,
 
 Yes, the proper path should be c:\dev\XmlDB, which has the setup.py, xmldb 
 subfolder, the docs subfolder, and example subfolder, and the other text 
 files proscribed by the package development folder.
 
 I could only get it to work, though, by renaming the xmldb folder in the 
 site-packages directory, and deleting the egg file created in the 
 site-packages directory. 
 
 Why the egg file, which doesn't list any paths, would interfere I do not know.
 
 But with those changes, the xmldb.pth file is being read.
 
 So I think the preferred search order is:
 
 1. a folder in the site-packages directory
 2. an Egg file (still unsure why)
 3. A .pth file


That might be implementation-dependent or it might even come down to something 
as simple as the in which order the operating system returns files/directories 
when asked for a listing. In other words, unless you can find something in the 
documentation (or Python's import implementation) that confirms your preferred 
search order observation, I would not count on it working the same way with all 
systems, all Pythons, or even all directory names.




Good luck
Philip
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator

2011-08-27 Thread ChasBrown

On Aug 27, 10:45 am, Roy Smith r...@panix.com wrote:
 In article 4e592852$0$29965$c3e8da3$54964...@news.astraweb.com,
  Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:

  open(file.txt)   # opens the file
   .read()           # reads the contents of the file
   .split(\n\n)    # splits the text on double-newlines.

 The biggest problem with this code is that read() slurps the entire file
 into a string.  That's fine for moderately sized files, but will fail
 (or at least be grossly inefficient) for very large files.

 It's always annoyed me a little that while it's easy to iterate over the
 lines of a file, it's more complicated to iterate over a file character
 by character.  You could write your own generator to do that:

 for c in getchar(open(file.txt)):
    whatever

 def getchar(f):
    for line in f:
       for c in line:
          yield c

 but that's annoyingly verbose (and probably not hugely efficient).

read() takes an optional size parameter; so f.read(1) is another
option...


 Of course, the next problem for the specific problem at hand is that
 even with an iterator over the characters of a file, split() only works
 on strings.  It would be nice to have a version of split which took an
 iterable and returned an iterator over the split components.  Maybe
 there is such a thing and I'm just missing it?

I don't know if there is such a thing; but for the OP's problem you
could read the file in chunks, e.g.:

def readgroup(f, delim, buffsize=8192):
tail=''
while True:
s = f.read(buffsize)
if not s:
yield tail
break
groups = (tail + s).split(delim)
tail = groups[-1]
for group in groups[:-1]:
yield group

for group in readgroup(open('file.txt'), '\n\n'):
# do something

Cheers - Chas
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Run time default arguments

On 8/25/11 1:54 PM, t...@thsu.org wrote:
 On Aug 25, 10:35 am, Arnaud Delobelle arno...@gmail.com wrote:
 You're close to the usual idiom:

 def doSomething(debug=None):
 if debug is None:
 debug = defaults['debug']
 ...

 Note the use of 'is' rather than '=='
 HTH
 
 Hmm, from what you are saying, it seems like there's no elegant way to
 handle run time defaults for function arguments,

Well, elegance is in the eye of the beholder: and the above idiom is
generally considered elegant in Python, more or less. (The global nature
of 'defaults' being a question)

 meaning that I should
 probably write a sql-esc coalesce function to keep my code cleaner. I
 take it that most people who run into this situation do this?
 
 def coalesce(*args):
   for a in args:
 if a is not None:
   return a
   return None
 
 def doSomething(debug=None):
   debug = coalesce(debug,defaults['debug'])
   # blah blah blah

Er, I'd say that most people don't do that, no. I'd guess that most do
something more along the lines of if debug is None: debug = default as
Arnaud said. Its very common Pythonic code.

In fact, I'm not quite sure what you think you're getting out of that
coalesce function. Return the first argument that is not None, or
return None? That's a kind of odd thing to do, I think. In Python at
least.

Why not just:

debug = defaults.get(debug, None)

(Strictly speaking, providing None to get is not needed, but I always
feel odd leaving it off.)

That's generally how I spell it when I need to do run time defaults.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+list/python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list

How can I solve a equation like solve a function containint expressions like sqrt(log(x) - 1) = 2 and exp((log(x) - 1.5)**2 - 3) = 5

2011-08-27 Thread Xiong Deng

HI,

Hi, I am trying to solve an equation containing both exp, log, erfc, and
they may be embedded into each otherBut sympy cannot handle this, as
shown below:

 from sympy import solve, exp, log, pi
from sympy.mpmath import *
from sympy import Symbol
x=Symbol('x')
sigma = 4
mu = 1.5
solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - mu)**2
/ sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - 1, x)

Traceback (most recent call last):
  File stdin, line 1, in module
  File
/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/functions/functions.py,
line 287, in log
return ctx.ln(x)
  File
/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py,
line 984, in f
x = ctx.convert(x)
  File
/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py,
line 662, in convert
return ctx._convert_fallback(x, strings)
  File
/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp.py,
line 556, in _convert_fallback
raise TypeError(cannot create mpf from  + repr(x))
TypeError: cannot create mpf from x

But sqrt, log, exp, itself is OK, as shown as below:

 solve((1.0 / sqrt(2 * pi) * x * sigma) - 1, x)
[0.626657068657750]

SO, How can I solve an equation containint expressions like sqrt(log(x) -
1)=0 or exp((log(x) - mu)**2 - 3) = 0??? If there are any other methods
without Sympy, it is still OK.

Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

2011-08-27 Thread OKB (not okblacke)

Josh English wrote:

 Philip,
 
 Yes, the proper path should be c:\dev\XmlDB, which has the
 setup.py, xmldb subfolder, the docs subfolder, and example
 subfolder, and the other text files proscribed by the package
 development folder. 
 
 I could only get it to work, though, by renaming the xmldb folder
 in the site-packages directory, and deleting the egg file created
 in the site-packages directory. 
 
 Why the egg file, which doesn't list any paths, would interfere I
 do not know. 
 
 But with those changes, the xmldb.pth file is being read.
 
 So I think the preferred search order is:
 
 1. a folder in the site-packages directory
 2. an Egg file (still unsure why)
 3. A .pth file

You say that the egg file was created by the setup script for the 
library.  Are you sure that this script did not also create or modify a 
.pth file of its own, adding the egg to the path?

.pth files do not redirect imports from site-packages; they add 
EXTRA directories to sys.path.  Also note that this means the .pth file 
itself is not part of the search path; it's not like you shadow a 
package xyz by creating a .pth file xyz.pth instead.  A single .pth 
file can list multiple directories, and it's those directories that are 
added to the path.

I'm not sure how your package is set up, but easy_install, for 
instance, creates an easy_install.pth file in site-packages.  This file 
contains references to egg files (or, at least in my case, .egg 
directories created by unpacking the eggs) for each package installed 
with easy_install.  As far as I'm aware, Python doesn't have special 
rules for putting egg files in the search path, so my guess is that 
it's something like that: the setup script is creating a .pth file (or 
modifying an existing .pth file) to add the egg to the path.

Read http://docs.python.org/library/site.html for the description 
of how .PTH files work.  I don't think there is a general way to 
globally shadow a package that exists in site-packages.  However, 
according to the docs the .pth files are added in alphabetical order, so 
if it is indeed easy_install.pth that is adding your egg, you could hack 
around it by making a file with an alphabetically earlier name (e.g., 
a_xmldb.pth).

-- 
--OKB (not okblacke)
Brendan Barnwell
Do not follow where the path may lead.  Go, instead, where there is
no path, and leave a trail.
--author unknown
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator


On 8/27/2011 1:45 PM, Roy Smith wrote:

In article4e592852$0$29965$c3e8da3$54964...@news.astraweb.com,
  Steven D'Apranosteve+comp.lang.pyt...@pearwood.info  wrote:


open(file.txt)   # opens the file
  .read()   # reads the contents of the file
  .split(\n\n)# splits the text on double-newlines.


The biggest problem with this code is that read() slurps the entire file
into a string.  That's fine for moderately sized files, but will fail
(or at least be grossly inefficient) for very large files.


I read the above as separating the file into paragraphs, as indicated by 
blank lines.


def paragraphs(file):
  para = []
  for line in file:
if line:
  para.append(line)
else:
  yield para # or ''.join(para), as desired
  para = []

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator

On Sun, Aug 28, 2011 at 6:03 AM, Terry Reedy tjre...@udel.edu wrote:
      yield para # or ''.join(para), as desired


Or possibly '\n'.join(para) if you want to keep the line breaks inside
paragraphs.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: typing question


On 8/27/2011 9:42 AM, Jason Swails wrote:


P.S.  I'll note that my preferred behavior is how python3.2 actually
operates


Python core developers agree. This is one of the reasons for breaking a 
bit from 2.x to make Python 3.



--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file


Hello Emile ,

Thank you for the code below as I have not encountered SequenceMatcher
before and would have to take a look at it closer.

My question would it work for a text file list of names about 25k
lines and a directory with say 100 files inside?

Thank you once again. 


On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebille em...@fenx.com
wrote:

On 8/27/2011 10:03 AM r...@rdo.python.org said...
 Hello,

 What would be the best way to accomplish this task?

I'd do something like:


usernames = Adler, Jack
Smith, John
Smith, Sally
Stone, Mark.split('\n')

filenames = Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc.split('\n')

from difflib import SequenceMatcher as SM


def ignore(x):
 return x in ' ,.'


for filename in filenames:
 ratios = [SM(ignore,filename,username).ratio() for username in 
usernames]
 best = max(ratios)
 owner = usernames[ratios.index(best)]
 print filename,:,owner


Emile



 I have many files in separate directories, each file name
 contain a persons name but never in the same spot.
 I need to find that name which is listed in a large
 text file in the following format. Last name, comma
 and First name. The last name could be duplicate.

 Adler, Jack
 Smith, John
 Smith, Sally
 Stone, Mark
 etc.


 The file names don't necessary follow any standard
 format.

 Smith, John - 02-15-75 - business files.doc
 Random Data - Adler Jack - expenses.xls
 More Data Mark Stone files list.doc
 etc

 I need some way to pull the name from the file name, find it in the
 text list and then create a directory based on the name on the list
 Smith, John and move all files named with the clients name into that
 directory.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages


On 8/27/2011 2:07 PM, Philip Semanchuk wrote:


On Aug 27, 2011, at 1:57 PM, Josh English wrote:


Philip,

Yes, the proper path should be c:\dev\XmlDB, which has the
setup.py, xmldb subfolder, the docs subfolder, and example
subfolder, and the other text files proscribed by the package
development folder.

I could only get it to work, though, by renaming the xmldb folder
in the site-packages directory, and deleting the egg file created
in the site-packages directory.

Why the egg file, which doesn't list any paths, would interfere I
do not know.

But with those changes, the xmldb.pth file is being read.

So I think the preferred search order is:

1. a folder in the site-packages directory 2. an Egg file (still
unsure why) 3. A .pth file



That might be implementation-dependent or it might even come down to
something as simple as the in which order the operating system
returns files/directories when asked for a listing.


Doc says first match, and I presume that includes first match within a 
directory.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages


On Aug 27, 2011, at 4:14 PM, Terry Reedy wrote:

 On 8/27/2011 2:07 PM, Philip Semanchuk wrote:
 
 On Aug 27, 2011, at 1:57 PM, Josh English wrote:
 
 Philip,
 
 Yes, the proper path should be c:\dev\XmlDB, which has the
 setup.py, xmldb subfolder, the docs subfolder, and example
 subfolder, and the other text files proscribed by the package
 development folder.
 
 I could only get it to work, though, by renaming the xmldb folder
 in the site-packages directory, and deleting the egg file created
 in the site-packages directory.
 
 Why the egg file, which doesn't list any paths, would interfere I
 do not know.
 
 But with those changes, the xmldb.pth file is being read.
 
 So I think the preferred search order is:
 
 1. a folder in the site-packages directory 2. an Egg file (still
 unsure why) 3. A .pth file
 
 
 That might be implementation-dependent or it might even come down to
 something as simple as the in which order the operating system
 returns files/directories when asked for a listing.
 
 Doc says first match, and I presume that includes first match within a 
 directory.

First match using which ordering? Do the docs clarify that?


Thanks
Philip




-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

Chris Angelico wrote:

 On Sun, Aug 28, 2011 at 3:27 AM, Emile van Sebille em...@fenx.com wrote:
 Code is first and foremost written to be executed.

 
 +1 QOTW. Yes, it'll be read, and most likely read several times, by
 humans, but ultimately its purpose is to be executed.

You've never noticed the masses of code written in text books, blogs, web
pages, discussion forums like this one, etc.?

Real world code for production is usually messy and complicated and filled
with data validation and error checking code. There's a lot of code without
that, because it was written explicitly to be read by humans, and the fact
that it may be executed as well is incidental. Some code is even written in
pseudo-code that *cannot* be executed. It's clear to me that a non-trivial
amount of code is specifically written to be consumed by other humans, not
by machines.

It seems to me that, broadly speaking, there are languages designed with
execution of code as the primary purpose:

Fortran, C, Lisp, Java, PL/I, APL, Forth, ...

and there are languages designed with *writing* of code as the primary
purpose:

Perl, AWK, sed, bash, ...

and then there are languages where *reading* is the primary purpose:

Python, Ruby, Hypertalk, Inform 7, Pascal, AppleScript, ...

and then there are languages where the torment of the damned is the primary
purpose:

INTERCAL, Oook, Brainf*ck, Whitespace, Malbolge, ...

and then there are languages with few, or no, design principles to speak of,
or as compromise languages that (deliberately or accidentally) straddle the
other categories. It all depends on the motivation and values of the
language designer, and the trade-offs the language makes. Which category
any specific language may fall into may be a matter of degree, or a matter
of opinion, or both.



-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

On Sun, Aug 28, 2011 at 6:27 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 You've never noticed the masses of code written in text books, blogs, web
 pages, discussion forums like this one, etc.?

 Real world code for production is usually messy and complicated and filled
 with data validation and error checking code. There's a lot of code without
 that, because it was written explicitly to be read by humans, and the fact
 that it may be executed as well is incidental. Some code is even written in
 pseudo-code that *cannot* be executed. It's clear to me that a non-trivial
 amount of code is specifically written to be consumed by other humans, not
 by machines.

Yes, I'm aware of the quantities of code that are primarily for human
consumption. But in the original context, which was of editing code
six months down the track, I still believe that such code is primarily
for the machine. In that situation, there are times when it's not
worth the hassle of writing beautiful code; you'd do better to just
get that code generated and in operation.

Same goes for lint tools and debuggers - sometimes, it's easier to
just put the code into a live situation (or a perfect copy of) and see
where it breaks, than to use a simulation/test harness.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file


On 8/27/2011 1:15 PM r...@rdo.python.org said...


Hello Emile ,

Thank you for the code below as I have not encountered SequenceMatcher
before and would have to take a look at it closer.

My question would it work for a text file list of names about 25k
lines and a directory with say 100 files inside?


Sure.

Emile




Thank you once again.


On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com
wrote:


On 8/27/2011 10:03 AM r...@rdo.python.org said...

Hello,

What would be the best way to accomplish this task?


I'd do something like:


usernames = Adler, Jack
Smith, John
Smith, Sally
Stone, Mark.split('\n')

filenames = Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc.split('\n')

from difflib import SequenceMatcher as SM


def ignore(x):
 return x in ' ,.'


for filename in filenames:
 ratios = [SM(ignore,filename,username).ratio() for username in
usernames]
 best = max(ratios)
 owner = usernames[ratios.index(best)]
 print filename,:,owner


Emile




I have many files in separate directories, each file name
contain a persons name but never in the same spot.
I need to find that name which is listed in a large
text file in the following format. Last name, comma
and First name. The last name could be duplicate.

Adler, Jack
Smith, John
Smith, Sally
Stone, Mark
etc.


The file names don't necessary follow any standard
format.

Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc
etc

I need some way to pull the name from the file name, find it in the
text list and then create a directory based on the name on the list
Smith, John and move all files named with the clients name into that
directory.





--
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator

In article mailman.477.1314475482.27778.python-l...@python.org,
 Terry Reedy tjre...@udel.edu wrote:

 On 8/27/2011 1:45 PM, Roy Smith wrote:
  In article4e592852$0$29965$c3e8da3$54964...@news.astraweb.com,
Steven D'Apranosteve+comp.lang.pyt...@pearwood.info  wrote:
 
  open(file.txt)   # opens the file
.read()   # reads the contents of the file
.split(\n\n)# splits the text on double-newlines.
 
  The biggest problem with this code is that read() slurps the entire file
  into a string.  That's fine for moderately sized files, but will fail
  (or at least be grossly inefficient) for very large files.
 
 I read the above as separating the file into paragraphs, as indicated by 
 blank lines.
 
 def paragraphs(file):
para = []
for line in file:
  if line:
para.append(line)
  else:
yield para # or ''.join(para), as desired
para = []

Plus or minus the last paragraph in the file :-)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

In article 4e595334$0$3$c3e8da3$54964...@news.astraweb.com,
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:

 and then there are languages with few, or no, design principles to speak of

Oh, like PHP?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to format long if conditions

2011-08-27 Thread Colin J. Williams


On 27-Aug-11 11:53 AM, Hans Mulder wrote:

On 27/08/11 17:16:51, Colin J. Williams wrote:


What about:
cond= isinstance(left, PyCompare)
and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]
py_and= PyCompare(left.complist + right.complist[1:])if cond
else: py_and = PyBooleanAnd(left, right)
Colin W.


That's a syntax error. You need to add parenthesis.

How about:

cond = (
isinstance(left, PyCompare)
and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]
}
py_and = (
PyCompare(left.complist + right.complist[1:])
if cond
else PyBooleanAnd(left, right)
)

-- HansM


I like your 11:53 message but suggest indenting the if cond as below to 
make it clearer that it, with the preceding line, is all one statement.


Colin W.

#!/usr/bin/env python
z= 1
class PyCompare:
complist = [True, False]
def __init__(self):
pass
left= PyCompare
right= PyCompare
def isinstance(a, b):
return True
def PyBooleanAnd(a, b):
return True
def PyCompare(a):
return False
z=2

def try1():

  '''Hans Mulder suggestion  03:50  '''
  if (
  isinstance(left, PyCompare)
  and isinstance(right, PyCompare)
  and left.complist[-1] is right.complist[0]
  ):
  py_and = PyCompare(left.complist + right.complist[1:])
  else:
  py_and = PyBooleanAnd(left, right)

def try2():
  '''cjw response - corrected  11:56  '''
  cond=  (isinstance(left, PyCompare)
 and isinstance(right, PyCompare)
 and left.complist[-1] is right.complist[0])
  py_and= (PyCompare(left.complist + right.complist[1:]) if cond
  else PyBooleanAnd(left, right))

def try3():
'''   Hans Mulder 11:53   '''
cond = (
isinstance(left, PyCompare)
and isinstance(right, PyCompare)
and left.complist[-1] is right.complist[0]
)  # not }
py_and = (
 PyCompare(left.complist + right.complist[1:])
 if   cond
 else PyBooleanAnd(left, right)
)
def main():
try1()
try2()
try3()
if __name__ == '__main__':
main()
pass
--
http://mail.python.org/mailman/listinfo/python-list

Re: UnicodeEncodeError -- 'character maps to undefined'

2011-08-27 Thread Ben Finney

Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:

  s = u'BIEBER FEVER \u2665'
  print s  # Printing Unicode is fine.
 BIEBER FEVER ♥

You're a cruel man. Why do you hate me?

-- 
 \ “If nature has made any one thing less susceptible than all |
  `\others of exclusive property, it is the action of the thinking |
_o__)  power called an idea” —Thomas Jefferson, 1813-08-13 |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

2011-08-27 Thread Ben Finney

Emile van Sebille em...@fenx.com writes:

 Code is first and foremost written to be executed.

−1 QotW. I disagree, and have a counter-aphorism:

“Programs must be written for people to read, and only incidentally for
machines to execute.”
—Abelson  Sussman, _Structure and Interpretation of Computer Programs_

Yes, the primary *function* of the code you write is for it to
eventually execute. But the primary *audience* of the text you type into
your buffer is not the computer, but the humans who will read it. That's
what must be foremost in your mind while writing that text.

-- 
 \  “If you can't beat them, arrange to have them beaten.” —George |
  `\Carlin |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function


On 8/27/2011 2:57 PM Ben Finney said...

Emile van Sebilleem...@fenx.com  writes:


Code is first and foremost written to be executed.






 “Programs must be written for people to read, and only incidentally for
 machines to execute.”
 —Abelson  Sussman, _Structure and Interpretation of Computer Programs_



That's certainly self-fulfilling -- code that doesn't execute will need 
to be read to be understood, and to be fixed so that it does run. 
Nobody cares about code not intended to be executed.  Pretty it up as 
much as you have free time to do so to enlighten your intended audience.


Code that runs from the offset may not ever again need to be read, so 
the only audience will ever be the processor.


I find it much to easy to waste enormous amounts of time prettying up 
code that works.  Pretty it up when it doesn't -- that's the code that 
needs the attention.


Emile




Yes, the primary *function* of the code you write is for it to
eventually execute. But the primary *audience* of the text you type into
your buffer is not the computer, but the humans who will read it. That's
what must be foremost in your mind while writing that text.




--
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

I have .egg files in my system path. The Egg file created by my setup script 
doesn't include anything but the introductory text. If I open other eggs I see 
the zipped data, but not for my own files.

Is having a zipped egg file any faster than a regular package? or does it just 
prevent people from seeing the code?

Josh
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

When I run: os.listdir('c:\Python27\lib\site-packages') I get the contents in 
order, so the folders come before .pth files (as nothing comes before 
something.) I would guess Python is using os.listdir. Why wouldn't it?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

OKB,

The setup.py script created the egg, but not the .pth file. I created that 
myself.

Thank you for clarifying about how .pth works. I know redirect imports was 
the wrong phrase, but it worked in my head at the time. It appears, at least on 
my system, that Python will find site-packages/foo before it finds and reads 
site-packages/foo.pth.

At least this solution gives me a way to develop my libraries outside of 
site-packages.

Josh
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

2011-08-27 Thread rantingrick

On Aug 27, 5:21 pm, Emile van Sebille em...@fenx.com wrote:
 On 8/27/2011 2:57 PM Ben Finney said...

  Emile van Sebilleem...@fenx.com  writes:

  Code is first and foremost written to be executed.

       “Programs must be written for people to read, and only incidentally for
       machines to execute.”
       —Abelson  Sussman, _Structure and Interpretation of Computer Programs_

 That's certainly self-fulfilling -- code that doesn't execute will need
 to be read to be understood, and to be fixed so that it does run.
 Nobody cares about code not intended to be executed.  Pretty it up as
 much as you have free time to do so to enlighten your intended audience.

 Code that runs from the offset may not ever again need to be read, so
 the only audience will ever be the processor.

WRONG!

Code may need to be extended someday no matter HOW well it executes
today. Also, code need to be readable so the readers can learn from
it.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

In article mailman.489.1314483681.27778.python-l...@python.org,
 Emile van Sebille em...@fenx.com wrote:

 code that doesn't execute will need to be read to be understood, and 
 to be fixed so that it does run.

That is certainly true, but it's not the whole story.  Even code that 
works perfectly today will need to be modified in the future.  Business 
requirements change.  Your code will need to be ported to a new OS.  
You'll need to make it work for 64-bit.  Or i18n.  Or y2k (well, don't 
need to worry about that one any more).  Or with a different run-time 
library.  A new complier.  A different database.  Regulatory changes 
will impose new requirements  Or, your company will get bought and 
you'll need to interface with a whole new system.

Code is never done.  At least not until the project is dead.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file

Thank you so much. The code worked perfectly. 

This is what I tried using Emile code. The only time when it picked
wrong name from the list was when the file was named like this.

Data Mark Stone.doc

How can I fix this? Hope I am not asking too much?


import os
from difflib import SequenceMatcher as SM

path = r'D:\Files '
txt_names = []


with open(r'D:/python/log1.txt') as f:
for txt_name in f.readlines():
txt_names.append(txt_name.strip())

def ignore(x):
 return x in ' ,.'

for filename in os.listdir(path):
 ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in
txt_names]
 best = max(ratios)
 owner = txt_names[ratios.index(best)]
 print filename,:,owner





On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebille em...@fenx.com
wrote:

On 8/27/2011 1:15 PM r...@rdo.python.org said...

 Hello Emile ,

 Thank you for the code below as I have not encountered SequenceMatcher
 before and would have to take a look at it closer.

 My question would it work for a text file list of names about 25k
 lines and a directory with say 100 files inside?

Sure.

Emile



 Thank you once again.


 On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com
 wrote:

 On 8/27/2011 10:03 AM r...@rdo.python.org said...
 Hello,

 What would be the best way to accomplish this task?

 I'd do something like:


 usernames = Adler, Jack
 Smith, John
 Smith, Sally
 Stone, Mark.split('\n')

 filenames = Smith, John - 02-15-75 - business files.doc
 Random Data - Adler Jack - expenses.xls
 More Data Mark Stone files list.doc.split('\n')

from difflib import SequenceMatcher as SM


 def ignore(x):
  return x in ' ,.'


 for filename in filenames:
  ratios = [SM(ignore,filename,username).ratio() for username in
 usernames]
  best = max(ratios)
  owner = usernames[ratios.index(best)]
  print filename,:,owner


 Emile



 I have many files in separate directories, each file name
 contain a persons name but never in the same spot.
 I need to find that name which is listed in a large
 text file in the following format. Last name, comma
 and First name. The last name could be duplicate.

 Adler, Jack
 Smith, John
 Smith, Sally
 Stone, Mark
 etc.


 The file names don't necessary follow any standard
 format.

 Smith, John - 02-15-75 - business files.doc
 Random Data - Adler Jack - expenses.xls
 More Data Mark Stone files list.doc
 etc

 I need some way to pull the name from the file name, find it in the
 text list and then create a directory based on the name on the list
 Smith, John and move all files named with the clients name into that
 directory.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

On 8/27/11 3:21 PM, Emile van Sebille wrote:
 On 8/27/2011 2:57 PM Ben Finney said...
 Emile van Sebilleem...@fenx.com  writes:

 Code is first and foremost written to be executed.

 
 
  “Programs must be written for people to read, and only
 incidentally for
  machines to execute.”
  —Abelson  Sussman, _Structure and Interpretation of Computer
 Programs_

 
 That's certainly self-fulfilling -- code that doesn't execute will need
 to be read to be understood, and to be fixed so that it does run. Nobody
 cares about code not intended to be executed.  Pretty it up as much as
 you have free time to do so to enlighten your intended audience.

Er, you're interpreting the quote... way overboard. No one's talking
about code that isn't intended to be executed, I don't think; the quote
includes, and only incidentally for machines to execute. That's still
the there, and its still important. It should just not be the prime
concern while actually writing the code.

The code has to actually do something. If not, obviously you'll have to
change it.

The Pythonic emphasis on doing readable, pretty code isn't JUST about
making code that just looks good; its not merely an aesthetic that the
community endorses.

And although people often tout the very valid reason why readability
counts-- that code is often read more then written, and that coming back
to a chunk of code 6 months later and being able to understand fully
what its doing is very important... that's not the only reason
readability counts.

Readable, pretty, elegantly crafted code is also far more likely to be
*correct* code.

However, this:

 Code that runs from the offset may not ever again need to be read, so
 the only audience will ever be the processor.
 
 I find it much to easy to waste enormous amounts of time prettying up
 code that works.  Pretty it up when it doesn't -- that's the code that
 needs the attention.

... seems to me to be a rather significant self-fulfilling prophecy in
its own right. The chances that the code does what its supposed to do,
accurately, and without any bugs, goes down in my experience quite
significantly the farther away from pretty it is.

If you code some crazy, overly clever, poorly organized, messy chunk of
something that /works/ -- that's fine and dandy. But unless you have
some /seriously/ comprehensive test coverage then the chances that you
can eyeball it and be sure it doesn't have some subtle bugs that will
call you back to fix it later, is pretty low. In my experience.

Its not that pretty code is bug-free, but code which is easily read and
understood is vastly more likely to be functioning correctly and reliably.

Also... it just does not take that much time to make pretty code. It
really doesn't.

The entire idea that its hard, time-consuming, effort-draining or
difficult to make code clean and pretty from the get-go is just wrong.

You don't need to do a major prettying up stage after the fact. Sure,
sometimes refactoring would greatly help a body of code as it evolves,
but you can do that as it becomes beneficial for maintenance reasons and
not just for pretty's sake.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+list/python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file

On 8/27/11 11:06 AM, Emile van Sebille wrote:
 from difflib import SequenceMatcher as SM
 
 def ignore(x):
 return x in ' ,.'
 
 for filename in filenames:
 ratios = [SM(ignore,filename,username).ratio() for username in
 usernames]
 best = max(ratios)
 owner = usernames[ratios.index(best)]
 print filename,:,owner

It amazes me that I can still find a surprising new tool in the stdlib
after all these years.

Neat.

/pinboards

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+list/python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/



signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

On 8/27/11 3:41 PM, Josh English wrote:
 I have .egg files in my system path. The Egg file created by my setup script 
 doesn't include anything but the introductory text. If I open other eggs I 
 see the zipped data, but not for my own files.

Sounds like your setup.py isn't actually including your source.
 
 Is having a zipped egg file any faster than a regular package? or does it 
 just prevent people from seeing the code?

IIUC, its nominally very slightly faster to use an egg, because it can
skip a lot of filesystem calls. But I've only heard that and can't
completely confirm it (internal testing at my day job did not
conclusively support this, but our environments are uniquely weird).

But that speed boost (if even true) isn't really the point of
eggs-as-files -- eggs are just easy to deal with as files is all. They
don't prevent people from seeing the code*, they're just regular zip
files and can be unzipped fine.

I almost always install unzip my eggs on a developer machine, because I
inevitably want to go poke inside and see what's actually going on.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+list/python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/

* Although you can make an egg and then go and remove all the .PY files
from it, and leave just the compiled .PYC files, and Python will load it
fine. At the day job, that's what we do. But, you have to be aware that
this ties the egg to a specific version of Python, and its not difficult
for someone industrious to disassemble and/or decompile the PYC back to
effectively equivalent PY files to edit away if they want.



signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file

2011-08-27 Thread MRAB


On 28/08/2011 00:18, r...@rdo.python.org wrote:

Thank you so much. The code worked perfectly.

This is what I tried using Emile code. The only time when it picked
wrong name from the list was when the file was named like this.

Data Mark Stone.doc

How can I fix this? Hope I am not asking too much?


Have you tried the alternative word orders, Mark Stone as well as
Stone, Mark, picking whichever name has the best ratio for either?


import os
from difflib import SequenceMatcher as SM

path = r'D:\Files '
txt_names = []


with open(r'D:/python/log1.txt') as f:
 for txt_name in f.readlines():
 txt_names.append(txt_name.strip())

def ignore(x):
  return x in ' ,.'

for filename in os.listdir(path):
  ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in
txt_names]
  best = max(ratios)
  owner = txt_names[ratios.index(best)]
  print filename,:,owner





On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebilleem...@fenx.com
wrote:


On 8/27/2011 1:15 PM r...@rdo.python.org said...


Hello Emile ,

Thank you for the code below as I have not encountered SequenceMatcher
before and would have to take a look at it closer.

My question would it work for a text file list of names about 25k
lines and a directory with say 100 files inside?


Sure.

Emile




Thank you once again.


On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com
wrote:


On 8/27/2011 10:03 AM r...@rdo.python.org said...

Hello,

What would be the best way to accomplish this task?


I'd do something like:


usernames = Adler, Jack
Smith, John
Smith, Sally
Stone, Mark.split('\n')

filenames = Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc.split('\n')


from difflib import SequenceMatcher as SM



def ignore(x):
  return x in ' ,.'


for filename in filenames:
  ratios = [SM(ignore,filename,username).ratio() for username in
usernames]
  best = max(ratios)
  owner = usernames[ratios.index(best)]
  print filename,:,owner


Emile




I have many files in separate directories, each file name
contain a persons name but never in the same spot.
I need to find that name which is listed in a large
text file in the following format. Last name, comma
and First name. The last name could be duplicate.

Adler, Jack
Smith, John
Smith, Sally
Stone, Mark
etc.


The file names don't necessary follow any standard
format.

Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc
etc

I need some way to pull the name from the file name, find it in the
text list and then create a directory based on the name on the list
Smith, John and move all files named with the clients name into that
directory.






--
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file

On Sun, 28 Aug 2011 00:48:20 +0100, MRAB pyt...@mrabarnett.plus.com
wrote:

On 28/08/2011 00:18, r...@rdo.python.org wrote:
 Thank you so much. The code worked perfectly.

 This is what I tried using Emile code. The only time when it picked
 wrong name from the list was when the file was named like this.

 Data Mark Stone.doc

 How can I fix this? Hope I am not asking too much?

Have you tried the alternative word orders, Mark Stone as well as
Stone, Mark, picking whichever name has the best ratio for either?


Yes I tried and the result was the same. I will try to work out
something. thank you. 
 
 import os
 from difflib import SequenceMatcher as SM

 path = r'D:\Files '
 txt_names = []


 with open(r'D:/python/log1.txt') as f:
  for txt_name in f.readlines():
  txt_names.append(txt_name.strip())

 def ignore(x):
   return x in ' ,.'

 for filename in os.listdir(path):
   ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in
 txt_names]
   best = max(ratios)
   owner = txt_names[ratios.index(best)]
   print filename,:,owner





 On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebilleem...@fenx.com
 wrote:

 On 8/27/2011 1:15 PM r...@rdo.python.org said...

 Hello Emile ,

 Thank you for the code below as I have not encountered SequenceMatcher
 before and would have to take a look at it closer.

 My question would it work for a text file list of names about 25k
 lines and a directory with say 100 files inside?

 Sure.

 Emile



 Thank you once again.


 On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebilleem...@fenx.com
 wrote:

 On 8/27/2011 10:03 AM r...@rdo.python.org said...
 Hello,

 What would be the best way to accomplish this task?

 I'd do something like:


 usernames = Adler, Jack
 Smith, John
 Smith, Sally
 Stone, Mark.split('\n')

 filenames = Smith, John - 02-15-75 - business files.doc
 Random Data - Adler Jack - expenses.xls
 More Data Mark Stone files list.doc.split('\n')

 from difflib import SequenceMatcher as SM


 def ignore(x):
   return x in ' ,.'


 for filename in filenames:
   ratios = [SM(ignore,filename,username).ratio() for username in
 usernames]
   best = max(ratios)
   owner = usernames[ratios.index(best)]
   print filename,:,owner


 Emile



 I have many files in separate directories, each file name
 contain a persons name but never in the same spot.
 I need to find that name which is listed in a large
 text file in the following format. Last name, comma
 and First name. The last name could be duplicate.

 Adler, Jack
 Smith, John
 Smith, Sally
 Stone, Mark
 etc.


 The file names don't necessary follow any standard
 format.

 Smith, John - 02-15-75 - business files.doc
 Random Data - Adler Jack - expenses.xls
 More Data Mark Stone files list.doc
 etc

 I need some way to pull the name from the file name, find it in the
 text list and then create a directory based on the name on the list
 Smith, John and move all files named with the clients name into that
 directory.


-- 
http://mail.python.org/mailman/listinfo/python-list

packaging a python application

2011-08-27 Thread suresh

Hi
I created a python application which consists of multiple python files and a 
configuration file. I am not sure, how can I distribute it. 

I read distutils2 documentation and a few blogs on python packaging. But I 
still have the following questions.

1. My package has a configuration file which has to be edited by the user. How 
do we achieve that? 

2. Should the user directly edit the configuration file, or there would be an 
interface for doing it...?(I remember my sendmail installations in 
Debian/Ubuntu. It would ask a bunch of questions and the cfg file would be 
ready)

I am just confused how to go about...

thanks
suresh
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator


On 8/27/2011 5:07 PM, Roy Smith wrote:

In articlemailman.477.1314475482.27778.python-l...@python.org,
  Terry Reedytjre...@udel.edu  wrote:


On 8/27/2011 1:45 PM, Roy Smith wrote:

In article4e592852$0$29965$c3e8da3$54964...@news.astraweb.com,
   Steven D'Apranosteve+comp.lang.pyt...@pearwood.info   wrote:


open(file.txt)   # opens the file
   .read()   # reads the contents of the file
   .split(\n\n)# splits the text on double-newlines.


The biggest problem with this code is that read() slurps the entire file
into a string.  That's fine for moderately sized files, but will fail
(or at least be grossly inefficient) for very large files.


I read the above as separating the file into paragraphs, as indicated by
blank lines.

def paragraphs(file):
para = []
for line in file:
  if line:
para.append(line)
  else:
yield para # or ''.join(para), as desired
para = []


Plus or minus the last paragraph in the file :-)


Or right, I forgot the last line, which is a repeat of the yield after 
the for loop finishes.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file


On 8/27/2011 4:18 PM r...@rdo.python.org said...

Thank you so much. The code worked perfectly.

This is what I tried using Emile code. The only time when it picked
wrong name from the list was when the file was named like this.

Data Mark Stone.doc

How can I fix this? Hope I am not asking too much?


What name did it pick?  I imagine if you're picking a name from a list 
of 25000 names that some subset of combinations may yield like ratios.


But, if you double up on the file name side you may get closer:

for filename in filenames:
ratios = [SM(ignore,filename+filename,username).ratio() for 
username in usernames]

best = max(ratios)
owner = usernames[ratios.index(best)]
print filename,:,owner

... on the other hand, if you've only got a 100 files to sort out, you 
should already be done.


:)

Emile

--
http://mail.python.org/mailman/listinfo/python-list

Re: Record seperator

2011-08-27 Thread Dan Stromberg

http://stromberg.dnsalias.org/svn/bufsock/trunk does it.

$ cat double-file
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
root:x:0:0:root:/root:/bin/bash
lp:x:7:7:lp:/var/spool/lpd:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh
proxy:x:13:13:proxy:/bin:/bin/sh
benchbox-dstromberg:~/src/home-svn/bufsock/trunk i686-pc-linux-gnu 8830 -
above cmd done 2011 Sat Aug 27 06:19 PM

$ python
Python 2.6.6 (r266:84292, Sep 15 2010, 15:52:39)
[GCC 4.4.5] on linux2
Type help, copyright, credits or license for more information.
 import bufsock
 file_ = open('double-file', 'rb')
 bs = bufsock.bufsock(file_)
 bs.readto('oo')
'daemon:x:1:1:daemon:/usr/sbin:/bin/sh\nbin:x:2:2:bin:/bin:/bin/sh\nsys:x:3:3:sys:/dev:/bin/sh\nsync:x:4:65534:sync:/bin:/bin/sync\ngames:x:5:60:games:/usr/games:/bin/sh\nman:x:6:12:man:/var/cache/man:/bin/sh\nroo'
 bs.close()


Don't let the name fool you - it's not just for sockets anymore.

On Fri, Aug 26, 2011 at 11:39 AM, greymaus greyma...@mail.com wrote:


 Is there an equivelent for the AWK RS in Python?


 as in RS='\n\n'
 will seperate a file at two blank line intervals


 --
 maus
  .
  .
 ...   NO CARRIER
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list

Custom dict to prevent keys from being overridden

2011-08-27 Thread Julien

Hi,

With a simple dict, the following happens:

 d = {
...   'a': 1,
...   'b': 2,
...   'a': 3
... }
 d
{'a': 3, 'b': 2}

... i.e. the value for the 'a' key gets overridden.

What I'd like to achieve is:

 d = {
...   'a': 1,
...   'b': 2,
...   'a': 3
... }
Error: The key 'a' already exists.

Is that possible, and if so, how?

Many thanks!

Kind regards,

Julien
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Custom dict to prevent keys from being overridden

Julien wrote:

 What I'd like to achieve is:
 
 d = {
 ...   'a': 1,
 ...   'b': 2,
 ...   'a': 3
 ... }
 Error: The key 'a' already exists.
 
 Is that possible, and if so, how?

Not if the requirements including using built-in dicts { }.

But if you are happy enough to use a custom class, like this:


d = StrictDict(('a', 1), ('b', 2'), ('a', 3))

then yes. Just subclass dict and have it validate items as they are added.
Something like:

# Untested
class StrictDict(dict):
def __init__(self, items):
for key, value in items:
self[key] = value
def __setitem__(self, key, value):
if key in self:
raise KeyError('key %r already exists' % key)
super(StrictDict, self).__setitem__(key, value)

should more or less do it.



-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages


On Aug 27, 2011, at 6:49 PM, Josh English wrote:

 When I run: os.listdir('c:\Python27\lib\site-packages') I get the contents in 
 order, so the folders come before .pth files (as nothing comes before 
 something.)

That's one definition of in order. =)


 I would guess Python is using os.listdir. Why wouldn't it?

If you mean that Python uses os.listdir() during import resolution, then yes I 
agree that's probable. And os.listdir() doesn't guarantee any consistent order. 
In fact, the documentation explicitly states that the list is returned in 
arbitrary order. Like a lot of things in Python, os.listdir() probably relies 
on the underlying C library which varies from system to system. (Case in point 
-- on my Mac, os.listdir() returns things in the same order as the 'ls' 
command, which is case-sensitive alphabetical, files  directories mixed -- 
different from Windows.)

So if import relies on os.listdir(), then you're relying on arbitrary 
resolution when you have a .pth file that shadows a site-packages directory. 
Those rules will probably work consistently on your particular system, you're 
developing a habit around what is essentially an implementation quirk.  

Cheers
Philip
-- 
http://mail.python.org/mailman/listinfo/python-list

Why do closures do this?

2011-08-27 Thread John O'Hagan

Somewhat apropos of the recent function principle thread, I was recently 
surprised by this:

funcs=[]
for n in range(3):
def f():
return n
funcs.append(f)

[i() for i in funcs]

The last expression, IMO surprisingly, is [2,2,2], not [0,1,2]. Google tells me 
I'm not the only one surprised, but explains that it's because n in the 
function f refers to whatever n is currently bound to, not what it was 
bound to at definition time (if I've got that right), and that there are at 
least two ways around it: either make a factory function:

def mkfnc(n):
def fnc():
return n
return fnc

funcs=[]
for n in range(3):
funcs.append(mkfnc(n))

which seems roundabout, or take advantage of the default values set at 
definition time behaviour:

funcs=[]
for n in range(3):
def f(n=n):
return n
funcs.append(f)

which seems obscure, and a side-effect.

My question is, is this an inescapable consequence of using closures, or is it 
by design, and if so, what are some examples of where this would be the 
preferred behaviour?

Regards,

John 
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why do closures do this?


On 8/27/2011 11:45 PM, John O'Hagan wrote:

Somewhat apropos of the recent function principle thread, I was recently 
surprised by this:

funcs=[]
for n in range(3):
 def f():
 return n
 funcs.append(f)



The last expression, IMO surprisingly, is [2,2,2], not [0,1,2]. Google tells me I'm not the only one 
surprised, but explains that it's because n in the function f refers to whatever 
n is currently bound to, not what it was bound to at definition time (if I've got that right), 
and that there are at least two ways around it: either make a factory function:


def f(): return n
is a CONSTANT value. It is not a closure.

Your code above is the same as
def f(): return n
funcs = [f,f,f]
n = 2
[i() for i in funcs]


def mkfnc(n):
 def fnc():
 return n
 return fnc


fnc is a closure and n in a nonlocal name. Since you only read it, no 
nonlocal declaration is needed.



funcs=[]
for n in range(3):
 funcs.append(mkfnc(n))

which seems roundabout, or take advantage of the default values set at definition 
time behaviour:

funcs=[]
for n in range(3):
 def f(n=n):
 return n
 funcs.append(f)

which seems obscure, and a side-effect.


It was the standard idiom until nested functions were upgraded to 
enclose or capture the values of nonlocals.



My question is, is this an inescapable consequence of using closures,


I cannot answer since I am not sure what you mean by 'this'.
Closures are nested functions that access the locals of enclosing 
functions. To ensure that the access remains possible even after the 
enclosing function returns, the last value of such accessed names is 
preserved even after the enclosing function returns. (That is the tricky 
part.)


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: is there any principle when writing python function

2011-08-27 Thread harrismh777


smith jack wrote:

i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?


Once Abraham Lincoln was asked how long a man's legs should be. (Well, 
he was a tall man and had exceptionally long legs... his bed had to be 
specially made.)


Old Abe said, A man's legs ought to be long enough to reach from his 
body to the floor.



One time the Austrian Emperor decided that one of Wolfgang Amadeus 
Mozart's masterpieces contained too many notes...  when asked how many 
notes a masterpiece ought to contain it is reported that Mozart 
retorted, I use precisely as many notes as the piece requires, not one 
note more, and not one note less.



After starting the python interpreter import this:

   import this


... study carefully.   If you're not Dutch, don't worry if some of it 
confuses you. ... apply liberally to your function praxis.



kind regards,




--
m harris

FSF  ...free as in freedom/
http://webpages.charter.net/harrismh777/gnulinux/gnulinux.htm
--
http://mail.python.org/mailman/listinfo/python-list

Re: Why do closures do this?

2011-08-27 Thread John O'Hagan

On Sun, 28 Aug 2011 00:19:07 -0400
Terry Reedy tjre...@udel.edu wrote:

 On 8/27/2011 11:45 PM, John O'Hagan wrote:
  Somewhat apropos of the recent function principle thread, I was recently 
  surprised by this:
 
  funcs=[]
  for n in range(3):
   def f():
   return n
   funcs.append(f)
 
 
 
  The last expression, IMO surprisingly, is [2,2,2], not [0,1,2]. 

[...]

 
 def f(): return n
 is a CONSTANT value. It is not a closure.
 
Quite right: I originally encountered this inside a function, but removed the 
enclosing function to show the issue in minimal form.
 
 Your code above is the same as
 def f(): return n
 funcs = [f,f,f]
 n = 2
 [i() for i in funcs]
 

Also right, but I still find this surprising.

[...]

 
  My question is, is this an inescapable consequence of using closures,
 
 I cannot answer since I am not sure what you mean by 'this'.

Ah, but you are and you have:

 Closures are nested functions that access the locals of enclosing 
 functions. To ensure that the access remains possible even after the 
 enclosing function returns, the last value of such accessed names is 
 preserved even after the enclosing function returns. (That is the tricky 
 part.)
 

Thanks,

John
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding .pth in site-packages

2011-08-27 Thread OKB (not okblacke)

Josh English wrote:

 OKB,
 
 The setup.py script created the egg, but not the .pth file. I
 created that myself. 
 
 Thank you for clarifying about how .pth works. I know redirect
 imports was the wrong phrase, but it worked in my head at the
 time. It appears, at least on my system, that Python will find
 site-packages/foo before it finds and reads site-packages/foo.pth. 
 
 At least this solution gives me a way to develop my libraries
 outside of site-packages. 

Well, I'm still not totally sure what your setup is, but assuming 
site-packages/foo is a directory containing an __init__.py (that is, it 
is a package), then yes, it will be found before an alternative package 
in a directory named with a .pth file.  Note that I don't say it will be 
found before the .pth file, because, again, the finding of the package 
(when you do import foo) happens much later than the processing of the 
.pth file.  So it doesn't find site-packages/foo before it reads 
foo.pth; it just finds site-packages/foo before it finds the other foo 
that foo.pth was trying to point to.

Let's say your .pth file specifies the directory /elsewhere.  The 
.pth file is processed by site.py when the interpreter starts up, and at 
that time /elsewhere  will be appended to sys.path.  Later, when you do 
the import, it searches sys.path in order.  site-packages itself will be 
earlier in sys.path than /elsewhere, so a package site-packages/foo will 
be found before /elsewhere/foo.  The key here is that the .pth file is 
processed at interpreter-start time, but the search for foo doesn't take 
place until you actually execute import foo.

If you want to make your /elsewhere jump the line and go to the 
front, look at easy_install.pth, which seems to have some magic code at 
the end that moves its eggs ahead of site-packages in sys.path.  I'm not 
sure how this works, though, and it seems like a risky proposition.


-- 
--OKB (not okblacke)
Brendan Barnwell
Do not follow where the path may lead.  Go, instead, where there is
no path, and leave a trail.
--author unknown
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Arrange files according to a text file

No, it turned out to be my mistake. Your code was correct and I
appreciate it very much.

Thank you again 

On Sat, 27 Aug 2011 18:10:07 -0700, Emile van Sebille em...@fenx.com
wrote:

On 8/27/2011 4:18 PM r...@rdo.python.org said...
 Thank you so much. The code worked perfectly.

 This is what I tried using Emile code. The only time when it picked
 wrong name from the list was when the file was named like this.

 Data Mark Stone.doc

 How can I fix this? Hope I am not asking too much?

What name did it pick?  I imagine if you're picking a name from a list 
of 25000 names that some subset of combinations may yield like ratios.

But, if you double up on the file name side you may get closer:

for filename in filenames:
 ratios = [SM(ignore,filename+filename,username).ratio() for 
username in usernames]
 best = max(ratios)
 owner = usernames[ratios.index(best)]
 print filename,:,owner

... on the other hand, if you've only got a 100 files to sort out, you 
should already be done.

:)

Emile
-- 
http://mail.python.org/mailman/listinfo/python-list

[issue12768] docstrings for the threading module

2011-08-27 Thread Graeme Cross


Graeme Cross gjcr...@gmail.com added the comment:

I will check that the patch works with 3.2; if not, I'll redo the patch for 3.2.
I will also incorporate the review changes from Ezio and Eric.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12768
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12833] raw_input misbehaves when readline is imported

2011-08-27 Thread Nadeem Vawda


Nadeem Vawda nadeem.va...@gmail.com added the comment:

Reproduced on 3.3 head. Looking at the documentation of the C readline
library, it needs to know the length of the prompt in order to display
properly, so this seems to be an acknowledged limitation of the underlying
library rather than a bug on our side.

Still, this behavior is surprising and undesirable. I would suggest adding
a note to the docs for the readline module, directing users to write:

input(foo )

instead of:

sys.stdout.write(foo )
input()

--
nosy: +nadeem.vawda

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12833
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12833] raw_input misbehaves when readline is imported

2011-08-27 Thread Idan Kamara


Idan Kamara idank...@gmail.com added the comment:

You're right, as this little C program verifies:

#include stdio.h
#include stdlib.h
#include readline/readline.h

int main() {
   printf(foo );
   char* buf = readline();
   free(buf);

   return 0;
}

Passing ' ' seems to be a suitable workaround for those who can't pass the text 
directly to raw_input though (such is the case where you have special classes 
who handle output).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12833
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Amaury Forgeot d'Arc


Amaury Forgeot d'Arc amaur...@gmail.com added the comment:

Unfortunately, it won't work. _dosmaperr() is not exported by msvcrt.dll, it is 
only available when you link against the static version of the C runtime.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12802
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-27 Thread Tom Christiansen


Tom Christiansen tchr...@perl.com added the comment:

Guido van Rossum rep...@bugs.python.org wrote
   on Sat, 27 Aug 2011 03:26:21 -: 

 To me, making (default) iteration deviate from indexing is anathema.

So long is there's a way to interate through a string some other way
that by code unit, that's fine.  However, the Java way of 16-bit code
units is so annoying because there often aren't code point APIs, and 
so you get a lot of niggling errors creeping in.  This is part of why
I strongly prefer wide builds, so that code point and code unit are the
same thing again.

 However, there is nothing wrong with providing a library function that
 takes a string and returns an iterator that iterates over code points,
 joining surrogate pairs as needed. You could even have one that
 iterates over characters (I think Tom calls them graphemes), if that
 is well-defined and useful.

Character can sometimes be a confusing term when it means something
different to us programmers as it does to users.  Code point to mean the
integer is a lot clearer to us but to no one else.  At work I often just
give in and go along with the crowd and say character for the number that
sits in a char or wchar_t or Character variable, even though of course
that's a code point.  I only rebel when they start calling code units 
characters, which (inexperienced) Java people tend to do, because that
leads to surrogate splitting and related errors.

By grapheme I mean something the user perceives as a single character.  In
full Unicodese, this is an extended grapheme cluster.  These are code point
sequences that start with a grapheme base and have zero or more grapheme
extenders following it.  For our purposes, that's *mostly* like saying you
have a non-Mark followed by any number of Mark code points, the main
excepting being that a CR followed by a LF also counts as a single grapheme
in Unicode.

If you are in an editor and wanted to swap two characters, the one 
under the user's cursor and the one next to it, you have to deal with
graphemes not individual code points, or else you'd get the wrong answer.
Imagine swapping the last two characters of the first string below,
or the first two characters of second one:

contrôléecontro\x{302}le\x{301}e
élèvee\x{301}le\x{300}ve

While you can sometimes fake a correct answer by considering things
in NFC not NFD, that's doesn't work in the general case, as there
are only a few compatibility glyphs for round-tripping for legacy
encodings (like ISO 8859-1) compared with infinitely many combinations
of combining marks.  Particularly in mathematics and in phonetics, 
you often end up using marks on characters for which no pre-combined
variant glyph exists.  Here's the IPA for a couple of Spanish words
with their tight (phonetic, not phonemic) transcriptions:

anécdota[a̠ˈne̞ɣ̞ð̞o̞t̪a̠]
rincón  [rĩŋˈkõ̞n]

NFD:
ane\x{301}cdota
[a\x{320}\x{2C8}ne\x{31E}\x{263}\x{31E}\x{F0}\x{31E}o\x{31E}t\x{32A}a\x{320}]
rinco\x{301}n  [ri\x{303}\x{14B}\x{2C8}ko\x{31E}\x{303}n]

NFD:
an\x{E9}cdota
[a\x{320}\x{2C8}ne\x{31E}\x{263}\x{31E}\x{F0}\x{31E}o\x{31E}t\x{32A}a\x{320}]
rinc\x{F3}n  [r\x{129}\x{14B}\x{2C8}k\x{F5}\x{31E}n]

So combining marks don't just go away in NFC, and you really do have to
deal with them.  Notice that to get the tabs right (your favorite subject :),
you have to deal with print widths, which is another place that you get
into trouble if you only count code points.

BTW, did you know that the stress mark used in the phonetics above
is actually a (modifier) letter in Unicode, not punctuation?

# uniprops -a 2c8
U+02C8 ‹ˈ› \N{MODIFIER LETTER VERTICAL LINE}
\w \pL \p{L_} \p{Lm}
All Any Alnum Alpha Alphabetic Assigned InSpacingModifierLetters 
Case_Ignorable CI Common Zyyy Dia Diacritic L Lm Gr_Base Grapheme_Base Graph 
GrBase ID_Continue IDC ID_Start IDS Letter L_ Modifier_Letter Print 
Spacing_Modifier_Letters Word XID_Continue XIDC XID_Start XIDS X_POSIX_Alnum 
X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
Age=1.1 Bidi_Class=ON Bidi_Class=Other_Neutral BC=ON 
Block=Spacing_Modifier_Letters Canonical_Combining_Class=0 
Canonical_Combining_Class=Not_Reordered CCC=NR Canonical_Combining_Class=NR 
Script=Common Decomposition_Type=None DT=None East_Asian_Width=Neutral 
Grapheme_Cluster_Break=Other GCB=XX Grapheme_Cluster_Break=XX 
Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA 
Joining_Group=No_Joining_Group JG=NoJoiningGroup Joining_Type=Non_Joining JT=U 
Joining_Type=U Line_Break=BB Line_Break=Break_Before LB=BB Numeric_Type=None 
NT=None Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 
Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1 
Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 
Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2 
Present_In=6.0 IN=6.0 SC=Zyyy

[issue12847] crash with negative PUT in pickle


New submission from Antoine Pitrou pit...@free.fr:

This doesn't happen on 2.x cPickle, where PUT keys are simply treated as 
strings.

 import pickle, pickletools
 s = b'Va\np-1\n.'
 pickletools.dis(s)
0: VUNICODE'a'
3: pPUT-1
7: .STOP
highest protocol among opcodes = 0
 pickle.loads(s)   
Erreur de segmentation

--
messages: 143062
nosy: pitrou
priority: normal
severity: normal
status: open
title: crash with negative PUT in pickle
type: crash
versions: Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12847
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12847] crash with negative PUT in pickle


Antoine Pitrou pit...@free.fr added the comment:

Same with LONG_BINPUT on a 32-bit build:

 s = b'\x80\x03X\x01\x00\x00\x00ar\xff\xff\xff\xff.'
 pickletools.dis(s)
0: \x80 PROTO  3
2: XBINUNICODE 'a'
8: rLONG_BINPUT -1
   13: .STOP
highest protocol among opcodes = 2
 pickle.loads(s)
Erreur de segmentation

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12847
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11564] pickle not 64-bit ready


Antoine Pitrou pit...@free.fr added the comment:

Here is a new patch against 3.2. I can't say it works for sure, but it should 
be much better. It also adds a couple more tests.
There seems to be a separate issue where pure-Python pickle.py considers 32-bit 
lengths signed where the C impl considers them unsigned...

--
Added file: http://bugs.python.org/file23052/pickle64-4.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11564
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned


New submission from Antoine Pitrou pit...@free.fr:

In several opcodes (BINBYTES, BINUNICODE... what else?), _pickle.c happily 
accepts 32-bit lengths of more than 2**31, while pickle.py uses marshal's i 
typecode which means signed... and therefore fails reading the data.
Apparently, pickle.py uses marshal for speed reasons, but marshal doesn't 
support unsigned types.

(seen from http://bugs.python.org/issue11564)

--
components: Library (Lib)
messages: 143065
nosy: alexandre.vassalotti, pitrou
priority: normal
severity: normal
status: open
title: pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned
type: behavior
versions: Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12848
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12835] Missing SSLSocket.sendmsg() wrapper allows programs to send unencrypted data by mistake

2011-08-27 Thread Roundup Robot


Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset b06f011a3529 by Nick Coghlan in branch 'default':
Fix #12835: prevent use of the unencrypted sendmsg/recvmsg APIs on SSL wrapped 
sockets (Patch by David Watson)
http://hg.python.org/cpython/rev/b06f011a3529

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12835
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12835] Missing SSLSocket.sendmsg() wrapper allows programs to send unencrypted data by mistake

2011-08-27 Thread Nick Coghlan


Changes by Nick Coghlan ncogh...@gmail.com:


--
resolution:  - fixed
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12835
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9923] mailcap module may not work on non-POSIX platforms if MAILCAPS env variable is set

2011-08-27 Thread Roundup Robot


Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 7b83d2c1aad9 by Nick Coghlan in branch 'default':
Fix #9923: mailcap now uses the OS path separator for the MAILCAP envvar. Not 
backported, since it could break cases where people worked around the old 
POSIX-specific behaviour on non-POSIX platforms.
http://hg.python.org/cpython/rev/7b83d2c1aad9

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9923
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12174] Multiprocessing logging levels unclear

2011-08-27 Thread Vinay Sajip


Vinay Sajip vinay_sa...@yahoo.co.uk added the comment:

Although the reference docs don't list the numeric values of logging levels, 
this happened during reorganising of the docs. The table has moved to the HOWTO:

http://docs.python.org/howto/logging.html#logging-levels

That said, I don't understand the need for special logging levels in the 
multiprocessing package. From the section following the one linked to above:

Defining your own levels is possible, but should not be necessary, as the 
existing levels have been chosen on the basis of practical experience. However, 
if you are convinced that you need custom levels, great care should be 
exercised when doing this, and it is possibly *a very bad idea to define custom 
levels if you are developing a library*. That’s because if multiple library 
authors all define their own custom levels, there is a chance that the logging 
output from such multiple libraries used together will be difficult for the 
using developer to control and/or interpret, because a given numeric value 
might mean different things for different libraries.

--
nosy: +vinay.sajip

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12174
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9923] mailcap module may not work on non-POSIX platforms if MAILCAPS env variable is set

2011-08-27 Thread Nick Coghlan


Nick Coghlan ncogh...@gmail.com added the comment:

As noted in the commit message, I didn't backport this, since it didn't seem 
worth risking breaking even the unlikely case that someone actually *was* using 
the MAILCAP environment variable on Windows.

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed
versions:  -Python 2.7, Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9923
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Vlad Riscutia


Vlad Riscutia riscutiav...@gmail.com added the comment:

Oh, got it. Interesting. Then should I just add a comment somewhere or should 
we resolve this as Won't Fix?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12802
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL


Antoine Pitrou pit...@free.fr added the comment:

We could add a special case to generrmap.c (but how can I compile and execute 
this file? it doesn't seem to be part of the project files).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12802
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Tom Christiansen

Tom Christiansen tchr...@perl.com added the comment:

Guido van Rossum rep...@bugs.python.org wrote
on Fri, 26 Aug 2011 21:11:24 -:

Would this also affect .islower() and friends?

SHORT VERSION: (7 lines)

I don't believe so, but the relationship between lower() and islower()
is not as clear to me as I would have thought, and more importantly,
the code and the documentation for Python's islower() etc currently seem
to disagree. For future releases, I recommend fixing the code, but if
compatibility is an issue, then perhaps for previous releases still in
maintenance mode fixing only the documentation would possibly be good
enough--your call.

===

MEDIUM VERSION: (87 lines)

I was initially confused with Python's islower() family because of the way
they are defined to operate on full strings. They don't check that
everything is lowercase even though they say they do.

http://docs.python.org/py3k/library/stdtypes.html#sequence-types-str-bytes-bytearray-list-tuple-range

str.lower()

Return a copy of the string with all the cased characters [4]
converted to lowercase.

str.islower()

Return true if all cased characters [4] in the string are lowercase
and there is at least one cased character, false otherwise.

[4] (1, 2, 3, 4) Cased characters are those with general category
property being one of “Lu” (Letter, uppercase), “Ll” (Letter,
lowercase), or “Lt” (Letter, titlecase).

This is strange in several ways. Of lesser importance is that
strings can be considered lowercase even if they don't match

^\p{lowercase}+$

Another is that the result of calling str.lower() may not be .islower().
I'm not sure what these are particularly for, since I myself would just use
a regex to get finer-grained control. (I suppose that's because re doesn't
give access to the Unicode properties needed that this approach never
gained any traction in the Python community.)

However, the worst of this is that the documentation defines both cased
characters and lowercase characters *differently* from how Unicode does
defines those very same terms. This was quite confusing.

Unicode distinguishes Cased code points from Cased_*Letter* code points.
Python is using the Cased_Letter property but calling it Cased. Cased in
a proper superset of Cased_Letter. From the DerivedCoreProperties file in
the Unicode Character Database:

# Derived Property: Cased (Cased)
# As defined by Unicode Standard Definition D120
# C has the Lowercase or Uppercase property or has a General_Category
value of Titlecase_Letter.

In the same way, the Lowercase and Uppercase properties are not the same as
the Lowercase_*Letter* and Uppercase_*Letter* properties. Rather, the former
are respectively proper supersets of the latter.

# Derived Property: Lowercase
# Generated from: Ll + Other_Lowercase

[...]

# Derived Property: Uppercase
# Generated from: Lu + Other_Uppercase

In all these, you almost always want the superset versions not the
restricted subset versions you are using. If it were in the regex engine,
the user could select either.

Java used to miss all these, too. But in 1.7, they updated their character
methods to use the properties that they'd all along said they were using:

http://download.oracle.com/javase/7/docs/api/java/lang/Character.html#isLowerCase(char)

public static boolean isLowerCase(char ch)
Determines if the specified character is a lowercase character.

A character is lowercase if its general category type, provided by
Character.getType(ch), is LOWERCASE_LETTER, or it has contributory
- property Other_Lowercase as defined by the Unicode Standard.

Note: This method cannot handle supplementary characters. To
support all Unicode characters, including supplementary
characters, use the isLowerCase(int) method.

(And yes, that's where Java uses character to mean code unit
not code point, alas. No wonder people get confused)

I'm pretty sure that Python needs to either update its documentation to
match its code, update its code to match its documentation, or both. Java
chose to update the code to match the documentation, and this is the course
I would recommend if at all possible. If you say you are checking for
cased code points, then you should use the Unicode definition of cased code
points not your own, and if you say you are checking for lowercase code
points, then you should use the Unicode definition not your own. Both of
these require access to contributory properties from the UCD and not
just general categories alone.

--tom

===

LONG VERSION: (222 lines)

Essential tools I use for inspecting Unicode code points and their
properties include

[issue10015] Creating a multiprocess.pool.ThreadPool from a child thread blows up.

2011-08-27 Thread Vinay Sajip


Changes by Vinay Sajip vinay_sa...@yahoo.co.uk:


--
title: Creating a multiproccess.pool.ThreadPool from a child thread blows up. 
- Creating a multiprocess.pool.ThreadPool from a child thread blows up.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10015
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL