Re: python3 raw strings and \u escapes

2012-05-31 Thread ru...@yahoo.com
On 05/30/2012 09:07 AM, ru...@yahoo.com wrote:
 On 05/30/2012 05:54 AM, Thomas Rachel wrote:
 Am 30.05.2012 08:52 schrieb ru...@yahoo.com:

 This breaks a lot of my code because in python 2
re.split (ur'[\u3000]', u'A\u3000A') ==  [u'A', u'A']
 but in python 3 (the result of running 2to3),
re.split (r'[\u3000]', 'A\u3000A' ) ==  ['A\u3000A']

 I can remove the r prefix from the regex string but then
 if I have other regex backslash symbols in it, I have to
 double all the other backslashes -- the very thing that
 the r-prefix was invented to avoid.

 Or I can leave the r prefix and replace something like
 r'[ \u3000]' with r'[  ]'.  But that is confusing because
 one can't distinguish between the space character and
 the ideographic space character.  It also a problem if a
 reader of the code doesn't have a font that can display
 the character.

 Was there a reason for dropping the lexical processing of
 \u escapes in strings in python3 (other than to add another
 annoyance in a long list of python3 annoyances?)

 Probably it is more consequent. Alas, it makes the whole stuff
 incompatible to Py2.

 But if you think about it: why allow for \u if \r, \n etc. are
 disallowed as well?

 Maybe the blame is elsewhere then...  If the re module
 interprets (in a regex string) the 2-character string
 consisting of r'\' followed by 'n' as a single newline
 character, then why wasn't re changed for Python 3 to
 interpret the 6-character string, r'\u3000' as a single
 unicode character to correspond with Python's lexer no
 longer doing that (as it did in Python 2)?

 And is there no choice for me but to choose between the two
 poor choices I mention above to deal with this problem?

 There is a 3rd one: use   r'[ ' + '\u3000' + ']'. Not very nice to read,
 but should do the trick...

 I guess the +s could be left out allowing something
 like,

   '[ \u3000]' r'\w+ \d{3}'

 but I'll have to try it a little; maybe just doubling
 backslashes won't be much worse.  I did that for years
 in Perl and lived through it.

Just for some closure, there are many places in my code
that I had/have to track down and change.  But the biggest
problem so far is a lexer module that is structured as many
dozens of little functions, each with a docstring that is
a regex string.

The only way I found change these and maintain sanity was
to go through them and remove the r prefix from any strings
that contain \u literals, and then double any other
backslashes in the string.

Since these are docstrings, creating them with executable
code was awkward, and using adjacent string concatenation
led to a very confusing mix of string styles.  Strings that
used concatenation often had a single logical regex structure
(eg a character set [...]) split between two strings.
The extra quote characters were as visually confusing as
doubled backslashes in many cases.

Strings with doubled backslashes, although harder to read
were, were much easier to edit reliably and in their way,
more regular.  It does make this module look very Perlish
though... :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python3 raw strings and \u escapes

2012-05-31 Thread ru...@yahoo.com
On 05/31/2012 03:10 PM, Chris Angelico wrote:
 On Fri, Jun 1, 2012 at 6:28 AM, ru...@yahoo.com ru...@yahoo.com wrote:
 ... a lexer module that is structured as many
 dozens of little functions, each with a docstring that is
 a regex string.

 This may be a good opportunity to take a step back and ask yourself:
 Why so many functions, each with a regular expression in its
 docstring?

Because that's the way David Beazley designed Ply?
 http://dabeaz.com/ply/

Personally, I think it's an abuse of docstrings but
he never asked me for my opinion...
-- 
http://mail.python.org/mailman/listinfo/python-list


python3 raw strings and \u escapes

2012-05-30 Thread ru...@yahoo.com
In python2, \u escapes are processed in raw unicode
strings.  That is, ur'\u3000' is a string of length 1
consisting of the IDEOGRAPHIC SPACE unicode character.

In python3, \u escapes are not processed in raw strings.
r'\u3000' is a string of length 6 consisting of a backslash,
'u', '3' and three '0' characters.

This breaks a lot of my code because in python 2
  re.split (ur'[\u3000]', u'A\u3000A') == [u'A', u'A']
but in python 3 (the result of running 2to3),
  re.split (r'[\u3000]', 'A\u3000A' ) == ['A\u3000A']

I can remove the r prefix from the regex string but then
if I have other regex backslash symbols in it, I have to
double all the other backslashes -- the very thing that
the r-prefix was invented to avoid.

Or I can leave the r prefix and replace something like
r'[ \u3000]' with r'[  ]'.  But that is confusing because
one can't distinguish between the space character and
the ideographic space character.  It also a problem if a
reader of the code doesn't have a font that can display
the character.

Was there a reason for dropping the lexical processing of
\u escapes in strings in python3 (other than to add another
annoyance in a long list of python3 annoyances?)

And is there no choice for me but to choose between the two
poor choices I mention above to deal with this problem?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python3 raw strings and \u escapes

2012-05-30 Thread ru...@yahoo.com
On 05/30/2012 05:54 AM, Thomas Rachel wrote:
 Am 30.05.2012 08:52 schrieb ru...@yahoo.com:

 This breaks a lot of my code because in python 2
re.split (ur'[\u3000]', u'A\u3000A') ==  [u'A', u'A']
 but in python 3 (the result of running 2to3),
re.split (r'[\u3000]', 'A\u3000A' ) ==  ['A\u3000A']

 I can remove the r prefix from the regex string but then
 if I have other regex backslash symbols in it, I have to
 double all the other backslashes -- the very thing that
 the r-prefix was invented to avoid.

 Or I can leave the r prefix and replace something like
 r'[ \u3000]' with r'[  ]'.  But that is confusing because
 one can't distinguish between the space character and
 the ideographic space character.  It also a problem if a
 reader of the code doesn't have a font that can display
 the character.

 Was there a reason for dropping the lexical processing of
 \u escapes in strings in python3 (other than to add another
 annoyance in a long list of python3 annoyances?)

 Probably it is more consequent. Alas, it makes the whole stuff
 incompatible to Py2.

 But if you think about it: why allow for \u if \r, \n etc. are
 disallowed as well?

Maybe the blame is elsewhere then...  If the re module
interprets (in a regex string) the 2-character string
consisting of r'\' followed by 'n' as a single newline
character, then why wasn't re changed for Python 3 to
interpret the 6-character string, r'\u3000' as a single
unicode character to correspond with Python's lexer no
longer doing that (as it did in Python 2)?

 And is there no choice for me but to choose between the two
 poor choices I mention above to deal with this problem?

 There is a 3rd one: use   r'[ ' + '\u3000' + ']'. Not very nice to read,
 but should do the trick...

I guess the +s could be left out allowing something
like,

  '[ \u3000]' r'\w+ \d{3}'

but I'll have to try it a little; maybe just doubling
backslashes won't be much worse.  I did that for years
in Perl and lived through it.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python3 raw strings and \u escapes

2012-05-30 Thread ru...@yahoo.com

On 05/30/2012 10:46 AM, Terry Reedy wrote:
 On 5/30/2012 2:52 AM, ru...@yahoo.com wrote:
 In python2, \u escapes are processed in raw unicode
 strings.  That is, ur'\u3000' is a string of length 1
 consisting of the IDEOGRAPHIC SPACE unicode character.

 That surprised me until I rechecked the fine manual and found:

 When an 'r' or 'R' prefix is present, a character following a backslash
 is included in the string without change, and all backslashes are left
 in the string.

 When an 'r' or 'R' prefix is used in conjunction with a 'u' or 'U'
 prefix, then the \u and \U escape sequences are processed
 while all other backslashes are left in the string.

 When 'u' was removed in Python 3, a choice had to be made and the first
 must have seemed to be the obvious one, or perhaps the automatic one.

 In 3.3, 'u' is being restored. I have inquired on pydev list whether the
 difference above should also be restored, and mentioned this thread.

As mentioned is a different message, another option might
be to leave raw strings as is (more consistent since all
backslashes are treated the same) and have the re module
un-escape \u (and similar) literals in regex string
(also more consistent since that's what it does with '\\n',
'\\t', etc.)

I do realize though that this may have back-compatibilty
problems that makes it impossible to do.



-- 
http://mail.python.org/mailman/listinfo/python-list


2to3 inscrutable output

2012-05-28 Thread ru...@yahoo.com
What is this output from 2to3 supposed to mean?
  $ cat mysub.py
  isinstance (3, (int,float))
  $ 2to3 -f isinstance mysub.py
  RefactoringTool: No changes to mysub.py
  RefactoringTool: Files that need to be modified:
  RefactoringTool: mysub.py
Why does mysub.py need to be modified, and how?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 2to3 for 2.7

2012-05-27 Thread ru...@yahoo.com
On 05/27/2012 07:53 AM, Steven D'Aprano wrote:
  On Sat, 26 May 2012 19:37:33 -0700, ru...@yahoo.com wrote:
  Is there a list of fixers I can tell 2to3 to use that will limit changes
  to things that will continue to run under python-2.7?
 
  So you want a 2to2?

Yes.  :-)

  I suggest you read the Fine Manual and choose the fixers you want to
  apply yourself:
 
  http://docs.python.org/library/2to3.html
 
  That, plus a bit of trial-and-error at the interactive prompt, will soon
  tell you what works and what doesn't. But read on for my suggestions.

That, and the 2.6 and 2.7 What's New's and the docs for
the 3.x backported features mentioned therein...  I've
started to do just that but if someone else has already
distilled all this information...

  I want to start the 2-3 trip by making my code as py3 compatible (under
  py2) as possible before going the rest of the way to py3, and having
  2to3 help with this seems like a good idea.
  Your project, your decision, but it doesn't sound like a good idea to me,
  unless your project is quite small or efficiency is not high on your list
  of priorities. You risk making your 2.7 version significantly slower and
  less efficient than your 2.6 version, but without actually gaining 3.x
  compatibility.

I can't really migrate my project until wxPython does.
But I've read a number of conversion experiences
ranging from ran 2to3 and everything was golden to
needing to make some serious design decisions (usually
in the bytes/str area) to months of effort to get all
the little glitches wrung out.  So I have no idea what
is in store for me.  By doing some of the conversion
now I can hopefully get a better sense of what is in
store and get some of the work done earlier rather
than later.  There is also the generally useful
heuristic of dividing a larger task into two smaller
independent tasks...

And finally, there is the question about maintaining
2/3 compatibility in a single codebase.  I don't have
a hard requirement for this but if it is doable without
too much effort, I would prefer to do so.  ISTM that
looking at what remains left to do after the 2.7 code
has been 3-ifided as much as possible will allow me to
make a better judgment about that.

  (For what it's worth, I try to aim at 3.x compatibility as the priority,
  and if that means my code is a bit slower under 2.5-2.7, that's a price
  I'm willing to pay.)
 
  The problem is that many of the idioms that work well in Python 3 will be
  less efficient, and therefore slower, in Python 2.7. For example,
  consider this Python 2.x loop, iterating lazily over a dict efficiently:

I did not spend much time optimizing for performance
when writing the code, so it probably doesn't make
sense to worry about it now, unless a really large
performance difference is likely (which seems to me
unlikely given that I don't have any really large
in-memory data).  Thanks for the tip though; it is
something I'll remain alert for.

 [...]
  For what it's worth, I'd try these fixers:
 
  apply
  except
  exec
  execfile
  exitfunc
  has_key
  idioms
  ne
  next
  paren
  print
  raise
  repr
  tuple_params
  ws_comma
  xreadlines
 
  plus from __future__ import print, and see what breaks :)
 
  Also, don't forget future_builtins:
  http://docs.python.org/library/future_builtins.html
 
  Good luck, and if you do go ahead with this, please consider posting an
  update here, or writing a blog post with details of how successful it was.

Thanks for that list.  Sans anything more definitive
it is a good starting point.

-- 
http://mail.python.org/mailman/listinfo/python-list


2to3 for 2.7

2012-05-26 Thread ru...@yahoo.com
Is there a list of fixers I can tell 2to3 to use that will
limit changes to things that will continue to run under
python-2.7?

I want to start the 2-3 trip by making my code
as py3 compatible (under py2) as possible before
going the rest of the way to py3, and having 2to3
help with this seems like a good idea.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Create directories and modify files with Python

2012-05-01 Thread ru...@yahoo.com
On 04/30/2012 05:24 PM, deltaquat...@gmail.com wrote:
 Hi,

 I would like to automate some simple tasks I'm doing by hand. Given a text 
 file
 foobar.fo:

 073 1.819
 085 2.132
 100 2.456
 115 2.789

 I need to create the directories 073, 085, 100, 115, and copy in each 
 directory a modified version of the text file input.in:

 .
 .
 .
 foo = 1.5 ! edit this value
 .
 .
 .
 bar = 1.5 ! this one, too
 .
 .
 .

 Tthe modification consists in substituting the number in the above lines with 
 the value associated to the directory in the file foobar.fo. Thus, the 
 input.in file in the directory 100 will be:

 .
 .
 .
 foo = 2.456 ! edit this value
 .
 .
 .
 bar = 2.456 ! this one, too
 .
 .
 .

 At first, I tried to write a bash script to do this. However, when and if the 
 script will work, I'll probably want to add more features to automate some 
 other tasks. So I thought about using some other language, to have a more 
 flexible and mantainable code. I've been told that both Python and perl are 
 well suited for such tasks, but unfortunately I know neither of them. Can you 
 show me how to write the script in Python? Thanks,

Perhaps something like this will get you started?  To
keep things simple (since this is illustrative code)
there is little parameterization and no error handling.
Apologies if Google screws up the formatting too bad.


from __future__ import print_function   #1
import os

def main():
listf = open ('foobar.fo')
for line in listf:
dirname, param = line.strip().split() #7
make_directory (dirname, param)

def make_directory (dirname, param):
os.mkdir (dirname) #11
tmplf = open (input.in)
newf = open (dirname + '/' + 'input.in', 'w')  #13
for line in tmplf:
if line.startswith ('foo = ') or line.startswith ('bar = '):  #15
line = line.replace (' 1.5 ', ' '+param+' ')  #16
print (line, file=newf, end='') #17

if __name__ == '__main__': main()#19


#1: Not sure whether you're using Python 2 or 3.  I ran
 this on Python 2.7 and think it will run on Python 3 if
 you remove this line.

#7:The strip() method removes the '\n' characters from
 the end of the lines as well as any other extraneous
 leading or trailing whitespace.  The split() method
 here breaks the line into two pieces on the whitespace
 in the middle.  See
  http://docs.python.org/library/stdtypes.html#string-methods

#11: This will create subdirectory 'dirname' relative
 to the current directory of course.  See
  http://docs.python.org/library/os.html#os.mkdir

#13: Usually, is is more portable to use os.path.join() to
 concatenate path components but since you stated you are
 on Linux (and / works on Windows too), creating the path
 with / is easier to follow in this example.  For open()
 see
  http://docs.python.org/library/functions.html#open

#15: Depending on your data, you might want to use the re
 (regular expression) module here if the simple string
 substitution is not sufficient.

#16: For simplicity I just blindly replaced the  1.5 
 text in the string.  Depending of your files, you might
 want to parameterize this of do something more robust or
 sophisticated.

#17: Since we did not strip the trailing '\n' of the lines
 we read from input.in, we use end='' to prevent
 print from adding an additional '\n'.  See
  http://docs.python.org/library/functions.html#print

#19: This line is required to actually get your python
 file to do anything. :-)

Hope this gets you started.  I think you will find doing
this kind of thing in Python is much easier in the long
run than with bash scripts.

A decent resource for learning the basics of Python is
the standard Python tutorial:
  http://docs.python.org/tutorial/index.html

-- 
http://mail.python.org/mailman/listinfo/python-list


argparse missing optparse capabilities?

2012-01-05 Thread ru...@yahoo.com
I have optparse code that parses a command line containing
intermixed positional and optional arguments, where the optional
arguments set the context for the following positional arguments.
For example,

  myprogram.py arg1 -c33 arg2 arg3 -c44 arg4

'arg1' is processed in a default context, 'args2' and 'arg3' in
context '33', and 'arg4' in context '44'.

I am trying to do the same using argparse but it appears to be
not doable in a documented way.

Here is the working optparse code (which took 30 minutes to write
using just the optparse docs):

  import optparse
  def append_with_pos (option, opt_str, value, parser):
if getattr (parser.values, option.dest, None) is None:
setattr (parser.values, option.dest, [])
getattr (parser.values, option.dest).append ((value, len
(parser.largs)))
  def opt_parse():
p = optparse.OptionParser()
p.add_option (-c, type=int,
action='callback', callback=append_with_pos)
opts, args = p.parse_args()
return args, opts
  if __name__ == '__main__':
args, opts = opt_parse()
print args, opts

Output from the command line above:
  ['arg1', 'arg2', 'arg3', 'arg4'] {'c': [(33, 1), (44, 3)]}
The -c values are stored as (value, arglist_position) tuples.

Here is an attempt to convert to argparse using the guidelines
in the argparse docs:

  import argparse
  class AppendWithPos (argparse.Action):
def __call__ (self, parser, namespace, values,
option_string=None):
if getattr (namespace, self.dest, None) is None:
setattr (namespace, self.dest, [])
getattr (namespace, self.dest).extend ((values, len
(parser.largs)))
  def arg_parse():
p = argparse.ArgumentParser (description='description')
p.add_argument ('src', nargs='*')
p.add_argument ('-c', type=int, action=AppendWithPos)
opts = p.parse_args()
return opts
  if __name__ == '__main__':
opts = arg_parse()
print opts

This fails with,
  AttributeError: 'ArgumentParser' object has no attribute 'largs'
and of course, the argparse.parser is not documented beyond how
to instantiate it.  Even were that not a problem, argparse complains
about unrecognised arguments for any positional arguments that
occur after an optional one.  I've been farting with this code for
a day now.

Any suggestions on how I can convince argparse to do what optparse
does easily will be very welcome.  (I tried parse_known_args() but
that breaks help and requires me to detect truly unknown arguments.)

(Python 2.7.1 if it matters and apologies if Google mangles
the formatting of this post.)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: argparse missing optparse capabilities?

2012-01-05 Thread ru...@yahoo.com
On Jan 5, 1:05 am, ru...@yahoo.com ru...@yahoo.com wrote:
   class AppendWithPos (argparse.Action):
     def __call__ (self, parser, namespace, values,
 option_string=None):
         if getattr (namespace, self.dest, None) is None:
             setattr (namespace, self.dest, [])
         getattr (namespace, self.dest).extend ((values, len (parser.largs)))

I realized right after posting that the above line should
be I think,
  getattr (namespace, self.dest).extend ((values, len
(namespace.src)))

but that still doesn't help with the unrecognised arguments
problem.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: argparse missing optparse capabilities?

2012-01-05 Thread ru...@yahoo.com
On 01/05/2012 02:19 AM, Ulrich Eckhardt wrote:
 Am 05.01.2012 09:05, schrieb ru...@yahoo.com:
 I have optparse code that parses a command line containing
 intermixed positional and optional arguments, where the optional
 arguments set the context for the following positional arguments.
 For example,

myprogram.py arg1 -c33 arg2 arg3 -c44 arg4

 'arg1' is processed in a default context, 'args2' and 'arg3' in
 context '33', and 'arg4' in context '44'.

 Question: How would you e.g. pass the string -c33 as first argument,
 i.e. to be parsed in the default context?

There will not be a need for that.

 The point is that you separate the parameters in a way that makes it
 possible to parse them in a way that works 100%, not just a way that
 works in 99% of all cases.

I agree that one should strive for a syntax that works
100% but in this case, the simplicity and intuitiveness
of the existing command syntax outweigh by far the need
for having it work in very improbable corner cases.
(And I'm sure I've seen this syntax used in other unix
command line tools in the past though I don't have time
to look for examples now.)

If argparse does not handle this syntax for some such
purity reason (as opposed to, for example. it is hard
to do in argparse's current design) then argparse is
mistakenly putting purity before practicality.

 For that reason, many commandline tools
 accept -- as separator, so that cp -- -r -x will copy the file -r
 to the folder -x. In that light, I would consider restructuring your
 commandline.

In my case that's not possible since I am replacing an
existing tool with a Python application and changing the
command line syntax is not an option.

 I am trying to do the same using argparse but it appears to be
 not doable in a documented way.

 As already hinted at, I don't think this is possible and that that is so
 by design.

Thanks for the confirmation.  I guess that shows that
optparse has a reason to exist beyond backwards compatibility.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: argparse missing optparse capabilities?

2012-01-05 Thread ru...@yahoo.com
On 01/05/2012 11:46 AM, Ian Kelly wrote:
 On Thu, Jan 5, 2012 at 11:14 AM, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Thu, Jan 5, 2012 at 1:05 AM, ru...@yahoo.com ru...@yahoo.com wrote:
 I have optparse code that parses a command line containing
 intermixed positional and optional arguments, where the optional
 arguments set the context for the following positional arguments.
 For example,

  myprogram.py arg1 -c33 arg2 arg3 -c44 arg4

 'arg1' is processed in a default context, 'args2' and 'arg3' in
 context '33', and 'arg4' in context '44'.

 I am trying to do the same using argparse but it appears to be
 not doable in a documented way.
[...]

 Sorry, I missed the second part of that.  You seem to be right, as far
 as I can tell from tinkering with it, all the positional arguments
 have to be in a single group.  If you have some positional arguments
 followed by an option followed by more positional arguments, and any
 of the arguments have a loose nargs quantifier ('?' or '*' or '+'),
 then you get an error.

OK, thanks for the second confirmation.  I was hoping there
was something I missed or some undocumented option to allow
intermixed optional and positional arguments with Argparse
but it appears not.

I notice that Optparse seems to intentionally provide this
capability since it offers a disable_interspersed_args()
method.  It is unfortunate that Argparse chose to not to
provide backward compatibility for this thus forcing some
users to continue using a deprecated module.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fixing the XML batteries

2011-12-13 Thread ru...@yahoo.com
On Dec 13, 5:32 am, Stefan Behnel stefan...@behnel.de wrote:
...
 In Python 2.7/3.2, ElementTree has support for C14N serialisation, just
 pass the option method=c14n.

Where in the Python docs can one find information about this?

[previous post disappeared, sorry if I double posted or replied to
author inadvertently.]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fixing the XML batteries

2011-12-13 Thread ru...@yahoo.com
On Dec 13, 5:32 am, Stefan Behnel stefan...@behnel.de wrote:
...
 In Python 2.7/3.2, ElementTree has support for C14N serialisation, just
 pass the option method=c14n.

Where does one find information in the Python documentation about
this?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fixing the XML batteries

2011-12-13 Thread ru...@yahoo.com
On Dec 13, 1:21 pm, Stefan Behnel stefan...@behnel.de wrote:
 ru...@yahoo.com, 13.12.2011 20:37:

  On Dec 13, 5:32 am, Stefan Behnel wrote:
  In Python 2.7/3.2, ElementTree has support for C14N serialisation, just
  pass the option method=c14n.

  Where does one find information in the Python documentation about
  this?

 Hmm, interesting. I though it had, but now when I click on the stdlib doc
 link to read the module source (hint, hint),

I realize the source is available (having had to use it way too
many time in the past, not just with ET), but that does not justify
omission from the docs.  However the point is moot since (as you
say) it seems the Python-distributed ET doesn't contain the c14n
feature.

 I can see that it only has the
 hooks. The C14N support module of ET 1.3 was not integrated into the
 stdlib. Sorry for not verifying this earlier.

 So you actually need the external package for C14N support. See here:

 http://effbot.org/zone/elementtree-13-intro.htm

 http://hg.effbot.org/et-2009-provolone/src/tip/elementtree/elementtre...

 Just to emphasize this once again: it's not more than a single module that
 you can copy into your own code as a fallback import, or deploy in your
 local installations.

Right, but many times I try to avoid external dependencies when
feasible.
Thanks for the clarifications.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to generate error when argument are not supplied and there is no explicit defults (in optparse)?

2011-10-15 Thread ru...@yahoo.com
On 10/14/2011 03:29 PM, Peng Yu wrote:
 Hi,

 The following code doesn't give me error, even I don't specify the
 value of filename from the command line arguments. filename gets
 'None'. I checked the manual, but I don't see a way to let
 OptionParser fail if an argument's value (which has no default
 explicitly specified) is not specified. I may miss some thing in the
 manual. Could any expert let me know if there is a way to do so?
 Thanks!

 #!/usr/bin/env python

 from optparse import OptionParser

 usage = 'usage: %prog [options] arg1 arg2'
 parser = OptionParser(usage=usage)
 parser.set_defaults(verbose=True)
 parser.add_option('-f', '--filename')

 #(options, args) = parser.parse_args(['-f', 'file.txt'])
 (options, args) = parser.parse_args()

 print options.filename

You can check it yourself.
I find I use a pretty standard pattern with optparse:

 def main (args, opts):
 ...

 def parse_cmdline ():
 p = OptionParser()
 p.add_option('-f', '--filename')
 options, args = parser.parse_args()

 if not options.filename:
p.error (-f option required)
 if len (args) != 2:
p.error (Expected exactly 2 arguments)
 # Other checks can obviously be done here too.

 return args, options

 if __name__ == '__main__':
 args, opts = parse_cmdline()
 main (args, opts)

While one can probably subclass OptionParser or use callbacks
to achieve the same end, I find the above approach simple and
easy to follow.

I also presume you know that you have can optparse produce a
usage message by adding 'help' arguments to the add_option()
calls?

And as was mentioned in another post, argparse in Python 2.7
(or in earlier Pythons by downloading/installing it yourself)
can do the checking you want.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regular expression in python

2011-08-19 Thread ru...@yahoo.com
On 08/19/2011 11:33 AM, Matt Funk wrote:
 On Friday, August 19, 2011, Alain Ketterlin wrote:
 Matt Funk matze...@gmail.com writes:
  thanks for the suggestion. I guess i had found another way around the
  problem as well. But i really wanted to match the line exactly and i
  wanted to know why it doesn't work. That is less for the purpose of
  getting the thing to work but more because it greatly annoys me off that
  i can't figure out why it doesn't work. I.e. why the expression is not
  matches {32} times. I just don't get it.

 Because a line is not 32 times a number, it is a number followed by 31
 times a space followed by a number. Using Jason's regexp, you can
 build the regexp step by step:

 number = r\d\.\d+e\+\d+
 numbersequence = r%s( %s){31} % (number,number)
 That didn't work either. Using the (modified (where the (.+) matches the end 
 of
 the line)) expression as:

 number = r\d\.\d+e\+\d+
 numbersequence = r%s( %s){31}(.+) % (number,number)
 instance_linetype_pattern = re.compile(numbersequence)

 The results obtained are:
 results:
 [(' 2.199000e+01', ' : (instance: 0)\t:\tsome description')]
 so this matches the last number plus the string at the end of the line, but no
 retaining the previous numbers.

The secret is buried very unobtrusively in the re docs,
where it has caught me out in the past.  Specifically
in the docs for re.group():

  If a group is contained in a part of the pattern that
  matched multiple times, the last match is returned.

In addition to the findall solution someone else
posted, another thing you could do is to explicitly
express the groups in your re:

  number = r\d\.\d+e\+\d+
  groups = (r( %s) % number)*31
  numbersequence = r%s%s(.+) % (number,groups)
  ...
  results = match_object.group(range(1,33))

Or (what I would probably do), simply match the
whole string of numbers and pull it apart later:

  number = r\d\.\d+e\+\d+
  numbersequence = r(%s(?: %s){31})(.+) % (number,number)
  results = (match_object.group(1)).split()

[none of this code is tested but should be close
enough to convey the general idea.]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-08 Thread ru...@yahoo.com
On 06/08/2011 03:01 AM, Duncan Booth wrote:
 ru...@yahoo.com ru...@yahoo.com wrote:
 On 06/06/2011 09:29 AM, Steven D'Aprano wrote:
 Yes, but you have to pay the cost of loading the re engine, even if
 it is a one off cost, it's still a cost,
[...]
 At least part of the reason that there's no difference there is that the
 're' module was imported in both cases:

Quite right.  I should have thought of that.

[...]
 Steven is right to assert that there's a cost to loading it, but unless you
 jump through hoops it's not a cost you can avoid paying and still use
 Python.

I would say that it is effectively zero cost then.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-08 Thread ru...@yahoo.com
On 06/07/2011 06:30 PM, Roy Smith wrote:
 On 06/06/2011 08:33 AM, rusi wrote:
 Evidently for syntactic, implementation and cultural reasons, Perl
 programmers are likely to get (and then overuse) regexes faster than
 python programmers.

 ru...@yahoo.com ru...@yahoo.com wrote:
 I don't see how the different Perl and Python cultures themselves
 would make learning regexes harder for Python programmers.

 Oh, that part's obvious.  People don't learn things in a vacuum.  They
 read about something, try it, fail, and ask for help.  If, in one
 community, the response they get is, I see what's wrong with your
 regex, you need to ..., and in another they get, You shouldn't be
 using a regex there, you should use this string method instead..., it
 should not be a surprise that it's easier to learn about regexes in the
 first community.

I think we are just using different definitions of harder.

I said, immediately after the sentence you quoted,

 At
 most I can see the Perl culture encouraging their use and
 the Python culture discouraging it, but that doesn't change
 the ease or difficulty of learning.

Constantly being told not to use regexes certainly discourages
one from learning them, but I don't think that's the same as
being *harder* to learn in Python.  The syntax of regexes is,
at least at the basic level, pretty universal, and it is in
learning to understand that syntax that most of any difficulty
lies.  Whether to express a regex as /code (blue)|(red)/i in
Perl or (r'code (blue)|(red)', re.I) in Python is a superficial
difference, as is, say, using match results: $alert = $1' vs
alert = m.group(1).

A Google for python regular expression tutorial produces
lots of results including the Python docs HOWTO.  And because
the syntax is pretty universal, leaving the python off that
search string will yield many, many more that are applicable.
Although one does get some don't do that responses to regex
questions on this list (and some are good advice), there are
also usually answers too.

So I think of it as more of a Python culture thing, rather
then being actually harder to learn to use regexes in Python
although I see how one can view it your way too.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-07 Thread ru...@yahoo.com
On 06/06/2011 09:29 AM, Steven D'Aprano wrote:
 On Sun, 05 Jun 2011 23:03:39 -0700, ru...@yahoo.com wrote:
[...]
 I would argue that the first, non-regex solution is superior, as it
 clearly distinguishes the multiple steps of the solution:

 * filter lines that start with CUSTOMER
 * extract fields in that line
 * validate fields (not shown in your code snippet)

 while the regex tries to do all of these in a single command. This makes
 the regex an all or nothing solution: it matches *everything* or
 *nothing*. This means that your opportunity for giving meaningful error
 messages is much reduced. E.g. I'd like to give an error message like:

 found digit in customer name (field 2)

 but with your regex, if it fails to match, I have no idea why it failed,
 so can't give any more meaningful error than:

 invalid customer line

 and leave it to the caller to determine what makes it invalid. (Did I
 misspell CUSTOMER? Put a dot after the initial? Forget the code? Use
 two spaces between fields instead of one?)

I agree that is a legitimate criticism.  Its importance depends
greatly on the purpose and consumers of the code.  While such
detailed error messages might be appropriate in a fully polished
product, in my case, I often have to process files personally
to extract information, or to provide code to others (who typically
have at least some degree of technical sophistication) to do the
same.

In this case, being able to code something quickly, and adapt it
quickly to changes is more important than providing highly detailed
error messages.  The format is simple enough that invalid customer
line and the line number is perfectly adaquate.  YMMV.

As I said, regexes are a tool, like any tool, to be used
appropriately.

[...]
 In addition to being wrong (loading is done once, compilation is
 typically done once or a few times, while the regex is used many times
 inside a loop so the overhead cost is usually trivial compared with the
 cost of starting Python or reading a file), this is another
 micro-optimization argument.

 Yes, but you have to pay the cost of loading the re engine, even if it is
 a one off cost, it's still a cost,

~$ time python -c 'pass'
real0m0.015s
user0m0.011s
sys 0m0.003s

~$ time python -c 'import re'
real0m0.015s
user0m0.011s
sys 0m0.003s

Or do you mean something else by loading the re engine?

 and sometimes (not always!) it can be
 significant. It's quite hard to write fast, tiny Python scripts, because
 the initialization costs of the Python environment are so high. (Not as
 high as for, say, VB or Java, but much higher than, say, shell scripts.)
 In a tiny script, you may be better off avoiding regexes because it takes
 longer to load the engine than to run the rest of your script!

Do you have an example?  I am having a hard time imagining that.
Perhaps you are thinking on the time require to compile a RE?

~$ time python -c 'import re; re.compile(r^[^()]*(\([^()]*\)[^()]*)*
$)'
real0m0.017s
user0m0.014s
sys 0m0.003s

Hard to imagine a case where where 15mS is fast enough but
17mS is too slow.  And that's without the diluting effect
of actually doing some real work in the script.  Of course
a more complex regex would likely take longer.

(The times vary greatly on my machine, I am quoting the most
common lowest but not absolutely lowest results.)

 (Note that Apocalypse is referring to a series of Perl design
 documents and has nothing to do with regexes in particular.)

 But Apocalypse 5 specifically has everything to do with regexes. That's
 why I linked to that, and not (say) Apocalypse 2.

 Where did I suggest that you should have linked to Apocalypse 2? I wrote
 what I wrote to point out that the Apocalypse title was not a
 pejorative comment on regexes.  I don't see how I could have been
 clearer.

 Possibly by saying what you just said here?

 I never suggested, or implied, or thought, that Apocalypse was a
 pejorative comment on *regexes*. The fact that I referenced Apocalypse
 FIVE suggests strongly that there are at least four others, presumably
 not about regexes.

Nor did I ever suggest you did.  Don't forget that you are
not the only person reading this list.  The comment was for
the benefit of others.  Perhaps you are being overly sensitive?

 [...]
 If regexes were more readable, as proposed by Wall, that would go a
 long way to reducing my suspicion of them.

 I am delighted to read that you find the new syntax more acceptable.

 Perhaps I wasn't as clear as I could have been. I don't know what the new
 syntax is. I was referring to the design principle of improving the
 readability of regexes. Whether Wall's new syntax actually does improve
 readability and ease of maintenance is a separate issue, one on which I
 don't have an opinion on. I applaud his *intention* to reform regex
 syntax, without necessarily agreeing that he has done so.

Thanks for clarifying.  But since you earlier wrote in response
to MRAB,
http

Re: how to avoid leading white spaces

2011-06-07 Thread ru...@yahoo.com
On 06/06/2011 08:33 AM, rusi wrote:
 For any significant language feature (take recursion for example)
 there are these issues:

 1. Ease of reading/skimming (other's) code
 2. Ease of writing/designing one's own
 3. Learning curve
 4. Costs/payoffs (eg efficiency, succinctness) of use
 5. Debug-ability

 I'll start with 3.
 When someone of Kernighan's calibre (thanks for the link BTW) says
 that he found recursion difficult it could mean either that Kernighan
 is a stupid guy -- unlikely considering his other achievements. Or
 that C is not optimal (as compared to lisp say) for learning
 recursion.

Just as a side comment, I didn't see anything in the link
Chris Torek posted (repeated here since it got snipped:
http://www.princeton.edu/~hos/frs122/precis/kernighan.htm)
that said Kernighan found recursion difficult, just that it
was perceived as expensive.  Nor that the expense had anything
to do with programming language but rather was due to hardware
constraints of the time.
But maybe you are referring to some other source?

 Evidently for syntactic, implementation and cultural reasons, Perl
 programmers are likely to get (and then overuse) regexes faster than
 python programmers.

If by get, you mean understand, then I'm not sure why
the reasons you give should make a big difference.  Regex
syntax is pretty similar in both Python and Perl, and
virtually identical in terms of learning their basics.
There are some differences in the how regexes are used
between Perl and Python that I mentioned in
http://groups.google.com/group/comp.lang.python/msg/39fca0d4589f4720?,
but as I said there, that wouldn't, particularly in light
of Python culture where one-liners and terseness are not
highly valued, seem very important.  And I don't see how
the different Perl and Python cultures themselves would
make learning regexes harder for Python programmers.  At
most I can see the Perl culture encouraging their use and
the Python culture discouraging it, but that doesn't change
the ease or difficulty of learning.

And why do you say overuse regexs?  Why isn't it the case
that Perl programmers use regexes appropriately in Perl?  Are
you not arbitrarily applying a Python-centric standard to a
different culture?  What if a Perl programmer says that Python
programmers under-use regexes?

 1 is related but not the same as 3.  Someone with courses in automata,
 compilers etc -- standard CS stuff -- is unlikely to find regexes a
 problem.  Conversely an intelligent programmer without a CS background
 may find them more forbidding.

I'm not sure of that.  (Not sure it should be that way,
perhaps it may be that way in practice.)  I suspect that
a good theoretical understanding of automata theory would
be essential in writing a regex compiler but I'm not sure
it is necessary to use regexes.

It does I'm sure give one a solid understanding of the
limitations of regexes but a practical understanding of
those can be achieved without the full course I think.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-06 Thread ru...@yahoo.com
On 06/03/2011 08:05 PM, Steven D'Aprano wrote:
 On Fri, 03 Jun 2011 12:29:52 -0700, ru...@yahoo.com wrote:

 I often find myself changing, for example, a startwith() to a RE when
 I realize that the input can contain mixed case

 Why wouldn't you just normalise the case?

 Because some of the text may be case-sensitive.

 Perhaps you misunderstood me. You don't have to throw away the
 unnormalised text, merely use the normalized text in the expression you
 need.

 Of course, if you include both case-sensitive and insensitive tests in
 the same calculation, that's a good candidate for a regex... or at least
 it would be if regexes supported that :)

I did not choose a good example to illustrate what I find often
motivates my use of regexes.

You are right that for a simple .startwith() using a regex just
in case is not a good choice, and in fact I would not do that.

The process that I find often occurs is that I write (or am about
to write string method solution and when I think more about the
input data (which is seldom well-specified), I realize that using
a regex I can get better error checking, do more of the parsing
in one place, and adapt to changes in input format better than I
could with a .startswith and a couple other such methods.

Thus what starts as
  if line.startswith ('CUSTOMER '):
try: kw, first_initial, last_name, code, rest = line.split(None,
4)
...
often turns into (sometimes before it is written) something like
  m = re.match (r'CUSTOMER (\w+) (\w+) ([A-Z]\d{3})')
  if m: first_initial, last_name, code = m.group(...)

[...]
 or that I have
 to treat commas as well as spaces as delimiters.

 source.replace(,,  ).split( )

 Uhgg. create a whole new string just so you can split it on one rather
 than two characters?

 You say that like it's expensive.

No, I said it like it was ugly.  Doing things unrelated to the
task at hand is ugly.  And not very adaptable -- see my reply
to Chris Torek's post.  I understand it is a common idiom and
I use it myself, but in this case there is a cleaner alternative
with re.split that expresses exactly what one is doing.

 And how do you what the regex engine is doing under the hood? For all you
 know, it could be making hundreds of temporary copies and throwing them
 away. Or something. It's a black box.

That's a silly argument.
And how do you know what replace is doing under the hood?
I would expect any regex processor to compile the regex into
an FSM.  As usual, I would expect to pay a small performance
price for the generality, but that is reasonable tradeoff in
many cases.  If it were a potential problem, I would test it.
What I wouldn't do is throw away a useful tool because, golly,
I don't know, maybe it'll be slow -- that's just a form of
cargo cult programming.

 The fact that creating a whole new string to split on is faster than
 *running* the regex (never mind compiling it, loading the regex engine,
 and anything else that needs to be done) should tell you which does more
 work. Copying is cheap. Parsing is expensive.

In addition to being wrong (loading is done once, compilation is
typically done once or a few times, while the regex is used many
times inside a loop so the overhead cost is usually trivial compared
with the cost of starting Python or reading a file), this is another
micro-optimization argument.

I'm not sure why you've suddenly developed this obsession with
wringing every last nanosecond out of your code.  Usually it
is not necessary.  Have you thought of buying a faster computer?
Or using C?  *wink*

 Sorry, but I find

 re.split ('[ ,]', source)

 states much more clearly exactly what is being done with no obfuscation.

 That's because you know regex syntax. And I'd hardly call the version
 with replace obfuscated.

 Certainly the regex is shorter, and I suppose it's reasonable to expect
 any reader to know at least enough regex to read that, so I'll grant you
 that this is a small win for clarity. A micro-optimization for
 readability, at the expense of performance.


 Obviously this is a simple enough case that the difference is minor but
 when the pattern gets only a little more complex, the clarity difference
 becomes greater.

 Perhaps. But complicated tasks require complicated regexes, which are
 anything but clear.

Complicated tasks require complicated code as well.

As another post pointed out, there are ways to improve the
clarity of a regex such as the re.VERBOSE flag.
There is no doubt that a regex encapsulates information much more
densely than python string manipulation code.  One should not
be surprised that is might take as much time and effort to understand
a one-line regex as a dozen (or whatever) lines Python code that
do the same thing.  In most cases I'll bet, given equal fluency
in regexes and Python, the regex will take less.

 [...]
 After doing this a
 number of times, one starts to use an RE right from the get go unless
 one is VERY sure that there will be no requirements creep.

 YAGNI

Re: how to avoid leading white spaces

2011-06-05 Thread ru...@yahoo.com
On 06/03/2011 02:49 PM, Neil Cerutti wrote:
  On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote:
  or that I have to treat commas as well as spaces as
  delimiters.
 
  source.replace(,,  ).split( )
 
  Uhgg. create a whole new string just so you can split it on one
  rather than two characters?  Sorry, but I find
 
  re.split ('[ ,]', source)
 
  It's quibbling to complain about creating one more string in an
  operation that already creates N strings.

It's not the time it take to create the string, its the doing
of things that aren't really needed to accomplish the task:
The re.split says directly and with no extraneous actions,
split 'source' on either spaces or commas.  This of course
is a trivial example but used thoughtfully, REs allow you to
be very precise about what you are doing, versus using tricks
like substituting individual characters first so you can split
on a single character afterwards.

  Here's another alternative:
 
  list(itertools.chain.from_iterable(elem.split( )
for elem in source.split(,)))

You seriously find that clearer than re.split('[ ,]') above?
I have no further comment. :-)

  It's weird looking, but delimiting text with two different
  delimiters is weird.

Perhaps, but real-world input data is often very weird.
Try parsing a text database of a circa 1980 telephone
company phone directory sometime. :-)

 [...]
  - they are another language to learn, a very cryptic a terse
  language;
 
  Chinese is cryptic too but there are a few billion people who
  don't seem to be bothered by that.
 
  Chinese *would* be a problem if you proposed it as the solution
  to a problem that could be solved by using a persons native
  tongue instead.

My point was that cryptic is in large part an inverse function
of knowledge.  If I always go out of my way to avoid regexes, than
likely I will never become comfortable with them and they will
always seem cryptic.  To someone who uses them more often, they
will seem less cryptic.  They may never have the clarity of Python
but neither is Python code a very clear way to describe text
patterns.

As for needing to learn them (S D'A comment), shrug.  Programmers
are expected to learn new things all the time, many even do so
for fun.  REs (practical use that is) in the grand scheme of things
are not that hard.

They are I think a lot easier to learn than SQL, yet it is common
here to see recommendations to use sqlite rather than an ad-hoc
concoction of Python dicts.

 [...]
  - and thanks in part to Perl's over-reliance on them, there's
  a tendency among many coders (especially those coming from
  Perl) to abuse and/or misuse regexes; people react to that
  misuse by treating any use of regexes with suspicion.
 
  So you claim.  I have seen more postings in here where
  REs were not used when they would have simplified the code,
  then I have seen regexes used when a string method or two
  would have done the same thing.
 
  Can you find an example or invent one? I simply don't remember
  such problems coming up, but I admit it's possible.

Sure, the response to the OP of this thread.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-05 Thread ru...@yahoo.com
On 06/03/2011 03:45 PM, Chris Torek wrote:
On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote:
 [prefers]
 re.split ('[ ,]', source)

 This is probably not what you want in dealing with
 human-created text:

  re.split('[ ,]', 'foo bar, spam,maps')
 ['foo', '', 'bar', '', 'spam', 'maps']

 Instead, you probably want a comma followed by zero or
 more spaces; or, one or more spaces:

  re.split(r',\s*|\s+', 'foo bar, spam,maps')
 ['foo', 'bar', 'spam', 'maps']

 or perhaps (depending on how you want to treat multiple
 adjacent commas) even this:

  re.split(r',+\s*|\s+', 'foo bar, spam,maps,, eggs')
 ['foo', 'bar', 'spam', 'maps', 'eggs']

Which to me, illustrates nicely the power of a regex to concisely
localize the specification of an input format and adapt easily
to changes in that specification.

 although eventually you might want to just give in and use the
 csv module. :-)  (Especially if you want to be able to quote
 commas, for instance.)

Which internally uses regexes, at least for the sniffer function.
(The main parser is in C presumably for speed, this being a
library module and all.)

 ...  With regexes the code is likely to be less brittle than a
 dozen or more lines of mixed string functions, indexes, and
 conditionals.

 In article 94svm4fe7...@mid.individual.net
 Neil Cerutti  ne...@norwich.edu wrote:
 [lots of snippage]
That is the opposite of my experience, but YMMV.

 I suspect it depends on how familiar the user is with regular
 expressions, their abilities, and their limitations.

I suspect so too at least in part.

 People relatively new to REs always seem to want to use them
 to count (to balance parentheses, for instance).  People who
 have gone through the compiler course know better. :-)

But also, a thing I think sometimes gets forgotten, is if the
max nesting depth is finite, parens can be balanced with a
regex.  This is nice for the particularly common case of a
nest depth of 1 (balanced but non-nested parens.)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-03 Thread ru...@yahoo.com
On 06/02/2011 07:21 AM, Neil Cerutti wrote:
  On 2011-06-01, ru...@yahoo.com ru...@yahoo.com wrote:
  For some odd reason (perhaps because they are used a lot in
  Perl), this groups seems to have a great aversion to regular
  expressions. Too bad because this is a typical problem where
  their use is the best solution.
 
  Python's str methods, when they're sufficent, are usually more
  efficient.

Unfortunately, except for the very simplest cases, they are often
not sufficient.  I often find myself changing, for example, a
startwith() to a RE when I realize that the input can contain mixed
case or that I have to treat commas as well as spaces as delimiters.
After doing this a number of times, one starts to use an RE right
from the get go unless one is VERY sure that there will be no
requirements creep.

And to regurgitate the mantra frequently used to defend Python when
it is criticized for being slow, the real question should be, are
REs fast enough?  The answer almost always is yes.

  Perl integrated regular expressions, while Python relegated them
  to a library.

Which means that one needs an one extra import re line that is
not required in Perl.

Since RE strings are complied and cached, one often need not compile
them explicitly.  Using match results is often requires more lines
than in Perl:
   m = re.match (...)
   if m: do something with m
rather than Perl's
   if m/.../ {do something with capture group globals}
Any true Python fan should not find this a problem, the stock
response being, what's the matter, your Enter key broken?

  There are thus a large class of problems that are best solve with
  regular expressions in Perl, but str methods in Python.

Guess that depends on what one's definition of large is.

There are a few simple things, admittedly common, that Python
provides functions for that Perl uses REs for: replace(), for
example.  But so what?  I don't know if Perl does it or not but
there is no reason why functions called with string arguments or
REs with no magic characters can't be optimized to something
about as efficient as a corresponding Python function.  Such uses
are likely to be naively counted as using an RE in Perl.

I would agree though that the selection of string manipulation
functions in Perl are not as nice or orthogonal as in Python, and
that this contributes to a tendency to use REs in Perl when one
doesn't need to.  But that is a programmer tradeoff (as in Python)
between fast-coding/slow-execution and slow-coding/fast-execution.
I for one would use Perl's index() and substr() to identify and
manipulate fixed patterns when performance was an issue.
One runs into the same tradeoff in Python pretty quickly too
so I'm not sure I'd call that space between the two languages
large.

The other tradeoff, applying both to Perl and Python is with
maintenance.  As mentioned above, even when today's requirements
can be solved with some code involving several string functions,
indexes, and conditionals, when those requirements change, it is
usually a lot harder to modify that code than a RE.

In short, although your observations are true to some extent, they
are not sufficient to justify the anti-RE attitude often seen here.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-03 Thread ru...@yahoo.com
On 06/03/2011 07:17 AM, Neil Cerutti wrote:
 On 2011-06-03, ru...@yahoo.com ru...@yahoo.com wrote:
 The other tradeoff, applying both to Perl and Python is with
 maintenance.  As mentioned above, even when today's
 requirements can be solved with some code involving several
 string functions, indexes, and conditionals, when those
 requirements change, it is usually a lot harder to modify that
 code than a RE.

 In short, although your observations are true to some extent,
 they are not sufficient to justify the anti-RE attitude often
 seen here.

 Very good article. Thanks. I mostly wanted to combat the notion
 that that the alleged anti-RE attitude here might be caused by an
 opposition to Perl culture.

 I contend that the anti-RE attitude sometimes seen here is caused
 by dissatisfaction with regexes in general combined with an
 aversion to the re module. I agree that it's not that bad, but
 it's clunky enough that it does contribute to making it my last
 resort.

But I questioned the reasons given (not as efficient, not built
in, not often needed) for dissatisfaction with REs.[*]  If those
reasons are not strong, then is not their Perl-smell still a leading
candidate for explaining the anti-RE attitude here?

Of course the whole question, lacking some serious group-psychological
investigation, is pure speculation anyway.


[*] A reason for not using REs not mentioned yet is that REs take
some time to learn.  Thus, although most people will know how to use
Python string methods, only a subset of those will be familiar with
REs.  But that doesn't seem like a reason for RE bashing either
since REs are easier to learn than SQL and one frequently sees
recommendations here to use sqlite.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to avoid leading white spaces

2011-06-03 Thread ru...@yahoo.com
On 06/03/2011 08:25 AM, Steven D'Aprano wrote:
 On Fri, 03 Jun 2011 05:51:18 -0700, ru...@yahoo.com wrote:

 On 06/02/2011 07:21 AM, Neil Cerutti wrote:

  Python's str methods, when they're sufficent, are usually more
  efficient.

 Unfortunately, except for the very simplest cases, they are often not
 sufficient.

 Maybe so, but the very simplest cases occur very frequently.

Right, and I stated that.

 I often find myself changing, for example, a startwith() to
 a RE when I realize that the input can contain mixed case

 Why wouldn't you just normalise the case?

Because some of the text may be case-sensitive.

[...]
 or that I have
 to treat commas as well as spaces as delimiters.

 source.replace(,,  ).split( )

Uhgg. create a whole new string just so you can split it on
one rather than two characters?  Sorry, but I find

re.split ('[ ,]', source)

states much more clearly exactly what is being done with no
obfuscation.  Obviously this is a simple enough case that the
difference is minor but when the pattern gets only a little
more complex, the clarity difference becomes greater.

[...]
 re.split is about four times slower than the simple solution.

If this processing is a bottleneck, by all means use a more
complex hard-coded replacement for a regex.  In most cases
that won't be necessary.

 After doing this a
 number of times, one starts to use an RE right from the get go unless
 one is VERY sure that there will be no requirements creep.

 YAGNI.

IAHNI. (I actually have needed it.)

 There's no need to use a regex just because you think that you *might*,
 someday, possibly need a regex. That's just silly. If and when
 requirements change, then use a regex. Until then, write the simplest
 code that will solve the problem you have to solve now, not the problem
 you think you might have to solve later.

I would not recommend you use a regex instead of a string method
solely because you might need a regex later.  But when you have
to spend 10 minutes writing a half-dozen lines of python versus
1 minute writing a regex, your evaluation of the possibility of
requirements changing should factor into your decision.

 [...]
 In short, although your observations are true to some extent, they
 are not sufficient to justify the anti-RE attitude often seen here.

 I don't think that there's really an *anti* RE attitude here. It's more a
 skeptical, cautious attitude to them, as a reaction to the Perl when all
 you have is a hammer, everything looks like a nail love affair with
 regexes.

Yes, as I said, the regex attitude here seems in large part to
be a reaction to their frequent use in Perl.  It seems anti- to
me in that I often see cautions about their use but seldom see
anyone pointing out that they are often a better solution than
a mass of twisty little string methods and associated plumbing.

 There are a few problems with regexes:

 - they are another language to learn, a very cryptic a terse language;

Chinese is cryptic too but there are a few billion people who
don't seem to be bothered by that.

 - hence code using many regexes tends to be obfuscated and brittle;

No.  With regexes the code is likely to be less brittle than
a dozen or more lines of mixed string functions, indexes, and
conditionals.

 - they're over-kill for many simple tasks;
 - and underpowered for complex jobs, and even some simple ones;

Right, like all tools (including Python itself) they are suited
best for a specific range of problems.  That range is quite wide.

 - debugging regexes is a nightmare;

Very complex ones, perhaps.  Nightmare seems an overstatement.

 - they're relatively slow;

So is Python.  In both cases, if it is a bottleneck then
choosing another tool is appropriate.

 - and thanks in part to Perl's over-reliance on them, there's a tendency
 among many coders (especially those coming from Perl) to abuse and/or
 misuse regexes; people react to that misuse by treating any use of
 regexes with suspicion.

So you claim.  I have seen more postings in here where
REs were not used when they would have simplified the code,
then I have seen regexes used when a string method or two
would have done the same thing.

 But they have their role to play as a tool in the programmers toolbox.

We agree.

 Regarding their syntax, I'd like to point out that even Larry Wall is
 dissatisfied with regex culture in the Perl community:

 http://www.perl.com/pub/2002/06/04/apo5.html

You did see the very first sentence in this, right?

  Editor's Note: this Apocalypse is out of date and remains here
  for historic reasons. See Synopsis 05 for the latest information.

(Note that Apocalypse is referring to a series of Perl design
documents and has nothing to do with regexes in particular.)

Synopsis 05 is (AFAICT with a quick scan) a proposal for revising
regex syntax.  I didn't see anything about de-emphasizing them in
Perl.  (But I have no idea what is going on for Perl 6 so I could
be wrong about that.)

As for the original

Re: how to avoid leading white spaces

2011-06-01 Thread ru...@yahoo.com
On Jun 1, 11:11 am, Chris Rebert c...@rebertia.com wrote:
 On Wed, Jun 1, 2011 at 12:31 AM, rakesh kumar
  Hi
 
  i have a file which contains data
 
  //ACCDJ EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB,
  // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCDJ   '
  //ACCT  EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB,
  // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCT    '
  //ACCUM EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB,
  // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCUM   '
  //ACCUM1    EXEC DB2UNLDC,DFLID=DFLID,PARMLIB=PARMLIB,
  // UNLDSYST=UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCUM1  '
 
  i want to cut the white spaces which are in between single quotes after 
  TABLE=.
 
  for example :
     'ACCT[spaces] '
     'ACCUM   '
     'ACCUM1 '
  the above is the output of another python script but its having a leading 
  spaces.

 Er, you mean trailing spaces. Since this is easy enough to be
 homework, I will only give an outline:

 1. Use str.index() and str.rindex() to find the positions of the
 starting and ending single-quotes in the line.
 2. Use slicing to extract the inside of the quoted string.
 3. Use str.rstrip() to remove the trailing spaces from the extracted string.
 4. Use slicing and concatenation to join together the rest of the line
 with the now-stripped inner string.

 Relevant docs:http://docs.python.org/library/stdtypes.html#string-methods

For some odd reason (perhaps because they are used a lot in Perl),
this groups seems to have a great aversion to regular expressions.
Too bad because this is a typical problem where their use is the
best solution.

import re
f = open (your file)
for line in f:
fixed = re.sub (r(TABLE='\S+)\s+'$, r\1', line)
print fixed,

(The above is for Python-2, adjust as needed for Python-3)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: checking if a list is empty

2011-05-12 Thread ru...@yahoo.com
On 05/12/2011 12:13 AM, Steven D'Aprano wrote:
[snip]
 http://www.codinghorror.com/blog/2006/07/separating-programming-sheep-from-non-programming-goats.html

 Shorter version: it seems that programming aptitude is a bimodal
 distribution, with very little migration from the can't program hump
 into the can program hump. There does seem to be a simple predictor for
 which hump you fall into: those who intuitively develop a consistent
 model of assignment (right or wrong, it doesn't matter, so long as it is
 consistent) can learn to program. Those who don't, can't.

A later paper by the same authors...
(http://www.eis.mdx.ac.uk/research/PhDArea/saeed/paper3.pdf)

Abstract:
[...] Despite a great deal of research into teaching methods
and student responses, there have been to date no strong
predictors of success in learning to program.  Two years ago
we appeared to have discovered an exciting and enigmatic new
predictor of success in a first programming course. We now
report that after six experiments, involving more than 500
students at six institutions in three countries, the predictive
effect of our test has failed to live up to that early promise.
We discuss the strength of the effects that have been observed
and the reasons for some apparent failures of prediction.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: opinion: comp lang docs style

2011-01-05 Thread ru...@yahoo.com
On 01/04/2011 11:29 PM, Steven D'Aprano wrote:
 On Tue, 04 Jan 2011 15:17:37 -0800, ru...@yahoo.com wrote:

 If one wants to critique the 'Python Docs', especially as regards to
 usefulness to beginners, one must start with the Tutorial; and if one
 wants to use if statements as an example, one must start with the
 above.

 No.  The language reference (LR) and standard library reference (SLR)
 must stand on their own merits.  It is nice to have a good tutorial for
 those who like that style of learning.  But it should be possible for a
 programmer with a basic understanding of computers and some other
 programming languages to understand how to program in python without
 referring to tutorials, explanatory websites, commercially published
 books, the source code, etc.

 No it shouldn't. That's what the tutorial is for. The language reference
 and standard library reference are there to be reference manuals, not to
 teach beginners Python.

Yes it should.  That's not what the tutorial is for.  The
(any) tutorial is for people new to python, often new to
programming, who have the time and a learning style suitable
for sitting down and going through a slow step-by-step
exposition, much as one would get in a classroom.  That is
a perfectly valid way for someone in that target audience
to learn python.

Your (and Terry's) mistake is to presume that it is
appropriate for everyone, perhaps because it worked for you
personally.  There is a large class of potential python
users for whom a tutorial is highly suboptimal -- people
who have some significant programming experience, who don't
have the time or patience required to go through it getting
information serially bit by bit, or whos learning style is,
don't spoon feed me, just tell me concisely what python
does, who fill in gaps on a need-to-know basis rather than
linearly.  I (and many others) don't need or want an
explanation of how to use lists as a stack!

A language reference manual should completely and accurately
describe the language it documents.  (That seems fairly obvious
to me although there will be differing opinions of how precise
one needs to be, etc.)  Once it meets that minimum standard,
it's quality is defined by how effectively it transfers that
information to its target audience.  A good reference manual
meets the learning needs of the target audience above admirably.

I learned Perl (reputedly more difficult to learn than Python)
from the Perl manpages and used it for many many years before
I ever bought a Perl book.  I learned C mostly from Harbison
and Steele's C: A Reference.  Despite several attempts at
python using its reference docs, I never got a handle on
it until I forked out money for Beazley's book.  There is
obviously nothing inherently difficult about python -- it's
just that python's reference docs are written for people who
already know python.  Since limiting their scope that narrowly
is not necessary, as other languages show, it is fair to say
that python's reference docs are poorer.

 In any case, your assumption that any one documentation work should stand
 on its own merits is nonsense -- *nothing* stands alone. Everything
 builds on something else. Technical documentation is no different: it
 *must* assume some level of knowledge of its readers -- should it be
 aimed at Python experts, or average Python coders, or beginners, or
 beginners to programming, or at the very least is it allowed to assume
 that the reader already knows how to read?

 You can't satisfy all of these groups with one document, because their
 needs are different and in conflict. This is why you have different
 documentation -- tutorials and reference manuals and literate source code
 and help text are all aimed at different audiences. Expecting one
 document to be useful for all readers' needs is like expecting one data
 type to be useful for all programming tasks.

I defined (roughly) the target audience I was talking about
when I wrote for a programmer with a basic understanding of
computers and some other programming languages.

Let's dispense with the 6th-grade arguments about people who
don't know how to read, etc.

 Reasonable people might disagree on what a particular documentation work
 should target, and the best way to target it, but not on the need for
 different documentation for different targets.

As I hope I clarified above, that was exactly my point too.
There is a significant, unsatisfied gap between the audience
that a tutorial aims at, and the audience that the reference
docs as currently written seem to be aimed at.  Since other
language manuals incorporate this gap audience more or less
sucessfully in their reference manuals, python's failure to
do so is justification for calling them poor.
(Of course they are poor in lots of other ways too but my
original response was prompted by the erroneous claim that
good (in my sense above) reference manuals were unnecessary
because a tutorial exists.)
-- 
http://mail.python.org/mailman

Re: opinion: comp lang docs style

2011-01-05 Thread ru...@yahoo.com
On 01/05/2011 12:23 AM, Alice Bevan–McGregor wrote:
  On 2011-01-04 22:29:31 -0800, Steven D'Aprano said:
 
  In any case, your assumption that any one documentation work should stand
  on its own merits is nonsense -- *nothing* stands alone.
 
  +1

I responded more fully in my response to Steven but you like
he is taking stand on it's own merits out of context.  The
context I gave was someone who wants a complete and accurate
description of python and who understands programming with
other languages but not python.

  How many RFCs still in use today don't start with:
 
  The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT,
  SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this
  document are to be interpreted as described in RFC 2119

RFC 2119 is incorporated in the others by reference.  It is purely
a matter of technical convenience that those definitions, which are
common to hundreds of RFCs, are factored out to a single common
location.  RFC 2119 is not a tutorial.

  I posted a response on the article itself, rather than pollute a
  mailing list with replies to a troll.  The name calling was a rather
  large hint as to the intention of the opinion, either that or whoever
  translated the article (man or machine) was really angry at the time.
  ;)

I can hint to my neighbor that his stereo is too loud by
throwing a brick through his window.  Neither that nor calling
people arrogant ignoramus is acceptable in polite society.
I am not naive, nor not shocked that c.l.p is not always polite,
and normally would not have even commented on it except that
1) Terry Reedy is usually more polite and thoughtful,
and 2) Xah Lee's post was not a troll -- it was a legitimate
comment on free software documentation (including specifically
python's) and while I don't agree with some of his particulars,
the Python docs would be improved if some of his comments
were considered rather than dismissed with mindless epithets
like troll and arrogant ignoramus.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: opinion: comp lang docs style

2011-01-04 Thread ru...@yahoo.com
On 01/04/2011 01:34 PM, Terry Reedy wrote:
 On 1/4/2011 1:24 PM, an Arrogant Ignoramus wrote:

 what he called
 a opinion piece.

 I normally do not respond to trolls, but while expressing his opinions,
 AI made statements that are factually wrong at least as regards Python
 and its practitioners.

Given that most trolls include factually false statements,
the above is inconsistent.  And speaking of arrogant, it
is just that to go around screaming troll about a posting
relevant to the newsgroup it was posted in because you don't
happen to agree with its content.  In doing so you lower
your own credibility.  (Which is also not helped by your
Arrogant Ignoramus name-calling.)

 [...]
 2. AI also claims that this notation is 'incomprehensible'.

Since incomprehensibility is clearly subjective your claim
that it is a factual error is every bit as hyperbolic as his.

 [...]
 3. AI's complaint is deceptive and deficient in omitting any mention the
 part of the docs *intended* to teach beginners: the Tutorial. The main
 doc pages list the Tutorial first, as what one should start with. That
 [...]
 If one wants to critique the 'Python Docs', especially as regards to
 usefulness to beginners, one must start with the Tutorial; and if one
 wants to use if statements as an example, one must start with the above.

No.  The language reference (LR) and standard library reference
(SLR) must stand on their own merits.  It is nice to have a good
tutorial for those who like that style of learning.  But it should
be possible for a programmer with a basic understanding of computers
and some other programming languages to understand how to program
in python without referring to tutorials, explanatory websites,
commercially published books, the source code, etc.

The difficulty of doing that is a measure of the failure of the
python docs to achive a level quality commensurate with the language
itself.

FWIW, I think the BNF in the LR is perfectly reasonable given the
target audience I gave above.  The failure of the LR has more to
do with missing or excessively terse material -- it concentrates
too exclusively on syntax and insufficiently on semantics.  Much
of the relevant semantics information is currently mislocated in
the SLR.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Performance: sets vs dicts.

2010-09-03 Thread ru...@yahoo.com
On 09/02/2010 02:47 PM, Terry Reedy wrote:
 On 9/1/2010 10:57 PM, ru...@yahoo.com wrote:

 So while you may think most people rarely read
 the docs for basic language features and objects
 (I presume you don't mean to restrict your statement
 to only sets), I and most people I know *do* read
 them.  And when read them I expect them, as any good
 reference documentation does, to completely and
 accurately describe the behavior of the item I am
 reading about.  If big-O performance is deemed an
 intrinsic behavior of an (operation of) an object,
 it should be described in the documentation for
 that object.

 However, big-O performance is intentionally NOT so deemed.

The discussion, as I understood it, was about whether
or not it *should* be so deemed.

 And I have
 and would continue to argue that it should not be, for multiple reasons.

Yes, you have.  And others have argued the opposite.
Personally, I did not find your arguments very convincing,
particularly that it would be misleading or that the
limits necessarily imposed by a real implementation
somehow invalidates the usefulness of O() documentation.
But I acknowledged that there was not universal agreement
that O() behavior should be documented in the the reference
docs by qualifying my statement with the word if.

But mostly my comments were directed towards some of the
side comments in Raymond's post I thought should not pass
unchallenged.  I think that some of the attitudes expressed
(and shared by others) are likely the direct cause of many
of the faults I find in the currrent documentation.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Performance: sets vs dicts.

2010-09-01 Thread ru...@yahoo.com
On 09/01/2010 04:51 PM, Raymond Hettinger wrote:
 On Aug 30, 6:03 am, a...@pythoncraft.com (Aahz) wrote:
 That reminds me: one co-worker (who really should have known better ;-)
 had the impression that sets were O(N) rather than O(1).  Although
 writing that off as a brain-fart seems appropriate, it's also the case
 that the docs don't really make that clear, it's implied from requiring
 elements to be hashable.  Do you agree that there should be a comment?

 There probably ought to be a HOWTO or FAQ entry on algorithmic
 complexity
 that covers classes and functions where the algorithms are
 interesting.
 That will concentrate the knowledge in one place where performance is
 a
 main theme and where the various alternatives can be compared and
 contrasted.


 I think most users of sets rarely read the docs for sets.  The few lines
 in the tutorial are enough so that most folks just get it and don't read
 more detail unless they attempting something exotic.

I think that attitude is very dangerous.  There is
a long history in this world of one group of people
presuming what another group of people does or does
not do or think.  This seems to be a characteristic
of human beings and is often used to promote one's
own ideology.  And even if you have hard evidence
for what you say, why should 60% of people who don't
read docs justify providing poor quality docs to
the 40% that do?

So while you may think most people rarely read
the docs for basic language features and objects
(I presume you don't mean to restrict your statement
to only sets), I and most people I know *do* read
them.  And when read them I expect them, as any good
reference documentation does, to completely and
accurately describe the behavior of the item I am
reading about.  If big-O performance is deemed an
intrinsic behavior of an (operation of) an object,
it should be described in the documentation for
that object.

Your use of the word exotic is also suspect.
I learned long ago to always click the advanced
options box on dialogs because most developers/-
designers really don't have a clue about what
users need access to.

 Our docs have gotten
 somewhat voluminous,

No they haven't (relative to what they attempt to
describe).  The biggest problem with the docs is
that they are too terse.  They often appear to have
been written by people playing a game of who can
describe X in the minimum number of words that can
still be defended as correct.  While that may be
fun, good docs are produced by considering how to
describe something to the reader, completely and
accurately, as effectively as possible.  The test
is not how few words were used, but how quickly
the reader can understand the object or find the
information being sought about the object.

 so it's unlikely that adding that particular
 needle to the haystack would have cured your colleague's brain-fart
 unless he had been focused on a single document talking about the
 performance
 characteristics of various data structures.

I don't know the colleague any more that you so I
feel comfortable saying that having it very likely
*would* have cured that brain-fart.  That is, he
or she very likely would have needed to check some
behavior of sets at some point and would have either
noted the big-O characteristics in passing, or would
have noted that such information was available, and
would have returned to the documentation when the
need for that information arose.  The reference
description of sets is the *one* canonical place to
look for information about sets.

There are people who don't read documentation, but
one has to be very careful not use the existence
of such people as an excuse to justify sub-standard
documentation.

So I think relegating algorithmic complexity information
to some remote document far from the description of the
object it pertains to, is exactly the wrong approach.
This is not to say that a performance HOWTO or FAQ
in addition to the reference manual would not be good.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to convert (unicode) text to image?

2010-08-30 Thread ru...@yahoo.com
On 08/30/2010 04:50 AM, Thomas Jollans wrote:
 On Monday 30 August 2010, it occurred to ru...@yahoo.com to exclaim:
 Face the facts dude.  The Python docs have some major problems.
 They were pretty good when Python was a new, cool, project used
 by a handful of geeks.  They are good relative to the average
 (whatever that is) open source project -- but that bar is so low
 as to be a string lying on the ground.

 Actually, the Python standard library reference manual is excellent. At least
 that's my opinion.

 Granted, it's not necessarily the best in the world. It could probably be
 better. But that goes for just about every documentation effort there is.

 What exactly are you comparing the Python docs to, I wonder? Obviously not
 something like Vala, but that goes without saying. kj said that the Perl
 docs were better. I can't comment on that. I also won't comment on the sorry
 mess that the language Perl is, either.
 There are a few documentation efforts that I recognize are actually better
 than the Python docs: Firstly, the MSDN Library docs for the .Net framework.
 Not that I refer to it much, but it is excellent, and it probably was a pretty
 darn expensive project too. Secondly, the libc development manual pages on
 Linux and the BSDs. Provided you know your way around the C library, they are
 really a top-notch reference.

The Postgresql docs have always seemed pretty good to me.  And I'll
second
kj's nomination of Perl.  The Perl docs have plenty of faults but
many
years ago I was able to learn Perl with nothing more than those docs.
It
was well over five years later that I ever got around to buying a
commercial
Perl book.  In contrast, I made several, honest efforts to learn
Python
the same way but found it impossible and never got a handle on it
until
I bought Lutz's and Beazley's books.  (Of which Bealey's was by far
the
most useful; Lutz became a doorstop pretty quickly.  And yes, I knew
about
but didn't use the tutorial -- tutorials are one way of presenting
information
that aren't appropriate for everyone or in every situation, and the
existence
of one in no way excuses inadequate reference material.)

If one is comparing the Python docs to others, comparing it to
Beazley's
book is informative.  Most of the faults I find with the book are the
places
he took material from the Python docs nearly verbatim.  The material
he
interprets and explains (usually quite tersely) is much clearer that
similar material (if it even exists) in the Python docs.

Finally, it it not really necessary to compare the Python docs to
others
to make a judgment -- simply looking at the hours taken to solve some
problem that could have been avoided with a couple more sentences in
the
docs -- the number of hours spent trying figure out some behavior by
pouring over the standard lib code -- the number of times one decides
how
to write code by trying it, with fingers crossed that one isn't
relying
on some accidental effect that will change with the next version or
platform -- these can give a pretty good indication of the magnitude
of the doc problems.

I think one reason for the frequent Python docs are great opinions
here
is that eventually one figures out the hard way how things work, and
tends
to rely less on the docs as documentation, and more as a memmonic.
And
for that the existing docs are adequate.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to convert (unicode) text to image?

2010-08-30 Thread ru...@yahoo.com
On 08/30/2010 01:14 PM, Terry Reedy wrote:
 On 8/30/2010 12:23 AM, ru...@yahoo.com wrote:
 The Python docs have some major problems.

 And I have no idea what you think they are.

I have written about a few of them here in the past.  I sure Google
will
turn up something.

 I have participated in 71 doc improvement issues on the tracker. Most of
 those I either initiated or provided suggestions. How many have you
 helped with?

Certainly not 71.  But there is, for example, 
http://bugs.python.org/issue1397474
Please note the date on it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to convert (unicode) text to image?

2010-08-29 Thread ru...@yahoo.com
On 08/29/2010 08:21 PM, alex23 wrote:
 kj no.em...@please.post wrote:
snip
 Sorry for the outburst, but unfortunately, PIL is not alone in
 this.  Python is awash in poor documentation. [...]
 I have to conclude that the problem with Python docs
 is somehow systemic...

 Yes, if everyone else disagrees with you, the problem is obviously
 systemic.

No, not everyone disagrees with him.  There are many people who
absolutely agree with him.

 What helps are concrete suggestions to the package maintainers about
 how these improvements could be made, rather than huge sprawling
 attacks on the state of Python documentation (and trying to tie it
 into the state of Python itself) as a whole. ]

Nothing you quoted of what he wrote attempted to tie it into the
state of Python itself

 Instead, what we get are
 huge pointless rants like yours whenever someone finds that something
 isn't spelled out for them in exactly the way that they want.

He never complained about spelling choices.

 These people are _volunteering_ their effort and their code,

Yes, we all know that.

 all
 you're providing is an over-excess of hyperbole

It is hardly convincing when one criticizes hyperbole with hyperbole.

 and punctuation. What
 is frustrating to me is seeing people like yourself spend far more
 time slamming these projects than actually contributing useful changes
 back.

Face the facts dude.  The Python docs have some major problems.
They were pretty good when Python was a new, cool, project used
by a handful of geeks.  They are good relative to the average
(whatever that is) open source project -- but that bar is so low
as to be a string lying on the ground.

Your overly defensive and oppressive response does not help.
All it (combined with similar knee-jerk responses) does is
act to suppress any criticism leaving the impression that
the Python docs are really great, an assertion commonly made
here and often left unchallenged.  Responses like yours
create a force that works to maintain the status quo.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A question about the posibility of raise-yield in Python

2010-06-30 Thread ru...@yahoo.com
On Jun 30, 10:48 am, John Nagle na...@animats.com wrote:
 On 6/30/2010 12:13 AM, Дамјан Георгиевски wrote:

  A 'raise-yield' expression would break the flow of a program just like
  an exception, going up the call stack until it would be handled, but
  also like yield it would be possible to continue the flow of the
  program from where it was raise-yield-ed.

      Bad idea.  Continuing after an exception is generally troublesome.
 This was discussed during the design phase of Ada, and rejected.
 Since then, it's been accepted that continuing after an exception
 is a terrible idea.  The stack has already been unwound, for example.

      What you want, in in the situation you describe, is an optional
 callback, to be called in case of a fixable problem. Then the
 caller gets control, but without stack unwinding.

Strangely I was just thinking about something similar (non-stack
the other day.  Something like

def caller():
  try: callee()
  except SomeError, exc: ...
  else exclist: ...

def callee():
  if error: raise SomeError()
  else raise2: SomeWarning()

raise2 would create an exception object but unlike
raise, would save in it a list somewhere and when
callee() returned normally, the list would be made
asvailable to caller, possibly in a parameter to
the try/except else clause as shown above.  Obviously
raise2 is a placeholder for some way to signal that
this is a non-stack-unwinding exception.

The use case addressed is to note exceptional conditions
in a function that aren't exceptional enough to be fatal
but which the caller may or may not care about.
Siminlar to the Warning module but without the brokenness
of doing io.  Integrates well with the existing way of
handling fatal exceptions.

No idea if something like this even remotely feasible.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I strongly dislike Python 3

2010-06-30 Thread ru...@yahoo.com
On Jun 30, 9:42 am, Michele Simionato michele.simion...@gmail.com
wrote:

 Actually when debugging I use pdb which uses p (no parens) for
 printing, so having
 print or print() would not make any difference for me.

Perhaps you don't use CJK strings much?
 p u'\u30d1\u30a4\u30c8\u30f3' give quite a different
result than
 print u'\u30d1\u30a4\u30c8\u30f3'
at least in python2.  Is this different in python3?
-- 
http://mail.python.org/mailman/listinfo/python-list