Re: extracting string.Template substitution placeholders

2014-01-16 Thread gmflanagan
On Sunday, January 12, 2014 3:08:31 PM UTC, Eric S. Johansson wrote:
> As part of speech recognition accessibility tools that I'm building, I'm 
> 
> using string.Template. In order to construct on-the-fly grammar, I need 
> 
> to know all of the identifiers before the template is filled in. what is 
> 
> the best way to do this?
> 

Try this:

import string
cmplxstr="""a simple $string a longer $string a $last line ${another} one"""

def finditer(s):
for match in string.Template.pattern.finditer(s):
arg = match.group('braced') or match.group('named')
if arg:
yield arg


if __name__ == '__main__':
print set(finditer(cmplxstr))

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Converting folders of jpegs to single pdf per folder

2014-01-16 Thread vasishtha . spier
On Thursday, January 16, 2014 12:07:59 PM UTC-8, Tim Golden wrote:
>
> Here's a quick example. 
> This should walk down the Python directory, creating a text file for 
> each directory. The textfile will contain the names of all the files in 
>  the directory. (NB this might create a lot of text files so run it 
>  inside some temp directory).

> 
> import os
> root = "c:/temp"
> for dirpath, dirnames, filenames in os.walk(root):
>  print("Looking at", dirpath)
>  txt_filename = os.path.basename(dirpath) + ".txt"
>  with open(txt_filename, "w") as f:
>f.write("\n".join(filenames)
> 
> TJG

Thanks Tim.  It worked like a charm and saved me weeks of work using a drag and 
drop utility. About 250 pdf files created of 50 to 100 pages each.  Heres the 
code in case any one else can use it.  

import os
from reportlab.pdfgen import canvas
from reportlab.lib.utils import ImageReader

root = "C:\\Users\\Harry\\" 

try:
 n = 0
 for dirpath, dirnames, filenames in os.walk(root):
  PdfOutputFileName = os.path.basename(dirpath) + ".pdf" 
  c = canvas.Canvas(PdfOutputFileName)
  if n > 0 :
   for filename in filenames:
LowerCaseFileName = filename.lower()
if LowerCaseFileName.endswith(".jpg"):
 print(filename)
 filepath= os.path.join(dirpath, filename)
 print(filepath)
 im  = ImageReader(filepath)
 imagesize   = im.getSize()
 c.setPageSize(imagesize)
 c.drawImage(filepath,0,0)
 c.showPage()
 c.save()
  n = n + 1
  print "PDF of Image directory created" + PdfOutputFileName
   
except:
 print "Failed creating PDF"
-
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to protect python source code by compiling it to .pyc or .pyo?

2014-01-16 Thread Steven D'Aprano
On Thu, 16 Jan 2014 16:58:48 -0800, Sam wrote:

> I would like to protect my python source code. It need not be foolproof
> as long as it adds inconvenience to pirates.

What makes you think that "pirates" will be the least bit interested in 
your code? No offence intended, I'm sure you worked really, really hard 
to write it, but the internet has hundreds of gigabytes of free and open 
source software which is easily and legally available, not to mention 
easily available (legally or not) non-free software at a relatively cheap 
price. Chances are that your biggest problem will not be piracy, but 
getting anyone to care or even notice that your program exists.


> Is it possible to protect python source code by compiling it to .pyc or
> .pyo? Does .pyo offer better protection?

Compiling to .pyc or .pyo will not give any protection from software 
piracy, since they can just copy the .pyc or .pyo file. It will give a 
tiny bit of protection from people reading your code, but any competent 
Python programmer ought to be able to use the dis module to read the byte 
code.

Perhaps if you explain what your program is, and why you think it needs 
protection, we can give you some concrete advice.



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Rustom Mody
On Friday, January 17, 2014 7:10:05 AM UTC+5:30, Tim Chase wrote:
> On 2014-01-17 11:14, Chris Angelico wrote:
> > UTF-8 specifies the byte order
> > as part of the protocol, so you don't need to mark it.

> You don't need to mark it when writing, but some idiots use it
> anyway.  If you're sniffing a file for purposes of reading, you need
> to look for it and remove it from the actual data that gets returned
> from the file--otherwise, your data can see it as corruption.  I end
> up with lots of CSV files from customers who have polluted it with
> Notepad or had Excel insert some UTF-8 BOM when exporting.  This
> means my first column-name gets the BOM prefixed onto it when the
> file is passed to csv.DictReader, grr.

And its part of the standard:
Table 2.4 here
http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Compiling main script into .pyc

2014-01-16 Thread Terry Reedy

On 1/16/2014 10:19 PM, MRAB wrote:

On 2014-01-17 02:56, bob gailer wrote:

On 1/16/2014 8:01 PM, Sam wrote:

One thing I observe about python byte-code compiling is that the main
script does not gets compiled into .pyc. Only imported modules are
compiled into .pyc.

May I know how can I compile the main script into .pyc?

Duh? Just import it!


What if you want to just compile it? Importing will run it!


Write the main script as

def main(): ...
if __name__ == '__main__':  main()

The difference between merely compiling 'def main' and executing it is 
trivial.


Or don't bother compiling and write main.py as one line:
  from realmain import start; start()

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Compiling main script into .pyc

2014-01-16 Thread Dave Angel
 MRAB  Wrote in message:
> On 2014-01-17 02:56, bob gailer wrote:
>> On 1/16/2014 8:01 PM, Sam wrote:
>>> One thing I observe about python byte-code compiling is that the main 
>>> script does not gets compiled into .pyc. Only imported modules are compiled 
>>> into .pyc.
>>>
>>> May I know how can I compile the main script into .pyc?
>> Duh? Just import it!
>>
> What if you want to just compile it? Importing will run it!
> 
> 

Importing will only run the portion of the code not protected by

if __name__ == "__main__":


-- 
DaveA

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Compiling main script into .pyc

2014-01-16 Thread MRAB

On 2014-01-17 02:56, bob gailer wrote:

On 1/16/2014 8:01 PM, Sam wrote:

One thing I observe about python byte-code compiling is that the main script 
does not gets compiled into .pyc. Only imported modules are compiled into .pyc.

May I know how can I compile the main script into .pyc?

Duh? Just import it!


What if you want to just compile it? Importing will run it!

--
https://mail.python.org/mailman/listinfo/python-list


Re: Compiling main script into .pyc

2014-01-16 Thread Ned Batchelder

On 1/16/14 8:01 PM, Sam wrote:

One thing I observe about python byte-code compiling is that the main script 
does not gets compiled into .pyc. Only imported modules are compiled into .pyc.

May I know how can I compile the main script into .pyc? It is to inconvenience 
potential copy-cats.



The standard library has the compileall module that can be used to 
create .pyc files from .py files, but as we've been discussing in 
another thread, you may not want .pyc files.


--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: Compiling main script into .pyc

2014-01-16 Thread bob gailer

On 1/16/2014 8:01 PM, Sam wrote:

One thing I observe about python byte-code compiling is that the main script 
does not gets compiled into .pyc. Only imported modules are compiled into .pyc.

May I know how can I compile the main script into .pyc?

Duh? Just import it!
--
https://mail.python.org/mailman/listinfo/python-list


Re: Process datafeed in one MySql table and output to another MySql table

2014-01-16 Thread Jason Friedman
>
> I have a datafeed which is constantly sent to a MySql table. The table
> grows constantly as the data feeds in. I would like to write a python
> script which process the data in this table and output the processed data
> to another table in another MySql database in real-time.
>
> Which are the python libraries which are suitable for this purpose? Are
> there any useful sample code or project on the web that I can use as
> reference? Thank you.
>

Is there a reason you do not want to move these rows with a mysql command?
drop table if exists temp;
insert into temp select * from source where ...;
insert into target select * from temp;
delete from source where id in (select id from temp);
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Process datafeed in one MySql table and output to another MySql table

2014-01-16 Thread Denis McMahon
On Thu, 16 Jan 2014 17:03:24 -0800, Sam wrote:

> I have a datafeed which is constantly sent to a MySql table ...

> Which are the python libraries which are suitable for this purpose? Are
> there any useful sample code or project on the web that I can use as
> reference?

Did you search for mysql on the python docs website, or perhaps try 
googling "python mysql"?

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread Asaf Las
inpu = "3443331123377"
tstr = inpu[0]
for k in range(1, len(inpu)):
if inpu[k] != inpu[k-1] :
tstr = tstr + inpu[k] 

print(tstr)

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Tim Chase
On 2014-01-17 11:14, Chris Angelico wrote:
> UTF-8 specifies the byte order
> as part of the protocol, so you don't need to mark it.

You don't need to mark it when writing, but some idiots use it
anyway.  If you're sniffing a file for purposes of reading, you need
to look for it and remove it from the actual data that gets returned
from the file--otherwise, your data can see it as corruption.  I end
up with lots of CSV files from customers who have polluted it with
Notepad or had Excel insert some UTF-8 BOM when exporting.  This
means my first column-name gets the BOM prefixed onto it when the
file is passed to csv.DictReader, grr.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to protect python source code by compiling it to .pyc or .pyo?

2014-01-16 Thread Ethan Furman

On 01/16/2014 05:09 PM, Chris Angelico wrote:

On Fri, Jan 17, 2014 at 11:58 AM, Sam  wrote:

I would like to protect my python source code. It need not be foolproof as long 
as it adds inconvenience to pirates.

Is it possible to protect python source code by compiling it to .pyc or .pyo? 
Does .pyo offer better protection?


No and no.


Distribute your code with a copyright notice, accept that a few people
will rip you off, and have done with it.


Yes.  One of the nice things about Python is being able to fix bugs myself [1].

--
~Ethan~


[1] Yes, I file upstream bug reports.  :)
--
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Steven D'Aprano
On Thu, 16 Jan 2014 11:37:29 -0800, Albert-Jan Roskam wrote:

>  On Thu, 1/16/14, Chris
> Angelico  wrote:
> 
>  Subject: Re: Guessing the encoding from a BOM To:
>  Cc: "python-list@python.org"  Date: Thursday,
>  January 16, 2014, 7:06 PM
>  
>  On Fri, Jan 17, 2014 at 5:01 AM,
>  Björn Lindqvist 
>  wrote:
>  > 2014/1/16 Steven D'Aprano :
>  >> def guess_encoding_from_bom(filename, default):
>  >>     with open(filename, 'rb')
>  as f:
>  >>         sig =
>  f.read(4)
>  >>     if
>  sig.startswith((b'\xFE\xFF', b'\xFF\xFE')):
>  >>         return
>  'utf_16'
>  >>     elif
>  sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')):
>  >>         return
>  'utf_32'
>  >>     else:
>  >>         return
>  default
>  >
>  > You might want to add the utf8 bom too:
>  '\xEF\xBB\xBF'.
>  
>  I'd actually rather not. It would tempt people to pollute UTF-8 files
>  with a BOM, which is not necessary unless you are MS Notepad.
>  
>  
>  ===> Can you elaborate on that? Unless your utf-8 files will only
>  contain ascii characters I do not understand why you would not want a
>  bom utf-8.

Because the UTF-8 signature -- it's not actually a Byte Order Mark -- is 
not really necessary. Unlike UTF-16 and UTF-32, there is no platform 
dependent ambiguity between Big Endian and Little Endian systems, so the 
UTF-8 stream of bytes is identical no matter what platform you are on.

If the UTF-8 signature was just unnecessary, it wouldn't be too bad, but 
it's actually harmful. Pure-ASCII text encoded as UTF-8 is still pure 
ASCII, and so backwards compatible with old software that assumes ASCII. 
But the same pure-ASCII text encoded as UTF-8 with a signature looks like 
a binary file.


> Btw, isn't "read_encoding_from_bom" a better function name than
> "guess_encoding_from_bom"? I thought the point of BOMs was that there
> would be no more need to guess?

Of course it's a guess. If you see a file that starts with FFFE, is 
that a UTF-32 text file, or a binary file that happens to start with two 
nulls followed by FFFE?

-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread Rhodri James

On Thu, 16 Jan 2014 22:24:40 -, Nac Temha  wrote:


Hi everyone,

I want to do operation with chars in the given string. Actually I want to
grouping the same chars.

For example;

input : "3443331123377"
operation-> (3)(44)()(333)(11)(2)(33)(77)
output: "34131237"



How can I do without list, regular expression. just using string
operations. Using an effective methods of python for this problem.


I almost convinced myself this was homework, you know.  A hint as to why  
you might want such a thing would look a lot less suspicious :-)


The simplest way to do this is probably using groupby:


from itertools import groupby

input = "3443331123377"
output = "".join(k for k, _ in groupby(s))
print output



--
Rhodri James *-* Wildebeest Herder to the Masses
--
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to protect python source code by compiling it to .pyc or .pyo?

2014-01-16 Thread Chris Angelico
On Fri, Jan 17, 2014 at 11:58 AM, Sam  wrote:
> I would like to protect my python source code. It need not be foolproof as 
> long as it adds inconvenience to pirates.
>
> Is it possible to protect python source code by compiling it to .pyc or .pyo? 
> Does .pyo offer better protection?
>

The only difference between pyo and pyc is that the former is with
optimization done. And neither of them offers any real security.

Even if you compiled it down to machine code, you wouldn't do much to
deter pirates. All you'd do is make it so they have to take your code
as a whole instead of piece-meal.

Fighting against piracy using technology is pretty much guaranteed to
be a losing battle. How much time and effort can you put in, versus
the whole rest of the world? And how much harassment will you permit
on your legitimate users in order to slow down a few who want to rip
you off? I've seen some programs - usually games - that put lots and
lots of checks in (checksumming the program periodically and crashing
if it's wrong, "calling home" and making sure the cryptographic hash
of the binary matches what's on the server, etc, etc)... and they
still get cracked within the first day. And then legitimate purchasers
like me have to deal with the stupidities (single-player games calling
home??), to the extent that it's actually more convenient to buy the
game and then install a cracked version from a torrent, than to
install the version you bought. And there's one particular game where
I've done exactly that. It's just way too much fiddliness to try to
make the legit version work.

Distribute your code with a copyright notice, accept that a few people
will rip you off, and have done with it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to protect python source code by compiling it to .pyc or .pyo?

2014-01-16 Thread Ben Finney
Sam  writes:

> I would like to protect my python source code.

Protect it from what? If there's some specific activity you want to
prevent or restrict, please say what it is, since “protect” is a rather
loaded term.

> It need not be foolproof as long as it adds inconvenience to pirates.

I doubt your software will be at risk from pirates, which are raiders on
the high seas.

If you mean something more specific, please explain, because “pirate” is
an even more loaded term that doesn't explain.

-- 
 \  “Instead of a trap door, what about a trap window? The guy |
  `\  looks out it, and if he leans too far, he falls out. Wait. I |
_o__)guess that's like a regular window.” —Jack Handey |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Process datafeed in one MySql table and output to another MySql table

2014-01-16 Thread Sam
I have a datafeed which is constantly sent to a MySql table. The table grows 
constantly as the data feeds in. I would like to write a python script which 
process the data in this table and output the processed data to another table 
in another MySql database in real-time.

Which are the python libraries which are suitable for this purpose? Are there 
any useful sample code or project on the web that I can use as reference? Thank 
you.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to protect python source code by compiling it to .pyc or .pyo?

2014-01-16 Thread Ned Batchelder

On 1/16/14 7:58 PM, Sam wrote:

I would like to protect my python source code. It need not be foolproof as long 
as it adds inconvenience to pirates.

Is it possible to protect python source code by compiling it to .pyc or .pyo? 
Does .pyo offer better protection?



First, .pyc and .pyo are nearly identical: they are bytecode.  The only 
difference is that .pyo has been "optimized", which in this case simply 
means that the docstrings and asserts are gone.  It is not difficult to 
see what a Python program does by looking at the bytecode, and the 
standard library includes the dis module for disassembling it.


How to protect your code depends an awful lot on what kinds of secrets 
are in the code, and how valuable those secrets are, and therefore how 
hard someone will work to get at them.


--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list


Compiling main script into .pyc

2014-01-16 Thread Sam
One thing I observe about python byte-code compiling is that the main script 
does not gets compiled into .pyc. Only imported modules are compiled into .pyc. 

May I know how can I compile the main script into .pyc? It is to inconvenience 
potential copy-cats.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: interactive help on the base object

2014-01-16 Thread Terry Reedy

On 12/6/2013 8:35 PM, Terry Reedy wrote:

On 12/6/2013 12:03 PM, Mark Lawrence wrote:

Is it just me, or is this basically useless?

 >>> help(object)
Help on class object in module builtins:

class object
  |  The most base type


Given that this can be interpreted as 'least desirable', it could
definitely be improved.


Surely a few more words,


How about something like.

'''The default top superclass for all Python classes.

Its methods are inherited by all classes unless overriden.
'''

When you have 1 or more concrete suggestions for the docstring, open a
tracker issue.


At Mark's invitation, I have done so.
http://bugs.python.org/issue20285

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Is it possible to protect python source code by compiling it to .pyc or .pyo?

2014-01-16 Thread Sam
I would like to protect my python source code. It need not be foolproof as long 
as it adds inconvenience to pirates.

Is it possible to protect python source code by compiling it to .pyc or .pyo? 
Does .pyo offer better protection?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread giacomo boffi
giacomo boffi  writes:

> % python a.py
> 34131237

% cat a.py 
i="3443331123377";n=0
while n+1!=len(i):i,n=(i[:n]+i[n+1:],n) if i[n+1]==i[n] else (i,n+1)
print i
% python a.py 
34131237
%
-- 
for Nikos
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread Denis McMahon
On Fri, 17 Jan 2014 00:24:40 +0200, Nac Temha wrote:

> Hi everyone,
> 
> I want to do operation with chars in the given string. Actually I want
> to grouping the same chars.
> 
> For example;
> 
> input : "3443331123377"
> operation-> (3)(44)()(333)(11)(2)(33)(77)
> output: "34131237"

> How can I do without list, regular expression. just using string
> operations. Using an effective methods of python for this problem.

You can do it on one line, but it looks really messy:

output = ''.join([{x:input[x]for x in range(len(input))}[x]for x in range
(len({x:input[x]for x in range(len(input))}))if(x==0 or {x:input[x]for x 
in range(len(input))}[x-1]!={x:input[x]for x in range(len(input))}[x])])

It looks much better if you do it in steps:

a = {x:input[x]for x in range(len(input))}
b = [a[n]for n in range(len(a))if(n==0 or a[n-1]!=a[n])])
output = ''.join(b)

If you really want to do it using just 'string' ops:

for i in range(len(input)):
if (i==0):
output=input[0]
elif input[i]!=input[i-1]:
output+=input[i]

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread giacomo boffi
Nac Temha  writes:

> Hi everyone,
>
> I want to do operation with chars in the given string. Actually I want to
> grouping the same chars.
>
> For example;
>
> input : "3443331123377"
> operation-> (3)(44)()(333)(11)(2)(33)(77)
> output: "34131237"
>
>
>
> How can I do without list, regular expression. just using string operations. 
> Using an effective methods of python for this problem.


% cat a.py
def f(s,n):
if s[n+1] == s[n]:
return s[:n]+s[n+1:], n
return s, n+1

i = "3443331123377"
n = 0

while n+1 != len(i):
i, n = f(i, n)

print i
% python a.py
34131237
% 

-- 
your instructor is a mean person
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Chris Angelico
On Fri, Jan 17, 2014 at 6:37 AM, Albert-Jan Roskam  wrote:
> Can you elaborate on that? Unless your utf-8 files will only contain ascii 
> characters I do not understand why you would not want a bom utf-8.

It's completely unnecessary, and could cause problems (the BOM is
actually whitespace, albeit zero-width, so it could effectively indent
the first line of your source code).  UTF-8 specifies the byte order
as part of the protocol, so you don't need to mark it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode strings as arguments to exceptions

2014-01-16 Thread Terry Reedy

On 1/16/2014 9:16 AM, Steven D'Aprano wrote:

On Thu, 16 Jan 2014 13:34:08 +0100, Ernest Adrogué wrote:


Hi,

There seems to be some inconsistency in the way exceptions handle
Unicode strings.


Yes. I believe the problem lies in the __str__ method. For example,
KeyError manages to handle Unicode, although in an ugly way:

py> str(KeyError(u'ä'))
"u'\\xe4'"

Hence:

py> raise KeyError(u'ä')
Traceback (most recent call last):
   File "", line 1, in 
KeyError: u'\xe4'


While ValueError assumes ASCII and fails:

py> str(ValueError(u'ä'))
Traceback (most recent call last):
   File "", line 1, in 
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
position 0: ordinal not in range(128)

When displaying the traceback, the error is suppressed, hence:

py> raise ValueError(u'ä')
Traceback (most recent call last):
   File "", line 1, in 
ValueError

I believe this might be accepted as a bug report on ValueError.


Or a change might be rejected as a feature change or as a bugfix that 
might break existing code. We do change exception messages in new 
versions but do not normally do so in bugfix releases.


http://bugs.python.org/issue1012952 is related but different. The issue 
there was that unicode(ValueError(u'ä')) gave the same 
UnicodeEncodeError as str(ValueError(u'ä')). That was fixed by giving 
exceptions a __unicode__ method, but that did not fix the traceback 
display issue above.


http://bugs.python.org/issue6108
unicode(exception) and str(exception) should return the same message
also seems related. The issue was raised what str should do if the 
unicode message had non-ascii chars. I did not read enough to find an 
answer. The same question would arise here.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode strings as arguments to exceptions

2014-01-16 Thread Terry Reedy

On 1/16/2014 7:34 AM, Ernest Adrogué wrote:

Hi,

There seems to be some inconsistency in the way exceptions handle Unicode
strings.  For instance, KeyError seems to not have a problem with them


raise KeyError('a')

Traceback (most recent call last):
   File "", line 1, in 
KeyError: 'a'

raise KeyError(u'ä')

Traceback (most recent call last):
   File "", line 1, in 
KeyError: u'\xe4'

On the other hand ValueError doesn't print anything.


raise ValueError('a')

Traceback (most recent call last):
   File "", line 1, in 
ValueError: a

raise ValueError(u'ä')

Traceback (most recent call last):
   File "", line 1, in 
ValueError

I'm using Python 2.7.6 on a Unix machine.


Fixed at some point in 3.x. In 3.4b2:
>>> ValueError(b'a')
ValueError(b'a',)
>>> ValueError('a')
ValueError('a',)



--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread John Gordon
In  Mark Lawrence 
 writes:

> > input = "3443331123377"
> > output = []
> > previous_ch = None
> > for ch in input:
> >  if ch != previous_ch:
> >  output.append(ch)
> >  previous_ch = ch
> > print ''.join(output)
> >

> Cheat, you've used a list :)

Ack!  I missed that the OP doesn't want to use lists.

Well, let's try this instead:

import sys

input = "3443331123377"
previous_ch = None
for ch in input:
 if ch != previous_ch:
 sys.stdout.write(ch)
 previous_ch = ch
sys.stdout.write('\n')

-- 
John Gordon Imagine what it must be like for a real medical doctor to
gor...@panix.comwatch 'House', or a real serial killer to watch 'Dexter'.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread Mark Lawrence

On 16/01/2014 22:30, John Gordon wrote:

In  Nac Temha 
 writes:


--047d7b6d95d0367a3d04f01de490
Content-Type: text/plain; charset=ISO-8859-1



Hi everyone,



I want to do operation with chars in the given string. Actually I want to
grouping the same chars.



For example;



input : "3443331123377"
operation-> (3)(44)()(333)(11)(2)(33)(77)
output: "34131237"


input = "3443331123377"
output = []
previous_ch = None
for ch in input:
 if ch != previous_ch:
 output.append(ch)
 previous_ch = ch
print ''.join(output)



Cheat, you've used a list :)

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread Tim Chase
On 2014-01-17 00:24, Nac Temha wrote:
> Hi everyone,
> 
> I want to do operation with chars in the given string. Actually I
> want to grouping the same chars.
> 
> For example;
> 
> input : "3443331123377"
> operation-> (3)(44)()(333)(11)(2)(33)(77)
> output: "34131237"
> 
> How can I do without list, regular expression. just using string
> operations. Using an effective methods of python for this problem.

I'm not sure what constitutes "just using string operations", but
it's quite simple with stdlib tools:

  >>> from itertools import groupby
  >>> ''.join(k for k,v in groupby("3443331123377"))
  '34131237'

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python solve problem with string operation

2014-01-16 Thread John Gordon
In  Nac Temha 
 writes:

> --047d7b6d95d0367a3d04f01de490
> Content-Type: text/plain; charset=ISO-8859-1

> Hi everyone,

> I want to do operation with chars in the given string. Actually I want to
> grouping the same chars.

> For example;

> input : "3443331123377"
> operation-> (3)(44)()(333)(11)(2)(33)(77)
> output: "34131237"

input = "3443331123377"
output = []
previous_ch = None
for ch in input:
if ch != previous_ch:
output.append(ch)
previous_ch = ch
print ''.join(output)

-- 
John Gordon Imagine what it must be like for a real medical doctor to
gor...@panix.comwatch 'House', or a real serial killer to watch 'Dexter'.

-- 
https://mail.python.org/mailman/listinfo/python-list


Python solve problem with string operation

2014-01-16 Thread Nac Temha
Hi everyone,

I want to do operation with chars in the given string. Actually I want to
grouping the same chars.

For example;

input : "3443331123377"
operation-> (3)(44)()(333)(11)(2)(33)(77)
output: "34131237"



How can I do without list, regular expression. just using string
operations. Using an effective methods of python for this problem.


Thanks,
Best regards.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Travis Griggs

On Jan 16, 2014, at 2:51 AM, Robin Becker  wrote:

> I assure you that I fully understand my ignorance of ...

Robin, don’t take this personally, I totally got what you meant.

At the same time, I got a real chuckle out of this line. That beats “army 
intelligence” any day.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Converting folders of jpegs to single pdf per folder

2014-01-16 Thread vasishtha . spier
On Thursday, January 16, 2014 12:12:01 PM UTC-8, Tim Golden wrote:
> On 16/01/2014 20:07, Tim Golden wrote:
> 
> > This should walk down the Python directory,
> s/the Python directory/some directory/
> (Sorry, I initially had it walking os.path.dirname(sys.executable))
> TJG

Thanks Tim thats very helpful. Sorry about the double lines.  For some reason I 
wasn't getting the posts directly in my email and was using Google Groups.  
I've changed my subscription parameters and hopefully I'll get the replies 
directly.
Cheers,
Harry

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Converting folders of jpegs to single pdf per folder

2014-01-16 Thread Tim Golden

On 16/01/2014 20:07, Tim Golden wrote:

This should walk down the Python directory,


s/the Python directory/some directory/

(Sorry, I initially had it walking os.path.dirname(sys.executable))

TJG

--
https://mail.python.org/mailman/listinfo/python-list


Re: Converting folders of jpegs to single pdf per folder

2014-01-16 Thread Tim Golden

On 16/01/2014 19:50, vasishtha.sp...@gmail.com wrote:

On Thursday, January 16, 2014 11:41:04 AM UTC-8, Tim Golden wrote:
The usual go-to library for PDF generation is ReportLab. I haven't used


it for a long while but I'm quite certain it would have no problem

including images.



Do I take it that it's the PDF-generation side of things you're asking

about? Or do you need help iterating over hundreds of directories and files?



TJG


Its mostly the PDF generating side I need but I haven't yet used the Python 
directory and file traversing functions so an example of this would also be 
useful especially showing how I could capture the directory name and use that 
as the name of the pdf file I'm creating from the directory contents.

Thanks again,
Harry



Here's a quick example. (And, by the way, please try to avoid the sort 
of double-spacing above, especially if you're coming from Google Groups 
which tends to produce such effects).


This should walk down the Python directory, creating a text file for 
each directory. The textfile will contain the names of all the files in 
the directory. (NB this might create a lot of text files so run it 
inside some temp directory).



import os

root = "c:/temp"

for dirpath, dirnames, filenames in os.walk(root):
print("Looking at", dirpath)
txt_filename = os.path.basename(dirpath) + ".txt"
with open(txt_filename, "w") as f:
  f.write("\n".join(filenames))




TJG
--
https://mail.python.org/mailman/listinfo/python-list


Re: Converting folders of jpegs to single pdf per folder

2014-01-16 Thread Mark Lawrence

On 16/01/2014 19:50, vasishtha.sp...@gmail.com wrote:

On Thursday, January 16, 2014 11:41:04 AM UTC-8, Tim Golden wrote:

On 16/01/2014 19:11, Harry Spier wrote:






Dear list members,







I have a directory that contains about a hundred subdirectories named



J0001,J0002,J0003 . . . etc.



Each of these subdirectories contains about a hundred JPEGs named



P001.jpg, P002.jpg, P003.jpg etc.







I need to write a python script that will cycle thru each directory and



convert ALL JPEGs in each directory into a single PDF file and save



these PDF files (one per directory) to an output file.







Any pointers on how to do this with a Python script would be



appreciated. Reading on the internet it appears that using ImageMagick



wouldn't work because of using too much memory. Can this be done using



the Python Image Library or some other library? Any sample code would



also be appreciated.




The usual go-to library for PDF generation is ReportLab. I haven't used

it for a long while but I'm quite certain it would have no problem

including images.



Do I take it that it's the PDF-generation side of things you're asking

about? Or do you need help iterating over hundreds of directories and files?



TJG


Its mostly the PDF generating side I need but I haven't yet used the Python 
directory and file traversing functions so an example of this would also be 
useful especially showing how I could capture the directory name and use that 
as the name of the pdf file I'm creating from the directory contents.

Thanks again,
Harry



I'm sorry that I can't help with your problem, but would you please read 
and action this https://wiki.python.org/moin/GoogleGroupsPython to 
prevent us seeing the double line spacing above, thanks.


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Converting folders of jpegs to single pdf per folder

2014-01-16 Thread vasishtha . spier
On Thursday, January 16, 2014 11:41:04 AM UTC-8, Tim Golden wrote:
> On 16/01/2014 19:11, Harry Spier wrote:
> 
> >
> 
> > Dear list members,
> 
> >
> 
> > I have a directory that contains about a hundred subdirectories named
> 
> > J0001,J0002,J0003 . . . etc.
> 
> > Each of these subdirectories contains about a hundred JPEGs named
> 
> > P001.jpg, P002.jpg, P003.jpg etc.
> 
> >
> 
> > I need to write a python script that will cycle thru each directory and
> 
> > convert ALL JPEGs in each directory into a single PDF file and save
> 
> > these PDF files (one per directory) to an output file.
> 
> >
> 
> > Any pointers on how to do this with a Python script would be
> 
> > appreciated. Reading on the internet it appears that using ImageMagick
> 
> > wouldn't work because of using too much memory. Can this be done using
> 
> > the Python Image Library or some other library? Any sample code would
> 
> > also be appreciated.
> 
> 
> 
> The usual go-to library for PDF generation is ReportLab. I haven't used 
> 
> it for a long while but I'm quite certain it would have no problem 
> 
> including images.
> 
> 
> 
> Do I take it that it's the PDF-generation side of things you're asking 
> 
> about? Or do you need help iterating over hundreds of directories and files?
> 
> 
> 
> TJG

Its mostly the PDF generating side I need but I haven't yet used the Python 
directory and file traversing functions so an example of this would also be 
useful especially showing how I could capture the directory name and use that 
as the name of the pdf file I'm creating from the directory contents.

Thanks again,
Harry
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Albert-Jan Roskam

On Thu, 1/16/14, Chris Angelico  wrote:

 Subject: Re: Guessing the encoding from a BOM
 To: 
 Cc: "python-list@python.org" 
 Date: Thursday, January 16, 2014, 7:06 PM
 
 On Fri, Jan 17, 2014 at 5:01 AM,
 Björn Lindqvist 
 wrote:
 > 2014/1/16 Steven D'Aprano :
 >> def guess_encoding_from_bom(filename, default):
 >>     with open(filename, 'rb')
 as f:
 >>         sig =
 f.read(4)
 >>     if
 sig.startswith((b'\xFE\xFF', b'\xFF\xFE')):
 >>         return
 'utf_16'
 >>     elif
 sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')):
 >>         return
 'utf_32'
 >>     else:
 >>         return
 default
 >
 > You might want to add the utf8 bom too:
 '\xEF\xBB\xBF'.
 
 I'd actually rather not. It would tempt people to pollute
 UTF-8 files
 with a BOM, which is not necessary unless you are MS
 Notepad.
 

 ===> Can you elaborate on that? Unless your utf-8 files will only contain 
ascii characters I do not understand why you would not want a bom utf-8.

Btw, isn't "read_encoding_from_bom" a better function name than 
"guess_encoding_from_bom"? I thought the point of BOMs was that there would be 
no more need to guess?

Thanks!

Albert-Jan
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Converting folders of jpegs to single pdf per folder

2014-01-16 Thread Tim Golden

On 16/01/2014 19:11, Harry Spier wrote:


Dear list members,

I have a directory that contains about a hundred subdirectories named
J0001,J0002,J0003 . . . etc.
Each of these subdirectories contains about a hundred JPEGs named
P001.jpg, P002.jpg, P003.jpg etc.

I need to write a python script that will cycle thru each directory and
convert ALL JPEGs in each directory into a single PDF file and save
these PDF files (one per directory) to an output file.

Any pointers on how to do this with a Python script would be
appreciated. Reading on the internet it appears that using ImageMagick
wouldn't work because of using too much memory. Can this be done using
the Python Image Library or some other library? Any sample code would
also be appreciated.


The usual go-to library for PDF generation is ReportLab. I haven't used 
it for a long while but I'm quite certain it would have no problem 
including images.


Do I take it that it's the PDF-generation side of things you're asking 
about? Or do you need help iterating over hundreds of directories and files?


TJG

--
https://mail.python.org/mailman/listinfo/python-list


Converting folders of jpegs to single pdf per folder

2014-01-16 Thread Harry Spier
Dear list members,

I have a directory that contains about a hundred subdirectories named
J0001,J0002,J0003 . . . etc.
Each of these subdirectories contains about a hundred JPEGs named P001.jpg,
P002.jpg, P003.jpg etc.

I need to write a python script that will cycle thru each directory and
convert ALL JPEGs in each directory into a single PDF file and save these
PDF files (one per directory) to an output file.

Any pointers on how to do this with a Python script would be appreciated.
Reading on the internet it appears that using ImageMagick wouldn't work
because of using too much memory. Can this be done using the Python Image
Library or some other library? Any sample code would also be appreciated.

Thanks,
Harry Spier
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Tim Chase
On 2014-01-17 05:06, Chris Angelico wrote:
> > You might want to add the utf8 bom too: '\xEF\xBB\xBF'.  
> 
> I'd actually rather not. It would tempt people to pollute UTF-8
> files with a BOM, which is not necessary unless you are MS Notepad.

If the intent is to just sniff and parse the file accordingly, I get
enough of these junk UTF-8 BOMs at $DAY_JOB that I've had to create
utility-openers much like Steven is doing here.  It's particularly
problematic for me in combination with csv.DictReader, where I go
looking for $COLUMN_NAME and get KeyError exceptions because it wants
me to ask for $UTF_BOM+$COLUMN_NAME for the first column.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python glob and raw string

2014-01-16 Thread Neil Cerutti
On 2014-01-16, Chris Angelico  wrote:
>> Hmmm... I might be doing too much in __init__. ;)
>
> Hmm, why is it even a class? :) I guess you elided all the
> stuff that makes it impractical to just use a non-class
> function.

I didn't remove anything that makes it obviously class-worthy,
just timestamp checking, and several dicts and sets to store
data.

The original version of that code is just a set of three
functions, but the return result of that version was a single
dict. Once the return value got complicated enough to require
building up a class instance, it became a convenient place to
hang the functions.

-- 
Neil Cerutti

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python glob and raw string

2014-01-16 Thread Chris Angelico
On Fri, Jan 17, 2014 at 5:14 AM, Neil Cerutti  wrote:
> class Miner:
> def __init__(self, archive):
> # setup goes here; prepare to acquire the data
> self.descend(os.path.join(archive, '*'))
>
> def descend(self, path):
> for fname in glob.glob(os.path.join(path, '*')):
> if os.path.isdir(fname):
> self.descend(fname)
> else:
> self.process(fname)
>
> def process(self, path):
> # Do what I want done with an actual file path.
> # This is where I add to the data.
>
> In your case you might not want to process unless the path also
> looks like an xml file.
>
> mine = Miner('myxmldir')
>
> Hmmm... I might be doing too much in __init__. ;)

Hmm, why is it even a class? :) I guess you elided all the stuff that
makes it impractical to just use a non-class function.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python glob and raw string

2014-01-16 Thread Neil Cerutti
On 2014-01-16, Xaxa Urtiz  wrote:
> Hello everybody, i've got a little problem, i've made a script
> which look after some files in some directory, typically my
> folder are organized like this :
>
> [share]
> folder1
> ->20131201
> -->file1.xml
> -->file2.txt
> ->20131202
> -->file9696009.tmp
> -->file421378932.xml
> etc
> so basically in the share i've got some folder
> (=folder1,folder2.) and inside these folder i've got these
> folder whose name is the date (20131201,20131202,20131203
> etc...) and inside them i want to find all the xml files.
> So, what i've done is to iterate over all the folder1/2/3 that
> i want and look, for each one, the xml file with that:
>
> for f in glob.glob(dir +r"\20140115\*.xml"):
> ->yield f
>
> dir is the folder1/2/3 everything is ok but i want to do
> something like that :
>
> for i in range(10,16):
> ->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
> -->yield f
>
> but the glob does not find any file (and of course there is
> some xml and the old way found them...) 
> Any help would be appreciate :) 

I've done this two different ways. The simple way is very similar
to what you are now doing. It sucks because I have to manually
maintain the list of subdirectories to traverse every time I
create a new subdir.

Here's the other way, using glob and isdir from os.path, adapted
from actual production code.

class Miner:
def __init__(self, archive):
# setup goes here; prepare to acquire the data
self.descend(os.path.join(archive, '*'))

def descend(self, path):
for fname in glob.glob(os.path.join(path, '*')):
if os.path.isdir(fname):
self.descend(fname)
else:
self.process(fname)

def process(self, path):
# Do what I want done with an actual file path.
# This is where I add to the data.

In your case you might not want to process unless the path also
looks like an xml file.

mine = Miner('myxmldir')

Hmmm... I might be doing too much in __init__. ;)

-- 
Neil Cerutti

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python glob and raw string

2014-01-16 Thread Xaxa Urtiz
Le jeudi 16 janvier 2014 17:49:57 UTC+1, Xaxa Urtiz a écrit :
> Hello everybody, i've got a little problem, i've made a script which look 
> after some files in some directory, typically my folder are organized like 
> this :
> 
> 
> 
> [share]
> 
> folder1
> 
> ->20131201
> 
> -->file1.xml
> 
> -->file2.txt
> 
> ->20131202
> 
> -->file9696009.tmp
> 
> -->file421378932.xml
> 
> etc
> 
> so basically in the share i've got some folder (=folder1,folder2.) and 
> inside these folder i've got these folder whose name is the date 
> (20131201,20131202,20131203 etc...) and inside them i want to find all the 
> xml files.
> 
> So, what i've done is to iterate over all the folder1/2/3 that i want and 
> look, for each one, the xml file with that:
> 
> 
> 
> 
> 
> for f in glob.glob(dir +r"\20140115\*.xml"):
> 
> ->yield f
> 
> 
> 
> dir is the folder1/2/3 everything is ok but i want to do something like that :
> 
> 
> 
> 
> 
> for i in range(10,16):
> 
> ->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
> 
> -->yield f
> 
> 
> 
> but the glob does not find any file (and of course there is some xml and 
> the old way found them...) 
> 
> Any help would be appreciate :)

I feel stupid, my mistake, it works :

for i in range(1,16):
->for f in glob.glob(dir +r"\201401{0:02}\*.xml".format(i)):
-->yield f
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Chris Angelico
On Fri, Jan 17, 2014 at 5:01 AM, Björn Lindqvist  wrote:
> 2014/1/16 Steven D'Aprano :
>> def guess_encoding_from_bom(filename, default):
>> with open(filename, 'rb') as f:
>> sig = f.read(4)
>> if sig.startswith((b'\xFE\xFF', b'\xFF\xFE')):
>> return 'utf_16'
>> elif sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')):
>> return 'utf_32'
>> else:
>> return default
>
> You might want to add the utf8 bom too: '\xEF\xBB\xBF'.

I'd actually rather not. It would tempt people to pollute UTF-8 files
with a BOM, which is not necessary unless you are MS Notepad.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-16 Thread Björn Lindqvist
2014/1/16 Steven D'Aprano :
> def guess_encoding_from_bom(filename, default):
> with open(filename, 'rb') as f:
> sig = f.read(4)
> if sig.startswith((b'\xFE\xFF', b'\xFF\xFE')):
> return 'utf_16'
> elif sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')):
> return 'utf_32'
> else:
> return default

You might want to add the utf8 bom too: '\xEF\xBB\xBF'.

> (4) Don't return anything, but raise an exception. (But
> which exception?)

I like this option the most because it is the most "fail fast". If you
return 'undefined' the error might happen hours later or not at all in
some cases.


-- 
mvh/best regards Björn Lindqvist
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-16 Thread Robert Kern

On 2014-01-16 04:05, Roy Smith wrote:

Rita  writes:

I know its frowned upon to do work in the __init__() method and only
declarations should be there.



In article ,
  Ben Finney  wrote:


Who says it's frowned on to do work in the initialiser? Where are they
saying it? That seems over-broad, I'd like to read the context of that
advice.


Weird, I was just having this conversation at work earlier this week.

There are some people who advocate that C++ constructors should not do a
lot of work and/or should be incapable of throwing exceptions.  The pros
and cons of that argument are largely C++ specific.  Here's a Stack
Overflow thread which covers most of the usual arguments on both sides:

http://stackoverflow.com/questions/293967/how-much-work-should-be-done-in
-a-constructor

But, Python is not C++.  I suspect the people who argue for __init__()
not doing much are extrapolating a C++ pattern to other languages
without fully understanding the reason why.


I'm one of those people who tends to argue this, but my limited experience with 
C++ does not inform my opinion one way or the other.


I prefer to keep my __init__() methods as dumb as possible to retain the 
flexibility to construct my objects in different ways. Sure, it's convenient to, 
say, pass a filename and have the __init__() open() it for me. But then I'm 
stuck with only being able to create this object with a true, named file on 
disk. I can't create it with a StringIO for testing, or by opening a file and 
seeking to a specific spot where the relevant data starts, etc. I can keep the 
flexibility and convenience by keeping __init__() dumb and relegating various 
smarter and more convenient ways to instantiate the object to classmethods.


Which isn't to say that "smart" or "heavy" __init__()s don't have their place 
for some kinds of objects. I just think that dumb __init__()s should be the default.


That said, what the OP asks about, validating data in the __init__() is 
perfectly fine, IMO. My beef isn't so much with the raw *amount* of stuff done 
but how much you can code yourself into a corner by making limiting assumptions. 
So from one of the "do nothing in your __init__()" crowd, I say "well, I didn't 
really mean *nothing*"



That being said, I've been on a tear lately, trying to get our unit test
suite to run faster.  I came across one slow test which had an
interesting twist.  The class being tested had an __init__() method
which read over 900,000 records from a database and took something like
5-10 seconds to run.  Man, talk about heavy-weight constructors :-)


Indeed.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-16 Thread Robert Kern

On 2014-01-16 16:18, Roy Smith wrote:

On Thursday, January 16, 2014 10:46:10 AM UTC-5, Robert Kern wrote:


I prefer to keep my __init__() methods as dumb as possible to retain the
flexibility to construct my objects in different ways. Sure, it's convenient to,
say, pass a filename and have the __init__() open() it for me. But then I'm
stuck with only being able to create this object with a true, named file on
disk. I can't create it with a StringIO for testing, or by opening a file and
seeking to a specific spot where the relevant data starts, etc. I can keep the
flexibility and convenience by keeping __init__() dumb and relegating various
smarter and more convenient ways to instantiate the object to classmethods.


There's two distinct things being discussed here.

The idea of passing a file-like object vs. a filename gives you flexibility, 
that's for sure.  But, that's orthogonal to how much work should be done in the 
constructor.  Consider this class:


Where the two get conflated is that both lead to advice that looks the same (or 
at least can get interpreted the same by newbies who are trying to learn and 
don't have the experience to pick out the subtleties): "do nothing in __init__". 
That's why I am trying to clarify where this advice might be coming from and why 
at least one version of it may be valid.



class DataSlurper:
 def __init__(self):
 self.slurpee = None

 def attach_slurpee(self, slurpee):
 self.slurpee = slurpee

 def slurp(self):
 for line in self.slurpee:
 # whatever

This exhibits the nice behavior you describe; you can pass it any iterable, not just a 
file, so you have a lot more flexibility.  But, it's also exhibiting what many people 
call the "two-phase constructor" anti-pattern.  When you construct an instance 
of this class, it's not usable until you call attach_slurpee(), so why not just do that 
in the constructor?


That's where my recommendation of classmethods come in. The result of __init__() 
should always be usable. It's just that its arguments may not be as convenient 
as you like because you pass in objects that are closer to the internal 
representation than you normally want to deal with (e.g. file objects instead of 
filenames). You make additional constructors (initializers, whatever) as 
classmethods to restore convenience.



class DataSlurper:
  def __init__(self, slurpee):
self.slurpee = slurpee

  @classmethod
  def fromfile(cls, filename):
slurpee = open(filename)
return cls(slurpee)

  @classmethod
  def fromurl(cls, url):
slurpee = urllib.urlopen(url)
return cls(slurpee)

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building and accessing an array of dictionaries

2014-01-16 Thread Mark Lawrence

On 16/01/2014 09:48, Chris Angelico wrote:

On Thu, Jan 16, 2014 at 8:41 PM, Sam  wrote:

I would like to build an array of dictionaries. Most of the dictionary example 
on the net are for single dictionary.

dict = {'a':'a','b':'b','c':'c'}
dict2 = {'a':'a','b':'b','c':'c'}
dict3 = {'a':'a','b':'b','c':'c'}

arr = (dict,dict2,dict3)

What is the syntax to access the value of dict3->'a'?


Technically, that's a tuple of dictionaries


For the benefit of lurkers, newbies or whatever it's the commas that 
make the tuple, not the brackets.


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Python glob and raw string

2014-01-16 Thread Xaxa Urtiz
Hello everybody, i've got a little problem, i've made a script which look after 
some files in some directory, typically my folder are organized like this :

[share]
folder1
->20131201
-->file1.xml
-->file2.txt
->20131202
-->file9696009.tmp
-->file421378932.xml
etc
so basically in the share i've got some folder (=folder1,folder2.) and 
inside these folder i've got these folder whose name is the date 
(20131201,20131202,20131203 etc...) and inside them i want to find all the xml 
files.
So, what i've done is to iterate over all the folder1/2/3 that i want and look, 
for each one, the xml file with that:


for f in glob.glob(dir +r"\20140115\*.xml"):
->yield f

dir is the folder1/2/3 everything is ok but i want to do something like that :


for i in range(10,16):
->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
-->yield f

but the glob does not find any file (and of course there is some xml and 
the old way found them...) 
Any help would be appreciate :) 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-16 Thread Skip Montanaro
I suspect when best to validate inputs depends on when they
come in, and what the cost is of having objects with invalid
state. If the input is something that is passed along when
the object is instantiated, you kind of have to validate in
__init__ or __new__, right?

Let's create a stupid example:

class Point(object):
def __init__(self, coordinates):
self.x, self.y, self.z = coordinates

That's kind of self-validating. If you pass something that
doesn't quack like a three-element sequence, the program
will crash. OTOH, Point("abc") will appear to work until you
expect x, y and z to act like numbers. If you can't tolerate
that, then a __new__ method allows you to validate arguments
before creating a new Point instance. You might also allow
one- or two-element tuples:

def __new__(cls, coordinates):
... convert to tuple, then ...
... verify that all elements are numbers, then ...

if len(coordinates) > 3:
raise ValueError("Expect 1-, 2-, or 3-element tuple")
if len(coordinates) < 2:
coordinates += (0.0,)
if len(coordinates) < 3:
coordinates += (0.0,)
return cls(coordinates)

Validating in __new__ will allow you to catch problems
sooner, and give you the option of returning some sort of
sentinel instead of just raising an exception, though that
is probably not generally good practice. This will catch
Point("abc") quickly, with a stack trace pointing to the
offending code.

Of course, you might need to validate any other inputs which
appear after instantiation:

def move(self, xdelta=0.0, ydelta=0.0, zdelta=0.0):
self.x += xdelta
self.y += ydelta
self.z += zdelta

Whether you need more feedback than an exception might give
you here is debatable, but there are likely plenty of
situations where you need to explicitly validate user input
before using it (think of accepting string data from the net
which you plan to feed to your SQL database...). Those sorts
of validation steps are beyond the scope of this thread, and
probably much better handled by platforms further up the
software stack (like Django or SQLAlchemy).

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-16 Thread Roy Smith
On Thursday, January 16, 2014 10:46:10 AM UTC-5, Robert Kern wrote:

> I prefer to keep my __init__() methods as dumb as possible to retain the 
> flexibility to construct my objects in different ways. Sure, it's convenient 
> to, 
> say, pass a filename and have the __init__() open() it for me. But then I'm 
> stuck with only being able to create this object with a true, named file on 
> disk. I can't create it with a StringIO for testing, or by opening a file and 
> seeking to a specific spot where the relevant data starts, etc. I can keep 
> the 
> flexibility and convenience by keeping __init__() dumb and relegating various 
> smarter and more convenient ways to instantiate the object to classmethods.

There's two distinct things being discussed here.

The idea of passing a file-like object vs. a filename gives you flexibility, 
that's for sure.  But, that's orthogonal to how much work should be done in the 
constructor.  Consider this class:

class DataSlurper:
def __init__(self):
self.slurpee = None

def attach_slurpee(self, slurpee):
self.slurpee = slurpee

def slurp(self):
for line in self.slurpee:
# whatever

This exhibits the nice behavior you describe; you can pass it any iterable, not 
just a file, so you have a lot more flexibility.  But, it's also exhibiting 
what many people call the "two-phase constructor" anti-pattern.  When you 
construct an instance of this class, it's not usable until you call 
attach_slurpee(), so why not just do that in the constructor?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-16 Thread Roy Smith
On Thursday, January 16, 2014 5:59:42 AM UTC-5, Albert-Jan Roskam wrote:

> what would be wrong with the following approach:
> 
> import unittest
> class Test(unittest.TestCase):
> 
> receipts = {}
>
> def unique_value(self, k, v):
> assert Test.receipts.get(k) is None, "Duplicate: %s" % v
> Test.receipts[k] = v
> 
> def test_a(self):
> self.unique_value("large_value", "foo")
> 
> def test_b(self):
> self.unique_value("large_value", "bar")  # oh no, a duplicate! 
> 
> def test_c(self):
> self.unique_value("another_large_value", "blah")

Although I didn't state it in my original post, we run these tests under nose 
in multi-process mode.  Each process would have its own copy of the receipts 
dictionary.

Yes, I know I can force all the tests in a class to be in the same process, but 
these are some of the slower tests in our suite, so we *want* them to run in 
parallel.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Tim Chase
On 2014-01-16 14:07, Steven D'Aprano wrote:
> The unicode type in Python 2.x is less-good because:
> 
> - it is missing some functionality, e.g. casefold;

Just for the record, str.casefold() wasn't added until 3.3, so
earlier 3.x versions (such as the 3.2.3 that is the default python3
on Debian Stable) don't have it either.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-16 Thread Peter Otten
Albert-Jan Roskam wrote:

> On Thu, 1/16/14, Peter Otten <__pete...@web.de> wrote:

>>  class Foo(unittest.TestCase):
>>  @unique_receipt("foo")
>>  def test_t1(self, RECEIPT):
>>  pass

>  Very cool approach. Question, though: what would be wrong
>  with the following approach:
> 
> 
> import unittest
> 
> class Test(unittest.TestCase):
> 
> receipts = {}
> 
> def unique_value(self, k, v):
> assert Test.receipts.get(k) is None, "Duplicate: %s" % v
> Test.receipts[k] = v
> 
> def test_a(self):
> self.unique_value("large_value", "foo")

Nothing.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode strings as arguments to exceptions

2014-01-16 Thread Roy Smith
In article <52d7e9a0$0$2$c3e8da3$54964...@news.astraweb.com>,
 Steven D'Aprano  wrote:

> On Thu, 16 Jan 2014 13:34:08 +0100, Ernest Adrogué wrote:
> 
> > Hi,
> > 
> > There seems to be some inconsistency in the way exceptions handle
> > Unicode strings.
> 
> Yes. I believe the problem lies in the __str__ method. For example, 
> KeyError manages to handle Unicode, although in an ugly way:
> 
> py> str(KeyError(u'ä'))
> "u'\\xe4'"
> 
> Hence:
> 
> py> raise KeyError(u'ä')
> Traceback (most recent call last):
>   File "", line 1, in 
> KeyError: u'\xe4'
> 
> 
> While ValueError assumes ASCII and fails:
> 
> py> str(ValueError(u'ä'))
> Traceback (most recent call last):
>   File "", line 1, in 
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in 
> position 0: ordinal not in range(128)
> 
> When displaying the traceback, the error is suppressed, hence:
> 
> py> raise ValueError(u'ä')
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError
> 
> 
> I believe this might be accepted as a bug report on ValueError.

If you try to construct an instance of ValueError with an argument it 
can't handle, the obvious thing for it to do is raise ValueError :-)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-16 Thread Roy Smith
In article <52d7874d$0$6599$c3e8da3$54964...@news.astraweb.com>,
 Steven D'Aprano  wrote:

> Is the mapping of receipt string to test fixed? That is, is it important 
> that test_t1 *always* runs with "some string", test_t2 "some other 
> string", and so forth?

Yes.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode strings as arguments to exceptions

2014-01-16 Thread Steven D'Aprano
On Thu, 16 Jan 2014 13:34:08 +0100, Ernest Adrogué wrote:

> Hi,
> 
> There seems to be some inconsistency in the way exceptions handle
> Unicode strings.

Yes. I believe the problem lies in the __str__ method. For example, 
KeyError manages to handle Unicode, although in an ugly way:

py> str(KeyError(u'ä'))
"u'\\xe4'"

Hence:

py> raise KeyError(u'ä')
Traceback (most recent call last):
  File "", line 1, in 
KeyError: u'\xe4'


While ValueError assumes ASCII and fails:

py> str(ValueError(u'ä'))
Traceback (most recent call last):
  File "", line 1, in 
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in 
position 0: ordinal not in range(128)

When displaying the traceback, the error is suppressed, hence:

py> raise ValueError(u'ä')
Traceback (most recent call last):
  File "", line 1, in 
ValueError


I believe this might be accepted as a bug report on ValueError.


-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Steven D'Aprano
On Thu, 16 Jan 2014 10:51:42 +, Robin Becker wrote:

> On 16/01/2014 00:32, Steven D'Aprano wrote:
>>> >Or are you saying thatwww.unicode.org  is wrong about the definitions
>>> >of Unicode terms?
>> No, I think he is saying that he doesn't know Unicode anywhere near as
>> well as he thinks he does. The question is, will he cherish his
>> ignorance, or learn from this thread?
> 
> I assure you that I fully understand my ignorance of unicode.

Robin, while I'm very happy to see that you have a good grasp of what you 
don't know, I'm afraid that you're misrepresenting me. You deleted the 
part of my post that made it clear that I was referring to our resident 
Unicode crank, JMF .


> Until
> recently I didn't even know that the unicode in python 2.x is considered
> broken and that str in python 3.x is considered 'better'.

No need for scare quotes.

The unicode type in Python 2.x is less-good because:

- it is not the default string type (you have to prefix the string 
  with a u to get Unicode);

- it is missing some functionality, e.g. casefold;

- there are two distinct implementations, narrow builds and wide builds;

- wide builds take up to four times more memory per string as needed;

- narrow builds take up to two times more memory per string as needed;

- worse, narrow builds have very naive (possibly even "broken") 
  handling of code points in the Supplementary Multilingual Planes.

The unicode string type in Python 3 is better because:

- it is the default string type;

- it includes more functionality;

- starting in Python 3.3, it gets rid of the distinction between 
  narrow and wide builds;

- which reduces the memory overhead of strings by up to a factor 
  of four in many cases;

- and fixes the issue of SMP code points.


> I can say that having made a lot of reportlab work in both 2.7 & 3.3 I
> don't understand why the latter seems slower especially since we try to
> convert early to unicode/str as a desirable internal form. 

*shrug*

Who knows? Is it slower or does it only *seem* slower? Is the performance 
regression platform specific? Have you traded correctness for speed, that 
is, does 2.7 version break when given astral characters on a narrow build?

Earlier in January, you commented in another thread that 

"I'm not sure if we have any non-bmp characters in the tests."

If you don't, you should have some.

There's all sorts of reasons why your code might be slower under 3.3, 
including the possibility of a non-trivial performance regression. If you 
can demonstrate a test case with a significant slowdown for real-world 
code, I'm sure that a bug report will be treated seriously.


> Probably I
> have some horrible error going on(eg one of the C extensions is working
> in 2.7 and not in 3.3).

Well that might explain a slowdown.

But really, one should expect that moving from single byte strings to up 
to four-byte strings will have *some* cost. It's exchanging functionality 
for time. The same thing happened years ago, people used to be extremely 
opposed to using floating point doubles instead of singles because of 
performance. And, I suppose it is true that back when 64K was considered 
a lot of memory, using eight whole bytes per floating point number (let 
alone ten like the IEEE Extended format) might have seemed the height of 
extravagance. But today we use doubles by default, and if singles would 
be a tiny bit faster, who wants to go back to the bad old days of single 
precision?

I believe the same applies to Unicode versus single-byte strings.



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'StraÃYe' ('Strasse') and Python 2

2014-01-16 Thread Robin Becker

On 16/01/2014 12:06, Frank Millman wrote:
..

I assure you that I fully understand my ignorance of unicode. Until
recently I didn't even know that the unicode in python 2.x is considered
broken and that str in python 3.x is considered 'better'.



Hi Robin

I am pretty sure that Steven was referring to the original post from
jmfauth, not to anything that you wrote.



unfortunately my ignorance remains even in the absence of criticism


May I say that I am delighted that you are putting in the effort to port
ReportLab to python3, and I trust that you will get plenty of support from
the gurus here in achieving this.


I have had a lot of support from the gurus thanks to all of them :)
--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.x adoption

2014-01-16 Thread Piet van Oostrum
Travis Griggs  writes:

> Personally, I wish they’d start python4, sure would take the heat out of
> the 3 vs 2 debates. And maybe there’d be a program called twentyfour as
> a result.

twelve would be sufficient, I would think.
-- 
Piet van Oostrum 
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]
-- 
https://mail.python.org/mailman/listinfo/python-list


Unicode strings as arguments to exceptions

2014-01-16 Thread Ernest Adrogué
Hi,

There seems to be some inconsistency in the way exceptions handle Unicode
strings.  For instance, KeyError seems to not have a problem with them

>>> raise KeyError('a')
Traceback (most recent call last):
  File "", line 1, in 
KeyError: 'a'
>>> raise KeyError(u'ä')
Traceback (most recent call last):
  File "", line 1, in 
KeyError: u'\xe4'

On the other hand ValueError doesn't print anything.

>>> raise ValueError('a')
Traceback (most recent call last):
  File "", line 1, in 
ValueError: a
>>> raise ValueError(u'ä')
Traceback (most recent call last):
  File "", line 1, in 
ValueError

I'm using Python 2.7.6 on a Unix machine.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Stra�Ye' ('Strasse') and Python 2

2014-01-16 Thread Frank Millman

"Robin Becker"  wrote in message 
news:52d7b9be.9020...@chamonix.reportlab.co.uk...
> On 16/01/2014 00:32, Steven D'Aprano wrote:
>>> >Or are you saying thatwww.unicode.org  is wrong about the definitions 
>>> >of
>>> >Unicode terms?
>> No, I think he is saying that he doesn't know Unicode anywhere near as
>> well as he thinks he does. The question is, will he cherish his
>> ignorance, or learn from this thread?
>
> I assure you that I fully understand my ignorance of unicode. Until 
> recently I didn't even know that the unicode in python 2.x is considered 
> broken and that str in python 3.x is considered 'better'.
>

Hi Robin

I am pretty sure that Steven was referring to the original post from 
jmfauth, not to anything that you wrote.

May I say that I am delighted that you are putting in the effort to port 
ReportLab to python3, and I trust that you will get plenty of support from 
the gurus here in achieving this.

Frank Millman



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-16 Thread Rita
Thanks everyone for the replies.




On Thu, Jan 16, 2014 at 1:36 AM, Cameron Simpson  wrote:

> On 16Jan2014 15:53, Ben Finney  wrote:
> > Roy Smith  writes:
> > >  Ben Finney  wrote:
> > > > Who says it's frowned on to do work in the initialiser? Where are
> they
> > > > saying it? That seems over-broad, I'd like to read the context of
> that
> > > > advice.
> > >
> > > There are some people who advocate that C++ constructors should not do
> > > a lot of work and/or should be incapable of throwing exceptions. The
> > > pros and cons of that argument are largely C++ specific. […]
> >
> > Even simpler: They are mistaken in what the constructor is named, in
> > Python.
> > Python classes have the constructor, ‘__new__’. I would agree with
> > advice not to do anything but allocate the resources for a new instance
> > in the constructor. [...]
> >
> > Python instances have an initialiser, ‘__init__’. That function is for
> > setting up the specific instance for later use. This is commonly
> > over-ridden and many classes define a custom initialiser, which normally
> > does some amount of work.
> >
> > I don't think ‘__init__’ is subject to the conventions of a constructor,
> > because *‘__init__’ is not a constructor*.
>
> 99% of the time this distinction is moot. When I call ClassName(blah,...),
> both the constructor and initialiser are called.
>
> Informally, there's a rule of thumb that making an object (allocate,
> construct and initialise) shouldn't be needlessly expensive. Beyond
> that, what happens depends on the use patterns.
>
> This rule of thumb will be what Rita's encountered, perhaps stated
> without any qualification regarding what's appropriate.
>
> Cheers,
> --
> Cameron Simpson 
>
> The problem with keeping an open mind is that my ideas all tend to fall
> out...
> - Bill Garrett 
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
--- Get your facts first, then you can distort them as you please.--
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python program distribution - a source of constant friction

2014-01-16 Thread Nicholas Cole
On Tue, Jan 7, 2014 at 12:09 AM, Nicholas Cole  wrote:

[SNIP]

> Even so, things like that are harder to create than they
> could be, or less prominently documented than one might have expected.
>
> Case in point: I have an application a friend/colleague of mine would like
> to look at.  I've no idea if he is running Debian or Redhat or FreeBSD or a
> Mac.  Assuming I've not used any C extensions, it is *possible* to create
> something that will run on all of the above without any fuss at his end.  It
> just isn't nearly as easy as it could be, which must be a shame.
>
> Nicholas.

In a spirit of trying to not only highlight problems, but start to solve them:

https://pypi.python.org/pypi/ncdistribute/

Feedback is very welcome.  Version 1 is a naive approach - it doesn't
filter the included files at all, and will include all detected
dependencies that are not part of the standard library.

Best wishes,

Nicholas
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-16 Thread Albert-Jan Roskam



On Thu, 1/16/14, Peter Otten <__pete...@web.de> wrote:

 Subject: Re: Is it possible to get string from function?
 To: python-list@python.org
 Date: Thursday, January 16, 2014, 9:52 AM
 
 Roy Smith wrote:
 
 > I realize the subject line is kind of meaningless, so
 let me explain :-)
 > 
 > I've got some unit tests that look like:
 > 
 > class Foo(TestCase):
 >   def test_t1(self):
 >     RECEIPT = "some string"
 > 
 >   def test_t2(self):
 >     RECEIPT = "some other string"
 > 
 >   def test_t3(self):
 >     RECEIPT = "yet a third string"
 > 
 > and so on.  It's important that the strings be
 mutually unique.  In the
 > example above, it's trivial to look at them and observe
 that they're all
 > different, but in real life, the strings are about 2500
 characters long,
 > hex-encoded.  It even turns out that a couple of
 the strings are
 > identical in the first 1000 or so characters, so it's
 not trivial to do
 > by visual inspection.
 > 
 > So, I figured I would write a meta-test, which used
 introspection to
 > find all the methods in the class, extract the strings
 from them (they
 > are all assigned to a variable named RECEIPT), and
 check to make sure
 > they're all different.
 > 
 > Is it possible to do that?  It is straight-forward
 using the inspect
 > module to discover the methods, but I don't see any way
 to find what
 > strings are assigned to a variable with a given
 name.  Of course, that
 > assignment doesn't even happen until the function is
 executed, so
 > perhaps what I want just isn't possible?
 > 
 > It turns out, I solved the problem with more mundane
 tools:
 > 
 > grep 'RECEIPT = ' test.py | sort | uniq -c
 > 
 > and I could have also solved the problem by putting all
 the strings in a
 > dict and having the functions pull them out of
 there.  But, I'm still
 > interested in exploring if there is any way to do this
 with
 > introspection, as an academic exercise.
 
 Instead of using introspection you could make it explicit
 with a decorator:
 
 $ cat unique_receipt.py 
 import functools
 import sys
 import unittest
 
 _receipts = {}
 def unique_receipt(receipt):
     def deco(f):
         if receipt in _receipts:
             raise ValueError(
                
 "Duplicate receipt {!r} in \n    {} and \n 
   {}".format(
                
     receipt, _receipts[receipt], f))
         _receipts[receipt] = f
         @functools.wraps(f)
         def g(self):
             return f(self,
 receipt)
         return g
     return deco
 
 class Foo(unittest.TestCase):
     @unique_receipt("foo")
     def test_t1(self, RECEIPT):
         pass
 
     @unique_receipt("bar")
     def test_t2(self, RECEIPT):
         pass
 
     @unique_receipt("foo")
     def test_t3(self, RECEIPT):
         pass
 
 if __name__ == "__main__":
     unittest.main()
 $ python unique_receipt.py 
 Traceback (most recent call last):
   File "unique_receipt.py", line 19, in 
     class Foo(unittest.TestCase):
   File "unique_receipt.py", line 28, in Foo
     @unique_receipt("foo")
   File "unique_receipt.py", line 11, in deco
     receipt, _receipts[receipt], f))
 ValueError: Duplicate receipt 'foo' in 
      and
 
     
 

 > Very cool approach. Question, though: what would be wrong with 
the following approach:


import unittest

class Test(unittest.TestCase):

receipts = {}

def unique_value(self, k, v):
assert Test.receipts.get(k) is None, "Duplicate: %s" % v
Test.receipts[k] = v

def test_a(self):
self.unique_value("large_value", "foo")

def test_b(self):
self.unique_value("large_value", "bar")  # oh no, a duplicate! 

def test_c(self):
self.unique_value("another_large_value", "blah")

unittest.main()



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Chris Angelico
On Thu, Jan 16, 2014 at 9:51 PM, Robin Becker  wrote:
> On 16/01/2014 00:32, Steven D'Aprano wrote:
>>>
>>> >Or are you saying thatwww.unicode.org  is wrong about the definitions of
>>> >Unicode terms?
>>
>> No, I think he is saying that he doesn't know Unicode anywhere near as
>> well as he thinks he does. The question is, will he cherish his
>> ignorance, or learn from this thread?
>
>
> I assure you that I fully understand my ignorance of unicode. Until recently
> I didn't even know that the unicode in python 2.x is considered broken and
> that str in python 3.x is considered 'better'.

Your wisdom, if I may paraphrase Master Foo, is that you know you are a fool.

http://catb.org/esr/writings/unix-koans/zealot.html

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-16 Thread Robin Becker

On 16/01/2014 00:32, Steven D'Aprano wrote:

>Or are you saying thatwww.unicode.org  is wrong about the definitions of
>Unicode terms?

No, I think he is saying that he doesn't know Unicode anywhere near as
well as he thinks he does. The question is, will he cherish his
ignorance, or learn from this thread?


I assure you that I fully understand my ignorance of unicode. Until recently I 
didn't even know that the unicode in python 2.x is considered broken and that 
str in python 3.x is considered 'better'.


I can say that having made a lot of reportlab work in both 2.7 & 3.3 I don't 
understand why the latter seems slower especially since we try to convert early 
to unicode/str as a desirable internal form. Probably I have some horrible error 
going on(eg one of the C extensions is working in 2.7 and not in 3.3).

-stupidly yrs-
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building and accessing an array of dictionaries

2014-01-16 Thread Jean-Michel Pichavant
- Original Message -
> I would like to build an array of dictionaries. Most of the
> dictionary example on the net are for single dictionary.
> 
> dict = {'a':'a','b':'b','c':'c'}
> dict2 = {'a':'a','b':'b','c':'c'}
> dict3 = {'a':'a','b':'b','c':'c'}
> 
> arr = (dict,dict2,dict3)
> 
> What is the syntax to access the value of dict3->'a'?
> 
> Thank you.
> 
> --
> https://mail.python.org/mailman/listinfo/python-list
> 

Hi,

arr = (dict,dict2,dict3) 
builds a tuple.

If you want to build a (ordered) List, which is the closest type to array 
(arrays don't exists in python), you may write

myList = [dict, dict2, dict3]

you can access 'a' by writing 

myList[2]['a']
Additionally:
myList[0] -> 1st element
myList[-1] -> last element
myList[3:] -> list of elements of myList from the the 4th element to the last


Accessing a list element or a dictionary value is done through the same 
operator []. That can be confusing at the very beginning, you'll get used to it 
eventually.

JM


-- IMPORTANT NOTICE: 

The contents of this email and any attachments are confidential and may also be 
privileged. If you are not the intended recipient, please notify the sender 
immediately and do not disclose the contents to any other person, use it for 
any purpose, or store or copy the information in any medium. Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building and accessing an array of dictionaries

2014-01-16 Thread Jussi Piitulainen
Sam writes:

> I would like to build an array of dictionaries. Most of the
> dictionary example on the net are for single dictionary.
> 
> dict = {'a':'a','b':'b','c':'c'}
> dict2 = {'a':'a','b':'b','c':'c'}
> dict3 = {'a':'a','b':'b','c':'c'}
> 
> arr = (dict,dict2,dict3)
> 
> What is the syntax to access the value of dict3->'a'?

This isn't a special case.

arr[2] to get the dictionary
arr[2]['a'] to get the value in the dictionary

'a' in arr[2] to find if there is such a key

arr[2].get('a') to get the value or None if the key isn't there
arr[2].get('a', 'd') to get a value even if the key isn't there

help(dict.get)

for key in arr[2]:
   # to iterate over the keys

The exact same mechanisms are used no matter where you get the
dictionary from.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building and accessing an array of dictionaries

2014-01-16 Thread Chris Angelico
On Thu, Jan 16, 2014 at 8:41 PM, Sam  wrote:
> I would like to build an array of dictionaries. Most of the dictionary 
> example on the net are for single dictionary.
>
> dict = {'a':'a','b':'b','c':'c'}
> dict2 = {'a':'a','b':'b','c':'c'}
> dict3 = {'a':'a','b':'b','c':'c'}
>
> arr = (dict,dict2,dict3)
>
> What is the syntax to access the value of dict3->'a'?

Technically, that's a tuple of dictionaries, and you may want to use a
list instead:

lst = [dict, dict2, dict3]

Like any other list or tuple, you can reference them by their indices:

lst[2] is dict3

lst[2]['a'] is dict3['a']

Hope that helps!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Building and accessing an array of dictionaries

2014-01-16 Thread Sam
I would like to build an array of dictionaries. Most of the dictionary example 
on the net are for single dictionary.

dict = {'a':'a','b':'b','c':'c'}
dict2 = {'a':'a','b':'b','c':'c'}
dict3 = {'a':'a','b':'b','c':'c'}

arr = (dict,dict2,dict3)

What is the syntax to access the value of dict3->'a'?

Thank you.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-16 Thread Peter Otten
Roy Smith wrote:

> I realize the subject line is kind of meaningless, so let me explain :-)
> 
> I've got some unit tests that look like:
> 
> class Foo(TestCase):
>   def test_t1(self):
> RECEIPT = "some string"
> 
>   def test_t2(self):
> RECEIPT = "some other string"
> 
>   def test_t3(self):
> RECEIPT = "yet a third string"
> 
> and so on.  It's important that the strings be mutually unique.  In the
> example above, it's trivial to look at them and observe that they're all
> different, but in real life, the strings are about 2500 characters long,
> hex-encoded.  It even turns out that a couple of the strings are
> identical in the first 1000 or so characters, so it's not trivial to do
> by visual inspection.
> 
> So, I figured I would write a meta-test, which used introspection to
> find all the methods in the class, extract the strings from them (they
> are all assigned to a variable named RECEIPT), and check to make sure
> they're all different.
> 
> Is it possible to do that?  It is straight-forward using the inspect
> module to discover the methods, but I don't see any way to find what
> strings are assigned to a variable with a given name.  Of course, that
> assignment doesn't even happen until the function is executed, so
> perhaps what I want just isn't possible?
> 
> It turns out, I solved the problem with more mundane tools:
> 
> grep 'RECEIPT = ' test.py | sort | uniq -c
> 
> and I could have also solved the problem by putting all the strings in a
> dict and having the functions pull them out of there.  But, I'm still
> interested in exploring if there is any way to do this with
> introspection, as an academic exercise.

Instead of using introspection you could make it explicit with a decorator:

$ cat unique_receipt.py 
import functools
import sys
import unittest

_receipts = {}
def unique_receipt(receipt):
def deco(f):
if receipt in _receipts:
raise ValueError(
"Duplicate receipt {!r} in \n{} and \n{}".format(
receipt, _receipts[receipt], f))
_receipts[receipt] = f
@functools.wraps(f)
def g(self):
return f(self, receipt)
return g
return deco

class Foo(unittest.TestCase):
@unique_receipt("foo")
def test_t1(self, RECEIPT):
pass

@unique_receipt("bar")
def test_t2(self, RECEIPT):
pass

@unique_receipt("foo")
def test_t3(self, RECEIPT):
pass

if __name__ == "__main__":
unittest.main()
$ python unique_receipt.py 
Traceback (most recent call last):
  File "unique_receipt.py", line 19, in 
class Foo(unittest.TestCase):
  File "unique_receipt.py", line 28, in Foo
@unique_receipt("foo")
  File "unique_receipt.py", line 11, in deco
receipt, _receipts[receipt], f))
ValueError: Duplicate receipt 'foo' in 
 and 



-- 
https://mail.python.org/mailman/listinfo/python-list