Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Rui Maciel
lcrocker wrote:

 I understand that for something like a server distribution, but Ubuntu
 is a user-focused desktop distribution. It has a GUI, always. 

Irrelevant.  


 The
 purpose of a distro like that is to give users a good experience. If I
 install Python on Windows, I get to use Python. On Ubuntu, I don't,
 and I think that will confuse some users. 

Nonsense.  No one is keeping anyone off tkinter.  If you want it, install 
it.  There are official packages in the repositories such as python-tk and 
python3-tk.  If someone else doesn't want them then they aren't forced to 
pack their Ubuntu systems with more cruft.  There's nothing worse than being 
forced to install piles of irrelevant and useless stuff as a dependency to a 
fundamental package.


 I recently recommended
 Python to a friend who wants to start learning programming. Hurdles
 like this don't help someone like him.

If your friend believes that having to do an extra pair of clicks or typing 
sudo apt-get install python-tk is an unbeatable hurdle then your friend's 
computer skills are awfully lacking and he won't have much success learning 
how to write programs.


Rui Maciel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Rui Maciel
Steven D'Aprano wrote:

 It's only easy to install a package on Ubuntu if you know that you have
 to, and can somehow work out the name of the package.

No one actually has to install tkinter.  That's the whole point of providing 
it as a separate package: only those who want to use it have to install it. 
The rest of us don't.


Rui Maciel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 07:36:47 +0100, Rui Maciel wrote:

 Steven D'Aprano wrote:
 
 It's only easy to install a package on Ubuntu if you know that you have
 to, and can somehow work out the name of the package.
 
 No one actually has to install tkinter.  That's the whole point of
 providing it as a separate package: only those who want to use it have
 to install it. The rest of us don't.

I think that if you are worrying about the overhead of the tkinter 
bindings for Python, you're guilty of premature optimization. The tkinter 
package in Python 3.3 is trivially small, under 2 MB.

Besides, how far do we go? Do we expect people to install (say):

python3-copy

so that those who don't need the copy module don't have to install it?

sudo apt-get python3 python3-copy python3-dis python3-doctest \
 python3-csv python3-logging python3-shutil ...


There are advantages to having the *standard library* actually be, you 
know, *standard*.

In my perfect world, the tk/tcl bindings and the tkinter package would be 
installed with any Python installation. Naturally they won't work if you 
don't install Tcl, but to make them work, all you need is:

sudo apt-get python3 tcl

Don't want Tcl? Fine, don't install it, and import tkinter will fail at 
import time, preferably with a sensible error message like Tcl not 
installed.

Naturally I'm just talking about the standard CPython implementation on 
Linux systems where Tcl is standard. If you have an embedded system, 
where tkinter's 2MB is *not* trivially small, or a platform where Tcl 
does not exist, then that's a different story. But in a standard Linux 
desktop install of Python, tkinter should Just Work once you install Tcl.

In my perfect world.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Andrew Berg
On 2013.04.22 02:17, Steven D'Aprano wrote:
 I think that if you are worrying about the overhead of the tkinter 
 bindings for Python, you're guilty of premature optimization. The tkinter 
 package in Python 3.3 is trivially small, under 2 MB.
 
 Besides, how far do we go? Do we expect people to install (say):
 
 python3-copy
 
 so that those who don't need the copy module don't have to install it?
Much of the stdlib doesn't rely on anything but the core interpreter. tkinter 
by itself is not the issue. As you said, the bindings are
tiny. However, in order to be usable, it requires quite a few things - most 
notably X. On desktop Linux, this is already installed, but on
server systems, it generally is not (or at least shouldn't be in most cases). 
Going back to my example of a web server using a Python-based
framework, I'll repeat that there is no reason such a system should have X even 
installed in order to serve web pages. Even on a lean, mean
server machine, CPython requires only a few extra libraries. Add tkinter, and 
suddenly you have to install a LOT of things. If you plan to
actually use tkinter, this is fine. If not, you've just added a lot of stuff 
that you don't need. This adds unnecessary overhead in several
places (like your package system's database).
-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


pip does not find packages

2013-04-22 Thread Olive
I am using virtualenv and pip (from archlinux). What I have done:
virtualenv was installed by my distribution. I have made a virtual environment 
and activate it, it has installed pip, so far so good.

Now I am trying to install package in the virtualenvironnement:

pip install Impacket
Downloading/unpacking Impacket
  Could not find any downloads that satisfy the requirement Impacket
No distributions at all found for Impacket

but Impacket is found by 
pip search Impacket
Impacket  - Network protocols Constructors and Dissectors

exactly the same happens with pcapy. With PyGTK, the pip command just hang when 
trying to download it. What is going on? Maybe a misconfigured server? Is there 
anything that I can do?

Olive

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 03:08:24 -0500, Andrew Berg wrote:

 Much of the stdlib doesn't rely on anything but the core interpreter.
 tkinter by itself is not the issue. As you said, the bindings are tiny.
 However, in order to be usable, it requires quite a few things - most
 notably X. On desktop Linux, this is already installed, but on server
 systems, it generally is not (or at least shouldn't be in most cases).
 Going back to my example of a web server using a Python-based framework,
 I'll repeat that there is no reason such a system should have X even
 installed in order to serve web pages. Even on a lean, mean server
 machine, CPython requires only a few extra libraries. Add tkinter, and
 suddenly you have to install a LOT of things. If you plan to actually
 use tkinter, this is fine. If not, you've just added a lot of stuff that
 you don't need. This adds unnecessary overhead in several places (like
 your package system's database).


I can't disagree with any of this, except to say that none of this 
justifies having a separate package for Tkinter. Naturally if you don't 
have X, Tcl won't work, and if Tcl won't work, Tkinter won't work and 
should give an import error. But that doesn't imply that X must be a 
dependency for Python. It's a dependency for having Tkinter *work*, but 
not for *installing* Tkinter as part of the standard library.

Hell, even if you have X installed, and Tcl, and the Tkinter packages, 
importing tkinter can still fail, if Python wasn't built with the right 
magic incantations for it to recognise that Tcl is installed.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Serial Port Issue

2013-04-22 Thread chandan kumar
Hi,
I'm new to python and trying to learn serial communication using python.In this 
process i'm facing serial port issues.Please find the attached COMPorttest.py 
file ,correct me if anything wrong in the code.With my code it's always goes in 
to exception.I noted down the the COM port number  from windows device manager 
list. 
Operating system: XPPython Ver: 2.5Pyserial: 2.5
Even i tried from python shell passing below commands
import serialser=ser=serial.Serial(port=21,baudrate=9600)
I observe below error on python shell
File C:\Python25\lib\serial\serialwin32.py, line 55, in open    raise 
SerialException(could not open port: %s % msg)SerialException: could not open 
port: (2, 'CreateFile', 'The system cannot find the file specified.')
Thanks in advance.
Best Regards,Chandan.


import serial


def checkPort():

This function reads the serial port and writes it.

try:
ser = serial.Serial(
port=21,
baudrate=9600,
bytesize=serial.EIGHTBITS,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
timeout=10  
)
   
if ser.isOpen(): 
print Port Open

else:   
   ser.close()
   printport closed
   
except serial.serialutil.SerialException:
print Failed to open port



checkPort()

attachment: DeviceManager.PNG-- 
http://mail.python.org/mailman/listinfo/python-list


Error in Import gv module

2013-04-22 Thread Megha Agrawal
https://code.google.com/p/python-graph/wiki/Example

When I am trying to run the code to draw a graph, given on above link, I am
getting following error:

ImportError: No module named gv

What can be the reasons?



Thank you!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Import gv module

2013-04-22 Thread Andreas Perstinger

On 22.04.2013 12:13, Megha Agrawal wrote:

https://code.google.com/p/python-graph/wiki/Example

When I am trying to run the code to draw a graph, given on above link, I am
getting following error:

ImportError: No module named gv

What can be the reasons?


Which OS?

It looks like you are missing graphviz or you need to adapt your paths:
https://code.google.com/p/python-graph/issues/detail?id=15

Bye, Andreas
--
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Import gv module

2013-04-22 Thread Dave Angel

On 04/22/2013 06:13 AM, Megha Agrawal wrote:

https://code.google.com/p/python-graph/wiki/Example

When I am trying to run the code to draw a graph, given on above link, I am
getting following error:

ImportError: No module named gv

What can be the reasons?




Simplest is that you haven't installed python-graph


https://code.google.com/p/python-graph/downloads/list

or, more directly, https://code.google.com/p/python-graph/


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread lcrocker
On Apr 21, 11:36 pm, Rui Maciel rui.mac...@gmail.com wrote:
 Steven D'Aprano wrote:
  It's only easy to install a package on Ubuntu if you know that you have
  to, and can somehow work out the name of the package.

 No one actually has to install tkinter.  That's the whole point of providing
 it as a separate package: only those who want to use it have to install it.
 The rest of us don't.

I'm a programmer, I installed Tkinter, and use it. I'd like to deploy
programs
written with it to others.  **Those** people know nothing about it,
and
**shouldn't have to**. I've given them a program in Python, they have
Python,
but it doesn't run, and doesn't give them a helpful error. They'll
probably
just blame me and move on.  Not every Python user is a programmer.  If
I write
a program in Java, any user with Java installed can run it.  As it
stands,
that's no true for Python.  That's not good PR for the cause.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread rusi
On Apr 22, 11:35 am, Rui Maciel rui.mac...@gmail.com wrote:
 lcrocker wrote:
 
  I recently recommended
  Python to a friend who wants to start learning programming. Hurdles
  like this don't help someone like him.

 If your friend believes that having to do an extra pair of clicks or typing
 sudo apt-get install python-tk is an unbeatable hurdle then your friend's
 computer skills are awfully lacking and he won't have much success learning
 how to write programs.

There are two worldviews here and they are as far as far can be. Its
good to see them before arguing.
1. python as a standalone language
2. python as part of an (OS-related) ecosystem

In windows python may or may not exist.  And if it exists and I go
inside the python directories and start messing around -- deleting
some files, modifying others etc -- what will happen? Nothing much. My
python programs will stop working.
Presumably if I reinstall, it will be fine thereafter.

What about linux?
As an experiment I just tried
$ aptitude purge python
#Noobs BEWARE of that command
and aptitude was too confused to give me a coherent report

Tried then
$ aptitude purge python2.7
The list of packages that it would purge was in hundreds. Heres a
small sample of what would go:
Firstly there are all the python-* packages.  This is obvious. Not so
obvious that some like python-csound were probably installed by me.
Others like python-debian are needed for the basic health and
functioning of the system.

And besides these there are a pile of others that have no relation to
python.  A sample:
asciidoc, bzr, dia, eog, gcj-*, gdb(!!), gimp, gnome-* (about 20 of
these) printconf…

So python is completely optional in windows.
It is a part of the infrastructure on linux
Messing with it is almost like saying: I dont see what that vmlinuz
file is doing in /boot. So I removed it.

Coming to the OP question:
a. The python that PSF provides is suitable for learning python
b. The python that linux distros provide is part of the wireframe on
which the system rests.

b may be derived from a but they are hardly the same.  They may look
very similar but their intents are quite different.

So when you say

 If your friend believes that having to do an extra pair of clicks or typing
 sudo apt-get install python-tk is an unbeatable hurdle then your friend's
 computer skills are awfully lacking and he won't have much success learning
 how to write programs.

Its all correct what you say.  You wont have too many people learning
from you if thats how you say it.
Remember that the difference between an expert and a noob is rarely a
question of intelligence or diligence.
Its just some boring trivial mountain of data that the expert has
picked up over time
-- 
http://mail.python.org/mailman/listinfo/python-list


List Count

2013-04-22 Thread Blind Anagram
I would be grateful for any advice people can offer on the fastest way
to count items in a sub-sequence of a large list.

I have a list of boolean values that can contain many hundreds of
millions of elements for which I want to count the number of True values
in a sub-sequence, one from the start up to some value (say hi).

I am currently using:

   sieve[:hi].count(True)

but I believe this may be costly because it copies a possibly large part
of the sieve.

Ideally I would like to be able to use:

   sieve.count(True, hi)

where 'hi' sets the end of the count but this function is, sadly, not
available for lists.

The use of a bytearray with a memoryview object instead of a list solves
this particular problem but it is not a solution for me as it creates
more problems than it solves in other aspects of the program.

Can I assume that one possible solution would be to sub-class list and
create a C based extension to provide list.count(value, limit)?

Are there any other solutions that will avoid copying a large part of
the list?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread rusi
On Apr 22, 4:18 pm, lcrocker leedanielcroc...@gmail.com wrote:
 On Apr 21, 11:36 pm, Rui Maciel rui.mac...@gmail.com wrote:

  Steven D'Aprano wrote:
   It's only easy to install a package on Ubuntu if you know that you have
   to, and can somehow work out the name of the package.

  No one actually has to install tkinter.  That's the whole point of providing
  it as a separate package: only those who want to use it have to install it.
  The rest of us don't.

 I'm a programmer, I installed Tkinter, and use it. I'd like to deploy
 programs
 written with it to others.  **Those** people know nothing about it,
 and
 **shouldn't have to**. I've given them a program in Python, they have
 Python,
 but it doesn't run, and doesn't give them a helpful error. They'll
 probably
 just blame me and move on.  Not every Python user is a programmer.  If
 I write
 a program in Java, any user with Java installed can run it.  As it
 stands,
 that's no true for Python.  That's not good PR for the cause.

On the whole agree -- except for the java part -- maybe you've not
heard of 'jar hell'?
On the whole easy-deployability without losing easy-programmability is
a major research issue.

See this for someone choosing C++ over Lisp
http://comments.gmane.org/gmane.comp.finance.ledger.general/1955
-- 
http://mail.python.org/mailman/listinfo/python-list


Selenium Webdriver + Python (How to get started ??)

2013-04-22 Thread arif7d . auto
Note that:- I have some experience of using Selenium IDE and Webdriver (Java). 
but no prior experience of Python.

Now there is a project for which I will need to work with webdriver + Python. 

So far I have done following steps.. 

Install JDK
Setup Eclipse
download  Installed Python v3.3.1
Download  Installed Pydev (for Eclipse) also configured
download  installed (Distribute + PIP) 
http://www.lfd.uci.edu/~gohlke/pythonlibs/#pip
Installed Selenium using command prompt

Running following commands from windows 7 command prompt, successfully opens 
firefox browser

python
from selenium import webdriver
webdriver.Firefox()

--

ISSUE is that, I do not know exact steps of creating a python webdriver test 
project.

I create new Pydev project with a src folder and also used sample python code 
from internet but selenium classes cannot be recognized. I have tried various 
approaches to import libraries but none seems to work. Any one can guide me 
what i need to do step by step to successfully run a simple test via python 
webdriver!! (eclipse pydev)

Thank you.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Import gv module

2013-04-22 Thread Andreas Perstinger

Please avoid top posting and answer to the list.

On 22.04.2013 12:38, Megha Agrawal wrote:

Widows 7, and i have pygraphviz library in python27- lib-
site-package folder.


Sorry don't know much about Windows.

Have you read through all the issues involving import gv errors?:
https://code.google.com/p/python-graph/issues/list?can=1q=import+gvcolspec=ID+Type+Status+Priority+Milestone+Owner+Summarycells=tiles

Bye, Andreas
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to set my gui?

2013-04-22 Thread Gene Heskett
On Friday 19 April 2013 22:16:18 Chris Angelico did opine:

 On Sat, Apr 20, 2013 at 9:10 AM, Dennis Lee Bieber
 
 wlfr...@ix.netcom.com wrote:
  On Fri, 19 Apr 2013 09:24:36 +1000, Chris Angelico ros...@gmail.com
  
  declaimed the following in gmane.comp.python.general:
  On Fri, Apr 19, 2013 at 8:57 AM, Walter Hurry 
walterhu...@lavabit.com wrote:
   On Fri, 19 Apr 2013 08:00:11 +1000, Chris Angelico wrote:
   But 1 Corinthians 13:11
   
   You are grown up now, I surmise.
  :
  :) Born in 1984, so that'll give you some idea where I was in the
  :1990s.
  :
  A puppy to be taught by greymuzzles (unfortunately, /this/
  
  greymuzzle [1958] has reached the point of being an old dog that only
  learns new tricks with extreme difficulty G)
 
 Yep, taught by my Dad, who has often told the story of how he once
 held a whole kilobyte of memory in his hands (something like a cubic
 meter in size). He introduced me to programming, to fiddling with the
 system configs (actually he forbade that, for ages - because he had to
 clean up the mess if the system wouldn't boot), and to the joys of
 networking. So in a large way he's why I'm a geek... and actually he
 started that even earlier, when I was given the name Chris at birth.
 That on its own probably is the biggest cause of my geekery, I think!
 
 ChrisA

Buncha spring chickens, the whole lot of you.  Born in '34, I was a geek 
before the word was invented.  But like some of you claim, I am now that 
old dog that doesn't learn new tricks easily.

Cheers, Gene
-- 
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)
My web page: http://coyoteden.dyndns-free.com:85/gene is up!
My views 
http://www.armchairpatriot.com/What%20Has%20America%20Become.shtml
Mandrell: You know what I think?
Doctor:   Ah, ah that's a catch question. With a brain your size you
  don't think, right?
-- Dr. Who
A pen in the hand of this president is far more
dangerous than a gun in the hands of 200 million
  law-abiding citizens.
-- 
http://mail.python.org/mailman/listinfo/python-list


Confusing Algorithm

2013-04-22 Thread RBotha
I'm facing the following problem:


In a city of towerblocks, Spiderman can 
“cover” all the towers by connecting the 
first tower with a spider-thread to the top 
of a later tower and then to a next tower 
and then to yet another tower until he 
reaches the end of the city. Threads are 
straight lines and cannot intersect towers. 
Your task is to write a program that finds 
the minimal number of threads to cover all 
the towers. The list of towers is given as a 
list of single digits indicating their height.

-Example:
List of towers: 1 5 3 7 2 5 2
Output: 4


I'm not sure how a 'towerblock' could be defined. How square does a shape have 
to be to qualify as a towerblock? Any help on solving this problem?

Thank you.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Dave Angel

On 04/22/2013 07:58 AM, Blind Anagram wrote:

I would be grateful for any advice people can offer on the fastest way
to count items in a sub-sequence of a large list.

I have a list of boolean values that can contain many hundreds of
millions of elements for which I want to count the number of True values
in a sub-sequence, one from the start up to some value (say hi).

I am currently using:

sieve[:hi].count(True)

but I believe this may be costly because it copies a possibly large part
of the sieve.

Ideally I would like to be able to use:

sieve.count(True, hi)

where 'hi' sets the end of the count but this function is, sadly, not
available for lists.

The use of a bytearray with a memoryview object instead of a list solves
this particular problem but it is not a solution for me as it creates
more problems than it solves in other aspects of the program.

Can I assume that one possible solution would be to sub-class list and
create a C based extension to provide list.count(value, limit)?

Are there any other solutions that will avoid copying a large part of
the list?



Instead of using the default slice notation, why not use 
itertools.islice() ?


Something like  (untested):

import itertools

it = itertools.islice(sieve, 0, hi)
sum(itertools.imap(bool, it))

I only broke it into two lines for clarity.  It could also be:

sum(itertools.imap(bool, itertools.islice(sieve, 0, hi)))

If you're using Python 3.x, say so, and I'm sure somebody can simplify 
these, since in Python 3, many functions already produce iterators 
instead of lists.



--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Antoon Pardon
Op 22-04-13 11:18, Steven D'Aprano schreef:
 On Mon, 22 Apr 2013 03:08:24 -0500, Andrew Berg wrote:

 Much of the stdlib doesn't rely on anything but the core interpreter.
 tkinter by itself is not the issue. As you said, the bindings are tiny.
 However, in order to be usable, it requires quite a few things - most
 notably X. On desktop Linux, this is already installed, but on server
 systems, it generally is not (or at least shouldn't be in most cases).
 Going back to my example of a web server using a Python-based framework,
 I'll repeat that there is no reason such a system should have X even
 installed in order to serve web pages. Even on a lean, mean server
 machine, CPython requires only a few extra libraries. Add tkinter, and
 suddenly you have to install a LOT of things. If you plan to actually
 use tkinter, this is fine. If not, you've just added a lot of stuff that
 you don't need. This adds unnecessary overhead in several places (like
 your package system's database).
 I can't disagree with any of this, except to say that none of this 
 justifies having a separate package for Tkinter. Naturally if you don't 
 have X, Tcl won't work, and if Tcl won't work, Tkinter won't work and 
 should give an import error. But that doesn't imply that X must be a 
 dependency for Python. It's a dependency for having Tkinter *work*, but 
 not for *installing* Tkinter as part of the standard library.

 Hell, even if you have X installed, and Tcl, and the Tkinter packages, 
 importing tkinter can still fail, if Python wasn't built with the right 
 magic incantations for it to recognise that Tcl is installed.
Then don't use a package system. The job of a package system is, that if
you install something, it install all dependencies that are needed to make
it work. And if, as the OP you thinks, python working, means tkinter working,
not installing tcl and not installing X, is not an option.

Your solution doesn't make sense in view of your earlier response where
you argue tkinster should be installed because it is part of the standard
combined with the advantage of having a standard library. But IMO a part
of that standard library not working, is just as harmful as part of that
standard library not being installed. From a user/programmer's point of
view the result is the same. It is unusable.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Confusing Algorithm

2013-04-22 Thread Chris Angelico
On Mon, Apr 22, 2013 at 10:39 PM, RBotha r...@ymond.co.za wrote:
 I'm facing the following problem:

 
 In a city of towerblocks, Spiderman can
 “cover” all the towers by connecting the
 first tower with a spider-thread to the top
 of a later tower and then to a next tower
 and then to yet another tower until he
 reaches the end of the city. Threads are
 straight lines and cannot intersect towers.
 Your task is to write a program that finds
 the minimal number of threads to cover all
 the towers. The list of towers is given as a
 list of single digits indicating their height.

 -Example:
 List of towers: 1 5 3 7 2 5 2
 Output: 4
 

 I'm not sure how a 'towerblock' could be defined. How square does a shape 
 have to be to qualify as a towerblock? Any help on solving this problem?

First start by clarifying the problem. My reading of this is that
Spiderman iterates over the towers, connecting his thread from one to
the next, but only so long as the towers get shorter:

First thread
1
New thread
5-3
New thread
7-2
New thread
5-2

There are other possible readings of the problem. Once you fully
understand the problem, write it out in pseudo-code - something like
this:

Iterate over towers sequentially
If next tower is taller than current tower, new thread
Report number of new threads

And then turn it into code, and start running it and playing with
it... and debugging it. (There's a small error in the pseudo-code I
just posted; can you spot it?) Once you're at that point, if you get
stuck, post your code and we can try to help.

But fundamentally, I think you don't _need_ to define a towerblock. :)

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Blind Anagram
On 22/04/2013 13:51, Dave Angel wrote:
 On 04/22/2013 07:58 AM, Blind Anagram wrote:
 I would be grateful for any advice people can offer on the fastest way
 to count items in a sub-sequence of a large list.

 I have a list of boolean values that can contain many hundreds of
 millions of elements for which I want to count the number of True values
 in a sub-sequence, one from the start up to some value (say hi).

 I am currently using:

 sieve[:hi].count(True)

 but I believe this may be costly because it copies a possibly large part
 of the sieve.

 Ideally I would like to be able to use:

 sieve.count(True, hi)

 where 'hi' sets the end of the count but this function is, sadly, not
 available for lists.

 The use of a bytearray with a memoryview object instead of a list solves
 this particular problem but it is not a solution for me as it creates
 more problems than it solves in other aspects of the program.

 Can I assume that one possible solution would be to sub-class list and
 create a C based extension to provide list.count(value, limit)?

 Are there any other solutions that will avoid copying a large part of
 the list?

 
 Instead of using the default slice notation, why not use
 itertools.islice() ?
 
 Something like  (untested):
 
 import itertools
 
 it = itertools.islice(sieve, 0, hi)
 sum(itertools.imap(bool, it))
 
 I only broke it into two lines for clarity.  It could also be:
 
 sum(itertools.imap(bool, itertools.islice(sieve, 0, hi)))
 
 If you're using Python 3.x, say so, and I'm sure somebody can simplify
 these, since in Python 3, many functions already produce iterators
 instead of lists.

Thanks, I'll look at these ideas.  And, yes, my interest is mainly in
Python 3.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Import gv module

2013-04-22 Thread Megha Agrawal
yes, I did. They said, gv module doesn't exist for windows.


On Mon, Apr 22, 2013 at 5:56 PM, Andreas Perstinger andiper...@gmail.comwrote:

 Please avoid top posting and answer to the list.


 On 22.04.2013 12:38, Megha Agrawal wrote:

 Widows 7, and i have pygraphviz library in python27- lib-
 site-package folder.


 Sorry don't know much about Windows.

 Have you read through all the issues involving import gv errors?:
 https://code.google.com/p/**python-graph/issues/list?can=**
 1q=import+gvcolspec=ID+Type+**Status+Priority+Milestone+**
 Owner+Summarycells=tileshttps://code.google.com/p/python-graph/issues/list?can=1q=import+gvcolspec=ID+Type+Status+Priority+Milestone+Owner+Summarycells=tiles


 Bye, Andreas
 --
 http://mail.python.org/**mailman/listinfo/python-listhttp://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote:

 I would be grateful for any advice people can offer on the fastest way
 to count items in a sub-sequence of a large list.
 
 I have a list of boolean values that can contain many hundreds of
 millions of elements for which I want to count the number of True values
 in a sub-sequence, one from the start up to some value (say hi).
 
 I am currently using:
 
sieve[:hi].count(True)
 
 but I believe this may be costly because it copies a possibly large part
 of the sieve.

Have you timed it? Because Python is a high-level language, it is rarely 
obvious what code will be fast. Yes, sieve[:hi] will copy the first hi 
entries, but that's likely to be fast, basically just a memcopy, unless 
sieve is huge and memory is short. In other words, unless your sieve is 
so huge that the operating system cannot find enough memory for it, 
making a copy is likely to be relatively insignificant.

I've just tried seven different techniques to optimize this, and the 
simplest, most obvious technique is by far the fastest. Here are the 
seven different code snippets I measured, with results:


sieve[:hi].count(True)
sum(sieve[:hi])
sum(islice(sieve, hi))
sum(x for x in islice(sieve, hi) if x)
sum(x for x in islice(sieve, hi) if x is True)
sum(1 for x in islice(sieve, hi) if x is True)
len(list(filter(None, islice(sieve, hi


Here's the code I used to time them. Just copy and paste into an 
interactive interpreter:

=== cut ===

import random
sieve = [random.random()  0.5 for i in range(10**7)]

from timeit import Timer
setup = from __main__ import sieve
from itertools import islice
hi = 7*10**6


t1 = Timer(sieve[:hi].count(True), setup)
t2 = Timer(sum(sieve[:hi]), setup)
t3 = Timer(sum(islice(sieve, hi)), setup)
t4 = Timer(sum(x for x in islice(sieve, hi) if x), setup)
t5 = Timer(sum(x for x in islice(sieve, hi) if x is True), setup)
t6 = Timer(sum(1 for x in islice(sieve, hi) if x is True), setup)
t7 = Timer(len(list(filter(None, islice(sieve, hi, setup)

for t in (t1, t2, t3, t4, t5, t6, t7):
print( min(t.repeat(number=10)) )

=== cut ===


On my computer, using Python 3.3, here are the timing results I get:

2.3714727330952883
7.96061935601756
7.230580328032374
10.080201900098473
11.544118068180978
9.216834562830627
3.499635103158653


Times shown are in seconds, and are for the best of three trials, each 
trial having 10 repetitions of the code being tested.

As you can see, clever tricks using sum are horrible pessimisations, the 
only thing that comes close to the obvious solution is the one using 
filter.

Although I have only tested a list with ten million items, not hundreds 
of millions, I don't expect that the results will be significantly 
different if you use a larger list, unless you are very short of memory.

[...]
 Can I assume that one possible solution would be to sub-class list and
 create a C based extension to provide list.count(value, limit)?

Of course. But don't optimize this until you know that you *need* to 
optimize it. Is it really a bottleneck in your code? There's no point in 
saving the 0.1 second it takes to copy the list if it takes 2 seconds to 
count the items regardless.


 Are there any other solutions that will avoid copying a large part of
 the list?

Yes, but they're slower.

Perhaps a better solution might be to avoid counting anything. If you can 
keep a counter, and each time you add a value to the list you update the 
counter, then getting the number of True values will be instantaneous.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Chris Angelico
On Mon, Apr 22, 2013 at 9:18 PM, lcrocker leedanielcroc...@gmail.com wrote:
 On Apr 21, 11:36 pm, Rui Maciel rui.mac...@gmail.com wrote:
 Steven D'Aprano wrote:
  It's only easy to install a package on Ubuntu if you know that you have
  to, and can somehow work out the name of the package.

 No one actually has to install tkinter.  That's the whole point of providing
 it as a separate package: only those who want to use it have to install it.
 The rest of us don't.

 I'm a programmer, I installed Tkinter, and use it. I'd like to deploy
 programs
 written with it to others.  **Those** people know nothing about it,
 and
 **shouldn't have to**. I've given them a program in Python, they have
 Python,
 but it doesn't run, and doesn't give them a helpful error. They'll
 probably
 just blame me and move on.  Not every Python user is a programmer.  If
 I write
 a program in Java, any user with Java installed can run it.  As it
 stands,
 that's no true for Python.  That's not good PR for the cause.

If you're deploying only to Debian-based Linuxes (such as the Ubuntu
you mentioned originally), then it may be worth distributing your
program as a .deb file and declaring all the appropriate dependencies
(which would then include python3-tk). Alternatively, just put an
apt-get install python3-tk into your install script (which is what I
do for internal deployments - if you need package XYZ for program Foo,
inst-foo will install XYZ), or simply tell people they need to install
it. How do you make sure they even have a Python 3.x? Whatever you do
to ensure that, just add python3-tk to it.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: itertools.groupby

2013-04-22 Thread Wolfgang Maier
Jason Friedman jsf80238 at gmail.com writes:

 
 Thank you for the responses!  Not sure yet which one I will pick.
 

Hi again,
I was working a bit on my own solution and on the one from Steven/Joshua,
and maybe that helps you deciding:

def separate_on(iterable, separator):
# based on groupby
sep_len=len(separator)
for is_header, item in groupby(iterable,
lambda line: line[:sep_len] == separator):
if is_header:
header_tails = [h[sep_len:].strip() for h in item]
for naked_header in header_tails[:-1]:
yield (naked_header,[])
header_tail = header_tails[-1]
else:
try:
yield (header_tail, [s.strip() for s in item])
except UnboundLocalError:
yield (None, [s.strip() for s in item])


def group(iterable, separator):
# Steven's/Joshua's rewritten
sep_len = len(separator)
accum = None
header = None
for item in iterable:
item = item.strip()
if item[:sep_len] == separator:
if accum is not None:
# Don't bother if there are no accumulated lines.
yield (header, accum)
header = item[sep_len:]
accum = []
else:
try:
accum.append(item)
except AttributeError:
accum = [item]

# Don't forget the last group of lines.
yield (header, accum)

Both versions behave as follows:
- any line that *starts* with the separator is treated as a header line. The
tail of that line is returned as the groups title in a tuple with the
group's content, i.e. (header, [body]). If there's only the separator, the
title is ''. I find this a more useful behaviour as it allows things like:

##Group1
elem1
elem2
elem3
##Group2
a
b
c
...

- if there are headers without body, they are reported as (header, []).
- if the first body has no header, that's reported as (None, [body]).

Advantages  Disadvantages of either form:
Steven's/Joshua's: simple and fast
it's more readable I'd say, and
for small groups the groupby implementation is about 1.5x slower than this
one. The groupby version catches up with increasing group sizes (because it
uses comprehensions instead of list.append I think), but it only wins with
groups of ~1000 elements.

the groupby implementation: more flexible
its yield statement deliberately returns a list of the elements, but before
that you just have an iterator, which you could just as well turn into a
tuple, set, string or anything without constructing the list in memory.
So in terms of code recycling this might be preferable.

Cheers,
Wolfgang

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Peter Otten
Blind Anagram wrote:

 I would be grateful for any advice people can offer on the fastest way
 to count items in a sub-sequence of a large list.
 
 I have a list of boolean values that can contain many hundreds of
 millions of elements for which I want to count the number of True values
 in a sub-sequence, one from the start up to some value (say hi).
 
 I am currently using:
 
sieve[:hi].count(True)
 
 but I believe this may be costly because it copies a possibly large part
 of the sieve.
 
 Ideally I would like to be able to use:
 
sieve.count(True, hi)
 
 where 'hi' sets the end of the count but this function is, sadly, not
 available for lists.
 
 The use of a bytearray with a memoryview object instead of a list solves
 this particular problem but it is not a solution for me as it creates
 more problems than it solves in other aspects of the program.
 
 Can I assume that one possible solution would be to sub-class list and
 create a C based extension to provide list.count(value, limit)?
 
 Are there any other solutions that will avoid copying a large part of
 the list?

If the list doesn't change often you can convert it to a string

 items = [True, False, False] * 10
 sitems = .join(FT[i] for i in items)
 sitems
'TFFTFFTFFTFFTFFTFFTFFTFFTFFTFF'
 sitems.count(T, 3, 10)
3
 sitems.count(F, 3, 10)
4

Or you use a[3:10].sum() on a boolean numpy array. Its slices are views 
rather than copies:

 import numpy
 a = numpy.array([True, False, False]*10)
 a[3:10].sum()
3


-- 
http://mail.python.org/mailman/listinfo/python-list


kbhit/getch python equivalent

2013-04-22 Thread alb
Hi everyone,

I'm looking for a kbhit/getch equivalent in python in order to be able
to stop my inner loop in a controlled way (communication with external
hardware is involved and breaking it abruptly may cause unwanted errors
on the protocol).

I'm programming on *nix systems, no need to be portable on Windows. I've
seen the msvcrt module, but it looks like is for Windows only.

Any ideas/suggestions?

Al

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread 88888 Dihedral
Blind Anagram於 2013年4月22日星期一UTC+8下午7時58分20秒寫道:
 I would be grateful for any advice people can offer on the fastest way
 
 to count items in a sub-sequence of a large list.
 
 
 
 I have a list of boolean values that can contain many hundreds of
 
 millions of elements for which I want to count the number of True values
 
 in a sub-sequence, one from the start up to some value (say hi).
 
 
 
 I am currently using:
 
 
 
sieve[:hi].count(True)
 
 
 
 but I believe this may be costly because it copies a possibly large part
 
 of the sieve.
 
 
 
 Ideally I would like to be able to use:
 
 
 
sieve.count(True, hi)
 
 
 
 where 'hi' sets the end of the count but this function is, sadly, not
 
 available for lists.
 
 
 
 The use of a bytearray with a memoryview object instead of a list solves
 
 this particular problem but it is not a solution for me as it creates
 
 more problems than it solves in other aspects of the program.
 
 
 
 Can I assume that one possible solution would be to sub-class list and
 
 create a C based extension to provide list.count(value, limit)?
 
 
 
 Are there any other solutions that will avoid copying a large part of
 
 the list?

For those problems related to a homogeneous list of numbers
, please check whether  the arrays in numpy can fit your needs practically or 
not.


Sometimes I work on numbers  in varied ranges, 
then the list and the long integers in Python is really handy.





-- 
http://mail.python.org/mailman/listinfo/python-list


HTTPServer again

2013-04-22 Thread Tom P

Hi,
 a few weeks back I posed a question about passing static data to a 
request server, and thanks to some useful suggestions, got it working. I 
see yesterday there is a suggestion to use a framework like Tornado 
rather than base classes. However I can't figure achieve the same effect 
using Tornado (BTW this is all python 2.7). The key point is how to 
access the server class from within do_GET, and from the server class 
instance, to access its get and set methods.


Here are some code fragments that work with HTTPServer:

class MyHandler(BaseHTTPRequestHandler):

def do_GET(self):
ss = self.server
tracks = ss.tracks
. . .
class MyWebServer(object):
def get_params(self):
return self.global_params
def set_params(self, params):
self.global_params = params
def get_tracks(self):
return self.tracks
def __init__(self):
self.global_params = 
self.tracks = setup_()
myServer = HTTPServer
myServer.tracks = self.get_tracks()
myServer.params = self.get_params()
self.server = myServer(('', 7878), MyHandler)
print 'started httpserver on port 7878...'
. . . .
def main():
aServer = MyWebServer()
aServer.runIt()

if __name__ == '__main__':
main()





--
http://mail.python.org/mailman/listinfo/python-list


comp.lang.python

2013-04-22 Thread M.gowtham M.gowtham
Hi, 
  a few weeks back I posed a question about passing static data to a 
request server, and thanks to some useful suggestions, got it working. I 
see yesterday there is a suggestion to use a framework like Tornado 
rather than base classes. However I can't figure achieve the same effect 
using Tornado (BTW this is all python 2.7). The key point is how to 
access the server class from within do_GET, and from the server class 
instance, to access its get and set methods. 

Here are some code fragments that work with HTTPServer: 

class MyHandler(BaseHTTPRequestHandler): 

 def do_GET(self): 
 ss = self.server 
 tracks = ss.tracks 
. . . 
class MyWebServer(object): 
 def get_params(self): 
 return self.global_params 
 def set_params(self, params): 
 self.global_params = params 
 def get_tracks(self): 
 return self.tracks 
 def __init__(self): 
 self.global_params =  
 self.tracks = setup_() 
 myServer = HTTPServer 
 myServer.tracks = self.get_tracks() 
 myServer.params = self.get_params() 
 self.server = myServer(('', 7878), MyHandler) 
 print 'started httpserver on port 7878...' 
. . . . 
def main(): 
 aServer = MyWebServer() 
 aServer.runIt() 

if __name__ == '__main__': 
 main() 





Show trimmed content
website--  http://www.win2job.info/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Skip Montanaro
Numpy is a big improvement here.  In Py 2.7 I get this output if I run
Steven's benchmark:

2.10364603996
3.68471002579
4.01849389076
7.41974878311
10.4202470779
9.16782712936
3.36137390137

(confirming his results).  If I then run the numpy idiom for this:


import random
from timeit import Timer

import numpy

sieve = numpy.array([random.random()  0.5 for i in range(10**7)],
dtype=bool)

setup = from __main__ import sieve
from itertools import islice
hi = 7*10**6


t1 = Timer((True == sieve[:hi]).sum(), setup)

print(min(t1.repeat(number=10)))
###

I get :

0.344316959381

It likely consumes less space as well, since it doesn't store Python
objects in the array.

Skip
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: kbhit/getch python equivalent

2013-04-22 Thread Peter Otten
alb wrote:

 I'm looking for a kbhit/getch equivalent in python in order to be able
 to stop my inner loop in a controlled way (communication with external
 hardware is involved and breaking it abruptly may cause unwanted errors
 on the protocol).
 
 I'm programming on *nix systems, no need to be portable on Windows. I've
 seen the msvcrt module, but it looks like is for Windows only.
 
 Any ideas/suggestions?

Curses?

http://docs.python.org/dev/library/curses.html

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: kbhit/getch python equivalent

2013-04-22 Thread Grant Edwards
On 2013-04-22, alb alessandro.bas...@cern.ch wrote:

 I'm looking for a kbhit/getch equivalent in python in order to be able
 to stop my inner loop in a controlled way (communication with external
 hardware is involved and breaking it abruptly may cause unwanted errors
 on the protocol).

 I'm programming on *nix systems, no need to be portable on Windows. I've
 seen the msvcrt module, but it looks like is for Windows only.

 Any ideas/suggestions?

Signals, ncurses, termios.

-- 
Grant Edwards   grant.b.edwardsYow! ANN JILLIAN'S HAIR
  at   makes LONI ANDERSON'S
  gmail.comHAIR look like RICARDO
   MONTALBAN'S HAIR!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Blind Anagram
On 22/04/2013 14:13, Steven D'Aprano wrote:
 On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote:
 
 I would be grateful for any advice people can offer on the fastest way
 to count items in a sub-sequence of a large list.

 I have a list of boolean values that can contain many hundreds of
 millions of elements for which I want to count the number of True values
 in a sub-sequence, one from the start up to some value (say hi).

 I am currently using:

sieve[:hi].count(True)

 but I believe this may be costly because it copies a possibly large part
 of the sieve.
 
 Have you timed it? Because Python is a high-level language, it is rarely 
 obvious what code will be fast. Yes, sieve[:hi] will copy the first hi 
 entries, but that's likely to be fast, basically just a memcopy, unless 
 sieve is huge and memory is short. In other words, unless your sieve is 
 so huge that the operating system cannot find enough memory for it, 
 making a copy is likely to be relatively insignificant.
 
 I've just tried seven different techniques to optimize this, and the 
 simplest, most obvious technique is by far the fastest. Here are the 
 seven different code snippets I measured, with results:
 
 
 sieve[:hi].count(True)
 sum(sieve[:hi])
 sum(islice(sieve, hi))
 sum(x for x in islice(sieve, hi) if x)
 sum(x for x in islice(sieve, hi) if x is True)
 sum(1 for x in islice(sieve, hi) if x is True)
 len(list(filter(None, islice(sieve, hi

Yes, I did time it and I agree with your results (where my tests overlap
with yours).

But when using a sub-sequence, I do suffer a significant reduction in
speed for a count when compared with count on the full list.  When the
list is small enough not to cause memory allocation issues this is about
30% on 100,000,000 items.  But when the list is 1,000,000,000 items, OS
memory allocation becomes an issue and the cost on my system rises to
over 600%.

I agree that this is not a big issue but it seems to me a high price to
pay for the lack of a sieve.count(value, limit), which I feel is a
useful function (given that memoryview operations are not available for
lists).

 Of course. But don't optimize this until you know that you *need* to 
 optimize it. Is it really a bottleneck in your code? There's no point in 
 saving the 0.1 second it takes to copy the list if it takes 2 seconds to 
 count the items regardless.
 
 Are there any other solutions that will avoid copying a large part of
 the list?
 
 Yes, but they're slower.
 
 Perhaps a better solution might be to avoid counting anything. If you can 
 keep a counter, and each time you add a value to the list you update the 
 counter, then getting the number of True values will be instantaneous.

Creating the sieve is currently very fast as it is not done by adding
single items but by adding a large number of items at the same time
using a slice operation.  I could count the items in each slice as it is
added but this would add complexity that I would prefer to avoid because
the creation of the sieve is quite tricky to get right and I would
prefer not to fiddle with this.

Thank you (and others) for advice on this.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Preparing sqlite, dl and tkinter for Python installation (no admin rights)

2013-04-22 Thread Serhiy Storchaka

On 21.04.13 23:31, James Jong wrote:

I see, just to be clear, do you mean that Python 2.7.4 (stable) is
incompatible with Tk 8.6 (stable)?


Yes.


--
http://mail.python.org/mailman/listinfo/python-list


Re: itertools.groupby

2013-04-22 Thread Neil Cerutti
On 2013-04-20, Jason Friedman jsf80...@gmail.com wrote:
 I have a file such as:

 $ cat my_data
 Starting a new group
 a
 b
 c
 Starting a new group
 1
 2
 3
 4
 Starting a new group
 X
 Y
 Z
 Starting a new group

 I am wanting a list of lists:
 ['a', 'b', 'c']
 ['1', '2', '3', '4']
 ['X', 'Y', 'Z']
 []

Hrmmm, hoomm. Nobody cares for slicing any more.

def headered_groups(lst, header):
b = lst.index(header) + 1
while True:
try:
e = lst.index(header, b)
except ValueError:
yield lst[b:]
break
yield lst[b:e]
b = e+1

for group in headered_groups([line.strip() for line in open('data.txt')],
Starting a new group):
print(group)

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Weird behaviour?

2013-04-22 Thread nn
On Apr 21, 9:19 pm, Steven D'Aprano steve
+comp.lang.pyt...@pearwood.info wrote:
 On Mon, 22 Apr 2013 10:56:11 +1000, Chris Angelico wrote:
  You're running this under Windows. The convention on Windows is for
  end-of-line to be signalled with \r\n, but the convention inside Python
  is to use just \n. With the normal use of buffered and parsed input,
  this is all handled for you; with unbuffered input, that translation
  also seems to be disabled, so your string actually contains '120\r', as
  will be revealed by its repr().

 If that's actually the case, then I would call that a bug in raw_input.

 Actually, raw_input doesn't seem to cope well with embedded newlines even
 without the -u option. On Linux, I can embed a control character by
 typing Ctrl-V followed by Ctrl-char. E.g. Ctrl-V Ctrl-M to embed a
 carriage return, Ctrl-V Ctrl-J to embed a newline. So watch:

 [steve@ando ~]$ python2.7 -c x = raw_input('Hello? '); print repr(x)
 Hello? 120^M^Jabc
 '120\r'

 Everything after the newline is lost.

 --
 Steven

Maybe it is related to this bug?

http://bugs.python.org/issue11272


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: There must be a better way

2013-04-22 Thread Oscar Benjamin
On 21 April 2013 14:15, Colin J. Williams c...@ncf.ca wrote:
 In the end, I used:

 inData= csv.reader(inFile)

 def main():

 if ver == '2':
 headerLine= inData.next()
 else:
 headerLine= inData.__next__()
 ...
 for item in inData:
 assert len(dataStore) == len(item)
 j= findCardinal(item[10])
 ...

This may not be relevant for what you're doing but if you use
csv.DictReader there's no need to retrieve the top line separately:

$ cat tmp.csv
a,b,c
1,2,3
4,5,6
$ python
Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
(Intel)] on win32
Type help, copyright, credits or license for more information.
 import csv
 with open('tmp.csv', 'rb') as csvfile:
...   for row in csv.DictReader(csvfile):
... print(row)
...
{'a': '1', 'c': '3', 'b': '2'}
{'a': '4', 'c': '6', 'b': '5'}


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: There must be a better way

2013-04-22 Thread Neil Cerutti
On 2013-04-21, Colin J. Williams c...@ncf.ca wrote:
 On 20/04/2013 9:07 PM, Terry Jan Reedy wrote:
 On 4/20/2013 8:34 PM, Tim Chase wrote:
 In 2.x, the csv.reader() class (and csv.DictReader() class) offered
 a .next() method that is absent in 3.x

 In Py 3, .next was renamed to .__next__ for *all* iterators. The
 intention is that one iterate with for item in iterable or use builtin
 functions iter() and next().


 Thanks to Chris, Tim and Terry for their helpful comments.

 I was seeking some code that would be acceptable to both Python 2.7 and 3.3.

 In the end, I used:

 inData= csv.reader(inFile)

 def main():
  if ver == '2':
  headerLine= inData.next()
  else:
  headerLine= inData.__next__()
  ...
  for item in inData:
  assert len(dataStore) == len(item)
  j= findCardinal(item[10])
  ...

 This is acceptable to both versions.

 It is not usual to have a name with preceding and following 
 udserscores,imn user code.

 Presumably, there is a rationale for the change from csv.reader.next
 to csv.reader.__next__.

 If next is not acceptable for the version 3 csv.reader, perhaps __next__ 
 could be added to the version 2 csv.reader, so that the same code can be 
 used in the two versions.

 This would avoid the kluge I used above.

Would using csv.DictReader instead a csv.reader be an option?

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: itertools.groupby

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 15:24, Neil Cerutti ne...@norwich.edu wrote:

 Hrmmm, hoomm. Nobody cares for slicing any more.

 def headered_groups(lst, header):
 b = lst.index(header) + 1
 while True:
 try:
 e = lst.index(header, b)
 except ValueError:
 yield lst[b:]
 break
 yield lst[b:e]
 b = e+1

This requires the whole file to be read into memory. Iterators are
typically preferred over list slicing for sequential text file access
since you can avoid loading the whole file at once. This means that
you can process a large file while only using a constant amount of
memory.


 for group in headered_groups([line.strip() for line in open('data.txt')],
 Starting a new group):
 print(group)

The list comprehension above loads the entire file into memory.
Assuming that .strip() is just being used to remove the newline at the
end it would be better to just use the readlines() method since that
loads everything into memory and removes the newlines. To remove them
without reading everything you can use map (or imap in Python 2):

with open('data.txt') as inputfile:
for group in headered_groups(map(str.strip, inputfile)):
print(group)


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


error importing modules

2013-04-22 Thread Rodrick Brown
I'm using the fabric api (fabfile.org)

 I’m executing my fab script like the following:

$ fab -H server set_nic_buffers  -f set_nic_buffers.py
Traceback (most recent call last):
  File /usr/lib/python2.7/site-packages/fabric/main.py, line 739, in main
*args, **kwargs
  File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 316, in
execute
multiprocessing
  File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 213, in
_execute
return task.run(*args, **kwargs)
  File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 123, in run
return self.wrapped(*args, **kwargs)
  File /home/rbrown/repos/unix-tools/tools/fabfiles/nb.py, line 5, in
set_nic_buffers
f_exec = modules.Fabexec('set_nic_buffers',
'/var/tmp/unix-tools/tools/set_nic_buffers.sh')
TypeError: 'module' object is not callable

 My paths all seem to be fine not sure what’s going on

$ python -c 'import modules.Fabexec; print (modules.Fabexec)'
module 'modules.Fabexec' from 'modules/Fabexec.pyc'

Fabfiles

|-- modules
|   |-- Fabexec.py
|   |-- Fabexec.pyc
|   |-- __init__.py
|   `-- __init__.pyc
|-- systune.py
|-- systune.pyc
`-- set_nic_buffers.py

--- set_nic_buffers.py ---
import modules
from modules import Fabexec

def set_nic_buffers():
f_exec = modules.Fabexec('set_nic_buffers',
'/var/tmp/unix-tools/tools/set_nic_buffers.sh')
f_exec.run()

--- Fabexec.py ---
from fabric.api import run, cd, sudo, env
from fabric.contrib import files
from fabric.colors import green

class Fabexec(object):

repobase='/var/tmp/unix-tools'

def __init__(self,script_name,install_script):
self.script_name = script_name
self.install_script = install_script

def run(self):
if files.exists(self.install_script):
with cd(self.repobase):
result = sudo(self.install_script + ' %s ' % env.host)
if result.return_code != 0:
print(red('Error occured executing %s' %
self.install_script))
else:
print(green('%s executed successfully'))
else:
print(red('Error no such dir %s try running repo deploy script
to host %s' % (self.repobase, env.host)))
raise SystemExit()
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Import gv module

2013-04-22 Thread Andreas Perstinger

You are still top posting.

On 22.04.2013 14:43, Megha Agrawal wrote:

yes, I did. They said, gv module doesn't exist for windows.


Then I'm afraid you are out of luck.

Two possible alternatives:

1) Save your graph to a file and use the command line tools:
http://stackoverflow.com/a/12698636

2) Try some other Graphviz bindings. A quick search on PyPi gave me:
https://pypi.python.org/pypi/pygraphviz/1.1
https://pypi.python.org/pypi/pydot/1.0.28
https://pypi.python.org/pypi/yapgvb/1.2.0

Bye, Andreas
--
http://mail.python.org/mailman/listinfo/python-list


Re: Confusing Algorithm

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 12:57 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
 On 22 April 2013 13:56, Chris Angelico ros...@gmail.com wrote:
 There are other possible readings of the problem.

 I read it differently. I thought the threads would go 1-5-7-5-2.

I hadn't thought of that one, but agreed, that's also plausible, and
it results in an answer of 4. It's a stronger contender than the one I
posited, because the wording implies that there are multiple ways to
do it and you have to pick/find the best. Seems to me the problem's a
little under-specified, tbh.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: kbhit/getch python equivalent

2013-04-22 Thread Chris Angelico
On Mon, Apr 22, 2013 at 11:34 PM, alb alessandro.bas...@cern.ch wrote:
 I'm looking for a kbhit/getch equivalent in python in order to be able
 to stop my inner loop in a controlled way (communication with external
 hardware is involved and breaking it abruptly may cause unwanted errors
 on the protocol).

Catch KeyboardInterrupt and hit Ctrl-C.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Confusing Algorithm

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 13:56, Chris Angelico ros...@gmail.com wrote:
 On Mon, Apr 22, 2013 at 10:39 PM, RBotha r...@ymond.co.za wrote:
 I'm facing the following problem:

 
 In a city of towerblocks, Spiderman can
 “cover” all the towers by connecting the
 first tower with a spider-thread to the top
 of a later tower and then to a next tower
 and then to yet another tower until he
 reaches the end of the city. Threads are
 straight lines and cannot intersect towers.
 Your task is to write a program that finds
 the minimal number of threads to cover all
 the towers. The list of towers is given as a
 list of single digits indicating their height.

 -Example:
 List of towers: 1 5 3 7 2 5 2
 Output: 4
 

 I'm not sure how a 'towerblock' could be defined. How square does a shape 
 have to be to qualify as a towerblock? Any help on solving this problem?

 First start by clarifying the problem. My reading of this is that
 Spiderman iterates over the towers, connecting his thread from one to
 the next, but only so long as the towers get shorter:

-Example:
List of towers: 1 5 3 7 2 5 2
Output: 4

 First thread
 1
 New thread
 5-3
 New thread
 7-2
 New thread
 5-2

 There are other possible readings of the problem.

I read it differently. I thought the threads would go 1-5-7-5-2.


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: itertools.groupby

2013-04-22 Thread Neil Cerutti
On 2013-04-22, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 On 22 April 2013 15:24, Neil Cerutti ne...@norwich.edu wrote:

 Hrmmm, hoomm. Nobody cares for slicing any more.

 def headered_groups(lst, header):
 b = lst.index(header) + 1
 while True:
 try:
 e = lst.index(header, b)
 except ValueError:
 yield lst[b:]
 break
 yield lst[b:e]
 b = e+1

 This requires the whole file to be read into memory. Iterators
 are typically preferred over list slicing for sequential text
 file access since you can avoid loading the whole file at once.
 This means that you can process a large file while only using a
 constant amount of memory.

I agree, but this application processes unknowns-sized slices,
you have to build lists anyhow. I find slicing much more
convenient than accumulating in this case, but it's possibly a
tradeoff.

 with open('data.txt') as inputfile:
 for group in headered_groups(map(str.strip, inputfile)):
 print(group)

Thanks, that's a nice improvement.

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: itertools.groupby

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 12:49 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
 Iterators are
 typically preferred over list slicing for sequential text file access
 since you can avoid loading the whole file at once. This means that
 you can process a large file while only using a constant amount of
 memory.

And, perhaps even more importantly, allows you to pipe text in and
out. Obviously some operations (eg grep) lend themselves better to
this than do others (eg sort), but with this it ought at least to
output each group as it comes.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Serial Port Issue

2013-04-22 Thread Phil Birkelbach
Have you tried 'port=20'?

The documentation says that the port numbering starts at zero.  I don't use 
Windows so I can't test it for you.

You could also try port=COM21

Phil

On Apr 22, 2013, at 4:34 AM, chandan kumar wrote:

 Hi,
 
 I'm new to python and trying to learn serial communication using python.In 
 this process i'm facing serial port issues.Please find the attached 
 COMPorttest.py file ,correct me if anything wrong in the code.With my code 
 it's always goes in to exception.I noted down the the COM port number  from 
 windows device manager list. 
 
 Operating system: XP
 Python Ver: 2.5
 Pyserial: 2.5
 
 Even i tried from python shell passing below commands
 
 import serial
 ser=ser=serial.Serial(port=21,baudrate=9600)
 
 I observe below error on python shell
 
 File C:\Python25\lib\serial\serialwin32.py, line 55, in open
 raise SerialException(could not open port: %s % msg)
 SerialException: could not open port: (2, 'CreateFile', 'The system cannot 
 find the file specified.')
 
 Thanks in advance.
 
 Best Regards,
 Chandan.
 
 
 
 COMPortTest.pyDeviceManager.PNG-- 
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 15:15, Blind Anagram blindanag...@nowhere.org wrote:
 On 22/04/2013 14:13, Steven D'Aprano wrote:
 On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote:

 I would be grateful for any advice people can offer on the fastest way
 to count items in a sub-sequence of a large list.

 I have a list of boolean values that can contain many hundreds of
 millions of elements for which I want to count the number of True values
 in a sub-sequence, one from the start up to some value (say hi).

 I am currently using:

sieve[:hi].count(True)

 but I believe this may be costly because it copies a possibly large part
 of the sieve.
[snip]

 But when using a sub-sequence, I do suffer a significant reduction in
 speed for a count when compared with count on the full list.  When the
 list is small enough not to cause memory allocation issues this is about
 30% on 100,000,000 items.  But when the list is 1,000,000,000 items, OS
 memory allocation becomes an issue and the cost on my system rises to
 over 600%.

Have you tried using numpy? I find that it reduces the memory required
to store a list of bools by a factor of 4 on my 32 bit system. I would
expect that to be a factor of 8 on a 64 bit system:

 import sys
 a = [True] * 100
 sys.getsizeof(a)
436
 import numpy
 a = numpy.ndarray(100, bool)
 sys.getsizeof(a)  # This does not include the data buffer
40
 a.nbytes
100

The numpy array also has the advantage that slicing does not actually
copy the data (as has already been mentioned). On this system slicing
a numpy array has a 40 byte overhead regardless of the size of the
slice.

 I agree that this is not a big issue but it seems to me a high price to
 pay for the lack of a sieve.count(value, limit), which I feel is a
 useful function (given that memoryview operations are not available for
 lists).

It would be very easy to subclass list and add this functionality in
cython if you decide that you do need a builtin method.


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Blind Anagram
On 22/04/2013 16:14, Oscar Benjamin wrote:

 On 22 April 2013 15:15, Blind Anagram blindanag...@nowhere.org wrote:
 On 22/04/2013 14:13, Steven D'Aprano wrote:
 On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote:

 I would be grateful for any advice people can offer on the fastest way
 to count items in a sub-sequence of a large list.

 I have a list of boolean values that can contain many hundreds of
 millions of elements for which I want to count the number of True values
 in a sub-sequence, one from the start up to some value (say hi).

 I am currently using:

sieve[:hi].count(True)

 but I believe this may be costly because it copies a possibly large part
 of the sieve.
 [snip]

 But when using a sub-sequence, I do suffer a significant reduction in
 speed for a count when compared with count on the full list.  When the
 list is small enough not to cause memory allocation issues this is about
 30% on 100,000,000 items.  But when the list is 1,000,000,000 items, OS
 memory allocation becomes an issue and the cost on my system rises to
 over 600%.
 
 Have you tried using numpy? I find that it reduces the memory required
 to store a list of bools by a factor of 4 on my 32 bit system. I would
 expect that to be a factor of 8 on a 64 bit system:
 
 import sys
 a = [True] * 100
 sys.getsizeof(a)
 436
 import numpy
 a = numpy.ndarray(100, bool)
 sys.getsizeof(a)  # This does not include the data buffer
 40
 a.nbytes
 100
 
 The numpy array also has the advantage that slicing does not actually
 copy the data (as has already been mentioned). On this system slicing
 a numpy array has a 40 byte overhead regardless of the size of the
 slice.
 
 I agree that this is not a big issue but it seems to me a high price to
 pay for the lack of a sieve.count(value, limit), which I feel is a
 useful function (given that memoryview operations are not available for
 lists).
 
 It would be very easy to subclass list and add this functionality in
 cython if you decide that you do need a builtin method.

Thanks Oscar, I'll take a look at this.

But I was really wondering if there was a simple solution that worked
without people having to add libraries to their basic Python installations.

As I have never tried building an extension with cython, I am inclined
to try this as a learning exercise if nothing else.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 16:50, Blind Anagram blindanag...@nowhere.org wrote:

 It would be very easy to subclass list and add this functionality in
 cython if you decide that you do need a builtin method.
[snip]

 But I was really wondering if there was a simple solution that worked
 without people having to add libraries to their basic Python installations.

There are simple solutions and some have already been listed. You are
attempting to push your program to the limit of your hardware
capabilities and it's natural that in a high-level language you'll
often want special libraries for that.

I don't know what your application is but I would say that my first
port of call here would be to consider a different algorithmic
approach. An obvious question would be about the sparsity of this data
structure. How frequent are the values that you are trying to count?
Would it make more sense to store a list of their indices?

If the problem needs to be solved the way that you are currently doing
it and the available methods are not fast enough then you will need to
consider additional libraries.


 As I have never tried building an extension with cython, I am inclined
 to try this as a learning exercise if nothing else.

I definitely recommend this over writing a C extension directly.


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


How to connect to a website?

2013-04-22 Thread webmaster
Hi,
i just try to connect to a website, read that page and display the rules get 
from it.
Then i get this error message:

  File D:/Python/Py projects/socket test/sockettest.py, line 21, in module
fileobj.write(GET +filename+ HTTP/1.0\n\n)
io.UnsupportedOperation: not writable

My code:

# import sys for handling command line argument 
# import socket for network communications 
import sys, socket 
 
# hard-wire the port number for safety's sake 
# then take the names of the host and file from the command line 
port = 80 
host = 'www..nl' 
filename = 'index.php' 

# create a socket object called 'c' 
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 

# connect to the socket 
c.connect((host, port)) 

# create a file-like object to read 
fileobj = c.makefile('r', 1024) 

# Ask the server for the file 
fileobj.write(GET +filename+ HTTP/1.0\n\n) 

# read the lines of the file object into a buffer, buff 
buff = fileobj.readlines() 

# step through the buffer, printing each line 
for line in buff: 
print (line)


I started with invent games with python (book 1  2)
Now I want to write a multiplayergame, which connects to a website, where all 
players and gamedata will be stored/controlled.
Players need to subscribe and to login via the game software. (executable, made 
from python script)
Sending gamedata preferable in JSON, because of low traffic resources then.
No idea about how authentication proces should be done

I made many searches with Google, but got confused about my first steps. 
I am new to python, but code for many years in php/mysql. Spent most time in an 
online chessgame project.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: error importing modules

2013-04-22 Thread MRAB

On 22/04/2013 15:54, Rodrick Brown wrote:

I'm using the fabric api (fabfile.org http://fabfile.org)

  I’m executing my fab script like the following:

$ fab -H server set_nic_buffers  -f set_nic_buffers.py
Traceback (most recent call last):
   File /usr/lib/python2.7/site-packages/fabric/main.py, line 739, in main
 *args, **kwargs
   File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 316, in
execute
 multiprocessing
   File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 213, in
_execute
 return task.run(*args, **kwargs)
   File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 123, in run
 return self.wrapped(*args, **kwargs)
   File /home/rbrown/repos/unix-tools/tools/fabfiles/nb.py, line 5, in
set_nic_buffers
 f_exec = modules.Fabexec('set_nic_buffers',
'/var/tmp/unix-tools/tools/set_nic_buffers.sh')
TypeError: 'module' object is not callable

  My paths all seem to be fine not sure what’s going on

$ python -c 'import modules.Fabexec; print (modules.Fabexec)'
module 'modules.Fabexec' from 'modules/Fabexec.pyc'

Fabfiles

|-- modules
|   |-- Fabexec.py
|   |-- Fabexec.pyc
|   |-- __init__.py
|   `-- __init__.pyc
|-- systune.py
|-- systune.pyc
`-- set_nic_buffers.py

--- set_nic_buffers.py ---
import modules
from modules import Fabexec
def set_nic_buffers():
 f_exec = modules.Fabexec('set_nic_buffers',
'/var/tmp/unix-tools/tools/set_nic_buffers.sh')
 f_exec.run()

'modules.Fabexec' is the module/script 'Fabexec'. What you want is the 
'Fabexec' class within the 'Fabexec' module.



--- Fabexec.py ---
from fabric.api import run, cd, sudo, env
from fabric.contrib import files
from fabric.colors import green

class Fabexec(object):

 repobase='/var/tmp/unix-tools'

 def __init__(self,script_name,install_script):
 self.script_name = script_name
 self.install_script = install_script

 def run(self):
 if files.exists(self.install_script):
 with cd(self.repobase):
 result = sudo(self.install_script + ' %s ' % env.host)
 if result.return_code != 0:
 print(red('Error occured executing %s' %
self.install_script))
 else:
 print(green('%s executed successfully'))
 else:
 print(red('Error no such dir %s try running repo deploy
script to host %s' % (self.repobase, env.host)))
 raise SystemExit()




--
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Blind Anagram
On 22/04/2013 17:06, Oscar Benjamin wrote:
 On 22 April 2013 16:50, Blind Anagram blindanag...@nowhere.org wrote:

 It would be very easy to subclass list and add this functionality in
 cython if you decide that you do need a builtin method.
 [snip]

 But I was really wondering if there was a simple solution that worked
 without people having to add libraries to their basic Python installations.
 
 There are simple solutions and some have already been listed. You are
 attempting to push your program to the limit of your hardware
 capabilities and it's natural that in a high-level language you'll
 often want special libraries for that.

Hi Oscar

Yes, but it is a tribute to Python that I can do this quite fast for
huge lists provided that I only count on the full list.

And, unless I have completely misunderstood Python internals, it would
probably be just as fast on a sub-sequence if I had a list.count(value,
limit) function (however, I admit that I could be wrong here since the
fact that count on lists does not offer this may mean that it is not as
easy to implement as it might seem).

 I don't know what your application is but I would say that my first
 port of call here would be to consider a different algorithmic
 approach. An obvious question would be about the sparsity of this data
 structure. How frequent are the values that you are trying to count?
 Would it make more sense to store a list of their indices?

Actually it is no more than a simple prime sieve implemented as a Python
class (and, yes, I realize that there are plenty of these around).

 If the problem needs to be solved the way that you are currently doing
 it and the available methods are not fast enough then you will need to
 consider additional libraries.

 As I have never tried building an extension with cython, I am inclined
 to try this as a learning exercise if nothing else.
 
 I definitely recommend this over writing a C extension directly.

Thanks again - I will definitely look at this.

   Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to connect to a website?

2013-04-22 Thread John Gordon
In 566767a8-35cc-47f2-9f75-032ce5629...@googlegroups.com 
webmas...@terradon.nl writes:

 Hi,
 i just try to connect to a website, read that page and display the rules get 
 from it.
 Then i get this error message:

   File D:/Python/Py projects/socket test/sockettest.py, line 21, in module
 fileobj.write(GET +filename+ HTTP/1.0\n\n)
 io.UnsupportedOperation: not writable

I haven't worked with the socket library, but I think this error is because
you specified a mode of 'r' when calling makefile().  fileobj is read-only,
and you're trying to write to it.

If you just want to connect to a website, try using the urllib2 module
instead of socket.  It's higher-level and handles a lot of details for
you.  Here's an example:

import urllib2

request = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(request)
content = response.readlines()

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, The Gashlycrumb Tinies

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to connect to a website?

2013-04-22 Thread MRAB

On 22/04/2013 17:16, webmas...@terradon.nl wrote:

Hi,
i just try to connect to a website, read that page and display the rules get 
from it.
Then i get this error message:

   File D:/Python/Py projects/socket test/sockettest.py, line 21, in module
 fileobj.write(GET +filename+ HTTP/1.0\n\n)
io.UnsupportedOperation: not writable

My code:

# import sys for handling command line argument
# import socket for network communications
import sys, socket

# hard-wire the port number for safety's sake
# then take the names of the host and file from the command line
port = 80
host = 'www..nl'
filename = 'index.php'

# create a socket object called 'c'
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# connect to the socket
c.connect((host, port))

# create a file-like object to read
fileobj = c.makefile('r', 1024)


You're creating a file-like object for reading...


# Ask the server for the file
fileobj.write(GET +filename+ HTTP/1.0\n\n)


...and then trying to write to it.


# read the lines of the file object into a buffer, buff
buff = fileobj.readlines()

# step through the buffer, printing each line
for line in buff:
 print (line)


[snip]

--
http://mail.python.org/mailman/listinfo/python-list


Python Developer Needed in Ottawa

2013-04-22 Thread alika . resumes
Python Programmer need in Ottawa, Ontario Canada.

Must be eligible to work in Canada and preferably already in Ottawa with a 
security clearance in place.

Phone Al at (613) 425-1634
 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: There must be a better way

2013-04-22 Thread Colin J. Williams

On 22/04/2013 10:42 AM, Neil Cerutti wrote:

On 2013-04-21, Colin J. Williams c...@ncf.ca wrote:

On 20/04/2013 9:07 PM, Terry Jan Reedy wrote:

On 4/20/2013 8:34 PM, Tim Chase wrote:

In 2.x, the csv.reader() class (and csv.DictReader() class) offered
a .next() method that is absent in 3.x


In Py 3, .next was renamed to .__next__ for *all* iterators. The
intention is that one iterate with for item in iterable or use builtin
functions iter() and next().



Thanks to Chris, Tim and Terry for their helpful comments.

I was seeking some code that would be acceptable to both Python 2.7 and 3.3.

In the end, I used:

inData= csv.reader(inFile)

def main():
  if ver == '2':
  headerLine= inData.next()
  else:
  headerLine= inData.__next__()
  ...
  for item in inData:
  assert len(dataStore) == len(item)
  j= findCardinal(item[10])
  ...

This is acceptable to both versions.

It is not usual to have a name with preceding and following
udserscores,imn user code.

Presumably, there is a rationale for the change from csv.reader.next
to csv.reader.__next__.

If next is not acceptable for the version 3 csv.reader, perhaps __next__
could be added to the version 2 csv.reader, so that the same code can be
used in the two versions.

This would avoid the kluge I used above.


Would using csv.DictReader instead a csv.reader be an option?

Since I'm only interested in one or two columns, the simpler approach is 
probably better.


Colin W.
--
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Skip Montanaro
 But I was really wondering if there was a simple solution that worked
 without people having to add libraries to their basic Python installations.

I think installing numpy is approximately

pip install numpy

assuming you have write access to your site-packages directory.  If
not, install using --prefix and set PYTHONPATH accordingly.

I forgot that Python also has an array module.  With numpy available,
mature, and well-supported, I imagine it doesn't get much love these
days though.  Still, I gave it a whirl:

###
import random
import array
from timeit import Timer

import numpy

stuff = [random.random()  0.5 for i in range(10**7)]
sieve1 = numpy.array(stuff, dtype=bool)
sieve2 = array.array('B', stuff)

setup = from __main__ import sieve1, sieve2
from itertools import islice
hi = 7*10**6


t1 = Timer((True == sieve1[:hi]).sum(), setup)
t2 = Timer(sieve2[:hi].count(True), setup)
# t3 = Timer(sum(islice(sieve, hi)), setup)
# t4 = Timer(sum(x for x in islice(sieve, hi) if x), setup)
# t5 = Timer(sum(x for x in islice(sieve, hi) if x is True), setup)
# t6 = Timer(sum(1 for x in islice(sieve, hi) if x is True), setup)
# t7 = Timer(len(list(filter(None, islice(sieve, hi, setup)

print(min(t1.repeat(number=10)))
print(min(t2.repeat(number=10)))
# for t in (t1, t2, t3, t4, t5, t6, t7):
# print( min(t.repeat(number=10)) )
###

Performance was not all that impressive:

0.340315103531
5.42102503777

Still, you might fiddle around with it a bit.  Perhaps unsigned ints
instead of unsigned bytes will provide more efficient counting...

Skip
-- 
http://mail.python.org/mailman/listinfo/python-list


Lists and arrays

2013-04-22 Thread Ana Dionísio
Hello!

I need your help!

I have an array and I need pick some data from that array and put it in a list, 
for example:

array= [a,b,c,1,2,3]

list=array[0]+ array[3]+ array[4]

list: [a,1,2]

When I do it like this: list=array[0]+ array[3]+ array[4] I get an error:

TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 
'numpy.ndarray'

Can you help me?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lists and arrays

2013-04-22 Thread Dave Angel

On 04/22/2013 02:13 PM, Ana Dionísio wrote:

Hello!

I need your help!

I have an array


I think you mean you have a numpy array, which is very different than a 
python array.array




and I need pick some data from that array and put it in a list, for example:

array= [a,b,c,1,2,3]


That's a list.



list=array[0]+ array[3]+ array[4]


Nothing wrong with that, other than that you just hid the name of the 
list type, making it tricky to later convert things to lists.




list: [a,1,2]


You'll never get that.  When you assign an object to a list, the object 
itself is referenced in that list, not the name that it happened to have 
before.  So if a was an object of type float and value 41.5, then you 
presumably want:

   mylist: [41.5, 1, 2]




When I do it like this: list=array[0]+ array[3]+ array[4] I get an error:

TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 
'numpy.ndarray'



Apparently you did not use the line
array= [a,b,c,1,2,3]

as you said above, but some other assignment, perhaps using a numpy 
method or six.  Worse, apparently the elements of that collection aren't 
simple numbers but some kind of numpy thingies as well.


If you show what you actually did, probably someone here can help, 
though the more numpy you use, the less likely that it'll be me.



If you really had a list, you wouldn't have gotten an error, but neither 
would you have gotten anything like you're asking.  array[3] + array[4] 
== 1+2 == 3.  If you're trying to make a list using + from a subscripted 
list, you'd have to enclose each integer in square brackets.


mylist = [array[0]] + [array[3]] + [array[4]]

Alternatively, you could just do

mylist = [ array[0], array[3], array[4] ]

--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Encoding NaN in JSON

2013-04-22 Thread Wayne Werner

On Sat, 20 Apr 2013, Chris “Kwpolska” Warrick wrote:


On Fri, Apr 19, 2013 at 9:42 PM, Grant Edwards invalid@invalid.invalid wrote:

The OP asked for a string, and I thought you were proposing the string
'null'.  If one is to use a string, then 'NaN' makes the most sense,
since it can be converted back into a floating point NaN object.

I infer that you were proposing a JSON null value and not the string
'null'?


Not me, Wayne Werner proposed to use the JSON null value.  I parsed
the backticks (`) used by him as a way to delimit it from text and not
as a string.


That was, in fact, my intention. Though it seems to me that you'll have to 
suffer between some sort of ambiguity - in Chrome, at least, 
`Number(null)` evaluates to `0` instead of NaN. But `Number('Whatever')` 
evaluates to NaN. However, a JSON parser obviously wouldn't be able to 
make the semantic distinction, so I think you'll be left with whichever 
API makes the most sense to you:


NaN maps to null

   or

NaN maps to NaN (or any other string, really)


Obviously you're not limited to these particular choices, but they're 
probably the easiest to implement and communicate.


HTH,
-W-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Blind Anagram
On 22/04/2013 18:48, Skip Montanaro wrote:
 But I was really wondering if there was a simple solution that worked
 without people having to add libraries to their basic Python installations.
 
 I think installing numpy is approximately
 
 pip install numpy
 
 assuming you have write access to your site-packages directory.  If
 not, install using --prefix and set PYTHONPATH accordingly.
 
 I forgot that Python also has an array module.  With numpy available,
 mature, and well-supported, I imagine it doesn't get much love these
 days though.  Still, I gave it a whirl:
 
 ###
 import random
 import array
 from timeit import Timer
 
 import numpy
 
 stuff = [random.random()  0.5 for i in range(10**7)]
 sieve1 = numpy.array(stuff, dtype=bool)
 sieve2 = array.array('B', stuff)
 
 setup = from __main__ import sieve1, sieve2
 from itertools import islice
 hi = 7*10**6
 
 
 t1 = Timer((True == sieve1[:hi]).sum(), setup)
 t2 = Timer(sieve2[:hi].count(True), setup)
 # t3 = Timer(sum(islice(sieve, hi)), setup)
 # t4 = Timer(sum(x for x in islice(sieve, hi) if x), setup)
 # t5 = Timer(sum(x for x in islice(sieve, hi) if x is True), setup)
 # t6 = Timer(sum(1 for x in islice(sieve, hi) if x is True), setup)
 # t7 = Timer(len(list(filter(None, islice(sieve, hi, setup)
 
 print(min(t1.repeat(number=10)))
 print(min(t2.repeat(number=10)))
 # for t in (t1, t2, t3, t4, t5, t6, t7):
 # print( min(t.repeat(number=10)) )
 ###
 
 Performance was not all that impressive:
 
 0.340315103531
 5.42102503777
 
 Still, you might fiddle around with it a bit.  Perhaps unsigned ints
 instead of unsigned bytes will provide more efficient counting...

I spent a lot of time comparing python arrays and lists but found that
lists were always much faster in this application.

I do have numpy installed but I remember that when I did this (some time
ago) it was far from easy with Python 3.x running natively on Windows x64.

  Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to connect to a website?

2013-04-22 Thread webmaster
thanks!
solved with:

import urllib.request
import urllib.parse

user = 'user'
pw = 'password'

login_url = 'http://www.riskopoly.nl/test/index.php'

data = urllib.parse.urlencode({'user': user, 'pw': pw})
data = data.encode('utf-8')
# adding charset parameter to the Content-Type header.
request = urllib.request.Request(login_url)
request.add_header(Content-Type,application/x-www-form-urlencoded;charset=utf-8)

f = urllib.request.urlopen(request, data)
print(f.read().decode('utf-8'))

And then i get next answer:
preArray
(
[pw] = password
[user] = user
)
/pre

Solved and thanks again:)
-- 
http://mail.python.org/mailman/listinfo/python-list


How to get JSON values and how to trace sessions??

2013-04-22 Thread webmaster
Hi all, 
from python I post data to a webpage using urllib and can print that content.
See code below.
 
But now i am wondering how to trace sessions? it is needed for a multiplayer 
game, connected to a webserver. How do i trace a PHP-session? I suppose i have 
to save a cookie with the sessionID from the webserver? Is this possible with 
Python? Are their other ways to keep control over which players sends the 
gamedata?

Secondly, can i handle JSON values? I know how to create them serverside, but 
how do i handle that response in python?

Thank you very much for any answer!


Code:
import urllib.request
import urllib.parse

user = 'user'
pw = 'password'

login_url = 'http://www..nl/test/index.php'

data = urllib.parse.urlencode({'user': user, 'pw': pw})
data = data.encode('utf-8')
# adding charset parameter to the Content-Type header.
request = urllib.request.Request(login_url)
request.add_header(Content-Type,application/x-www-form-urlencoded;charset=utf-8)

f = urllib.request.urlopen(request, data)
print(f.read().decode('utf-8'))



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote:
 On 22/04/2013 17:06, Oscar Benjamin wrote:

 I don't know what your application is but I would say that my first
 port of call here would be to consider a different algorithmic
 approach. An obvious question would be about the sparsity of this data
 structure. How frequent are the values that you are trying to count?
 Would it make more sense to store a list of their indices?

 Actually it is no more than a simple prime sieve implemented as a Python
 class (and, yes, I realize that there are plenty of these around).

If I understand correctly, you have a list of roughly a billion
True/False values indicating which integers are prime and which are
not. You would like to discover how many prime numbers there are
between two numbers a and b. You currently do this by counting the
number of True values in your list between the indices a and b.

If my description is correct then I would definitely consider using a
different algorithmic approach. The density of primes from 1 to 1
billlion is about 5%. Storing the prime numbers themselves in a sorted
list would save memory and allow a potentially more efficient way of
counting the number of primes within some interval.

To see how it saves memory (on a 64 bit system):

$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
Type help, copyright, credits or license for more information.
 import sys
 a = ([True] + [False]*19) * 5
 len(a)
100
 sys.getsizeof(a)
872
 a = list(range(5))
 sys.getsizeof(a)
450120
 sum(sys.getsizeof(x) for x in a)
120

So you're using about 1/5th of the memory with a list of primes
compared to a list of True/False values. Further savings would be
possible if you used an array to store the primes as 64 bit integers.
In this case it would take about 400MB to store all the primes up to 1
billion.

The more efficient way of counting the primes would then be to use the
bisect module. This gives you a way of counting the primes between a
and b with a cost that is logarithmic in the total number of primes
stored rather than linear in the size of the range (e.g. b-a). For
large enough primes/ranges this is certain to be faster. Whether it
actually works that way for your numbers I can't say.


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: kbhit/getch python equivalent

2013-04-22 Thread woooee
 I'm looking for a kbhit/getch equivalent in python in order to be able to 
 stop my inner loop in a controlled way (communication with external hardware 
 is involved and breaking it abruptly may cause unwanted errors

A curses example

import curses

stdscr = curses.initscr()
curses.cbreak()
stdscr.keypad(1)

stdscr.addstr(0,10,Hit 'q' to quit )
stdscr.refresh()

key = ''
while key != ord('q'):
key = stdscr.getch()
stdscr.addch(20,25,key)
stdscr.refresh()

curses.endwin()
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Confusing Algorithm

2013-04-22 Thread Christian Gollwitzer

Am 22.04.13 16:57, schrieb Oscar Benjamin:

On 22 April 2013 13:56, Chris Angelico ros...@gmail.com wrote:

On Mon, Apr 22, 2013 at 10:39 PM, RBotha r...@ymond.co.za wrote:



Threads are
straight lines and cannot intersect towers.
Your task is to write a program that finds
the minimal number of threads to cover all
the towers.



-Example:
List of towers: 1 5 3 7 2 5 2
Output: 4




I read it differently. I thought the threads would go 1-5-7-5-2.




I'd agree with your interpretation. Threads are straight lines and 
cannot intersect towers - I read it such that the answer is the convex 
hull of the set of points given by the tower height. The convex hull 
can be computed for this 1D problem by initializing with
 line segments between every point and repeatedly pulling up every 
non-convex piece, if I'm not mistaken.


Christian

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to get JSON values and how to trace sessions??

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 6:09 AM,  webmas...@terradon.nl wrote:
 But now i am wondering how to trace sessions? it is needed for a multiplayer 
 game, connected to a webserver. How do i trace a PHP-session? I suppose i 
 have to save a cookie with the sessionID from the webserver? Is this possible 
 with Python? Are their other ways to keep control over which players sends 
 the gamedata?

 Secondly, can i handle JSON values? I know how to create them serverside, but 
 how do i handle that response in python?

Python has a JSON module that should do what you want:
http://docs.python.org/3.3/library/json.html

I don't know the details of cookie handling in Python, but this looks
to be what you want:

http://docs.python.org/3.3/library/http.cookiejar.html#http.cookiejar.CookieJar

Tip: The Python docs can be searched very efficiently with a web
search (eg Google, Bing, DuckDuckGo, etc). Just type python and
whatever it is you want - chances are you'll get straight there.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 21:18, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote:
 On 22/04/2013 17:06, Oscar Benjamin wrote:

 I don't know what your application is but I would say that my first
 port of call here would be to consider a different algorithmic
 approach. An obvious question would be about the sparsity of this data
 structure. How frequent are the values that you are trying to count?
 Would it make more sense to store a list of their indices?

 Actually it is no more than a simple prime sieve implemented as a Python
 class (and, yes, I realize that there are plenty of these around).

 If I understand correctly, you have a list of roughly a billion
 True/False values indicating which integers are prime and which are
 not. You would like to discover how many prime numbers there are
 between two numbers a and b. You currently do this by counting the
 number of True values in your list between the indices a and b.

 If my description is correct then I would definitely consider using a
 different algorithmic approach. The density of primes from 1 to 1
 billlion is about 5%. Storing the prime numbers themselves in a sorted
 list would save memory and allow a potentially more efficient way of
 counting the number of primes within some interval.

In fact it is probably quicker if you don't mind using all that memory
to just store the cumulative sum of your prime True/False indicator
list. This would be the prime counting function pi(n). You can then
count the primes between a and b in constant time with pi[b] - pi[a].


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Rui Maciel
lcrocker wrote:

 I'm a programmer, I installed Tkinter, and use it. I'd like to deploy
 programs written with it to others.  **Those** people know nothing 
 about it, and **shouldn't have to**.

They don't need to.  The only person that needs to know what he is doing is 
you.  You want to distribute a software package?  Package it.  Learn the 
very basics and set python-tkinter as a dependency.

http://wiki.debian.org/Packaging


Rui Maciel

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Rui Maciel
Steven D'Aprano wrote:

 I think that if you are worrying about the overhead of the tkinter
 bindings for Python, you're guilty of premature optimization.

I'm not worried about that.  No one should be forced to install crap that 
they don't use or will ever need, no matter how great the average HD 
capacity is nowadays.


Rui Maciel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lists and arrays

2013-04-22 Thread BartC



Ana Dionísio anadionisio...@gmail.com wrote in message 
news:de1cc79e-cbf7-4b0b-ae8e-18841a1ef...@googlegroups.com...

Hello!

I need your help!

I have an array and I need pick some data from that array and put it in a 
list, for example:


array= [a,b,c,1,2,3]

list=array[0]+ array[3]+ array[4]

list: [a,1,2]

When I do it like this: list=array[0]+ array[3]+ array[4] I get an error:

TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 
'numpy.ndarray'


You're calculating a+1+2. Probably a isn't something that can be added to 
1+2.


--
Bartc 


--
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Blind Anagram
On 22/04/2013 21:18, Oscar Benjamin wrote:
 On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote:

[snip]

 If my description is correct then I would definitely consider using a
 different algorithmic approach. The density of primes from 1 to 1
 billlion is about 5%. Storing the prime numbers themselves in a sorted
 list would save memory and allow a potentially more efficient way of
 counting the number of primes within some interval.

That is correct but I need to say that the lengths I have been
describing are limiting cases - almost all of the time the sieve length
will be quite small.

But I was still interested to see if I could push the limit without
changing the essential simplicity of the sieve.

And here the cost of creating the slice (which I have measured) set me
wondering why a list.count(value, limit) function did not exist.

I also wondered whether I had missed any obvious way of avoiding the
slicing cost (intellectually it seemed wrong to me to have to copy the
list in order to count items within it).
[snip]

 
 So you're using about 1/5th of the memory with a list of primes
 compared to a list of True/False values. Further savings would be
 possible if you used an array to store the primes as 64 bit integers.
 In this case it would take about 400MB to store all the primes up to 1
 billion.

I have looked at solutions based on listing primes and here I have found
that they are very much slower than my existing solution when the sieve
is not large (which is the majority use case).

I have also tried counting using a loop such as:

  while i  limit:
i = sieve.index(1, i) + 1
cnt += 1

but this is slower than count even on huge lists.

Thank you again for your advice.

   Brian



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Blind Anagram
On 22/04/2013 22:03, Oscar Benjamin wrote:
 On 22 April 2013 21:18, Oscar Benjamin oscar.j.benja...@gmail.com wrote:
 On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote:
 On 22/04/2013 17:06, Oscar Benjamin wrote:

 I don't know what your application is but I would say that my first
 port of call here would be to consider a different algorithmic
 approach. An obvious question would be about the sparsity of this data
 structure. How frequent are the values that you are trying to count?
 Would it make more sense to store a list of their indices?

 Actually it is no more than a simple prime sieve implemented as a Python
 class (and, yes, I realize that there are plenty of these around).

 If I understand correctly, you have a list of roughly a billion
 True/False values indicating which integers are prime and which are
 not. You would like to discover how many prime numbers there are
 between two numbers a and b. You currently do this by counting the
 number of True values in your list between the indices a and b.

 If my description is correct then I would definitely consider using a
 different algorithmic approach. The density of primes from 1 to 1
 billlion is about 5%. Storing the prime numbers themselves in a sorted
 list would save memory and allow a potentially more efficient way of
 counting the number of primes within some interval.
 
 In fact it is probably quicker if you don't mind using all that memory
 to just store the cumulative sum of your prime True/False indicator
 list. This would be the prime counting function pi(n). You can then
 count the primes between a and b in constant time with pi[b] - pi[a].

I did wonder whether, after creating the sieve, I should simply go
through the list and replace the True values with a count.  This would
certainly speed up the prime count function, which is where the issue
arises.  I will try this and see what sort of performance trade-offs
this involves.

  Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Confusing Algorithm

2013-04-22 Thread DJC

On 22/04/13 13:39, RBotha wrote:

I'm facing the following problem:


In a city of towerblocks, Spiderman can
“cover” all the towers by connecting the
first tower with a spider-thread to the top
of a later tower and then to a next tower
and then to yet another tower until he
reaches the end of the city. Threads are
straight lines and cannot intersect towers.
Your task is to write a program that finds
the minimal number of threads to cover all
the towers. The list of towers is given as a
list of single digits indicating their height.

-Example:
List of towers: 1 5 3 7 2 5 2
Output: 4


I'm not sure how a 'towerblock' could be defined. How square does a shape have 
to be to qualify as a towerblock? Any help on solving this problem?


It's not the algorithm that's confusing, it's the problem. First clarify 
the problem.
This appears to be a variation of the travelling-salesman problem. 
Except the position of the towers is not defined, only their height.
So either the necessary information is missing or whoever set the 
problem intended something else.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Confusing Algorithm

2013-04-22 Thread Ian Kelly
On Mon, Apr 22, 2013 at 2:33 PM, Christian Gollwitzer aurio...@gmx.de wrote:
 I'd agree with your interpretation. Threads are straight lines and cannot
 intersect towers - I read it such that the answer is the convex hull of
 the set of points given by the tower height. The convex hull can be computed
 for this 1D problem by initializing with
  line segments between every point and repeatedly pulling up every
 non-convex piece, if I'm not mistaken.

I agree that seems the likely intention.  One also must assume that
the towers are evenly spaced and have point width, neither of which
are stated in the problem.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 15:15:19 +0100, Blind Anagram wrote:

 But when using a sub-sequence, I do suffer a significant reduction in
 speed for a count when compared with count on the full list.  When the
 list is small enough not to cause memory allocation issues this is about
 30% on 100,000,000 items.  But when the list is 1,000,000,000 items, OS
 memory allocation becomes an issue and the cost on my system rises to
 over 600%.

Buy more memory :-)


 I agree that this is not a big issue but it seems to me a high price to
 pay for the lack of a sieve.count(value, limit), which I feel is a
 useful function (given that memoryview operations are not available for
 lists).

There is no need to complicate the count method for such a specialised 
use-case. A more general solution would be to provide list views. 


Another solution might be to use arrays rather than lists. Since your 
sieve list is homogeneous, you could possibly use an array of 1 or 0 
bytes rather than a list of True or False bools. That would reduce the 
memory overhead by a factor of four, and similarly reduce the overhead of 
any copying:


py from array import array
py from sys import getsizeof
py L = [True, False, False, True]*1000
py A = array('b', L)
py getsizeof(L)
16032
py getsizeof(A)
4032



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 22:25, Blind Anagram blindanag...@nowhere.org wrote:
 On 22/04/2013 21:18, Oscar Benjamin wrote:
 On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote:

 I also wondered whether I had missed any obvious way of avoiding the
 slicing cost (intellectually it seemed wrong to me to have to copy the
 list in order to count items within it).
 [snip]

 I have looked at solutions based on listing primes and here I have found
 that they are very much slower than my existing solution when the sieve
 is not large (which is the majority use case).

What matters is not so much the size of the sieve but the size of the
interval you want to query. You say that slicing cost is somehow
significant which suggests to me that it's not a small interval. An
approach using a sorted list of primes and bisect would have a cost
that is independent of the size of the interval (and depends only
logarithmically on the size of the sieve).


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Weird behaviour?

2013-04-22 Thread jussij
On Tuesday, April 23, 2013 12:29:57 AM UTC+10, nn wrote:

 Maybe it is related to this bug?
 
 http://bugs.python.org/issue11272

I'm running Python 2.7.2 (on Windows) and that version doesn't appear to have 
that bug:

  Python 2.7.2 (default, Apr 23 2013, 11:49:52) [MSC v.1500 32 bit (Intel)] on 
win32
  Type help, copyright, credits or license for more information.
   print(repr(input()))
  testing
  'testing'
  

Cheers Jussi

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 22:25:50 +0100, Blind Anagram wrote:

 I have looked at solutions based on listing primes and here I have found
 that they are very much slower than my existing solution when the sieve
 is not large (which is the majority use case).

Yes. This is hardly surprising. Algorithms suitable for dealing with the 
first million primes are not suitable for dealing with the first trillion 
primes, and vice versa. We like to pretend that computer programming is 
an abstraction, and for small enough data we often can get away with 
that, but like all abstractions eventually it breaks and the cost of 
dealing with real hardware becomes significant.

But I must ask, given that the primes are so widely distributed, why are 
you storing them in a list instead of a sparse array (i.e. a dict)? There 
are 50,847,534 primes less than or equal to 1,000,000,000, so you are 
storing roughly 18 False values for every True value. That ratio will 
only get bigger. With a billion entries, you are using 18 times more 
memory than necessary.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 22:09:14 +0100, Rui Maciel wrote:

 Steven D'Aprano wrote:
 
 I think that if you are worrying about the overhead of the tkinter
 bindings for Python, you're guilty of premature optimization.
 
 I'm not worried about that.  No one should be forced to install crap
 that they don't use or will ever need, no matter how great the average
 HD capacity is nowadays.

Nobody forces you to do anything. Python is open source, and the source 
code is freely available. Feel free to hand-optimize your Python 
installation, selecting carefully each and every module, class, and 
function in the standard library so that only the ones you absolutely 
know you will need to use are installed, using your godlike powers of 
precognition to foresee exactly what you need in seventeen months from 
now and what is crap that you will never need.

Good luck with that. I look forward to hearing about the results.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Weird behaviour?

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 9:06 AM,  jus...@zeusedit.com wrote:
 On Tuesday, April 23, 2013 12:29:57 AM UTC+10, nn wrote:

 Maybe it is related to this bug?

 http://bugs.python.org/issue11272

 I'm running Python 2.7.2 (on Windows) and that version doesn't appear to have 
 that bug:

   Python 2.7.2 (default, Apr 23 2013, 11:49:52) [MSC v.1500 32 bit (Intel)] 
 on win32
   Type help, copyright, credits or license for more information.
print(repr(input()))
   testing
   'testing'
   

Careful there; go with raw_input() on Py2. And then it does happen.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 14:52:39 +0200, Antoon Pardon wrote:

 Op 22-04-13 11:18, Steven D'Aprano schreef:
 On Mon, 22 Apr 2013 03:08:24 -0500, Andrew Berg wrote:

 Much of the stdlib doesn't rely on anything but the core interpreter.
 tkinter by itself is not the issue. As you said, the bindings are
 tiny. However, in order to be usable, it requires quite a few things -
 most notably X. On desktop Linux, this is already installed, but on
 server systems, it generally is not (or at least shouldn't be in most
 cases). Going back to my example of a web server using a Python-based
 framework, I'll repeat that there is no reason such a system should
 have X even installed in order to serve web pages. Even on a lean,
 mean server machine, CPython requires only a few extra libraries. Add
 tkinter, and suddenly you have to install a LOT of things. If you plan
 to actually use tkinter, this is fine. If not, you've just added a lot
 of stuff that you don't need. This adds unnecessary overhead in
 several places (like your package system's database).
 I can't disagree with any of this, except to say that none of this
 justifies having a separate package for Tkinter. Naturally if you don't
 have X, Tcl won't work, and if Tcl won't work, Tkinter won't work and
 should give an import error. But that doesn't imply that X must be a
 dependency for Python. It's a dependency for having Tkinter *work*, but
 not for *installing* Tkinter as part of the standard library.

 Hell, even if you have X installed, and Tcl, and the Tkinter packages,
 importing tkinter can still fail, if Python wasn't built with the right
 magic incantations for it to recognise that Tcl is installed.

 Then don't use a package system. The job of a package system is, that if
 you install something, it install all dependencies that are needed to
 make it work.

No, the job of the package system is to manage dependencies. It makes no 
guarantee about whether or not something will work.

$ sudo apt-get install rule_world
$ rule_world --start-from Australia
Error: cannot connect to US nuclear arsenal from here, you cannot rule 
the world


A joke example, of course, but a serious point. Successful installation 
doesn't necessarily mean the program will run successfully, or work in 
any meaningful way.

We're also glossing over what it means to be a dependency. This is not 
obvious, and in fact I would argue that X is NOT a dependency for 
tkinter, even though tkinter will not work without it, for some 
definition of work. I can quite happily import tkinter on a remote 
machine over ssh:

py from tkinter.messagebox import showinfo

or do the same thing on a local machine from a non-X terminal. I haven't 
tried it, but quite possibly even on a headless machine without X 
installed at all. And why not? Tkinter is a big module, there are all 
sorts of things that I might want to access that don't actually require 
an X display. If nothing else, I can do this:

py help(showinfo)


and read the docs. Tkinter does not actually require X to work. It 
merely requires X in order to *display an X window*.

It's only when I actually try to do something that requires an X display 
that it will fail. I won't show the entire traceback, because it is long 
and not particularly enlightening, but the final error message explains 
exactly why it isn't working:

_tkinter.TclError: no display name and no $DISPLAY environment variable



 Your solution doesn't make sense in view of your earlier response where
 you argue tkinster should be installed because it is part of the
 standard combined with the advantage of having a standard library. But
 IMO a part of that standard library not working, is just as harmful as
 part of that standard library not being installed. From a
 user/programmer's point of view the result is the same. It is unusable.

Not at all. As I said earlier, I would expect that trying to import 
tkinter on such a system should give a meaningful error message. 
Actually, it need not even fail at import time. As I show above, I can 
happily import tkinter without an X display. I haven't tried it, but I 
expect that I can probably import tkinter without Tcl either.

Let me put this another way:

It should not matter whether I install Tcl before Python, or after 
Python, the end result should be that once both are installed, tkinter 
will be usable (provided you have an X display). To put it in Ubuntu 
terms, if I do this:

apt-get tcl
apt-get python

or this:

apt-get python
apt-get tcl

on a machine with X, tkinter should Just Work. And if I don't install 
tcl, tkinter should still import, it just won't be able to, you know, 
interface to tcl.

What we're arguing here is merely the design of the dependency graph, and 
that's a matter of taste. My design would be different from that of the 
Ubuntu folks. That's fine. If we all agreed about everything, we'd have 
nothing to argue about *wink*

But I think we can all agree that something like this is pretty crappy:



Re: Weird behaviour?

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 07:29:57 -0700, nn wrote:

 On Apr 21, 9:19 pm, Steven D'Aprano steve
 +comp.lang.pyt...@pearwood.info wrote:
 On Mon, 22 Apr 2013 10:56:11 +1000, Chris Angelico wrote:
  You're running this under Windows. The convention on Windows is for
  end-of-line to be signalled with \r\n, but the convention inside
  Python is to use just \n. With the normal use of buffered and parsed
  input, this is all handled for you; with unbuffered input, that
  translation also seems to be disabled, so your string actually
  contains '120\r', as will be revealed by its repr().

 If that's actually the case, then I would call that a bug in raw_input.

 Actually, raw_input doesn't seem to cope well with embedded newlines
 even without the -u option. On Linux, I can embed a control character
 by typing Ctrl-V followed by Ctrl-char. E.g. Ctrl-V Ctrl-M to embed a
 carriage return, Ctrl-V Ctrl-J to embed a newline. So watch:

 [steve@ando ~]$ python2.7 -c x = raw_input('Hello? '); print repr(x)
 Hello? 120^M^Jabc
 '120\r'

 Everything after the newline is lost.

 --
 Steven
 
 Maybe it is related to this bug?
 
 http://bugs.python.org/issue11272



I doubt it, I'm not using Windows and that bug is specific to Windows.


Here's the behaviour on Python 3.3:

py result = input(Type something with control chars: )
Type something with control chars: something ^T^M else
and a second line
py print(repr(result))
'something \x14\r else \nand a second line'


Much better!



-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 10:22 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 It's only when I actually try to do something that requires an X display
 that it will fail. I won't show the entire traceback, because it is long
 and not particularly enlightening, but the final error message explains
 exactly why it isn't working:

 _tkinter.TclError: no display name and no $DISPLAY environment variable

You presumably have a system to test this on. Can you try using ssh -X
to get to it, and then retry that action? It looks like you actually
have everything you need, just no display... which is exactly what
you'd get if you ssh to something that has a real GUI. Not a
dependency problem.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Andrew Berg
On 2013.04.22 19:22, Steven D'Aprano wrote:
 It's only when I actually try to do something that requires an X display 
 that it will fail. I won't show the entire traceback, because it is long 
 and not particularly enlightening, but the final error message explains 
 exactly why it isn't working:
 
 _tkinter.TclError: no display name and no $DISPLAY environment variable
So you want to go from this won't work because it's not installed to this 
won't work, and it there could be a hundred different reasons
why? tkinter's main function is to display something on a display. To say that 
displaying something is an optional feature is absurd.
You can install this, but your package manager won't pull in any dependencies 
because a few minor things will work without them. If you
want it to actually do what it was made for, you need to install them 
yourself. Much bigger problem than the OP's, no?

-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1
-- 
http://mail.python.org/mailman/listinfo/python-list


optomizations

2013-04-22 Thread Rodrick Brown
I would like some feedback on possible solutions to make this script run
faster.
The system is pegged at 100% CPU and it takes a long time to complete.


#!/usr/bin/env python

import gzip
import re
import os
import sys
from datetime import datetime
import argparse

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-f', dest='inputfile', type=str, help='data file
to parse')
parser.add_argument('-o', dest='outputdir', type=str,
default=os.getcwd(), help='Output directory')
args = parser.parse_args()

if len(sys.argv[1:])  1:
parser.print_usage()
sys.exit(-1)

print(args)
if args.inputfile and os.path.exists(args.inputfile):
try:
with gzip.open(args.inputfile) as datafile:
for line in datafile:
line = line.replace('mediacdn.xxx.com', 'media.xxx.com')
line = line.replace('staticcdn.xxx.co.uk', '
static.xxx.co.uk')
line = line.replace('cdn.xxx', 'www.xxx')
line = line.replace('cdn.xxx', 'www.xxx')
line = line.replace('cdn.xx', 'www.xx')
siteurl = line.split()[6].split('/')[2]
line = re.sub(r'\bhttps?://%s\b' % siteurl, , line, 1)

(day, month, year, hour, minute, second) =
(line.split()[3]).replace('[','').replace(':','/').split('/')
datelog = '{} {} {}'.format(month, day, year)
dateobj = datetime.strptime(datelog, '%b %d %Y')

outfile = '{}{}{}_combined.log'.format(dateobj.year,
dateobj.month, dateobj.day)
outdir = (args.outputdir + os.sep + siteurl)

if not os.path.exists(outdir):
os.makedirs(outdir)

with open(outdir + os.sep + outfile, 'w+') as outf:
outf.write(line)

except IOError, err:
sys.stderr.write(Error unable to read or extract inputfile: {}
{}\n.format(args.inputfile, err))
sys.exit(-1)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: optomizations

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 11:19 AM, Rodrick Brown rodrick.br...@gmail.com wrote:
 with gzip.open(args.inputfile) as datafile:
 for line in datafile:
 outfile = '{}{}{}_combined.log'.format(dateobj.year,
 dateobj.month, dateobj.day)
 outdir = (args.outputdir + os.sep + siteurl)

 with open(outdir + os.sep + outfile, 'w+') as outf:
 outf.write(line)

You're opening files and closing them again for every line. This
wouldn't cause you to spin the CPU (more likely it'd thrash the hard
disk - unless you have an SSD), but it is certainly an optimization
target.

Can you know in advance what files you need? If not, I'd try something
like this:

outf = {} # Might want a better name though

.
   outfile = ...
   if outfile not in outf:
   os.makedirs(...)
   outf[outfile] = open(...)
   outf[outfile].write(line)

for f in outf.values():
  f.close()

Open files only as needed, close 'em all at the end.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List Count

2013-04-22 Thread Dave Angel

On 04/22/2013 05:32 PM, Blind Anagram wrote:

On 22/04/2013 22:03, Oscar Benjamin wrote:

On 22 April 2013 21:18, Oscar Benjamin oscar.j.benja...@gmail.com wrote:

On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote:

On 22/04/2013 17:06, Oscar Benjamin wrote:


I don't know what your application is but I would say that my first
port of call here would be to consider a different algorithmic
approach. An obvious question would be about the sparsity of this data
structure. How frequent are the values that you are trying to count?
Would it make more sense to store a list of their indices?


Actually it is no more than a simple prime sieve implemented as a Python
class (and, yes, I realize that there are plenty of these around).


If I understand correctly, you have a list of roughly a billion
True/False values indicating which integers are prime and which are
not. You would like to discover how many prime numbers there are
between two numbers a and b. You currently do this by counting the
number of True values in your list between the indices a and b.

If my description is correct then I would definitely consider using a
different algorithmic approach. The density of primes from 1 to 1
billlion is about 5%. Storing the prime numbers themselves in a sorted
list would save memory and allow a potentially more efficient way of
counting the number of primes within some interval.


In fact it is probably quicker if you don't mind using all that memory
to just store the cumulative sum of your prime True/False indicator
list. This would be the prime counting function pi(n). You can then
count the primes between a and b in constant time with pi[b] - pi[a].


I did wonder whether, after creating the sieve, I should simply go
through the list and replace the True values with a count.  This would
certainly speed up the prime count function, which is where the issue
arises.  I will try this and see what sort of performance trade-offs
this involves.



By doing that replacement, you'd increase memory usage manyfold (maybe 
3:1, I don't know).  As long as you're only using bools in the list, you 
only have the list overhead to consider, because all the objects 
involved are already cached (True and False exist only once each).  If 
you have integers, you'll need a new object for each nonzero count.




--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: optomizations

2013-04-22 Thread Roy Smith
In article mailman.944.1366680414.3114.python-l...@python.org,
 Rodrick Brown rodrick.br...@gmail.com wrote:

 I would like some feedback on possible solutions to make this script run
 faster.

If I had to guess, I would think this stuff:

 line = line.replace('mediacdn.xxx.com', 'media.xxx.com')
 line = line.replace('staticcdn.xxx.co.uk', '
 static.xxx.co.uk')
 line = line.replace('cdn.xxx', 'www.xxx')
 line = line.replace('cdn.xxx', 'www.xxx')
 line = line.replace('cdn.xx', 'www.xx')
 siteurl = line.split()[6].split('/')[2]
 line = re.sub(r'\bhttps?://%s\b' % siteurl, , line, 1)

You make 6 copies of every line.  That's slow.  But I'm also going to 
quote something I wrote here a couple of months back:

 I've been doing some log analysis.  It's been taking a grovelingly long 
 time, so I decided to fire up the profiler and see what's taking so 
 long.  I had a pretty good idea of where the ONLY TWO POSSIBLE hotspots 
 might be (looking up IP addresses in the geolocation database, or 
 producing some pretty pictures using matplotlib).  It was just a matter 
 of figuring out which it was. 
 
 As with most attempts to out-guess the profiler, I was totally, 
 absolutely, and embarrassingly wrong. 

So, my real advice to you is to fire up the profiler and see what it 
says.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: optomizations

2013-04-22 Thread MRAB

On 23/04/2013 02:19, Rodrick Brown wrote:

I would like some feedback on possible solutions to make this script run
faster.
The system is pegged at 100% CPU and it takes a long time to complete.


#!/usr/bin/env python

import gzip
import re
import os
import sys
from datetime import datetime
import argparse

if __name__ == '__main__':
 parser = argparse.ArgumentParser()
 parser.add_argument('-f', dest='inputfile', type=str, help='data file to 
parse')
 parser.add_argument('-o', dest='outputdir', type=str, default=os.getcwd(), 
help='Output directory')
 args = parser.parse_args()

 if len(sys.argv[1:])  1:
 parser.print_usage()
 sys.exit(-1)

 print(args)
 if args.inputfile and os.path.exists(args.inputfile):
 try:
 with gzip.open(args.inputfile) as datafile:
 for line in datafile:
 line = line.replace('mediacdn.xxx.com', 'media.xxx.com')
 line = line.replace('staticcdn.xxx.co.uk', 
'static.xxx.co.uk')


These next 2 lines are duplicates; the second will have no effect (I
think!).


 line = line.replace('cdn.xxx', 'www.xxx')
 line = line.replace('cdn.xxx', 'www.xxx')


Won't the next line also do the work of the preceding 2 lines?


 line = line.replace('cdn.xx', 'www.xx')
 siteurl = line.split()[6].split('/')[2]
 line = re.sub(r'\bhttps?://%s\b' % siteurl, , line, 1)

 (day, month, year, hour, minute, second) = 
(line.split()[3]).replace('[','').replace(':','/').split('/')
 datelog = '{} {} {}'.format(month, day, year)
 dateobj = datetime.strptime(datelog, '%b %d %Y')

 outfile = '{}{}{}_combined.log'.format(dateobj.year, 
dateobj.month, dateobj.day)
 outdir = (args.outputdir + os.sep + siteurl)

 if not os.path.exists(outdir):
 os.makedirs(outdir)

 with open(outdir + os.sep + outfile, 'w+') as outf:
 outf.write(line)

 except IOError, err:
 sys.stderr.write(Error unable to read or extract inputfile: {} 
{}\n.format(args.inputfile, err))
 sys.exit(-1)


I wonder whether it'll make a difference if you read a chunk at a time
(datafile.read(chunk_size) + datafile.readline() to ensure you have
complete lines), perform the replacements on it (so that you're working 
on several lines in one go), and then split it into lines for further

processing.

Another thing you could try caching the result of parsing the date, 
using (month, day, year) the key and outfile as the value in a dict.


A third thing you could try is not writing a file for every line
(doesn't the 'w+' mode truncate the file?), but save the output for
each chunk (see first suggestion) and then write the files afterwards,
at the end of the chunk.

--
http://mail.python.org/mailman/listinfo/python-list


Re: optomizations

2013-04-22 Thread Dan Stromberg
On Mon, Apr 22, 2013 at 6:53 PM, Roy Smith r...@panix.com wrote:


 So, my real advice to you is to fire up the profiler and see what it
 says.


I agree.

Fire up a  line-oriented profiler and only then start trying to improve the
hot spots.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread rusi
On Apr 23, 5:22 am, Steven D'Aprano steve
+comp.lang.pyt...@pearwood.info wrote:
 We're also glossing over what it means to be a dependency. This is not
 obvious, and in fact I would argue that X is NOT a dependency for
 tkinter, even though tkinter will not work without it, for some
 definition of work. I can quite happily import tkinter on a remote
 machine over ssh:

Yes the crux of the matter is what it means 'to work' and therefore
'to not work'

Lets say my car is 'not working'
On further investigation its found that the petrol tank is empty.
A case could be made for either case: 'it (the car) working' or 'its
not working'

To the extent that pragmatically 'not working' is attended by a
mechanic, its not in that category
To the extent that (even more pragmatically) I missed an important
appointment, its in that category

Both of which gloss over the fact that after filling the petrol it may
still not work.
So to conclude: since I could not check,  its vacuously working
is more problematic than the contrary
since I could not check, its vacuously not working

Package systems need to 'federate' so to speak workingness from a
zillion packages to the whole system.
The problem is that workingness is peculiar to each package.
Therefore it seems reasonable to me to ask of a package system that
- it allows a maximum number of different configurations for different
requirements ('without crap')
- it disallows all kinds of misconfigured/non-working systems --
therefore conservative dependencies are good
- the above subject to reasonable best efforts -- so dont cater to
fringe pathological cases (like I want Tkinter but not X)

BTW I suggested earlier that python could have something like KDE (Kde-
full and a smaller Kde-standard).
Just checked that python already has python2.7 and python2.7-minimal
where the description of the latter says: it can be used in the boot
process for basic tasks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: optomizations

2013-04-22 Thread Steven D'Aprano
On Mon, 22 Apr 2013 21:19:23 -0400, Rodrick Brown wrote:

 I would like some feedback on possible solutions to make this script run
 faster.
 The system is pegged at 100% CPU and it takes a long time to complete.

Have you profiled the app to see where it is spending all its time?

What does a long time mean? For instance:

It takes two hours to process a 15KB file -- you have a problem.

It takes 20 minutes to process a 15GB file -- and why are you 
complaining?


Or somewhere in the middle... 


But before profiling, I suggest you clean up the program. For example:

if args.inputfile and os.path.exists(args.inputfile):

Don't do that. There really isn't any point in checking whether the input 
file exists, since:

1) Just because it exists doesn't mean you can read it;

2) Just because you can read it doesn't mean it is a valid gzip file;

3) Just because it is a valid gzip file that you can read *now*, doesn't 
mean that it still will be in 10 milliseconds when you actually try to 
open the file.


A lot can happen in 10ms, or 1ms. The file might be deleted, or 
overwritten, or permissions changed. Change that to:

try:
with gzip.open(args.inputfile) as datafile:
for line in datafile:

and catch the exception if the file doesn't exist, or cannot be read. 
Which you already do, which just demonstrates that the call to 
os.path.exists is a waste of effort. 


Then look for wasted effort like this:

line = line.replace('cdn.xxx', 'www.xxx')
line = line.replace('cdn.xx', 'www.xx')


Surely the first line is redundant, since it would be correctly caught 
and replaced by the second?

Also, you're searching the file system *for every line* in the input 
file. Pull this outside of the loop and have it run once:

if not os.path.exists(outdir):
os.makedirs(outdir)

Likewise for opening and closing the output file, which you currently 
open and close it for every line. It only needs to be opened and closed 
once.

If it comes down to micro-optimizations to shave a few microseconds off, 
consider using string % formatting rather than the format method. But 
really, if you find yourself shaving microseconds off something that runs 
for ten minutes, you have to ask why you're bothering.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: optomizations

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 2:00 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Also, you're searching the file system *for every line* in the input
 file. Pull this outside of the loop and have it run once:

 if not os.path.exists(outdir):
 os.makedirs(outdir)

 Likewise for opening and closing the output file, which you currently
 open and close it for every line. It only needs to be opened and closed
 once.

The outdir depends on the line, though. Hence my suggestion to retain
the open files in a dictionary.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Steven D'Aprano
On Tue, 23 Apr 2013 10:36:38 +1000, Chris Angelico wrote:

 On Tue, Apr 23, 2013 at 10:22 AM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 It's only when I actually try to do something that requires an X
 display that it will fail. I won't show the entire traceback, because
 it is long and not particularly enlightening, but the final error
 message explains exactly why it isn't working:

 _tkinter.TclError: no display name and no $DISPLAY environment variable
 
 You presumably have a system to test this on. Can you try using ssh -X
 to get to it, and then retry that action? It looks like you actually
 have everything you need, just no display... which is exactly what you'd
 get if you ssh to something that has a real GUI. Not a dependency
 problem.

I didn't say it was a dependency problem. I'm just demonstrating that it 
is possible for tkinter code to fail even if all the dependencies are 
met; and on the other hand, it is useful to be able to import tkinter 
even if you cannot display any tkinter windows.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ubuntu package python3 does not include tkinter

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 2:03 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Tue, 23 Apr 2013 10:36:38 +1000, Chris Angelico wrote:

 On Tue, Apr 23, 2013 at 10:22 AM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 It's only when I actually try to do something that requires an X
 display that it will fail. I won't show the entire traceback, because
 it is long and not particularly enlightening, but the final error
 message explains exactly why it isn't working:

 _tkinter.TclError: no display name and no $DISPLAY environment variable

 You presumably have a system to test this on. Can you try using ssh -X
 to get to it, and then retry that action? It looks like you actually
 have everything you need, just no display... which is exactly what you'd
 get if you ssh to something that has a real GUI. Not a dependency
 problem.

 I didn't say it was a dependency problem. I'm just demonstrating that it
 is possible for tkinter code to fail even if all the dependencies are
 met; and on the other hand, it is useful to be able to import tkinter
 even if you cannot display any tkinter windows.

Sure. But I don't know that the situation you're seeing is the same as
the one you'd see if you install tkinter without tk.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: optomizations

2013-04-22 Thread Rodrick Brown
On Apr 22, 2013, at 11:18 PM, Dan Stromberg drsali...@gmail.com wrote:


On Mon, Apr 22, 2013 at 6:53 PM, Roy Smith r...@panix.com wrote:


 So, my real advice to you is to fire up the profiler and see what it
 says.


I agree.

Fire up a  line-oriented profiler and only then start trying to improve the
hot spots.


Got a doc or URL I have no experience working with python profilers.


-- 
http://mail.python.org/mailman/listinfo/python-list
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >