Re: Ubuntu package python3 does not include tkinter
lcrocker wrote: I understand that for something like a server distribution, but Ubuntu is a user-focused desktop distribution. It has a GUI, always. Irrelevant. The purpose of a distro like that is to give users a good experience. If I install Python on Windows, I get to use Python. On Ubuntu, I don't, and I think that will confuse some users. Nonsense. No one is keeping anyone off tkinter. If you want it, install it. There are official packages in the repositories such as python-tk and python3-tk. If someone else doesn't want them then they aren't forced to pack their Ubuntu systems with more cruft. There's nothing worse than being forced to install piles of irrelevant and useless stuff as a dependency to a fundamental package. I recently recommended Python to a friend who wants to start learning programming. Hurdles like this don't help someone like him. If your friend believes that having to do an extra pair of clicks or typing sudo apt-get install python-tk is an unbeatable hurdle then your friend's computer skills are awfully lacking and he won't have much success learning how to write programs. Rui Maciel -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
Steven D'Aprano wrote: It's only easy to install a package on Ubuntu if you know that you have to, and can somehow work out the name of the package. No one actually has to install tkinter. That's the whole point of providing it as a separate package: only those who want to use it have to install it. The rest of us don't. Rui Maciel -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Mon, 22 Apr 2013 07:36:47 +0100, Rui Maciel wrote: Steven D'Aprano wrote: It's only easy to install a package on Ubuntu if you know that you have to, and can somehow work out the name of the package. No one actually has to install tkinter. That's the whole point of providing it as a separate package: only those who want to use it have to install it. The rest of us don't. I think that if you are worrying about the overhead of the tkinter bindings for Python, you're guilty of premature optimization. The tkinter package in Python 3.3 is trivially small, under 2 MB. Besides, how far do we go? Do we expect people to install (say): python3-copy so that those who don't need the copy module don't have to install it? sudo apt-get python3 python3-copy python3-dis python3-doctest \ python3-csv python3-logging python3-shutil ... There are advantages to having the *standard library* actually be, you know, *standard*. In my perfect world, the tk/tcl bindings and the tkinter package would be installed with any Python installation. Naturally they won't work if you don't install Tcl, but to make them work, all you need is: sudo apt-get python3 tcl Don't want Tcl? Fine, don't install it, and import tkinter will fail at import time, preferably with a sensible error message like Tcl not installed. Naturally I'm just talking about the standard CPython implementation on Linux systems where Tcl is standard. If you have an embedded system, where tkinter's 2MB is *not* trivially small, or a platform where Tcl does not exist, then that's a different story. But in a standard Linux desktop install of Python, tkinter should Just Work once you install Tcl. In my perfect world. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On 2013.04.22 02:17, Steven D'Aprano wrote: I think that if you are worrying about the overhead of the tkinter bindings for Python, you're guilty of premature optimization. The tkinter package in Python 3.3 is trivially small, under 2 MB. Besides, how far do we go? Do we expect people to install (say): python3-copy so that those who don't need the copy module don't have to install it? Much of the stdlib doesn't rely on anything but the core interpreter. tkinter by itself is not the issue. As you said, the bindings are tiny. However, in order to be usable, it requires quite a few things - most notably X. On desktop Linux, this is already installed, but on server systems, it generally is not (or at least shouldn't be in most cases). Going back to my example of a web server using a Python-based framework, I'll repeat that there is no reason such a system should have X even installed in order to serve web pages. Even on a lean, mean server machine, CPython requires only a few extra libraries. Add tkinter, and suddenly you have to install a LOT of things. If you plan to actually use tkinter, this is fine. If not, you've just added a lot of stuff that you don't need. This adds unnecessary overhead in several places (like your package system's database). -- CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1 -- http://mail.python.org/mailman/listinfo/python-list
pip does not find packages
I am using virtualenv and pip (from archlinux). What I have done: virtualenv was installed by my distribution. I have made a virtual environment and activate it, it has installed pip, so far so good. Now I am trying to install package in the virtualenvironnement: pip install Impacket Downloading/unpacking Impacket Could not find any downloads that satisfy the requirement Impacket No distributions at all found for Impacket but Impacket is found by pip search Impacket Impacket - Network protocols Constructors and Dissectors exactly the same happens with pcapy. With PyGTK, the pip command just hang when trying to download it. What is going on? Maybe a misconfigured server? Is there anything that I can do? Olive -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Mon, 22 Apr 2013 03:08:24 -0500, Andrew Berg wrote: Much of the stdlib doesn't rely on anything but the core interpreter. tkinter by itself is not the issue. As you said, the bindings are tiny. However, in order to be usable, it requires quite a few things - most notably X. On desktop Linux, this is already installed, but on server systems, it generally is not (or at least shouldn't be in most cases). Going back to my example of a web server using a Python-based framework, I'll repeat that there is no reason such a system should have X even installed in order to serve web pages. Even on a lean, mean server machine, CPython requires only a few extra libraries. Add tkinter, and suddenly you have to install a LOT of things. If you plan to actually use tkinter, this is fine. If not, you've just added a lot of stuff that you don't need. This adds unnecessary overhead in several places (like your package system's database). I can't disagree with any of this, except to say that none of this justifies having a separate package for Tkinter. Naturally if you don't have X, Tcl won't work, and if Tcl won't work, Tkinter won't work and should give an import error. But that doesn't imply that X must be a dependency for Python. It's a dependency for having Tkinter *work*, but not for *installing* Tkinter as part of the standard library. Hell, even if you have X installed, and Tcl, and the Tkinter packages, importing tkinter can still fail, if Python wasn't built with the right magic incantations for it to recognise that Tcl is installed. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Serial Port Issue
Hi, I'm new to python and trying to learn serial communication using python.In this process i'm facing serial port issues.Please find the attached COMPorttest.py file ,correct me if anything wrong in the code.With my code it's always goes in to exception.I noted down the the COM port number from windows device manager list. Operating system: XPPython Ver: 2.5Pyserial: 2.5 Even i tried from python shell passing below commands import serialser=ser=serial.Serial(port=21,baudrate=9600) I observe below error on python shell File C:\Python25\lib\serial\serialwin32.py, line 55, in open raise SerialException(could not open port: %s % msg)SerialException: could not open port: (2, 'CreateFile', 'The system cannot find the file specified.') Thanks in advance. Best Regards,Chandan. import serial def checkPort(): This function reads the serial port and writes it. try: ser = serial.Serial( port=21, baudrate=9600, bytesize=serial.EIGHTBITS, parity=serial.PARITY_NONE, stopbits=serial.STOPBITS_ONE, timeout=10 ) if ser.isOpen(): print Port Open else: ser.close() printport closed except serial.serialutil.SerialException: print Failed to open port checkPort() attachment: DeviceManager.PNG-- http://mail.python.org/mailman/listinfo/python-list
Error in Import gv module
https://code.google.com/p/python-graph/wiki/Example When I am trying to run the code to draw a graph, given on above link, I am getting following error: ImportError: No module named gv What can be the reasons? Thank you! -- http://mail.python.org/mailman/listinfo/python-list
Re: Error in Import gv module
On 22.04.2013 12:13, Megha Agrawal wrote: https://code.google.com/p/python-graph/wiki/Example When I am trying to run the code to draw a graph, given on above link, I am getting following error: ImportError: No module named gv What can be the reasons? Which OS? It looks like you are missing graphviz or you need to adapt your paths: https://code.google.com/p/python-graph/issues/detail?id=15 Bye, Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: Error in Import gv module
On 04/22/2013 06:13 AM, Megha Agrawal wrote: https://code.google.com/p/python-graph/wiki/Example When I am trying to run the code to draw a graph, given on above link, I am getting following error: ImportError: No module named gv What can be the reasons? Simplest is that you haven't installed python-graph https://code.google.com/p/python-graph/downloads/list or, more directly, https://code.google.com/p/python-graph/ -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Apr 21, 11:36 pm, Rui Maciel rui.mac...@gmail.com wrote: Steven D'Aprano wrote: It's only easy to install a package on Ubuntu if you know that you have to, and can somehow work out the name of the package. No one actually has to install tkinter. That's the whole point of providing it as a separate package: only those who want to use it have to install it. The rest of us don't. I'm a programmer, I installed Tkinter, and use it. I'd like to deploy programs written with it to others. **Those** people know nothing about it, and **shouldn't have to**. I've given them a program in Python, they have Python, but it doesn't run, and doesn't give them a helpful error. They'll probably just blame me and move on. Not every Python user is a programmer. If I write a program in Java, any user with Java installed can run it. As it stands, that's no true for Python. That's not good PR for the cause. -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Apr 22, 11:35 am, Rui Maciel rui.mac...@gmail.com wrote: lcrocker wrote: I recently recommended Python to a friend who wants to start learning programming. Hurdles like this don't help someone like him. If your friend believes that having to do an extra pair of clicks or typing sudo apt-get install python-tk is an unbeatable hurdle then your friend's computer skills are awfully lacking and he won't have much success learning how to write programs. There are two worldviews here and they are as far as far can be. Its good to see them before arguing. 1. python as a standalone language 2. python as part of an (OS-related) ecosystem In windows python may or may not exist. And if it exists and I go inside the python directories and start messing around -- deleting some files, modifying others etc -- what will happen? Nothing much. My python programs will stop working. Presumably if I reinstall, it will be fine thereafter. What about linux? As an experiment I just tried $ aptitude purge python #Noobs BEWARE of that command and aptitude was too confused to give me a coherent report Tried then $ aptitude purge python2.7 The list of packages that it would purge was in hundreds. Heres a small sample of what would go: Firstly there are all the python-* packages. This is obvious. Not so obvious that some like python-csound were probably installed by me. Others like python-debian are needed for the basic health and functioning of the system. And besides these there are a pile of others that have no relation to python. A sample: asciidoc, bzr, dia, eog, gcj-*, gdb(!!), gimp, gnome-* (about 20 of these) printconf… So python is completely optional in windows. It is a part of the infrastructure on linux Messing with it is almost like saying: I dont see what that vmlinuz file is doing in /boot. So I removed it. Coming to the OP question: a. The python that PSF provides is suitable for learning python b. The python that linux distros provide is part of the wireframe on which the system rests. b may be derived from a but they are hardly the same. They may look very similar but their intents are quite different. So when you say If your friend believes that having to do an extra pair of clicks or typing sudo apt-get install python-tk is an unbeatable hurdle then your friend's computer skills are awfully lacking and he won't have much success learning how to write programs. Its all correct what you say. You wont have too many people learning from you if thats how you say it. Remember that the difference between an expert and a noob is rarely a question of intelligence or diligence. Its just some boring trivial mountain of data that the expert has picked up over time -- http://mail.python.org/mailman/listinfo/python-list
List Count
I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. Ideally I would like to be able to use: sieve.count(True, hi) where 'hi' sets the end of the count but this function is, sadly, not available for lists. The use of a bytearray with a memoryview object instead of a list solves this particular problem but it is not a solution for me as it creates more problems than it solves in other aspects of the program. Can I assume that one possible solution would be to sub-class list and create a C based extension to provide list.count(value, limit)? Are there any other solutions that will avoid copying a large part of the list? -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Apr 22, 4:18 pm, lcrocker leedanielcroc...@gmail.com wrote: On Apr 21, 11:36 pm, Rui Maciel rui.mac...@gmail.com wrote: Steven D'Aprano wrote: It's only easy to install a package on Ubuntu if you know that you have to, and can somehow work out the name of the package. No one actually has to install tkinter. That's the whole point of providing it as a separate package: only those who want to use it have to install it. The rest of us don't. I'm a programmer, I installed Tkinter, and use it. I'd like to deploy programs written with it to others. **Those** people know nothing about it, and **shouldn't have to**. I've given them a program in Python, they have Python, but it doesn't run, and doesn't give them a helpful error. They'll probably just blame me and move on. Not every Python user is a programmer. If I write a program in Java, any user with Java installed can run it. As it stands, that's no true for Python. That's not good PR for the cause. On the whole agree -- except for the java part -- maybe you've not heard of 'jar hell'? On the whole easy-deployability without losing easy-programmability is a major research issue. See this for someone choosing C++ over Lisp http://comments.gmane.org/gmane.comp.finance.ledger.general/1955 -- http://mail.python.org/mailman/listinfo/python-list
Selenium Webdriver + Python (How to get started ??)
Note that:- I have some experience of using Selenium IDE and Webdriver (Java). but no prior experience of Python. Now there is a project for which I will need to work with webdriver + Python. So far I have done following steps.. Install JDK Setup Eclipse download Installed Python v3.3.1 Download Installed Pydev (for Eclipse) also configured download installed (Distribute + PIP) http://www.lfd.uci.edu/~gohlke/pythonlibs/#pip Installed Selenium using command prompt Running following commands from windows 7 command prompt, successfully opens firefox browser python from selenium import webdriver webdriver.Firefox() -- ISSUE is that, I do not know exact steps of creating a python webdriver test project. I create new Pydev project with a src folder and also used sample python code from internet but selenium classes cannot be recognized. I have tried various approaches to import libraries but none seems to work. Any one can guide me what i need to do step by step to successfully run a simple test via python webdriver!! (eclipse pydev) Thank you. -- http://mail.python.org/mailman/listinfo/python-list
Re: Error in Import gv module
Please avoid top posting and answer to the list. On 22.04.2013 12:38, Megha Agrawal wrote: Widows 7, and i have pygraphviz library in python27- lib- site-package folder. Sorry don't know much about Windows. Have you read through all the issues involving import gv errors?: https://code.google.com/p/python-graph/issues/list?can=1q=import+gvcolspec=ID+Type+Status+Priority+Milestone+Owner+Summarycells=tiles Bye, Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: How to set my gui?
On Friday 19 April 2013 22:16:18 Chris Angelico did opine: On Sat, Apr 20, 2013 at 9:10 AM, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: On Fri, 19 Apr 2013 09:24:36 +1000, Chris Angelico ros...@gmail.com declaimed the following in gmane.comp.python.general: On Fri, Apr 19, 2013 at 8:57 AM, Walter Hurry walterhu...@lavabit.com wrote: On Fri, 19 Apr 2013 08:00:11 +1000, Chris Angelico wrote: But 1 Corinthians 13:11 You are grown up now, I surmise. : :) Born in 1984, so that'll give you some idea where I was in the :1990s. : A puppy to be taught by greymuzzles (unfortunately, /this/ greymuzzle [1958] has reached the point of being an old dog that only learns new tricks with extreme difficulty G) Yep, taught by my Dad, who has often told the story of how he once held a whole kilobyte of memory in his hands (something like a cubic meter in size). He introduced me to programming, to fiddling with the system configs (actually he forbade that, for ages - because he had to clean up the mess if the system wouldn't boot), and to the joys of networking. So in a large way he's why I'm a geek... and actually he started that even earlier, when I was given the name Chris at birth. That on its own probably is the biggest cause of my geekery, I think! ChrisA Buncha spring chickens, the whole lot of you. Born in '34, I was a geek before the word was invented. But like some of you claim, I am now that old dog that doesn't learn new tricks easily. Cheers, Gene -- There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) My web page: http://coyoteden.dyndns-free.com:85/gene is up! My views http://www.armchairpatriot.com/What%20Has%20America%20Become.shtml Mandrell: You know what I think? Doctor: Ah, ah that's a catch question. With a brain your size you don't think, right? -- Dr. Who A pen in the hand of this president is far more dangerous than a gun in the hands of 200 million law-abiding citizens. -- http://mail.python.org/mailman/listinfo/python-list
Confusing Algorithm
I'm facing the following problem: In a city of towerblocks, Spiderman can “cover” all the towers by connecting the first tower with a spider-thread to the top of a later tower and then to a next tower and then to yet another tower until he reaches the end of the city. Threads are straight lines and cannot intersect towers. Your task is to write a program that finds the minimal number of threads to cover all the towers. The list of towers is given as a list of single digits indicating their height. -Example: List of towers: 1 5 3 7 2 5 2 Output: 4 I'm not sure how a 'towerblock' could be defined. How square does a shape have to be to qualify as a towerblock? Any help on solving this problem? Thank you. -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 04/22/2013 07:58 AM, Blind Anagram wrote: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. Ideally I would like to be able to use: sieve.count(True, hi) where 'hi' sets the end of the count but this function is, sadly, not available for lists. The use of a bytearray with a memoryview object instead of a list solves this particular problem but it is not a solution for me as it creates more problems than it solves in other aspects of the program. Can I assume that one possible solution would be to sub-class list and create a C based extension to provide list.count(value, limit)? Are there any other solutions that will avoid copying a large part of the list? Instead of using the default slice notation, why not use itertools.islice() ? Something like (untested): import itertools it = itertools.islice(sieve, 0, hi) sum(itertools.imap(bool, it)) I only broke it into two lines for clarity. It could also be: sum(itertools.imap(bool, itertools.islice(sieve, 0, hi))) If you're using Python 3.x, say so, and I'm sure somebody can simplify these, since in Python 3, many functions already produce iterators instead of lists. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
Op 22-04-13 11:18, Steven D'Aprano schreef: On Mon, 22 Apr 2013 03:08:24 -0500, Andrew Berg wrote: Much of the stdlib doesn't rely on anything but the core interpreter. tkinter by itself is not the issue. As you said, the bindings are tiny. However, in order to be usable, it requires quite a few things - most notably X. On desktop Linux, this is already installed, but on server systems, it generally is not (or at least shouldn't be in most cases). Going back to my example of a web server using a Python-based framework, I'll repeat that there is no reason such a system should have X even installed in order to serve web pages. Even on a lean, mean server machine, CPython requires only a few extra libraries. Add tkinter, and suddenly you have to install a LOT of things. If you plan to actually use tkinter, this is fine. If not, you've just added a lot of stuff that you don't need. This adds unnecessary overhead in several places (like your package system's database). I can't disagree with any of this, except to say that none of this justifies having a separate package for Tkinter. Naturally if you don't have X, Tcl won't work, and if Tcl won't work, Tkinter won't work and should give an import error. But that doesn't imply that X must be a dependency for Python. It's a dependency for having Tkinter *work*, but not for *installing* Tkinter as part of the standard library. Hell, even if you have X installed, and Tcl, and the Tkinter packages, importing tkinter can still fail, if Python wasn't built with the right magic incantations for it to recognise that Tcl is installed. Then don't use a package system. The job of a package system is, that if you install something, it install all dependencies that are needed to make it work. And if, as the OP you thinks, python working, means tkinter working, not installing tcl and not installing X, is not an option. Your solution doesn't make sense in view of your earlier response where you argue tkinster should be installed because it is part of the standard combined with the advantage of having a standard library. But IMO a part of that standard library not working, is just as harmful as part of that standard library not being installed. From a user/programmer's point of view the result is the same. It is unusable. -- http://mail.python.org/mailman/listinfo/python-list
Re: Confusing Algorithm
On Mon, Apr 22, 2013 at 10:39 PM, RBotha r...@ymond.co.za wrote: I'm facing the following problem: In a city of towerblocks, Spiderman can “cover” all the towers by connecting the first tower with a spider-thread to the top of a later tower and then to a next tower and then to yet another tower until he reaches the end of the city. Threads are straight lines and cannot intersect towers. Your task is to write a program that finds the minimal number of threads to cover all the towers. The list of towers is given as a list of single digits indicating their height. -Example: List of towers: 1 5 3 7 2 5 2 Output: 4 I'm not sure how a 'towerblock' could be defined. How square does a shape have to be to qualify as a towerblock? Any help on solving this problem? First start by clarifying the problem. My reading of this is that Spiderman iterates over the towers, connecting his thread from one to the next, but only so long as the towers get shorter: First thread 1 New thread 5-3 New thread 7-2 New thread 5-2 There are other possible readings of the problem. Once you fully understand the problem, write it out in pseudo-code - something like this: Iterate over towers sequentially If next tower is taller than current tower, new thread Report number of new threads And then turn it into code, and start running it and playing with it... and debugging it. (There's a small error in the pseudo-code I just posted; can you spot it?) Once you're at that point, if you get stuck, post your code and we can try to help. But fundamentally, I think you don't _need_ to define a towerblock. :) ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22/04/2013 13:51, Dave Angel wrote: On 04/22/2013 07:58 AM, Blind Anagram wrote: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. Ideally I would like to be able to use: sieve.count(True, hi) where 'hi' sets the end of the count but this function is, sadly, not available for lists. The use of a bytearray with a memoryview object instead of a list solves this particular problem but it is not a solution for me as it creates more problems than it solves in other aspects of the program. Can I assume that one possible solution would be to sub-class list and create a C based extension to provide list.count(value, limit)? Are there any other solutions that will avoid copying a large part of the list? Instead of using the default slice notation, why not use itertools.islice() ? Something like (untested): import itertools it = itertools.islice(sieve, 0, hi) sum(itertools.imap(bool, it)) I only broke it into two lines for clarity. It could also be: sum(itertools.imap(bool, itertools.islice(sieve, 0, hi))) If you're using Python 3.x, say so, and I'm sure somebody can simplify these, since in Python 3, many functions already produce iterators instead of lists. Thanks, I'll look at these ideas. And, yes, my interest is mainly in Python 3. -- http://mail.python.org/mailman/listinfo/python-list
Re: Error in Import gv module
yes, I did. They said, gv module doesn't exist for windows. On Mon, Apr 22, 2013 at 5:56 PM, Andreas Perstinger andiper...@gmail.comwrote: Please avoid top posting and answer to the list. On 22.04.2013 12:38, Megha Agrawal wrote: Widows 7, and i have pygraphviz library in python27- lib- site-package folder. Sorry don't know much about Windows. Have you read through all the issues involving import gv errors?: https://code.google.com/p/**python-graph/issues/list?can=** 1q=import+gvcolspec=ID+Type+**Status+Priority+Milestone+** Owner+Summarycells=tileshttps://code.google.com/p/python-graph/issues/list?can=1q=import+gvcolspec=ID+Type+Status+Priority+Milestone+Owner+Summarycells=tiles Bye, Andreas -- http://mail.python.org/**mailman/listinfo/python-listhttp://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. Have you timed it? Because Python is a high-level language, it is rarely obvious what code will be fast. Yes, sieve[:hi] will copy the first hi entries, but that's likely to be fast, basically just a memcopy, unless sieve is huge and memory is short. In other words, unless your sieve is so huge that the operating system cannot find enough memory for it, making a copy is likely to be relatively insignificant. I've just tried seven different techniques to optimize this, and the simplest, most obvious technique is by far the fastest. Here are the seven different code snippets I measured, with results: sieve[:hi].count(True) sum(sieve[:hi]) sum(islice(sieve, hi)) sum(x for x in islice(sieve, hi) if x) sum(x for x in islice(sieve, hi) if x is True) sum(1 for x in islice(sieve, hi) if x is True) len(list(filter(None, islice(sieve, hi Here's the code I used to time them. Just copy and paste into an interactive interpreter: === cut === import random sieve = [random.random() 0.5 for i in range(10**7)] from timeit import Timer setup = from __main__ import sieve from itertools import islice hi = 7*10**6 t1 = Timer(sieve[:hi].count(True), setup) t2 = Timer(sum(sieve[:hi]), setup) t3 = Timer(sum(islice(sieve, hi)), setup) t4 = Timer(sum(x for x in islice(sieve, hi) if x), setup) t5 = Timer(sum(x for x in islice(sieve, hi) if x is True), setup) t6 = Timer(sum(1 for x in islice(sieve, hi) if x is True), setup) t7 = Timer(len(list(filter(None, islice(sieve, hi, setup) for t in (t1, t2, t3, t4, t5, t6, t7): print( min(t.repeat(number=10)) ) === cut === On my computer, using Python 3.3, here are the timing results I get: 2.3714727330952883 7.96061935601756 7.230580328032374 10.080201900098473 11.544118068180978 9.216834562830627 3.499635103158653 Times shown are in seconds, and are for the best of three trials, each trial having 10 repetitions of the code being tested. As you can see, clever tricks using sum are horrible pessimisations, the only thing that comes close to the obvious solution is the one using filter. Although I have only tested a list with ten million items, not hundreds of millions, I don't expect that the results will be significantly different if you use a larger list, unless you are very short of memory. [...] Can I assume that one possible solution would be to sub-class list and create a C based extension to provide list.count(value, limit)? Of course. But don't optimize this until you know that you *need* to optimize it. Is it really a bottleneck in your code? There's no point in saving the 0.1 second it takes to copy the list if it takes 2 seconds to count the items regardless. Are there any other solutions that will avoid copying a large part of the list? Yes, but they're slower. Perhaps a better solution might be to avoid counting anything. If you can keep a counter, and each time you add a value to the list you update the counter, then getting the number of True values will be instantaneous. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Mon, Apr 22, 2013 at 9:18 PM, lcrocker leedanielcroc...@gmail.com wrote: On Apr 21, 11:36 pm, Rui Maciel rui.mac...@gmail.com wrote: Steven D'Aprano wrote: It's only easy to install a package on Ubuntu if you know that you have to, and can somehow work out the name of the package. No one actually has to install tkinter. That's the whole point of providing it as a separate package: only those who want to use it have to install it. The rest of us don't. I'm a programmer, I installed Tkinter, and use it. I'd like to deploy programs written with it to others. **Those** people know nothing about it, and **shouldn't have to**. I've given them a program in Python, they have Python, but it doesn't run, and doesn't give them a helpful error. They'll probably just blame me and move on. Not every Python user is a programmer. If I write a program in Java, any user with Java installed can run it. As it stands, that's no true for Python. That's not good PR for the cause. If you're deploying only to Debian-based Linuxes (such as the Ubuntu you mentioned originally), then it may be worth distributing your program as a .deb file and declaring all the appropriate dependencies (which would then include python3-tk). Alternatively, just put an apt-get install python3-tk into your install script (which is what I do for internal deployments - if you need package XYZ for program Foo, inst-foo will install XYZ), or simply tell people they need to install it. How do you make sure they even have a Python 3.x? Whatever you do to ensure that, just add python3-tk to it. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: itertools.groupby
Jason Friedman jsf80238 at gmail.com writes: Thank you for the responses! Not sure yet which one I will pick. Hi again, I was working a bit on my own solution and on the one from Steven/Joshua, and maybe that helps you deciding: def separate_on(iterable, separator): # based on groupby sep_len=len(separator) for is_header, item in groupby(iterable, lambda line: line[:sep_len] == separator): if is_header: header_tails = [h[sep_len:].strip() for h in item] for naked_header in header_tails[:-1]: yield (naked_header,[]) header_tail = header_tails[-1] else: try: yield (header_tail, [s.strip() for s in item]) except UnboundLocalError: yield (None, [s.strip() for s in item]) def group(iterable, separator): # Steven's/Joshua's rewritten sep_len = len(separator) accum = None header = None for item in iterable: item = item.strip() if item[:sep_len] == separator: if accum is not None: # Don't bother if there are no accumulated lines. yield (header, accum) header = item[sep_len:] accum = [] else: try: accum.append(item) except AttributeError: accum = [item] # Don't forget the last group of lines. yield (header, accum) Both versions behave as follows: - any line that *starts* with the separator is treated as a header line. The tail of that line is returned as the groups title in a tuple with the group's content, i.e. (header, [body]). If there's only the separator, the title is ''. I find this a more useful behaviour as it allows things like: ##Group1 elem1 elem2 elem3 ##Group2 a b c ... - if there are headers without body, they are reported as (header, []). - if the first body has no header, that's reported as (None, [body]). Advantages Disadvantages of either form: Steven's/Joshua's: simple and fast it's more readable I'd say, and for small groups the groupby implementation is about 1.5x slower than this one. The groupby version catches up with increasing group sizes (because it uses comprehensions instead of list.append I think), but it only wins with groups of ~1000 elements. the groupby implementation: more flexible its yield statement deliberately returns a list of the elements, but before that you just have an iterator, which you could just as well turn into a tuple, set, string or anything without constructing the list in memory. So in terms of code recycling this might be preferable. Cheers, Wolfgang -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
Blind Anagram wrote: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. Ideally I would like to be able to use: sieve.count(True, hi) where 'hi' sets the end of the count but this function is, sadly, not available for lists. The use of a bytearray with a memoryview object instead of a list solves this particular problem but it is not a solution for me as it creates more problems than it solves in other aspects of the program. Can I assume that one possible solution would be to sub-class list and create a C based extension to provide list.count(value, limit)? Are there any other solutions that will avoid copying a large part of the list? If the list doesn't change often you can convert it to a string items = [True, False, False] * 10 sitems = .join(FT[i] for i in items) sitems 'TFFTFFTFFTFFTFFTFFTFFTFFTFFTFF' sitems.count(T, 3, 10) 3 sitems.count(F, 3, 10) 4 Or you use a[3:10].sum() on a boolean numpy array. Its slices are views rather than copies: import numpy a = numpy.array([True, False, False]*10) a[3:10].sum() 3 -- http://mail.python.org/mailman/listinfo/python-list
kbhit/getch python equivalent
Hi everyone, I'm looking for a kbhit/getch equivalent in python in order to be able to stop my inner loop in a controlled way (communication with external hardware is involved and breaking it abruptly may cause unwanted errors on the protocol). I'm programming on *nix systems, no need to be portable on Windows. I've seen the msvcrt module, but it looks like is for Windows only. Any ideas/suggestions? Al -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
Blind Anagram於 2013年4月22日星期一UTC+8下午7時58分20秒寫道: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. Ideally I would like to be able to use: sieve.count(True, hi) where 'hi' sets the end of the count but this function is, sadly, not available for lists. The use of a bytearray with a memoryview object instead of a list solves this particular problem but it is not a solution for me as it creates more problems than it solves in other aspects of the program. Can I assume that one possible solution would be to sub-class list and create a C based extension to provide list.count(value, limit)? Are there any other solutions that will avoid copying a large part of the list? For those problems related to a homogeneous list of numbers , please check whether the arrays in numpy can fit your needs practically or not. Sometimes I work on numbers in varied ranges, then the list and the long integers in Python is really handy. -- http://mail.python.org/mailman/listinfo/python-list
HTTPServer again
Hi, a few weeks back I posed a question about passing static data to a request server, and thanks to some useful suggestions, got it working. I see yesterday there is a suggestion to use a framework like Tornado rather than base classes. However I can't figure achieve the same effect using Tornado (BTW this is all python 2.7). The key point is how to access the server class from within do_GET, and from the server class instance, to access its get and set methods. Here are some code fragments that work with HTTPServer: class MyHandler(BaseHTTPRequestHandler): def do_GET(self): ss = self.server tracks = ss.tracks . . . class MyWebServer(object): def get_params(self): return self.global_params def set_params(self, params): self.global_params = params def get_tracks(self): return self.tracks def __init__(self): self.global_params = self.tracks = setup_() myServer = HTTPServer myServer.tracks = self.get_tracks() myServer.params = self.get_params() self.server = myServer(('', 7878), MyHandler) print 'started httpserver on port 7878...' . . . . def main(): aServer = MyWebServer() aServer.runIt() if __name__ == '__main__': main() -- http://mail.python.org/mailman/listinfo/python-list
comp.lang.python
Hi, a few weeks back I posed a question about passing static data to a request server, and thanks to some useful suggestions, got it working. I see yesterday there is a suggestion to use a framework like Tornado rather than base classes. However I can't figure achieve the same effect using Tornado (BTW this is all python 2.7). The key point is how to access the server class from within do_GET, and from the server class instance, to access its get and set methods. Here are some code fragments that work with HTTPServer: class MyHandler(BaseHTTPRequestHandler): def do_GET(self): ss = self.server tracks = ss.tracks . . . class MyWebServer(object): def get_params(self): return self.global_params def set_params(self, params): self.global_params = params def get_tracks(self): return self.tracks def __init__(self): self.global_params = self.tracks = setup_() myServer = HTTPServer myServer.tracks = self.get_tracks() myServer.params = self.get_params() self.server = myServer(('', 7878), MyHandler) print 'started httpserver on port 7878...' . . . . def main(): aServer = MyWebServer() aServer.runIt() if __name__ == '__main__': main() Show trimmed content website-- http://www.win2job.info/ -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
Numpy is a big improvement here. In Py 2.7 I get this output if I run Steven's benchmark: 2.10364603996 3.68471002579 4.01849389076 7.41974878311 10.4202470779 9.16782712936 3.36137390137 (confirming his results). If I then run the numpy idiom for this: import random from timeit import Timer import numpy sieve = numpy.array([random.random() 0.5 for i in range(10**7)], dtype=bool) setup = from __main__ import sieve from itertools import islice hi = 7*10**6 t1 = Timer((True == sieve[:hi]).sum(), setup) print(min(t1.repeat(number=10))) ### I get : 0.344316959381 It likely consumes less space as well, since it doesn't store Python objects in the array. Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: kbhit/getch python equivalent
alb wrote: I'm looking for a kbhit/getch equivalent in python in order to be able to stop my inner loop in a controlled way (communication with external hardware is involved and breaking it abruptly may cause unwanted errors on the protocol). I'm programming on *nix systems, no need to be portable on Windows. I've seen the msvcrt module, but it looks like is for Windows only. Any ideas/suggestions? Curses? http://docs.python.org/dev/library/curses.html -- http://mail.python.org/mailman/listinfo/python-list
Re: kbhit/getch python equivalent
On 2013-04-22, alb alessandro.bas...@cern.ch wrote: I'm looking for a kbhit/getch equivalent in python in order to be able to stop my inner loop in a controlled way (communication with external hardware is involved and breaking it abruptly may cause unwanted errors on the protocol). I'm programming on *nix systems, no need to be portable on Windows. I've seen the msvcrt module, but it looks like is for Windows only. Any ideas/suggestions? Signals, ncurses, termios. -- Grant Edwards grant.b.edwardsYow! ANN JILLIAN'S HAIR at makes LONI ANDERSON'S gmail.comHAIR look like RICARDO MONTALBAN'S HAIR! -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22/04/2013 14:13, Steven D'Aprano wrote: On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. Have you timed it? Because Python is a high-level language, it is rarely obvious what code will be fast. Yes, sieve[:hi] will copy the first hi entries, but that's likely to be fast, basically just a memcopy, unless sieve is huge and memory is short. In other words, unless your sieve is so huge that the operating system cannot find enough memory for it, making a copy is likely to be relatively insignificant. I've just tried seven different techniques to optimize this, and the simplest, most obvious technique is by far the fastest. Here are the seven different code snippets I measured, with results: sieve[:hi].count(True) sum(sieve[:hi]) sum(islice(sieve, hi)) sum(x for x in islice(sieve, hi) if x) sum(x for x in islice(sieve, hi) if x is True) sum(1 for x in islice(sieve, hi) if x is True) len(list(filter(None, islice(sieve, hi Yes, I did time it and I agree with your results (where my tests overlap with yours). But when using a sub-sequence, I do suffer a significant reduction in speed for a count when compared with count on the full list. When the list is small enough not to cause memory allocation issues this is about 30% on 100,000,000 items. But when the list is 1,000,000,000 items, OS memory allocation becomes an issue and the cost on my system rises to over 600%. I agree that this is not a big issue but it seems to me a high price to pay for the lack of a sieve.count(value, limit), which I feel is a useful function (given that memoryview operations are not available for lists). Of course. But don't optimize this until you know that you *need* to optimize it. Is it really a bottleneck in your code? There's no point in saving the 0.1 second it takes to copy the list if it takes 2 seconds to count the items regardless. Are there any other solutions that will avoid copying a large part of the list? Yes, but they're slower. Perhaps a better solution might be to avoid counting anything. If you can keep a counter, and each time you add a value to the list you update the counter, then getting the number of True values will be instantaneous. Creating the sieve is currently very fast as it is not done by adding single items but by adding a large number of items at the same time using a slice operation. I could count the items in each slice as it is added but this would add complexity that I would prefer to avoid because the creation of the sieve is quite tricky to get right and I would prefer not to fiddle with this. Thank you (and others) for advice on this. -- http://mail.python.org/mailman/listinfo/python-list
Re: Preparing sqlite, dl and tkinter for Python installation (no admin rights)
On 21.04.13 23:31, James Jong wrote: I see, just to be clear, do you mean that Python 2.7.4 (stable) is incompatible with Tk 8.6 (stable)? Yes. -- http://mail.python.org/mailman/listinfo/python-list
Re: itertools.groupby
On 2013-04-20, Jason Friedman jsf80...@gmail.com wrote: I have a file such as: $ cat my_data Starting a new group a b c Starting a new group 1 2 3 4 Starting a new group X Y Z Starting a new group I am wanting a list of lists: ['a', 'b', 'c'] ['1', '2', '3', '4'] ['X', 'Y', 'Z'] [] Hrmmm, hoomm. Nobody cares for slicing any more. def headered_groups(lst, header): b = lst.index(header) + 1 while True: try: e = lst.index(header, b) except ValueError: yield lst[b:] break yield lst[b:e] b = e+1 for group in headered_groups([line.strip() for line in open('data.txt')], Starting a new group): print(group) -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list
Re: Weird behaviour?
On Apr 21, 9:19 pm, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Mon, 22 Apr 2013 10:56:11 +1000, Chris Angelico wrote: You're running this under Windows. The convention on Windows is for end-of-line to be signalled with \r\n, but the convention inside Python is to use just \n. With the normal use of buffered and parsed input, this is all handled for you; with unbuffered input, that translation also seems to be disabled, so your string actually contains '120\r', as will be revealed by its repr(). If that's actually the case, then I would call that a bug in raw_input. Actually, raw_input doesn't seem to cope well with embedded newlines even without the -u option. On Linux, I can embed a control character by typing Ctrl-V followed by Ctrl-char. E.g. Ctrl-V Ctrl-M to embed a carriage return, Ctrl-V Ctrl-J to embed a newline. So watch: [steve@ando ~]$ python2.7 -c x = raw_input('Hello? '); print repr(x) Hello? 120^M^Jabc '120\r' Everything after the newline is lost. -- Steven Maybe it is related to this bug? http://bugs.python.org/issue11272 -- http://mail.python.org/mailman/listinfo/python-list
Re: There must be a better way
On 21 April 2013 14:15, Colin J. Williams c...@ncf.ca wrote: In the end, I used: inData= csv.reader(inFile) def main(): if ver == '2': headerLine= inData.next() else: headerLine= inData.__next__() ... for item in inData: assert len(dataStore) == len(item) j= findCardinal(item[10]) ... This may not be relevant for what you're doing but if you use csv.DictReader there's no need to retrieve the top line separately: $ cat tmp.csv a,b,c 1,2,3 4,5,6 $ python Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. import csv with open('tmp.csv', 'rb') as csvfile: ... for row in csv.DictReader(csvfile): ... print(row) ... {'a': '1', 'c': '3', 'b': '2'} {'a': '4', 'c': '6', 'b': '5'} Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: There must be a better way
On 2013-04-21, Colin J. Williams c...@ncf.ca wrote: On 20/04/2013 9:07 PM, Terry Jan Reedy wrote: On 4/20/2013 8:34 PM, Tim Chase wrote: In 2.x, the csv.reader() class (and csv.DictReader() class) offered a .next() method that is absent in 3.x In Py 3, .next was renamed to .__next__ for *all* iterators. The intention is that one iterate with for item in iterable or use builtin functions iter() and next(). Thanks to Chris, Tim and Terry for their helpful comments. I was seeking some code that would be acceptable to both Python 2.7 and 3.3. In the end, I used: inData= csv.reader(inFile) def main(): if ver == '2': headerLine= inData.next() else: headerLine= inData.__next__() ... for item in inData: assert len(dataStore) == len(item) j= findCardinal(item[10]) ... This is acceptable to both versions. It is not usual to have a name with preceding and following udserscores,imn user code. Presumably, there is a rationale for the change from csv.reader.next to csv.reader.__next__. If next is not acceptable for the version 3 csv.reader, perhaps __next__ could be added to the version 2 csv.reader, so that the same code can be used in the two versions. This would avoid the kluge I used above. Would using csv.DictReader instead a csv.reader be an option? -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list
Re: itertools.groupby
On 22 April 2013 15:24, Neil Cerutti ne...@norwich.edu wrote: Hrmmm, hoomm. Nobody cares for slicing any more. def headered_groups(lst, header): b = lst.index(header) + 1 while True: try: e = lst.index(header, b) except ValueError: yield lst[b:] break yield lst[b:e] b = e+1 This requires the whole file to be read into memory. Iterators are typically preferred over list slicing for sequential text file access since you can avoid loading the whole file at once. This means that you can process a large file while only using a constant amount of memory. for group in headered_groups([line.strip() for line in open('data.txt')], Starting a new group): print(group) The list comprehension above loads the entire file into memory. Assuming that .strip() is just being used to remove the newline at the end it would be better to just use the readlines() method since that loads everything into memory and removes the newlines. To remove them without reading everything you can use map (or imap in Python 2): with open('data.txt') as inputfile: for group in headered_groups(map(str.strip, inputfile)): print(group) Oscar -- http://mail.python.org/mailman/listinfo/python-list
error importing modules
I'm using the fabric api (fabfile.org) I’m executing my fab script like the following: $ fab -H server set_nic_buffers -f set_nic_buffers.py Traceback (most recent call last): File /usr/lib/python2.7/site-packages/fabric/main.py, line 739, in main *args, **kwargs File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 316, in execute multiprocessing File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 213, in _execute return task.run(*args, **kwargs) File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 123, in run return self.wrapped(*args, **kwargs) File /home/rbrown/repos/unix-tools/tools/fabfiles/nb.py, line 5, in set_nic_buffers f_exec = modules.Fabexec('set_nic_buffers', '/var/tmp/unix-tools/tools/set_nic_buffers.sh') TypeError: 'module' object is not callable My paths all seem to be fine not sure what’s going on $ python -c 'import modules.Fabexec; print (modules.Fabexec)' module 'modules.Fabexec' from 'modules/Fabexec.pyc' Fabfiles |-- modules | |-- Fabexec.py | |-- Fabexec.pyc | |-- __init__.py | `-- __init__.pyc |-- systune.py |-- systune.pyc `-- set_nic_buffers.py --- set_nic_buffers.py --- import modules from modules import Fabexec def set_nic_buffers(): f_exec = modules.Fabexec('set_nic_buffers', '/var/tmp/unix-tools/tools/set_nic_buffers.sh') f_exec.run() --- Fabexec.py --- from fabric.api import run, cd, sudo, env from fabric.contrib import files from fabric.colors import green class Fabexec(object): repobase='/var/tmp/unix-tools' def __init__(self,script_name,install_script): self.script_name = script_name self.install_script = install_script def run(self): if files.exists(self.install_script): with cd(self.repobase): result = sudo(self.install_script + ' %s ' % env.host) if result.return_code != 0: print(red('Error occured executing %s' % self.install_script)) else: print(green('%s executed successfully')) else: print(red('Error no such dir %s try running repo deploy script to host %s' % (self.repobase, env.host))) raise SystemExit() -- http://mail.python.org/mailman/listinfo/python-list
Re: Error in Import gv module
You are still top posting. On 22.04.2013 14:43, Megha Agrawal wrote: yes, I did. They said, gv module doesn't exist for windows. Then I'm afraid you are out of luck. Two possible alternatives: 1) Save your graph to a file and use the command line tools: http://stackoverflow.com/a/12698636 2) Try some other Graphviz bindings. A quick search on PyPi gave me: https://pypi.python.org/pypi/pygraphviz/1.1 https://pypi.python.org/pypi/pydot/1.0.28 https://pypi.python.org/pypi/yapgvb/1.2.0 Bye, Andreas -- http://mail.python.org/mailman/listinfo/python-list
Re: Confusing Algorithm
On Tue, Apr 23, 2013 at 12:57 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 22 April 2013 13:56, Chris Angelico ros...@gmail.com wrote: There are other possible readings of the problem. I read it differently. I thought the threads would go 1-5-7-5-2. I hadn't thought of that one, but agreed, that's also plausible, and it results in an answer of 4. It's a stronger contender than the one I posited, because the wording implies that there are multiple ways to do it and you have to pick/find the best. Seems to me the problem's a little under-specified, tbh. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: kbhit/getch python equivalent
On Mon, Apr 22, 2013 at 11:34 PM, alb alessandro.bas...@cern.ch wrote: I'm looking for a kbhit/getch equivalent in python in order to be able to stop my inner loop in a controlled way (communication with external hardware is involved and breaking it abruptly may cause unwanted errors on the protocol). Catch KeyboardInterrupt and hit Ctrl-C. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Confusing Algorithm
On 22 April 2013 13:56, Chris Angelico ros...@gmail.com wrote: On Mon, Apr 22, 2013 at 10:39 PM, RBotha r...@ymond.co.za wrote: I'm facing the following problem: In a city of towerblocks, Spiderman can “cover” all the towers by connecting the first tower with a spider-thread to the top of a later tower and then to a next tower and then to yet another tower until he reaches the end of the city. Threads are straight lines and cannot intersect towers. Your task is to write a program that finds the minimal number of threads to cover all the towers. The list of towers is given as a list of single digits indicating their height. -Example: List of towers: 1 5 3 7 2 5 2 Output: 4 I'm not sure how a 'towerblock' could be defined. How square does a shape have to be to qualify as a towerblock? Any help on solving this problem? First start by clarifying the problem. My reading of this is that Spiderman iterates over the towers, connecting his thread from one to the next, but only so long as the towers get shorter: -Example: List of towers: 1 5 3 7 2 5 2 Output: 4 First thread 1 New thread 5-3 New thread 7-2 New thread 5-2 There are other possible readings of the problem. I read it differently. I thought the threads would go 1-5-7-5-2. Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: itertools.groupby
On 2013-04-22, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 22 April 2013 15:24, Neil Cerutti ne...@norwich.edu wrote: Hrmmm, hoomm. Nobody cares for slicing any more. def headered_groups(lst, header): b = lst.index(header) + 1 while True: try: e = lst.index(header, b) except ValueError: yield lst[b:] break yield lst[b:e] b = e+1 This requires the whole file to be read into memory. Iterators are typically preferred over list slicing for sequential text file access since you can avoid loading the whole file at once. This means that you can process a large file while only using a constant amount of memory. I agree, but this application processes unknowns-sized slices, you have to build lists anyhow. I find slicing much more convenient than accumulating in this case, but it's possibly a tradeoff. with open('data.txt') as inputfile: for group in headered_groups(map(str.strip, inputfile)): print(group) Thanks, that's a nice improvement. -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list
Re: itertools.groupby
On Tue, Apr 23, 2013 at 12:49 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: Iterators are typically preferred over list slicing for sequential text file access since you can avoid loading the whole file at once. This means that you can process a large file while only using a constant amount of memory. And, perhaps even more importantly, allows you to pipe text in and out. Obviously some operations (eg grep) lend themselves better to this than do others (eg sort), but with this it ought at least to output each group as it comes. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Serial Port Issue
Have you tried 'port=20'? The documentation says that the port numbering starts at zero. I don't use Windows so I can't test it for you. You could also try port=COM21 Phil On Apr 22, 2013, at 4:34 AM, chandan kumar wrote: Hi, I'm new to python and trying to learn serial communication using python.In this process i'm facing serial port issues.Please find the attached COMPorttest.py file ,correct me if anything wrong in the code.With my code it's always goes in to exception.I noted down the the COM port number from windows device manager list. Operating system: XP Python Ver: 2.5 Pyserial: 2.5 Even i tried from python shell passing below commands import serial ser=ser=serial.Serial(port=21,baudrate=9600) I observe below error on python shell File C:\Python25\lib\serial\serialwin32.py, line 55, in open raise SerialException(could not open port: %s % msg) SerialException: could not open port: (2, 'CreateFile', 'The system cannot find the file specified.') Thanks in advance. Best Regards, Chandan. COMPortTest.pyDeviceManager.PNG-- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22 April 2013 15:15, Blind Anagram blindanag...@nowhere.org wrote: On 22/04/2013 14:13, Steven D'Aprano wrote: On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. [snip] But when using a sub-sequence, I do suffer a significant reduction in speed for a count when compared with count on the full list. When the list is small enough not to cause memory allocation issues this is about 30% on 100,000,000 items. But when the list is 1,000,000,000 items, OS memory allocation becomes an issue and the cost on my system rises to over 600%. Have you tried using numpy? I find that it reduces the memory required to store a list of bools by a factor of 4 on my 32 bit system. I would expect that to be a factor of 8 on a 64 bit system: import sys a = [True] * 100 sys.getsizeof(a) 436 import numpy a = numpy.ndarray(100, bool) sys.getsizeof(a) # This does not include the data buffer 40 a.nbytes 100 The numpy array also has the advantage that slicing does not actually copy the data (as has already been mentioned). On this system slicing a numpy array has a 40 byte overhead regardless of the size of the slice. I agree that this is not a big issue but it seems to me a high price to pay for the lack of a sieve.count(value, limit), which I feel is a useful function (given that memoryview operations are not available for lists). It would be very easy to subclass list and add this functionality in cython if you decide that you do need a builtin method. Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22/04/2013 16:14, Oscar Benjamin wrote: On 22 April 2013 15:15, Blind Anagram blindanag...@nowhere.org wrote: On 22/04/2013 14:13, Steven D'Aprano wrote: On Mon, 22 Apr 2013 12:58:20 +0100, Blind Anagram wrote: I would be grateful for any advice people can offer on the fastest way to count items in a sub-sequence of a large list. I have a list of boolean values that can contain many hundreds of millions of elements for which I want to count the number of True values in a sub-sequence, one from the start up to some value (say hi). I am currently using: sieve[:hi].count(True) but I believe this may be costly because it copies a possibly large part of the sieve. [snip] But when using a sub-sequence, I do suffer a significant reduction in speed for a count when compared with count on the full list. When the list is small enough not to cause memory allocation issues this is about 30% on 100,000,000 items. But when the list is 1,000,000,000 items, OS memory allocation becomes an issue and the cost on my system rises to over 600%. Have you tried using numpy? I find that it reduces the memory required to store a list of bools by a factor of 4 on my 32 bit system. I would expect that to be a factor of 8 on a 64 bit system: import sys a = [True] * 100 sys.getsizeof(a) 436 import numpy a = numpy.ndarray(100, bool) sys.getsizeof(a) # This does not include the data buffer 40 a.nbytes 100 The numpy array also has the advantage that slicing does not actually copy the data (as has already been mentioned). On this system slicing a numpy array has a 40 byte overhead regardless of the size of the slice. I agree that this is not a big issue but it seems to me a high price to pay for the lack of a sieve.count(value, limit), which I feel is a useful function (given that memoryview operations are not available for lists). It would be very easy to subclass list and add this functionality in cython if you decide that you do need a builtin method. Thanks Oscar, I'll take a look at this. But I was really wondering if there was a simple solution that worked without people having to add libraries to their basic Python installations. As I have never tried building an extension with cython, I am inclined to try this as a learning exercise if nothing else. -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22 April 2013 16:50, Blind Anagram blindanag...@nowhere.org wrote: It would be very easy to subclass list and add this functionality in cython if you decide that you do need a builtin method. [snip] But I was really wondering if there was a simple solution that worked without people having to add libraries to their basic Python installations. There are simple solutions and some have already been listed. You are attempting to push your program to the limit of your hardware capabilities and it's natural that in a high-level language you'll often want special libraries for that. I don't know what your application is but I would say that my first port of call here would be to consider a different algorithmic approach. An obvious question would be about the sparsity of this data structure. How frequent are the values that you are trying to count? Would it make more sense to store a list of their indices? If the problem needs to be solved the way that you are currently doing it and the available methods are not fast enough then you will need to consider additional libraries. As I have never tried building an extension with cython, I am inclined to try this as a learning exercise if nothing else. I definitely recommend this over writing a C extension directly. Oscar -- http://mail.python.org/mailman/listinfo/python-list
How to connect to a website?
Hi, i just try to connect to a website, read that page and display the rules get from it. Then i get this error message: File D:/Python/Py projects/socket test/sockettest.py, line 21, in module fileobj.write(GET +filename+ HTTP/1.0\n\n) io.UnsupportedOperation: not writable My code: # import sys for handling command line argument # import socket for network communications import sys, socket # hard-wire the port number for safety's sake # then take the names of the host and file from the command line port = 80 host = 'www..nl' filename = 'index.php' # create a socket object called 'c' c = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # connect to the socket c.connect((host, port)) # create a file-like object to read fileobj = c.makefile('r', 1024) # Ask the server for the file fileobj.write(GET +filename+ HTTP/1.0\n\n) # read the lines of the file object into a buffer, buff buff = fileobj.readlines() # step through the buffer, printing each line for line in buff: print (line) I started with invent games with python (book 1 2) Now I want to write a multiplayergame, which connects to a website, where all players and gamedata will be stored/controlled. Players need to subscribe and to login via the game software. (executable, made from python script) Sending gamedata preferable in JSON, because of low traffic resources then. No idea about how authentication proces should be done I made many searches with Google, but got confused about my first steps. I am new to python, but code for many years in php/mysql. Spent most time in an online chessgame project. -- http://mail.python.org/mailman/listinfo/python-list
Re: error importing modules
On 22/04/2013 15:54, Rodrick Brown wrote: I'm using the fabric api (fabfile.org http://fabfile.org) I’m executing my fab script like the following: $ fab -H server set_nic_buffers -f set_nic_buffers.py Traceback (most recent call last): File /usr/lib/python2.7/site-packages/fabric/main.py, line 739, in main *args, **kwargs File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 316, in execute multiprocessing File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 213, in _execute return task.run(*args, **kwargs) File /usr/lib/python2.7/site-packages/fabric/tasks.py, line 123, in run return self.wrapped(*args, **kwargs) File /home/rbrown/repos/unix-tools/tools/fabfiles/nb.py, line 5, in set_nic_buffers f_exec = modules.Fabexec('set_nic_buffers', '/var/tmp/unix-tools/tools/set_nic_buffers.sh') TypeError: 'module' object is not callable My paths all seem to be fine not sure what’s going on $ python -c 'import modules.Fabexec; print (modules.Fabexec)' module 'modules.Fabexec' from 'modules/Fabexec.pyc' Fabfiles |-- modules | |-- Fabexec.py | |-- Fabexec.pyc | |-- __init__.py | `-- __init__.pyc |-- systune.py |-- systune.pyc `-- set_nic_buffers.py --- set_nic_buffers.py --- import modules from modules import Fabexec def set_nic_buffers(): f_exec = modules.Fabexec('set_nic_buffers', '/var/tmp/unix-tools/tools/set_nic_buffers.sh') f_exec.run() 'modules.Fabexec' is the module/script 'Fabexec'. What you want is the 'Fabexec' class within the 'Fabexec' module. --- Fabexec.py --- from fabric.api import run, cd, sudo, env from fabric.contrib import files from fabric.colors import green class Fabexec(object): repobase='/var/tmp/unix-tools' def __init__(self,script_name,install_script): self.script_name = script_name self.install_script = install_script def run(self): if files.exists(self.install_script): with cd(self.repobase): result = sudo(self.install_script + ' %s ' % env.host) if result.return_code != 0: print(red('Error occured executing %s' % self.install_script)) else: print(green('%s executed successfully')) else: print(red('Error no such dir %s try running repo deploy script to host %s' % (self.repobase, env.host))) raise SystemExit() -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22/04/2013 17:06, Oscar Benjamin wrote: On 22 April 2013 16:50, Blind Anagram blindanag...@nowhere.org wrote: It would be very easy to subclass list and add this functionality in cython if you decide that you do need a builtin method. [snip] But I was really wondering if there was a simple solution that worked without people having to add libraries to their basic Python installations. There are simple solutions and some have already been listed. You are attempting to push your program to the limit of your hardware capabilities and it's natural that in a high-level language you'll often want special libraries for that. Hi Oscar Yes, but it is a tribute to Python that I can do this quite fast for huge lists provided that I only count on the full list. And, unless I have completely misunderstood Python internals, it would probably be just as fast on a sub-sequence if I had a list.count(value, limit) function (however, I admit that I could be wrong here since the fact that count on lists does not offer this may mean that it is not as easy to implement as it might seem). I don't know what your application is but I would say that my first port of call here would be to consider a different algorithmic approach. An obvious question would be about the sparsity of this data structure. How frequent are the values that you are trying to count? Would it make more sense to store a list of their indices? Actually it is no more than a simple prime sieve implemented as a Python class (and, yes, I realize that there are plenty of these around). If the problem needs to be solved the way that you are currently doing it and the available methods are not fast enough then you will need to consider additional libraries. As I have never tried building an extension with cython, I am inclined to try this as a learning exercise if nothing else. I definitely recommend this over writing a C extension directly. Thanks again - I will definitely look at this. Brian -- http://mail.python.org/mailman/listinfo/python-list
Re: How to connect to a website?
In 566767a8-35cc-47f2-9f75-032ce5629...@googlegroups.com webmas...@terradon.nl writes: Hi, i just try to connect to a website, read that page and display the rules get from it. Then i get this error message: File D:/Python/Py projects/socket test/sockettest.py, line 21, in module fileobj.write(GET +filename+ HTTP/1.0\n\n) io.UnsupportedOperation: not writable I haven't worked with the socket library, but I think this error is because you specified a mode of 'r' when calling makefile(). fileobj is read-only, and you're trying to write to it. If you just want to connect to a website, try using the urllib2 module instead of socket. It's higher-level and handles a lot of details for you. Here's an example: import urllib2 request = urllib2.Request('http://www.voidspace.org.uk') response = urllib2.urlopen(request) content = response.readlines() -- John Gordon A is for Amy, who fell down the stairs gor...@panix.com B is for Basil, assaulted by bears -- Edward Gorey, The Gashlycrumb Tinies -- http://mail.python.org/mailman/listinfo/python-list
Re: How to connect to a website?
On 22/04/2013 17:16, webmas...@terradon.nl wrote: Hi, i just try to connect to a website, read that page and display the rules get from it. Then i get this error message: File D:/Python/Py projects/socket test/sockettest.py, line 21, in module fileobj.write(GET +filename+ HTTP/1.0\n\n) io.UnsupportedOperation: not writable My code: # import sys for handling command line argument # import socket for network communications import sys, socket # hard-wire the port number for safety's sake # then take the names of the host and file from the command line port = 80 host = 'www..nl' filename = 'index.php' # create a socket object called 'c' c = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # connect to the socket c.connect((host, port)) # create a file-like object to read fileobj = c.makefile('r', 1024) You're creating a file-like object for reading... # Ask the server for the file fileobj.write(GET +filename+ HTTP/1.0\n\n) ...and then trying to write to it. # read the lines of the file object into a buffer, buff buff = fileobj.readlines() # step through the buffer, printing each line for line in buff: print (line) [snip] -- http://mail.python.org/mailman/listinfo/python-list
Python Developer Needed in Ottawa
Python Programmer need in Ottawa, Ontario Canada. Must be eligible to work in Canada and preferably already in Ottawa with a security clearance in place. Phone Al at (613) 425-1634 -- http://mail.python.org/mailman/listinfo/python-list
Re: There must be a better way
On 22/04/2013 10:42 AM, Neil Cerutti wrote: On 2013-04-21, Colin J. Williams c...@ncf.ca wrote: On 20/04/2013 9:07 PM, Terry Jan Reedy wrote: On 4/20/2013 8:34 PM, Tim Chase wrote: In 2.x, the csv.reader() class (and csv.DictReader() class) offered a .next() method that is absent in 3.x In Py 3, .next was renamed to .__next__ for *all* iterators. The intention is that one iterate with for item in iterable or use builtin functions iter() and next(). Thanks to Chris, Tim and Terry for their helpful comments. I was seeking some code that would be acceptable to both Python 2.7 and 3.3. In the end, I used: inData= csv.reader(inFile) def main(): if ver == '2': headerLine= inData.next() else: headerLine= inData.__next__() ... for item in inData: assert len(dataStore) == len(item) j= findCardinal(item[10]) ... This is acceptable to both versions. It is not usual to have a name with preceding and following udserscores,imn user code. Presumably, there is a rationale for the change from csv.reader.next to csv.reader.__next__. If next is not acceptable for the version 3 csv.reader, perhaps __next__ could be added to the version 2 csv.reader, so that the same code can be used in the two versions. This would avoid the kluge I used above. Would using csv.DictReader instead a csv.reader be an option? Since I'm only interested in one or two columns, the simpler approach is probably better. Colin W. -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
But I was really wondering if there was a simple solution that worked without people having to add libraries to their basic Python installations. I think installing numpy is approximately pip install numpy assuming you have write access to your site-packages directory. If not, install using --prefix and set PYTHONPATH accordingly. I forgot that Python also has an array module. With numpy available, mature, and well-supported, I imagine it doesn't get much love these days though. Still, I gave it a whirl: ### import random import array from timeit import Timer import numpy stuff = [random.random() 0.5 for i in range(10**7)] sieve1 = numpy.array(stuff, dtype=bool) sieve2 = array.array('B', stuff) setup = from __main__ import sieve1, sieve2 from itertools import islice hi = 7*10**6 t1 = Timer((True == sieve1[:hi]).sum(), setup) t2 = Timer(sieve2[:hi].count(True), setup) # t3 = Timer(sum(islice(sieve, hi)), setup) # t4 = Timer(sum(x for x in islice(sieve, hi) if x), setup) # t5 = Timer(sum(x for x in islice(sieve, hi) if x is True), setup) # t6 = Timer(sum(1 for x in islice(sieve, hi) if x is True), setup) # t7 = Timer(len(list(filter(None, islice(sieve, hi, setup) print(min(t1.repeat(number=10))) print(min(t2.repeat(number=10))) # for t in (t1, t2, t3, t4, t5, t6, t7): # print( min(t.repeat(number=10)) ) ### Performance was not all that impressive: 0.340315103531 5.42102503777 Still, you might fiddle around with it a bit. Perhaps unsigned ints instead of unsigned bytes will provide more efficient counting... Skip -- http://mail.python.org/mailman/listinfo/python-list
Lists and arrays
Hello! I need your help! I have an array and I need pick some data from that array and put it in a list, for example: array= [a,b,c,1,2,3] list=array[0]+ array[3]+ array[4] list: [a,1,2] When I do it like this: list=array[0]+ array[3]+ array[4] I get an error: TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' Can you help me? -- http://mail.python.org/mailman/listinfo/python-list
Re: Lists and arrays
On 04/22/2013 02:13 PM, Ana Dionísio wrote: Hello! I need your help! I have an array I think you mean you have a numpy array, which is very different than a python array.array and I need pick some data from that array and put it in a list, for example: array= [a,b,c,1,2,3] That's a list. list=array[0]+ array[3]+ array[4] Nothing wrong with that, other than that you just hid the name of the list type, making it tricky to later convert things to lists. list: [a,1,2] You'll never get that. When you assign an object to a list, the object itself is referenced in that list, not the name that it happened to have before. So if a was an object of type float and value 41.5, then you presumably want: mylist: [41.5, 1, 2] When I do it like this: list=array[0]+ array[3]+ array[4] I get an error: TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' Apparently you did not use the line array= [a,b,c,1,2,3] as you said above, but some other assignment, perhaps using a numpy method or six. Worse, apparently the elements of that collection aren't simple numbers but some kind of numpy thingies as well. If you show what you actually did, probably someone here can help, though the more numpy you use, the less likely that it'll be me. If you really had a list, you wouldn't have gotten an error, but neither would you have gotten anything like you're asking. array[3] + array[4] == 1+2 == 3. If you're trying to make a list using + from a subscripted list, you'd have to enclose each integer in square brackets. mylist = [array[0]] + [array[3]] + [array[4]] Alternatively, you could just do mylist = [ array[0], array[3], array[4] ] -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Encoding NaN in JSON
On Sat, 20 Apr 2013, Chris “Kwpolska” Warrick wrote: On Fri, Apr 19, 2013 at 9:42 PM, Grant Edwards invalid@invalid.invalid wrote: The OP asked for a string, and I thought you were proposing the string 'null'. If one is to use a string, then 'NaN' makes the most sense, since it can be converted back into a floating point NaN object. I infer that you were proposing a JSON null value and not the string 'null'? Not me, Wayne Werner proposed to use the JSON null value. I parsed the backticks (`) used by him as a way to delimit it from text and not as a string. That was, in fact, my intention. Though it seems to me that you'll have to suffer between some sort of ambiguity - in Chrome, at least, `Number(null)` evaluates to `0` instead of NaN. But `Number('Whatever')` evaluates to NaN. However, a JSON parser obviously wouldn't be able to make the semantic distinction, so I think you'll be left with whichever API makes the most sense to you: NaN maps to null or NaN maps to NaN (or any other string, really) Obviously you're not limited to these particular choices, but they're probably the easiest to implement and communicate. HTH, -W-- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22/04/2013 18:48, Skip Montanaro wrote: But I was really wondering if there was a simple solution that worked without people having to add libraries to their basic Python installations. I think installing numpy is approximately pip install numpy assuming you have write access to your site-packages directory. If not, install using --prefix and set PYTHONPATH accordingly. I forgot that Python also has an array module. With numpy available, mature, and well-supported, I imagine it doesn't get much love these days though. Still, I gave it a whirl: ### import random import array from timeit import Timer import numpy stuff = [random.random() 0.5 for i in range(10**7)] sieve1 = numpy.array(stuff, dtype=bool) sieve2 = array.array('B', stuff) setup = from __main__ import sieve1, sieve2 from itertools import islice hi = 7*10**6 t1 = Timer((True == sieve1[:hi]).sum(), setup) t2 = Timer(sieve2[:hi].count(True), setup) # t3 = Timer(sum(islice(sieve, hi)), setup) # t4 = Timer(sum(x for x in islice(sieve, hi) if x), setup) # t5 = Timer(sum(x for x in islice(sieve, hi) if x is True), setup) # t6 = Timer(sum(1 for x in islice(sieve, hi) if x is True), setup) # t7 = Timer(len(list(filter(None, islice(sieve, hi, setup) print(min(t1.repeat(number=10))) print(min(t2.repeat(number=10))) # for t in (t1, t2, t3, t4, t5, t6, t7): # print( min(t.repeat(number=10)) ) ### Performance was not all that impressive: 0.340315103531 5.42102503777 Still, you might fiddle around with it a bit. Perhaps unsigned ints instead of unsigned bytes will provide more efficient counting... I spent a lot of time comparing python arrays and lists but found that lists were always much faster in this application. I do have numpy installed but I remember that when I did this (some time ago) it was far from easy with Python 3.x running natively on Windows x64. Brian -- http://mail.python.org/mailman/listinfo/python-list
Re: How to connect to a website?
thanks! solved with: import urllib.request import urllib.parse user = 'user' pw = 'password' login_url = 'http://www.riskopoly.nl/test/index.php' data = urllib.parse.urlencode({'user': user, 'pw': pw}) data = data.encode('utf-8') # adding charset parameter to the Content-Type header. request = urllib.request.Request(login_url) request.add_header(Content-Type,application/x-www-form-urlencoded;charset=utf-8) f = urllib.request.urlopen(request, data) print(f.read().decode('utf-8')) And then i get next answer: preArray ( [pw] = password [user] = user ) /pre Solved and thanks again:) -- http://mail.python.org/mailman/listinfo/python-list
How to get JSON values and how to trace sessions??
Hi all, from python I post data to a webpage using urllib and can print that content. See code below. But now i am wondering how to trace sessions? it is needed for a multiplayer game, connected to a webserver. How do i trace a PHP-session? I suppose i have to save a cookie with the sessionID from the webserver? Is this possible with Python? Are their other ways to keep control over which players sends the gamedata? Secondly, can i handle JSON values? I know how to create them serverside, but how do i handle that response in python? Thank you very much for any answer! Code: import urllib.request import urllib.parse user = 'user' pw = 'password' login_url = 'http://www..nl/test/index.php' data = urllib.parse.urlencode({'user': user, 'pw': pw}) data = data.encode('utf-8') # adding charset parameter to the Content-Type header. request = urllib.request.Request(login_url) request.add_header(Content-Type,application/x-www-form-urlencoded;charset=utf-8) f = urllib.request.urlopen(request, data) print(f.read().decode('utf-8')) -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote: On 22/04/2013 17:06, Oscar Benjamin wrote: I don't know what your application is but I would say that my first port of call here would be to consider a different algorithmic approach. An obvious question would be about the sparsity of this data structure. How frequent are the values that you are trying to count? Would it make more sense to store a list of their indices? Actually it is no more than a simple prime sieve implemented as a Python class (and, yes, I realize that there are plenty of these around). If I understand correctly, you have a list of roughly a billion True/False values indicating which integers are prime and which are not. You would like to discover how many prime numbers there are between two numbers a and b. You currently do this by counting the number of True values in your list between the indices a and b. If my description is correct then I would definitely consider using a different algorithmic approach. The density of primes from 1 to 1 billlion is about 5%. Storing the prime numbers themselves in a sorted list would save memory and allow a potentially more efficient way of counting the number of primes within some interval. To see how it saves memory (on a 64 bit system): $ python Python 2.7.3 (default, Sep 26 2012, 21:51:14) [GCC 4.7.2] on linux2 Type help, copyright, credits or license for more information. import sys a = ([True] + [False]*19) * 5 len(a) 100 sys.getsizeof(a) 872 a = list(range(5)) sys.getsizeof(a) 450120 sum(sys.getsizeof(x) for x in a) 120 So you're using about 1/5th of the memory with a list of primes compared to a list of True/False values. Further savings would be possible if you used an array to store the primes as 64 bit integers. In this case it would take about 400MB to store all the primes up to 1 billion. The more efficient way of counting the primes would then be to use the bisect module. This gives you a way of counting the primes between a and b with a cost that is logarithmic in the total number of primes stored rather than linear in the size of the range (e.g. b-a). For large enough primes/ranges this is certain to be faster. Whether it actually works that way for your numbers I can't say. Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: kbhit/getch python equivalent
I'm looking for a kbhit/getch equivalent in python in order to be able to stop my inner loop in a controlled way (communication with external hardware is involved and breaking it abruptly may cause unwanted errors A curses example import curses stdscr = curses.initscr() curses.cbreak() stdscr.keypad(1) stdscr.addstr(0,10,Hit 'q' to quit ) stdscr.refresh() key = '' while key != ord('q'): key = stdscr.getch() stdscr.addch(20,25,key) stdscr.refresh() curses.endwin() -- http://mail.python.org/mailman/listinfo/python-list
Re: Confusing Algorithm
Am 22.04.13 16:57, schrieb Oscar Benjamin: On 22 April 2013 13:56, Chris Angelico ros...@gmail.com wrote: On Mon, Apr 22, 2013 at 10:39 PM, RBotha r...@ymond.co.za wrote: Threads are straight lines and cannot intersect towers. Your task is to write a program that finds the minimal number of threads to cover all the towers. -Example: List of towers: 1 5 3 7 2 5 2 Output: 4 I read it differently. I thought the threads would go 1-5-7-5-2. I'd agree with your interpretation. Threads are straight lines and cannot intersect towers - I read it such that the answer is the convex hull of the set of points given by the tower height. The convex hull can be computed for this 1D problem by initializing with line segments between every point and repeatedly pulling up every non-convex piece, if I'm not mistaken. Christian -- http://mail.python.org/mailman/listinfo/python-list
Re: How to get JSON values and how to trace sessions??
On Tue, Apr 23, 2013 at 6:09 AM, webmas...@terradon.nl wrote: But now i am wondering how to trace sessions? it is needed for a multiplayer game, connected to a webserver. How do i trace a PHP-session? I suppose i have to save a cookie with the sessionID from the webserver? Is this possible with Python? Are their other ways to keep control over which players sends the gamedata? Secondly, can i handle JSON values? I know how to create them serverside, but how do i handle that response in python? Python has a JSON module that should do what you want: http://docs.python.org/3.3/library/json.html I don't know the details of cookie handling in Python, but this looks to be what you want: http://docs.python.org/3.3/library/http.cookiejar.html#http.cookiejar.CookieJar Tip: The Python docs can be searched very efficiently with a web search (eg Google, Bing, DuckDuckGo, etc). Just type python and whatever it is you want - chances are you'll get straight there. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22 April 2013 21:18, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote: On 22/04/2013 17:06, Oscar Benjamin wrote: I don't know what your application is but I would say that my first port of call here would be to consider a different algorithmic approach. An obvious question would be about the sparsity of this data structure. How frequent are the values that you are trying to count? Would it make more sense to store a list of their indices? Actually it is no more than a simple prime sieve implemented as a Python class (and, yes, I realize that there are plenty of these around). If I understand correctly, you have a list of roughly a billion True/False values indicating which integers are prime and which are not. You would like to discover how many prime numbers there are between two numbers a and b. You currently do this by counting the number of True values in your list between the indices a and b. If my description is correct then I would definitely consider using a different algorithmic approach. The density of primes from 1 to 1 billlion is about 5%. Storing the prime numbers themselves in a sorted list would save memory and allow a potentially more efficient way of counting the number of primes within some interval. In fact it is probably quicker if you don't mind using all that memory to just store the cumulative sum of your prime True/False indicator list. This would be the prime counting function pi(n). You can then count the primes between a and b in constant time with pi[b] - pi[a]. Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
lcrocker wrote: I'm a programmer, I installed Tkinter, and use it. I'd like to deploy programs written with it to others. **Those** people know nothing about it, and **shouldn't have to**. They don't need to. The only person that needs to know what he is doing is you. You want to distribute a software package? Package it. Learn the very basics and set python-tkinter as a dependency. http://wiki.debian.org/Packaging Rui Maciel -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
Steven D'Aprano wrote: I think that if you are worrying about the overhead of the tkinter bindings for Python, you're guilty of premature optimization. I'm not worried about that. No one should be forced to install crap that they don't use or will ever need, no matter how great the average HD capacity is nowadays. Rui Maciel -- http://mail.python.org/mailman/listinfo/python-list
Re: Lists and arrays
Ana Dionísio anadionisio...@gmail.com wrote in message news:de1cc79e-cbf7-4b0b-ae8e-18841a1ef...@googlegroups.com... Hello! I need your help! I have an array and I need pick some data from that array and put it in a list, for example: array= [a,b,c,1,2,3] list=array[0]+ array[3]+ array[4] list: [a,1,2] When I do it like this: list=array[0]+ array[3]+ array[4] I get an error: TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' You're calculating a+1+2. Probably a isn't something that can be added to 1+2. -- Bartc -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22/04/2013 21:18, Oscar Benjamin wrote: On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote: [snip] If my description is correct then I would definitely consider using a different algorithmic approach. The density of primes from 1 to 1 billlion is about 5%. Storing the prime numbers themselves in a sorted list would save memory and allow a potentially more efficient way of counting the number of primes within some interval. That is correct but I need to say that the lengths I have been describing are limiting cases - almost all of the time the sieve length will be quite small. But I was still interested to see if I could push the limit without changing the essential simplicity of the sieve. And here the cost of creating the slice (which I have measured) set me wondering why a list.count(value, limit) function did not exist. I also wondered whether I had missed any obvious way of avoiding the slicing cost (intellectually it seemed wrong to me to have to copy the list in order to count items within it). [snip] So you're using about 1/5th of the memory with a list of primes compared to a list of True/False values. Further savings would be possible if you used an array to store the primes as 64 bit integers. In this case it would take about 400MB to store all the primes up to 1 billion. I have looked at solutions based on listing primes and here I have found that they are very much slower than my existing solution when the sieve is not large (which is the majority use case). I have also tried counting using a loop such as: while i limit: i = sieve.index(1, i) + 1 cnt += 1 but this is slower than count even on huge lists. Thank you again for your advice. Brian -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22/04/2013 22:03, Oscar Benjamin wrote: On 22 April 2013 21:18, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote: On 22/04/2013 17:06, Oscar Benjamin wrote: I don't know what your application is but I would say that my first port of call here would be to consider a different algorithmic approach. An obvious question would be about the sparsity of this data structure. How frequent are the values that you are trying to count? Would it make more sense to store a list of their indices? Actually it is no more than a simple prime sieve implemented as a Python class (and, yes, I realize that there are plenty of these around). If I understand correctly, you have a list of roughly a billion True/False values indicating which integers are prime and which are not. You would like to discover how many prime numbers there are between two numbers a and b. You currently do this by counting the number of True values in your list between the indices a and b. If my description is correct then I would definitely consider using a different algorithmic approach. The density of primes from 1 to 1 billlion is about 5%. Storing the prime numbers themselves in a sorted list would save memory and allow a potentially more efficient way of counting the number of primes within some interval. In fact it is probably quicker if you don't mind using all that memory to just store the cumulative sum of your prime True/False indicator list. This would be the prime counting function pi(n). You can then count the primes between a and b in constant time with pi[b] - pi[a]. I did wonder whether, after creating the sieve, I should simply go through the list and replace the True values with a count. This would certainly speed up the prime count function, which is where the issue arises. I will try this and see what sort of performance trade-offs this involves. Brian -- http://mail.python.org/mailman/listinfo/python-list
Re: Confusing Algorithm
On 22/04/13 13:39, RBotha wrote: I'm facing the following problem: In a city of towerblocks, Spiderman can “cover” all the towers by connecting the first tower with a spider-thread to the top of a later tower and then to a next tower and then to yet another tower until he reaches the end of the city. Threads are straight lines and cannot intersect towers. Your task is to write a program that finds the minimal number of threads to cover all the towers. The list of towers is given as a list of single digits indicating their height. -Example: List of towers: 1 5 3 7 2 5 2 Output: 4 I'm not sure how a 'towerblock' could be defined. How square does a shape have to be to qualify as a towerblock? Any help on solving this problem? It's not the algorithm that's confusing, it's the problem. First clarify the problem. This appears to be a variation of the travelling-salesman problem. Except the position of the towers is not defined, only their height. So either the necessary information is missing or whoever set the problem intended something else. -- http://mail.python.org/mailman/listinfo/python-list
Re: Confusing Algorithm
On Mon, Apr 22, 2013 at 2:33 PM, Christian Gollwitzer aurio...@gmx.de wrote: I'd agree with your interpretation. Threads are straight lines and cannot intersect towers - I read it such that the answer is the convex hull of the set of points given by the tower height. The convex hull can be computed for this 1D problem by initializing with line segments between every point and repeatedly pulling up every non-convex piece, if I'm not mistaken. I agree that seems the likely intention. One also must assume that the towers are evenly spaced and have point width, neither of which are stated in the problem. -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On Mon, 22 Apr 2013 15:15:19 +0100, Blind Anagram wrote: But when using a sub-sequence, I do suffer a significant reduction in speed for a count when compared with count on the full list. When the list is small enough not to cause memory allocation issues this is about 30% on 100,000,000 items. But when the list is 1,000,000,000 items, OS memory allocation becomes an issue and the cost on my system rises to over 600%. Buy more memory :-) I agree that this is not a big issue but it seems to me a high price to pay for the lack of a sieve.count(value, limit), which I feel is a useful function (given that memoryview operations are not available for lists). There is no need to complicate the count method for such a specialised use-case. A more general solution would be to provide list views. Another solution might be to use arrays rather than lists. Since your sieve list is homogeneous, you could possibly use an array of 1 or 0 bytes rather than a list of True or False bools. That would reduce the memory overhead by a factor of four, and similarly reduce the overhead of any copying: py from array import array py from sys import getsizeof py L = [True, False, False, True]*1000 py A = array('b', L) py getsizeof(L) 16032 py getsizeof(A) 4032 -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 22 April 2013 22:25, Blind Anagram blindanag...@nowhere.org wrote: On 22/04/2013 21:18, Oscar Benjamin wrote: On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote: I also wondered whether I had missed any obvious way of avoiding the slicing cost (intellectually it seemed wrong to me to have to copy the list in order to count items within it). [snip] I have looked at solutions based on listing primes and here I have found that they are very much slower than my existing solution when the sieve is not large (which is the majority use case). What matters is not so much the size of the sieve but the size of the interval you want to query. You say that slicing cost is somehow significant which suggests to me that it's not a small interval. An approach using a sorted list of primes and bisect would have a cost that is independent of the size of the interval (and depends only logarithmically on the size of the sieve). Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: Weird behaviour?
On Tuesday, April 23, 2013 12:29:57 AM UTC+10, nn wrote: Maybe it is related to this bug? http://bugs.python.org/issue11272 I'm running Python 2.7.2 (on Windows) and that version doesn't appear to have that bug: Python 2.7.2 (default, Apr 23 2013, 11:49:52) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. print(repr(input())) testing 'testing' Cheers Jussi -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On Mon, 22 Apr 2013 22:25:50 +0100, Blind Anagram wrote: I have looked at solutions based on listing primes and here I have found that they are very much slower than my existing solution when the sieve is not large (which is the majority use case). Yes. This is hardly surprising. Algorithms suitable for dealing with the first million primes are not suitable for dealing with the first trillion primes, and vice versa. We like to pretend that computer programming is an abstraction, and for small enough data we often can get away with that, but like all abstractions eventually it breaks and the cost of dealing with real hardware becomes significant. But I must ask, given that the primes are so widely distributed, why are you storing them in a list instead of a sparse array (i.e. a dict)? There are 50,847,534 primes less than or equal to 1,000,000,000, so you are storing roughly 18 False values for every True value. That ratio will only get bigger. With a billion entries, you are using 18 times more memory than necessary. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Mon, 22 Apr 2013 22:09:14 +0100, Rui Maciel wrote: Steven D'Aprano wrote: I think that if you are worrying about the overhead of the tkinter bindings for Python, you're guilty of premature optimization. I'm not worried about that. No one should be forced to install crap that they don't use or will ever need, no matter how great the average HD capacity is nowadays. Nobody forces you to do anything. Python is open source, and the source code is freely available. Feel free to hand-optimize your Python installation, selecting carefully each and every module, class, and function in the standard library so that only the ones you absolutely know you will need to use are installed, using your godlike powers of precognition to foresee exactly what you need in seventeen months from now and what is crap that you will never need. Good luck with that. I look forward to hearing about the results. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Weird behaviour?
On Tue, Apr 23, 2013 at 9:06 AM, jus...@zeusedit.com wrote: On Tuesday, April 23, 2013 12:29:57 AM UTC+10, nn wrote: Maybe it is related to this bug? http://bugs.python.org/issue11272 I'm running Python 2.7.2 (on Windows) and that version doesn't appear to have that bug: Python 2.7.2 (default, Apr 23 2013, 11:49:52) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. print(repr(input())) testing 'testing' Careful there; go with raw_input() on Py2. And then it does happen. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Mon, 22 Apr 2013 14:52:39 +0200, Antoon Pardon wrote: Op 22-04-13 11:18, Steven D'Aprano schreef: On Mon, 22 Apr 2013 03:08:24 -0500, Andrew Berg wrote: Much of the stdlib doesn't rely on anything but the core interpreter. tkinter by itself is not the issue. As you said, the bindings are tiny. However, in order to be usable, it requires quite a few things - most notably X. On desktop Linux, this is already installed, but on server systems, it generally is not (or at least shouldn't be in most cases). Going back to my example of a web server using a Python-based framework, I'll repeat that there is no reason such a system should have X even installed in order to serve web pages. Even on a lean, mean server machine, CPython requires only a few extra libraries. Add tkinter, and suddenly you have to install a LOT of things. If you plan to actually use tkinter, this is fine. If not, you've just added a lot of stuff that you don't need. This adds unnecessary overhead in several places (like your package system's database). I can't disagree with any of this, except to say that none of this justifies having a separate package for Tkinter. Naturally if you don't have X, Tcl won't work, and if Tcl won't work, Tkinter won't work and should give an import error. But that doesn't imply that X must be a dependency for Python. It's a dependency for having Tkinter *work*, but not for *installing* Tkinter as part of the standard library. Hell, even if you have X installed, and Tcl, and the Tkinter packages, importing tkinter can still fail, if Python wasn't built with the right magic incantations for it to recognise that Tcl is installed. Then don't use a package system. The job of a package system is, that if you install something, it install all dependencies that are needed to make it work. No, the job of the package system is to manage dependencies. It makes no guarantee about whether or not something will work. $ sudo apt-get install rule_world $ rule_world --start-from Australia Error: cannot connect to US nuclear arsenal from here, you cannot rule the world A joke example, of course, but a serious point. Successful installation doesn't necessarily mean the program will run successfully, or work in any meaningful way. We're also glossing over what it means to be a dependency. This is not obvious, and in fact I would argue that X is NOT a dependency for tkinter, even though tkinter will not work without it, for some definition of work. I can quite happily import tkinter on a remote machine over ssh: py from tkinter.messagebox import showinfo or do the same thing on a local machine from a non-X terminal. I haven't tried it, but quite possibly even on a headless machine without X installed at all. And why not? Tkinter is a big module, there are all sorts of things that I might want to access that don't actually require an X display. If nothing else, I can do this: py help(showinfo) and read the docs. Tkinter does not actually require X to work. It merely requires X in order to *display an X window*. It's only when I actually try to do something that requires an X display that it will fail. I won't show the entire traceback, because it is long and not particularly enlightening, but the final error message explains exactly why it isn't working: _tkinter.TclError: no display name and no $DISPLAY environment variable Your solution doesn't make sense in view of your earlier response where you argue tkinster should be installed because it is part of the standard combined with the advantage of having a standard library. But IMO a part of that standard library not working, is just as harmful as part of that standard library not being installed. From a user/programmer's point of view the result is the same. It is unusable. Not at all. As I said earlier, I would expect that trying to import tkinter on such a system should give a meaningful error message. Actually, it need not even fail at import time. As I show above, I can happily import tkinter without an X display. I haven't tried it, but I expect that I can probably import tkinter without Tcl either. Let me put this another way: It should not matter whether I install Tcl before Python, or after Python, the end result should be that once both are installed, tkinter will be usable (provided you have an X display). To put it in Ubuntu terms, if I do this: apt-get tcl apt-get python or this: apt-get python apt-get tcl on a machine with X, tkinter should Just Work. And if I don't install tcl, tkinter should still import, it just won't be able to, you know, interface to tcl. What we're arguing here is merely the design of the dependency graph, and that's a matter of taste. My design would be different from that of the Ubuntu folks. That's fine. If we all agreed about everything, we'd have nothing to argue about *wink* But I think we can all agree that something like this is pretty crappy:
Re: Weird behaviour?
On Mon, 22 Apr 2013 07:29:57 -0700, nn wrote: On Apr 21, 9:19 pm, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Mon, 22 Apr 2013 10:56:11 +1000, Chris Angelico wrote: You're running this under Windows. The convention on Windows is for end-of-line to be signalled with \r\n, but the convention inside Python is to use just \n. With the normal use of buffered and parsed input, this is all handled for you; with unbuffered input, that translation also seems to be disabled, so your string actually contains '120\r', as will be revealed by its repr(). If that's actually the case, then I would call that a bug in raw_input. Actually, raw_input doesn't seem to cope well with embedded newlines even without the -u option. On Linux, I can embed a control character by typing Ctrl-V followed by Ctrl-char. E.g. Ctrl-V Ctrl-M to embed a carriage return, Ctrl-V Ctrl-J to embed a newline. So watch: [steve@ando ~]$ python2.7 -c x = raw_input('Hello? '); print repr(x) Hello? 120^M^Jabc '120\r' Everything after the newline is lost. -- Steven Maybe it is related to this bug? http://bugs.python.org/issue11272 I doubt it, I'm not using Windows and that bug is specific to Windows. Here's the behaviour on Python 3.3: py result = input(Type something with control chars: ) Type something with control chars: something ^T^M else and a second line py print(repr(result)) 'something \x14\r else \nand a second line' Much better! -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Tue, Apr 23, 2013 at 10:22 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: It's only when I actually try to do something that requires an X display that it will fail. I won't show the entire traceback, because it is long and not particularly enlightening, but the final error message explains exactly why it isn't working: _tkinter.TclError: no display name and no $DISPLAY environment variable You presumably have a system to test this on. Can you try using ssh -X to get to it, and then retry that action? It looks like you actually have everything you need, just no display... which is exactly what you'd get if you ssh to something that has a real GUI. Not a dependency problem. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On 2013.04.22 19:22, Steven D'Aprano wrote: It's only when I actually try to do something that requires an X display that it will fail. I won't show the entire traceback, because it is long and not particularly enlightening, but the final error message explains exactly why it isn't working: _tkinter.TclError: no display name and no $DISPLAY environment variable So you want to go from this won't work because it's not installed to this won't work, and it there could be a hundred different reasons why? tkinter's main function is to display something on a display. To say that displaying something is an optional feature is absurd. You can install this, but your package manager won't pull in any dependencies because a few minor things will work without them. If you want it to actually do what it was made for, you need to install them yourself. Much bigger problem than the OP's, no? -- CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1 -- http://mail.python.org/mailman/listinfo/python-list
optomizations
I would like some feedback on possible solutions to make this script run faster. The system is pegged at 100% CPU and it takes a long time to complete. #!/usr/bin/env python import gzip import re import os import sys from datetime import datetime import argparse if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('-f', dest='inputfile', type=str, help='data file to parse') parser.add_argument('-o', dest='outputdir', type=str, default=os.getcwd(), help='Output directory') args = parser.parse_args() if len(sys.argv[1:]) 1: parser.print_usage() sys.exit(-1) print(args) if args.inputfile and os.path.exists(args.inputfile): try: with gzip.open(args.inputfile) as datafile: for line in datafile: line = line.replace('mediacdn.xxx.com', 'media.xxx.com') line = line.replace('staticcdn.xxx.co.uk', ' static.xxx.co.uk') line = line.replace('cdn.xxx', 'www.xxx') line = line.replace('cdn.xxx', 'www.xxx') line = line.replace('cdn.xx', 'www.xx') siteurl = line.split()[6].split('/')[2] line = re.sub(r'\bhttps?://%s\b' % siteurl, , line, 1) (day, month, year, hour, minute, second) = (line.split()[3]).replace('[','').replace(':','/').split('/') datelog = '{} {} {}'.format(month, day, year) dateobj = datetime.strptime(datelog, '%b %d %Y') outfile = '{}{}{}_combined.log'.format(dateobj.year, dateobj.month, dateobj.day) outdir = (args.outputdir + os.sep + siteurl) if not os.path.exists(outdir): os.makedirs(outdir) with open(outdir + os.sep + outfile, 'w+') as outf: outf.write(line) except IOError, err: sys.stderr.write(Error unable to read or extract inputfile: {} {}\n.format(args.inputfile, err)) sys.exit(-1) -- http://mail.python.org/mailman/listinfo/python-list
Re: optomizations
On Tue, Apr 23, 2013 at 11:19 AM, Rodrick Brown rodrick.br...@gmail.com wrote: with gzip.open(args.inputfile) as datafile: for line in datafile: outfile = '{}{}{}_combined.log'.format(dateobj.year, dateobj.month, dateobj.day) outdir = (args.outputdir + os.sep + siteurl) with open(outdir + os.sep + outfile, 'w+') as outf: outf.write(line) You're opening files and closing them again for every line. This wouldn't cause you to spin the CPU (more likely it'd thrash the hard disk - unless you have an SSD), but it is certainly an optimization target. Can you know in advance what files you need? If not, I'd try something like this: outf = {} # Might want a better name though . outfile = ... if outfile not in outf: os.makedirs(...) outf[outfile] = open(...) outf[outfile].write(line) for f in outf.values(): f.close() Open files only as needed, close 'em all at the end. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: List Count
On 04/22/2013 05:32 PM, Blind Anagram wrote: On 22/04/2013 22:03, Oscar Benjamin wrote: On 22 April 2013 21:18, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 22 April 2013 17:38, Blind Anagram blindanag...@nowhere.org wrote: On 22/04/2013 17:06, Oscar Benjamin wrote: I don't know what your application is but I would say that my first port of call here would be to consider a different algorithmic approach. An obvious question would be about the sparsity of this data structure. How frequent are the values that you are trying to count? Would it make more sense to store a list of their indices? Actually it is no more than a simple prime sieve implemented as a Python class (and, yes, I realize that there are plenty of these around). If I understand correctly, you have a list of roughly a billion True/False values indicating which integers are prime and which are not. You would like to discover how many prime numbers there are between two numbers a and b. You currently do this by counting the number of True values in your list between the indices a and b. If my description is correct then I would definitely consider using a different algorithmic approach. The density of primes from 1 to 1 billlion is about 5%. Storing the prime numbers themselves in a sorted list would save memory and allow a potentially more efficient way of counting the number of primes within some interval. In fact it is probably quicker if you don't mind using all that memory to just store the cumulative sum of your prime True/False indicator list. This would be the prime counting function pi(n). You can then count the primes between a and b in constant time with pi[b] - pi[a]. I did wonder whether, after creating the sieve, I should simply go through the list and replace the True values with a count. This would certainly speed up the prime count function, which is where the issue arises. I will try this and see what sort of performance trade-offs this involves. By doing that replacement, you'd increase memory usage manyfold (maybe 3:1, I don't know). As long as you're only using bools in the list, you only have the list overhead to consider, because all the objects involved are already cached (True and False exist only once each). If you have integers, you'll need a new object for each nonzero count. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: optomizations
In article mailman.944.1366680414.3114.python-l...@python.org, Rodrick Brown rodrick.br...@gmail.com wrote: I would like some feedback on possible solutions to make this script run faster. If I had to guess, I would think this stuff: line = line.replace('mediacdn.xxx.com', 'media.xxx.com') line = line.replace('staticcdn.xxx.co.uk', ' static.xxx.co.uk') line = line.replace('cdn.xxx', 'www.xxx') line = line.replace('cdn.xxx', 'www.xxx') line = line.replace('cdn.xx', 'www.xx') siteurl = line.split()[6].split('/')[2] line = re.sub(r'\bhttps?://%s\b' % siteurl, , line, 1) You make 6 copies of every line. That's slow. But I'm also going to quote something I wrote here a couple of months back: I've been doing some log analysis. It's been taking a grovelingly long time, so I decided to fire up the profiler and see what's taking so long. I had a pretty good idea of where the ONLY TWO POSSIBLE hotspots might be (looking up IP addresses in the geolocation database, or producing some pretty pictures using matplotlib). It was just a matter of figuring out which it was. As with most attempts to out-guess the profiler, I was totally, absolutely, and embarrassingly wrong. So, my real advice to you is to fire up the profiler and see what it says. -- http://mail.python.org/mailman/listinfo/python-list
Re: optomizations
On 23/04/2013 02:19, Rodrick Brown wrote: I would like some feedback on possible solutions to make this script run faster. The system is pegged at 100% CPU and it takes a long time to complete. #!/usr/bin/env python import gzip import re import os import sys from datetime import datetime import argparse if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('-f', dest='inputfile', type=str, help='data file to parse') parser.add_argument('-o', dest='outputdir', type=str, default=os.getcwd(), help='Output directory') args = parser.parse_args() if len(sys.argv[1:]) 1: parser.print_usage() sys.exit(-1) print(args) if args.inputfile and os.path.exists(args.inputfile): try: with gzip.open(args.inputfile) as datafile: for line in datafile: line = line.replace('mediacdn.xxx.com', 'media.xxx.com') line = line.replace('staticcdn.xxx.co.uk', 'static.xxx.co.uk') These next 2 lines are duplicates; the second will have no effect (I think!). line = line.replace('cdn.xxx', 'www.xxx') line = line.replace('cdn.xxx', 'www.xxx') Won't the next line also do the work of the preceding 2 lines? line = line.replace('cdn.xx', 'www.xx') siteurl = line.split()[6].split('/')[2] line = re.sub(r'\bhttps?://%s\b' % siteurl, , line, 1) (day, month, year, hour, minute, second) = (line.split()[3]).replace('[','').replace(':','/').split('/') datelog = '{} {} {}'.format(month, day, year) dateobj = datetime.strptime(datelog, '%b %d %Y') outfile = '{}{}{}_combined.log'.format(dateobj.year, dateobj.month, dateobj.day) outdir = (args.outputdir + os.sep + siteurl) if not os.path.exists(outdir): os.makedirs(outdir) with open(outdir + os.sep + outfile, 'w+') as outf: outf.write(line) except IOError, err: sys.stderr.write(Error unable to read or extract inputfile: {} {}\n.format(args.inputfile, err)) sys.exit(-1) I wonder whether it'll make a difference if you read a chunk at a time (datafile.read(chunk_size) + datafile.readline() to ensure you have complete lines), perform the replacements on it (so that you're working on several lines in one go), and then split it into lines for further processing. Another thing you could try caching the result of parsing the date, using (month, day, year) the key and outfile as the value in a dict. A third thing you could try is not writing a file for every line (doesn't the 'w+' mode truncate the file?), but save the output for each chunk (see first suggestion) and then write the files afterwards, at the end of the chunk. -- http://mail.python.org/mailman/listinfo/python-list
Re: optomizations
On Mon, Apr 22, 2013 at 6:53 PM, Roy Smith r...@panix.com wrote: So, my real advice to you is to fire up the profiler and see what it says. I agree. Fire up a line-oriented profiler and only then start trying to improve the hot spots. -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Apr 23, 5:22 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: We're also glossing over what it means to be a dependency. This is not obvious, and in fact I would argue that X is NOT a dependency for tkinter, even though tkinter will not work without it, for some definition of work. I can quite happily import tkinter on a remote machine over ssh: Yes the crux of the matter is what it means 'to work' and therefore 'to not work' Lets say my car is 'not working' On further investigation its found that the petrol tank is empty. A case could be made for either case: 'it (the car) working' or 'its not working' To the extent that pragmatically 'not working' is attended by a mechanic, its not in that category To the extent that (even more pragmatically) I missed an important appointment, its in that category Both of which gloss over the fact that after filling the petrol it may still not work. So to conclude: since I could not check, its vacuously working is more problematic than the contrary since I could not check, its vacuously not working Package systems need to 'federate' so to speak workingness from a zillion packages to the whole system. The problem is that workingness is peculiar to each package. Therefore it seems reasonable to me to ask of a package system that - it allows a maximum number of different configurations for different requirements ('without crap') - it disallows all kinds of misconfigured/non-working systems -- therefore conservative dependencies are good - the above subject to reasonable best efforts -- so dont cater to fringe pathological cases (like I want Tkinter but not X) BTW I suggested earlier that python could have something like KDE (Kde- full and a smaller Kde-standard). Just checked that python already has python2.7 and python2.7-minimal where the description of the latter says: it can be used in the boot process for basic tasks -- http://mail.python.org/mailman/listinfo/python-list
Re: optomizations
On Mon, 22 Apr 2013 21:19:23 -0400, Rodrick Brown wrote: I would like some feedback on possible solutions to make this script run faster. The system is pegged at 100% CPU and it takes a long time to complete. Have you profiled the app to see where it is spending all its time? What does a long time mean? For instance: It takes two hours to process a 15KB file -- you have a problem. It takes 20 minutes to process a 15GB file -- and why are you complaining? Or somewhere in the middle... But before profiling, I suggest you clean up the program. For example: if args.inputfile and os.path.exists(args.inputfile): Don't do that. There really isn't any point in checking whether the input file exists, since: 1) Just because it exists doesn't mean you can read it; 2) Just because you can read it doesn't mean it is a valid gzip file; 3) Just because it is a valid gzip file that you can read *now*, doesn't mean that it still will be in 10 milliseconds when you actually try to open the file. A lot can happen in 10ms, or 1ms. The file might be deleted, or overwritten, or permissions changed. Change that to: try: with gzip.open(args.inputfile) as datafile: for line in datafile: and catch the exception if the file doesn't exist, or cannot be read. Which you already do, which just demonstrates that the call to os.path.exists is a waste of effort. Then look for wasted effort like this: line = line.replace('cdn.xxx', 'www.xxx') line = line.replace('cdn.xx', 'www.xx') Surely the first line is redundant, since it would be correctly caught and replaced by the second? Also, you're searching the file system *for every line* in the input file. Pull this outside of the loop and have it run once: if not os.path.exists(outdir): os.makedirs(outdir) Likewise for opening and closing the output file, which you currently open and close it for every line. It only needs to be opened and closed once. If it comes down to micro-optimizations to shave a few microseconds off, consider using string % formatting rather than the format method. But really, if you find yourself shaving microseconds off something that runs for ten minutes, you have to ask why you're bothering. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: optomizations
On Tue, Apr 23, 2013 at 2:00 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Also, you're searching the file system *for every line* in the input file. Pull this outside of the loop and have it run once: if not os.path.exists(outdir): os.makedirs(outdir) Likewise for opening and closing the output file, which you currently open and close it for every line. It only needs to be opened and closed once. The outdir depends on the line, though. Hence my suggestion to retain the open files in a dictionary. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Tue, 23 Apr 2013 10:36:38 +1000, Chris Angelico wrote: On Tue, Apr 23, 2013 at 10:22 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: It's only when I actually try to do something that requires an X display that it will fail. I won't show the entire traceback, because it is long and not particularly enlightening, but the final error message explains exactly why it isn't working: _tkinter.TclError: no display name and no $DISPLAY environment variable You presumably have a system to test this on. Can you try using ssh -X to get to it, and then retry that action? It looks like you actually have everything you need, just no display... which is exactly what you'd get if you ssh to something that has a real GUI. Not a dependency problem. I didn't say it was a dependency problem. I'm just demonstrating that it is possible for tkinter code to fail even if all the dependencies are met; and on the other hand, it is useful to be able to import tkinter even if you cannot display any tkinter windows. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Ubuntu package python3 does not include tkinter
On Tue, Apr 23, 2013 at 2:03 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Tue, 23 Apr 2013 10:36:38 +1000, Chris Angelico wrote: On Tue, Apr 23, 2013 at 10:22 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: It's only when I actually try to do something that requires an X display that it will fail. I won't show the entire traceback, because it is long and not particularly enlightening, but the final error message explains exactly why it isn't working: _tkinter.TclError: no display name and no $DISPLAY environment variable You presumably have a system to test this on. Can you try using ssh -X to get to it, and then retry that action? It looks like you actually have everything you need, just no display... which is exactly what you'd get if you ssh to something that has a real GUI. Not a dependency problem. I didn't say it was a dependency problem. I'm just demonstrating that it is possible for tkinter code to fail even if all the dependencies are met; and on the other hand, it is useful to be able to import tkinter even if you cannot display any tkinter windows. Sure. But I don't know that the situation you're seeing is the same as the one you'd see if you install tkinter without tk. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: optomizations
On Apr 22, 2013, at 11:18 PM, Dan Stromberg drsali...@gmail.com wrote: On Mon, Apr 22, 2013 at 6:53 PM, Roy Smith r...@panix.com wrote: So, my real advice to you is to fire up the profiler and see what it says. I agree. Fire up a line-oriented profiler and only then start trying to improve the hot spots. Got a doc or URL I have no experience working with python profilers. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list