[Tutor] Weighted Random Choice - Anyone have an efficient algorithm?
Does anyone know of an efficient way of doing a weighted random choice? (I don't even know what algorithms like this would be called.) Preferably, something that doesn't grow exponentially with the number of elements in the list, or the size of their respective values. For example: Assume I have a list of integer elements. I want to pick an index from that list, using the value of said elements to control the probability of selecting a given index: w = [5, 20, 75] wrand(w) Where wrand() will return '2', (the index of 75), about 75% of the time. It would return '1', (the index of 20) about 20% of the time, and 0, about 5% of the time. It could return the actual value, but returning the index seems more flexible. The implementation doesn't have to work exactly like this, even if the weight values don't add up to 100, or are arbitrary, that's fine. Hopefully you get the idea. Here's what I tried (it works, but is slow): ### Begin Code Example ### import random random.seed(2) #<-- So we can reproduce the sequence for testing. def wrandom(weights): ''' Return a randomly selected index of the 'weights' list, using the values of that list to affect probability of said index being returned. ''' debug = False flattened = [] for i, w in enumerate(weights): if debug: print "i, w: ", i, w for k in range(w): flattened.append(i) if debug: print "flattened: ", flattened rnd = random.randint(0, (len(flattened) - 1)) return flattened[rnd] # Not test it: print wrandom([5, 20, 75]) print wrandom([5, 20, 75]) print wrandom([5, 20, 75]) ### End Code Example ### It works and is easy enough to understand, but when the number of list items gets large, or the weights get heavy, things get ugly. -Modulok- ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] classes for setting 2D envelopes
Hi all, I've been working on creating 2D bounding box (envelope) classes to describe spatial data. Variations of these are available in other spatial libraries (e.g. Shapely), although I haven't found envelopes specific to raster data that also specifies cell size. Could be I just haven't found them yet. I have two classes (Envelope and RasterEnvelope). Envelope specifies a bounding box and does error checking (through __setattr__) when a coordinate is changed. RasterEnvelope additionally specifies a cell size (along with n_rows and n_cols), does the bounds checking on a coordinate change and adjusts the spatial envelope accordingly (again through __setattr__). I've posted the code (and some unit tests) here: http://pastebin.com/Twf3RjWa So far things work, but I have a nagging feeling that there's too much work devoted to changing coordinates. In another thread I posted a while back (http://mail.python.org/pipermail/tutor/2010-August/077940.html), Steven D'Aprano recommended using immutable types for point coordinate data and I'm guessing the advice might be applicable here as well? If so, I'm a slow learner ;) There are also (at least) a couple of design flaws right now that could be remedied if I put more work into it: - A RasterEnvelope is 'pinned' by its upper-left coordinate and only changes to x_min or y_max will cause changes in this coordinate (by design but probably a limitation) - Changing cell size currently doesn't change the corresponding window because there is no specification as to whether n_cols/n_rows should change or x_max/y_min should change. Both these probably suggest creating new RasterEnvelope instances any time a coordinate changes? Any feedback would be welcome, so that I don't devote too much more time down a bad route. thanks, matt ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] calling setters of superclasses
On 12/18/2010 2:06 AM, Peter Otten wrote: I don't think /how/ you are trying it is stupid though I'm not so sure about /what/ . Thank you all for very helpful suggestions. It took me a while to chew on this before I could respond. I learned a lot about descriptors and their interactions with properties that I hadn't fully understood before. Peter and Alan's advice to create a check method that is overridden in subclasses makes sense in order to avoid the naming conflicts. And I also like Hugo's idea of applying separate descriptor classes to handle the constraints introduced. That seems to be a flexible way of doing things. As far as the /what/, my example given was obviously contrived. I'm really trying to create classes for 2D envelopes that describe the bounding extent of spatial data. I have both Envelope and RasterEnvelope classes - the former being just a bounding box around any spatial data, the latter additionally specifying a raster cell size and being able to discern rows, columns, etc. I had been using the setter to do bounds checking on the Envelope class (e.g. make sure x_min isn't bigger than x_max, etc. and rolling back changes if so). For the RasterEnvelope class, I first wanted to call the Envelope bounds checking and then to adjust rows/columns if a bigger extent was requested. But you've successfully scared me away from using properties (in a hierarchical way at least) and I've been able to get what I need by just defining __setattr__ in both classes. Whether I did that correctly is a story for another thread ... matt ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Dictionary Question
On Wed, 22 Dec 2010 23:31:39 +1100, Steven D'Aprano wrote: > In this case, you need to sum the number of races for all the fixtures: > > num_races = sum(len(racetimes) for racetimes in FixtureDict.values()) Many thanks Steven for your explanation and final golden nugget of code. On Wed, 22 Dec 2010 10:11:25 -0500, bob gailer wrote: > > Also note: len(dict.keys()) == len(dict.values()) == len(dict) Yup, thanks Bob. Cheers, G. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] xml.etree.ElementTree.parse() against a XMLShema file
On 12/22/2010 10:32 PM, Stefan Behnel wrote: Karim, 22.12.2010 22:09: Using lxml (except for the different import) will be fully compliant with the ET code. Do I have to adapt it? There are certain differences. http://codespeak.net/lxml/compatibility.html This page hasn't been changed for ages, but it should still be mostly accurate. I will have a look. Anyway, I must delivered my current version. I got 300 lines of codes which should be easily translated for the improved future version. I saw your fantastic benchmarks! Why the hell lxml is not integrated into the stdlib. I thought they put in it things which works at best for python interest ? I proposed it but it was rejected with the argument that it's a huge dependency and brings in two large C libraries that will be hard to control for future long-term maintenance. I think that's a reasonable objection. One can never says never... They will reconsider it I think. Thanks for your support! Regards Karim ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] xml.etree.ElementTree.parse() against a XMLShema file
Karim, 22.12.2010 22:09: Using lxml (except for the different import) will be fully compliant with the ET code. Do I have to adapt it? There are certain differences. http://codespeak.net/lxml/compatibility.html This page hasn't been changed for ages, but it should still be mostly accurate. I saw your fantastic benchmarks! Why the hell lxml is not integrated into the stdlib. I thought they put in it things which works at best for python interest ? I proposed it but it was rejected with the argument that it's a huge dependency and brings in two large C libraries that will be hard to control for future long-term maintenance. I think that's a reasonable objection. Stefan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] xml.etree.ElementTree.parse() against a XMLShema file
Thanks Stefan for answering. That's what I come up with. Using lxml (except for the different import) will be fully compliant with the ET code. Do I have to adapt it? I saw your fantastic benchmarks! Why the hell lxml is not integrated into the stdlib. I thought they put in it things which works at best for python interest ? Regards Karim On 12/22/2010 09:56 PM, Stefan Behnel wrote: Karim, 22.12.2010 19:28: On 12/22/2010 07:07 PM, Karim wrote: Is somebody has an example of the way to parse an xml file against a "grammary" file.xsd. I found this: http://www.velocityreviews.com/forums/t695106-re-xml-parsing-with-python.html Stefan is it still true the limitation of etree in python 2.7.1 ? Yes, ElementTree (which is in Python's stdlib) and lxml.etree are separate implementations. If you want validation, use the lxml package. Stefan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] xml.etree.ElementTree.parse() against a XMLShema file
Karim, 22.12.2010 19:28: On 12/22/2010 07:07 PM, Karim wrote: Is somebody has an example of the way to parse an xml file against a "grammary" file.xsd. I found this: http://www.velocityreviews.com/forums/t695106-re-xml-parsing-with-python.html Stefan is it still true the limitation of etree in python 2.7.1 ? Yes, ElementTree (which is in Python's stdlib) and lxml.etree are separate implementations. If you want validation, use the lxml package. Stefan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] xml.etree.ElementTree.parse() against a XMLShema file
On 12/22/2010 07:07 PM, Karim wrote: Is somebody has an example of the way to parse an xml file against a "grammary" file.xsd. I found this: http://www.velocityreviews.com/forums/t695106-re-xml-parsing-with-python.html Stefan is it still true the limitation of etree in python 2.7.1 ? Regards Karim ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Pyserial and invalid handle
Hi all, Last few days ago, I facing the same issue with my windows 7 64bit... it work well with windows XP 32-bit... FYI, I’m using python 2.6 amd64 bit as my core programming tools currently, I’m start study/learn some arduino robotic and rewrite Python-Arduino API I just do a quick anatomy to the Pyserial code and diff. of the (Ctypes) Handle range... found both are diff range with same Kernel32 dll function call, for 64-bit, ctypes.wintypes.HANDLE(-1) ==> c_void_p(18446744073709551615L) in 32 bit, give c_void_p(4294967295L) so what I guest it cause by Platform and environments issues. below is what I did and modifies on 64-bit server: a.. win32.py (C:\Python26\Lib\site-packages\serial) a.. use HRESULT to replace HANDLE for kernel32 function handle return b.. serialwin32.py (C:\Python26\Lib\site-packages\serial) a.. use win32file, pywintypes, win32event to rewrite readfile and writefile function it seem work now for my 64-bit server, somehow I didn’t put some effort on self.timeout == 0 for serial as I don’t have enough time for it that I will leave it back to Pyserial Owner to take the honor for that hope this will help to all open source member... (that why I’m like python most) Regards, Cheeng Shu Chin #! python # Python Serial Port Extension for Win32, Linux, BSD, Jython # serial driver for win32 # see __init__.py # # (C) 2001-2009 Chris Liechti # this is distributed under a free software license, see license.txt # # Initial patch to use ctypes by Giovanni Bajo import ctypes import win32 import win32file import pywintypes import win32event #from ctypes.wintypes import HWND from serialutil import * #from pywin32_testutil import str2bytes def device(portnum): """Turn a port number into a device name""" return 'COM%d' % (portnum+1) # numbers are transformed to a string class Win32Serial(SerialBase): """Serial port implementation for Win32 based on ctypes.""" BAUDRATES = (50, 75, 110, 134, 150, 200, 300, 600, 1200, 1800, 2400, 4800, 9600, 19200, 38400, 57600, 115200) def __init__(self, *args, **kwargs): self.hComPort = None SerialBase.__init__(self, *args, **kwargs) def open(self): """Open port with current settings. This may throw a SerialException if the port cannot be opened.""" if self._port is None: raise SerialException("Port must be configured before it can be used.") # the "\\.\COMx" format is required for devices other than COM1-COM8 # not all versions of windows seem to support this properly # so that the first few ports are used with the DOS device name port = self.portstr try: if port.upper().startswith('COM') and int(port[3:]) > 8: port = '.\\' + port except ValueError: # for like COMnotanumber pass self.hComPort = win32.CreateFile(port, win32.GENERIC_READ | win32.GENERIC_WRITE, 0, # exclusive access None, # no security win32.OPEN_EXISTING, win32.FILE_ATTRIBUTE_NORMAL | win32.FILE_FLAG_OVERLAPPED, 0) #self.pComPort=pywintypes.HANDLE(self.hComPort) #self.cComPort=HWND(self.hComPort) #print ctypes.sizeof(self.pComPort) #print ctypes.sizeof(self.hComPort) if self.hComPort == win32.INVALID_HANDLE_VALUE: self.hComPort = None# 'cause __del__ is called anyway raise SerialException("could not open port %s: %s" % (self.portstr, ctypes.WinError())) # Setup a 4k buffer win32.SetupComm(self.hComPort, 4096, 4096) # Save original timeout values: self._orgTimeouts = win32.COMMTIMEOUTS() win32.GetCommTimeouts(self.hComPort, ctypes.byref(self._orgTimeouts)) self._rtsState = win32.RTS_CONTROL_ENABLE self._dtrState = win32.DTR_CONTROL_ENABLE self._reconfigurePort() # Clear buffers: # Remove anything that was there win32.PurgeComm(self.hComPort, win32.PURGE_TXCLEAR | win32.PURGE_TXABORT | win32.PURGE_RXCLEAR | win32.PURGE_RXABORT) self._overlappedRead = win32.OVERLAPPED() self._overlappedRead.hEvent = win32.CreateEvent(None, 1, 0, None) self._overlappedWrite = win32.OVERLAPPED() #~ self._overlappedWrite.hEvent = win32.CreateEvent(None, 1, 0, None) self._overlappedWrite.hEvent = win32.CreateEvent(None, 0, 0, None) self._isOpen = True def _reconfigurePort(self): """Set communication parameters on opened port.""" if not self.hComPort: raise SerialException("Can only operate on a valid port handle") # Set Windows timeout values # timeouts is a tuple with the following items: # (ReadIntervalTimeout,ReadTotalTimeoutMultiplier, # ReadTotalTimeoutConst
[Tutor] xml.etree.ElementTree.parse() against a XMLShema file
Hello all, Is somebody has an example of the way to parse an xml file against a "grammary" file.xsd. The default parser is checking closing tags and attributes but I would like to validate a XSD file. I use the module ElementTree. Regards Karim ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] vim as a python editor
On Wed, Dec 15, 2010 at 7:30 PM, Alan Gauld wrote: > I also use split screen view in vim so that within vim I often have > two or three buffers open at once all displayed in a split screen. I know you prefer "default" settings, but one mapping I tend to stick in my .vimrcs whereever I go: nmap w_ Which moves to the next screen and maximizes it. = will return to the equally distributed split screens. When coding an AJAX app, for example, I'll often have the HTML, the JS, the CSS, and the Python service, and the tests all up at once. I also use "screen" a lot because I'm often working on remote machines without X involved, plus it lets me bounce between shell (docs, git, etc) and editing easily with the added benefit of not having to worry about communication disruption (local power outage, etc) killing my work in progress. Often once I get an environment perfectly tailored (yay for virtualenv) I'll have a screen session manage the entire thing until the project is done, just reconnecting to it when I'm working on it. I highly recommend both screen and virtualenv to anyone that is unfamiliar with them. -- Brett Ritter / SwiftOne swift...@swiftone.org ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Dictionary Question
On 12/22/2010 7:31 AM, Steven D'Aprano wrote: Also note: len(dict.keys()) == len(dict.values()) == len(dict) -- Bob Gailer 919-636-4239 Chapel Hill NC ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Dictionary Question
Garry Bettle wrote: Howdy all, Hope this message finds everyone well. I have dictionary of keys and a string of values. i.e. 8 Fixtures: I assume each fixture is a key, e.g. Swin, HGrn, etc. Swin1828 1844 1901 1916 1932 1948 2004 2019 2036 2052 2107 2122 HGrn1148 1204 1218 1232 1247 1304 1319 1333 1351 Newc1142 1157 1212 1227 1242 1258 1312 1327 1344 1403 Yarm1833 1849 1906 1922 1937 1953 2009 2024 2041 2057 2112 BVue1418 1437 1457 1517 1538 1558 1618 1637 1657 1717 1733 1747 1804 181 Hove1408 1427 1447 1507 1528 1548 1608 1627 1647 1707 1722 1738 1756 181 Romfd 1930 1946 2003 2019 2035 2053 2109 2125 2141 2157 2213 2230 Sund1839 1856 1911 1927 1943 1958 2014 2031 2047 2102 2117 I print that with the following: f = open(SummaryFile, 'a') header = "%d Fixtures, %d Races:\n" % len(FixtureDict.keys()) print header f.write(header) f.write("\n") for fixture, racetimes in FixtureDict.iteritems(): line = "%s\t%s" % (fixture, " ".join(racetimes)) According to your description, racetimes is already a single string, so using join on it would be the wrong thing to do: >>> racetimes = "1839 1856 1911" >>> " ".join(racetimes) '1 8 3 9 1 8 5 6 1 9 1 1' So what is racetimes? Is it a string, or is it a list of strings? ['1839', '1856', '1911'] I'm going to assume the latter. That's the right way to do it. print line f.write(line + "\n") f.write("\n") f.close() What I'd like to is add the number of values to the Header line. So how would I get i.e. 8 Fixtures, 93 Races I tried header = "%d Fixtures, %d Races:\n" % (len(FixtureDict.keys()), len(FixtureDict.values())) But I get print header 8 Fixture, 8 Races Any ideas? You need len(racetimes) rather than len(FixtureDict.values()). Every dict has exactly one value for every key, always, without exception. That is, len(dict.keys()) == len(dict.values()). In this case, the values are lists of multiple start times, but each list counts as one value. You need to count the number of items inside each value, not the number of values. In this case, you need to sum the number of races for all the fixtures: num_races = sum(len(racetimes) for racetimes in FixtureDict.values()) -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Dictionary Question
Howdy all, Hope this message finds everyone well. I have dictionary of keys and a string of values. i.e. 8 Fixtures: Swin1828 1844 1901 1916 1932 1948 2004 2019 2036 2052 2107 2122 HGrn1148 1204 1218 1232 1247 1304 1319 1333 1351 Newc1142 1157 1212 1227 1242 1258 1312 1327 1344 1403 Yarm1833 1849 1906 1922 1937 1953 2009 2024 2041 2057 2112 BVue1418 1437 1457 1517 1538 1558 1618 1637 1657 1717 1733 1747 1804 181 Hove1408 1427 1447 1507 1528 1548 1608 1627 1647 1707 1722 1738 1756 181 Romfd 1930 1946 2003 2019 2035 2053 2109 2125 2141 2157 2213 2230 Sund1839 1856 1911 1927 1943 1958 2014 2031 2047 2102 2117 I print that with the following: f = open(SummaryFile, 'a') header = "%d Fixtures, %d Races:\n" % len(FixtureDict.keys()) print header f.write(header) f.write("\n") for fixture, racetimes in FixtureDict.iteritems(): line = "%s\t%s" % (fixture, " ".join(racetimes)) print line f.write(line + "\n") f.write("\n") f.close() What I'd like to is add the number of values to the Header line. So how would I get i.e. 8 Fixtures, 93 Races I tried header = "%d Fixtures, %d Races:\n" % (len(FixtureDict.keys()), len(FixtureDict.values())) But I get print header >> 8 Fixture, 8 Races Any ideas? Cheers, Garry ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] pyodbc/date values in MS Access
Hi, Sorry for the late reply, but thanks a lot for helping me. It's solved now. Peter, the link you posted in another thread (or should I say 'query') was also relevant AND funny (http://xkcd.com/327/) Merry Christmas and Happy Coding! *) Cheers!! Albert-Jan *) Including those who have to parse a huge xml file *winks* ;-) ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: Peter Otten <__pete...@web.de> To: tutor@python.org Sent: Wed, December 15, 2010 3:06:19 PM Subject: Re: [Tutor] pyodbc/date values in MS Access Albert-Jan Roskam wrote: > Hi, > > I'm using pyodbc (Python 2.5) to insert records in an MS Access database. > For security reasons, question marks should be used for string replacement > [*]. The standard %s would make the code vulnerable to sql code injection. > Problem is, string replacement in the Good Way somehow doesn't work when > the values are dates. Below, snippet #1 does not work (Access says the > inserted value is not consistent with the defined datatype), but #2 does. > I tried various other ways (ie. DateValue, CDate, etc.) but none of them > works. Is there a solution for this? > > [*] see http://code.google.com/p/pyodbc/wiki/GettingStarted --> under > 'Parameters' > > ### 1 > sql = "INSERT INTO tblSomeTable (myDate) VALUES (?);" > cursor.execute(sql, "#01/01/2010#") (1) Try providing the date in ISO format "-mm-dd" "2010-01-01" or (even better if supported) as a date value from datetime import date date(2010, 1, 1) (2) Wrap the value into a tuple which I think is required by the Python DBAPI. cursor.execute(sql, ("2010-01-01",)) cursor.execute(sql, (date(2010, 1, 1),)) Peter ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Trying to parse a HUGE(1gb) xml file in python
Walter Prins, 21.12.2010 22:13: On 21 December 2010 17:57, Alan Gauld wrote: "Stefan Behnel" wrote But I don't understand how uncompressing a file before parsing it can be faster than parsing the original uncompressed file? I didn't say "uncompressing a file *before* parsing it". I meant uncompressing the data *while* parsing it. Ah, ok that can work, although it does add a layer of processing to identify compressed v uncompressed data, but if I/O is the bottleneck then it could give an advantage. OK my apologies, I see my previous response was already circumscribed by later emails (which I had not read yet.) Feel free to ignore it. :) Not much of a reason to apologize. Especially on a newbee list like python-tutor, a few words more or a different way of describing things may help in widening the set of readers who understand and manage to follow other people's arguments. Stefan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor