Re: [Tutor] Retrieving data from a web site

Peter Otten Sat, 18 May 2013 05:47:12 -0700

Phil wrote:

> On 18/05/13 19:25, Peter Otten wrote:
>>
>> Are there alternatives that give the number as plain text?
> 
> Further investigation shows that the numbers are available if I view the
> source of the page. So, all I have to do is parse the page and extract
> the drawn numbers. I'm not sure, at the moment, how I might do that but
> I have something to work with.


You can use a tool like lxml that "understands" html (though in this case 
you'd need a javascript parser on top of that) -- or hack something together 
with string methods or regular expressions. For example:

import urllib2
import json

s = urllib2.urlopen("http://*********/goldencasket";).read()
s = s.partition("latestResults_productResults")[2].lstrip(" =")
s = s.partition(";")[0]
data = json.loads(s)
lotto = data["GoldLottoSaturday"]

print lotto["drawDayDateNumber"]
print map(int, lotto["primaryNumbers"])
print map(int, lotto["secondaryNumbers"])

While this is brittle I've found that doing it "right" is usually not 
worthwhile as it won't survive the next website redesign eighter.

PS: <http://*********/goldencasket/results/download-results>
has links to zipped csv files with the results. Downloading, inflating and 
reading these should be the simplest and best way to get your data.

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Retrieving data from a web site

Reply via email to