Phil wrote:
> On 18/05/13 19:25, Peter Otten wrote:
>>
>> Are there alternatives that give the number as plain text?
>
> Further investigation shows that the numbers are available if I view the
> source of the page. So, all I have to do is parse the page and extract
> the drawn numbers. I'm not sure, at the moment, how I might do that but
> I have something to work with.
You can use a tool like lxml that "understands" html (though in this case
you'd need a javascript parser on top of that) -- or hack something together
with string methods or regular expressions. For example:
import urllib2
import json
s = urllib2.urlopen("http://*********/goldencasket").read()
s = s.partition("latestResults_productResults")[2].lstrip(" =")
s = s.partition(";")[0]
data = json.loads(s)
lotto = data["GoldLottoSaturday"]
print lotto["drawDayDateNumber"]
print map(int, lotto["primaryNumbers"])
print map(int, lotto["secondaryNumbers"])
While this is brittle I've found that doing it "right" is usually not
worthwhile as it won't survive the next website redesign eighter.
PS: <http://*********/goldencasket/results/download-results>
has links to zipped csv files with the results. Downloading, inflating and
reading these should be the simplest and best way to get your data.
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor