On Dec 13, 3:10 am, "Alec Bennett" <[email protected]> wrote:
> Interesting. Do you know if it's limited to 64 results?
I believe there is a limit. This is to prevent scraping an misuse of
the API.
>
> FYI, here's how I went about it. It works, but is restricted to the first 64
> results. Note that all I needed was the URLs of the search results. And
> there's lots of duct tape here. To neaten this up research JSON.
Take a look at the simplejson module:
http://code.google.com/p/simplejson/
Note, you can extract the URL, base url, and unescaped URL from the
JSON results:
responseData.results[i].unescapedUrl
responseData.results[i].url
responseData.results[i].visibleUrl (base url)
>
> def query_google(query, start):
>
> import urllib2, urllib
>
> #http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=Paris%20H...
>
> # urlencode the query
> query = urllib.quote(query)
>
> url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q='+
> query + '&start=' + str(start) + '&rsz=large'
>
> try:
> req = urllib2.Request(url)
> opener = urllib2.build_opener()
> data_string = opener.open(req).read()
>
> except urllib2.URLError:
> print "------ Error opening " + url + "..... Timed out?"
> return None
>
> # Should use json to parse the results, but instead we're converting the
> string to dictionary. Duct tape.
>
> # replace the "null" with "None" for Python
> data_string = data_string.replace(": null,", ": None,")
>
> # convert the string to a dictionary
> exec("data = " + data_string)
>
> # simplify the results a bit
> results = data["responseData"]["results"]
>
> # build list of urls
> urls = []
>
> for i in results:
> url = i["url"]
> url = url.split("%")[0] # get rid of some garbage from the url.
> Probably avoidable by using Json.
> urls.append(url)
>
> return urls
>
> results = query_google("whatever", start=1)
> print results
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Google Data Protocol" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/google-help-dataapi?hl=en
-~----------~----~----~----~------~----~------~--~---