Setting the user agent did the trick, at least in my case. (ns google-search (:import [java.net URL URLEncoder])) (def google-search-url "http://www.google.com/search?q=") (def user-agent "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.22 (KHTML, like Gecko) Chrome/25.0.1364.172") (defn open-connection [url] (doto (.openConnection url) (.setRequestProperty "User-Agent" user-agent))) (defn get-response [url] (let [conn (open-connection url) in (.getInputStream conn) sb (StringBuilder.)] (loop [c (.read in)] (if (neg? c) (str sb) (do (.append sb (char c)) (recur (.read in))))))) (defn search [query] (let [url (URL. (str google-search-url (URLEncoder/encode query)))] (get-response url))) (spit "response.html" (search "URLEncoder java 7"))
HIH, Juan On Friday, March 22, 2013 4:32:33 AM UTC-3, Cedric Greevey wrote: > > Change your code to it spoofs a common browser user-agent, change your > DHCP-assigned IP address, and try again. They're probably trying to > obstruct bots from making overwhelming numbers of requests or something. As > long as you don't flood them with requests at a higher rate than a human > would generate by clicking, I don't see any ethical issue with > circumventing their countermeasures, especially not if the search will be > triggered by a user input to your application anyway. > > > On Fri, Mar 22, 2013 at 3:09 AM, Rich Morin <r...@cfcl.com > <javascript:>>wrote: > >> I've been successfully using slurp and laser to harvest and pull >> apart some web pages. However, I can't figure out how to use >> Google Search from my code. >> >> My first thought was to use the Google Search API, but after >> a lot of frustration in trying to get and use an API key, I >> gave up on that. >> >> My next thought was to slurp in a page from the interactive >> Google Search facility, using the URL from Advanced Search: >> >> "http://www.google.com/search?hl=en&as_q=..." >> >> However, this gives me a 403 nastygram: >> >> IOException Server returned HTTP response code: 403 for URL: >> https://www.google.com/search?hl=en&as_q=&as_epq=... >> sun.net.www.protocol.http.HttpURLConnection.getInputStream >> (HttpURLConnection.java:1436) >> >> Has anyone here, by chance, been able to do this sort of thing? >> >> -r >> >> -- >> http://www.cfcl.com/rdm Rich Morin >> http://www.cfcl.com/rdm/resume r...@cfcl.com <javascript:> >> http://www.cfcl.com/rdm/weblog +1 650-873-7841 >> >> Software system design, development, and documentation >> >> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clo...@googlegroups.com<javascript:> >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+u...@googlegroups.com <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+u...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> > -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.