I'm using the google ajax search API via a perl script to answer the question "does google index these particular url's"?
It works fine unless there are query parameters, in which case I hit an escaping problem. For example the URL: site:http://www.bepress.com/jdpa/vol1/iss1/art3 produces exactly one api result, the correct one. But something with query parameters: site:http://www.bepress.com/cgi/viewcontent.cgi? article=1005&context=jdpa Breaks words at the punctuation, and returns multiple irrelevant results. Surrounding the above in quotes, etc, results in either zero results or no difference. Here's the exact URL used, and the exact results: -------------------------------------------------------------------------------------------------------------------------- TITLE: An Administrative Remedy for the Crack Mandatory Sentencing Problem http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site%3Ahttp%3A%2F%2Fwww.bepress.com%2Fjdpa%2Fvol1%2Fiss1%2Fart3 1. Journal of Drug Policy Analysis (http://www.bepress.com/jdpa/vol1/ iss1/art3/) http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site%3Ahttp%3A%2F%2Fwww.bepress.com%2Fcgi%2Fviewcontent.cgi%3Farticle%3D1003%26amp%3Bcontext%3Djdpa 1. Error - The Berkeley Electronic Press (http://www.bepress.com/cgi/ viewcontent.cgi?arti-) 2. qY‚ iZj‚ CP $t‚TqY'„I (http://www.bepress.com/cgi/ viewcontent.cgi?article=1004&context=giwp) 3. View the article - TOTAL RETURN ECONOMICS (http://www.bepress.com/ cgi/viewcontent.cgi?article=1000&context=giwp) 4. International Journal of Nursing Education Scholarship (http:// www.bepress.com/cgi/viewcontent.cgi?article=1979&context=ijnes) -------------------------------------------------------------------------------------------------------------------------- Is there a way to escape the punctuation, drilling straight to the binary determination (is this URL in google or not?) -- You received this message because you are subscribed to the Google Groups "Google AJAX APIs" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-ajax-search-api?hl=en.
