Hi Andrzej,
This was a very interesting experiment -- thanks for sharing the results
with us.
The last range was the maximum in this case - Google wouldn't display
any hit above 652 (which I find curious, too - because the total number
of hits is, well, significantly higher - and Google claims to return up
to the first 1000 results).
I believe this may have something to do with the way Google compacts
URLs. My guess is that initially a 1000 results is found and ranked.
Then pruning is performed on that, leaving just a subset of results for
the user to select from.
If you try this, self-indulging, query (with filtering enabled):
http://www.google.com/search?as_q=dawid+weiss&num=10&hl=en&as_qdr=all&as_occt=any&as_dt=i&safe=active&start=900
You get: "Results 781 - 782 of about 61,700"
Now try disabling filtering:
http://www.google.com/search?as_q=dawid+weiss&num=10&hl=en&as_qdr=all&as_occt=any&as_dt=i&safe=images&start=900
Then you get: Results 781 - 782 of about 65,500
Hmmm... still the same number of available results, but the total
estimate is higher.
So far I used URL parameters found on the "advanced" search page. I
tried to "display the omitted search results", as Google suggested.
Interestingly, this lead to:
http://www.google.com/search?q=dawid+weiss&hl=en&shb=t&filter=0&start=900
"Results 541 - 549 of about 65,400 "
And that's the maximum you can get.
Sorry, my initial intuition proved wrong -- there is no clear logic
behind the maximum limit of results you can see (unless you can find
some logic in the fact that I can see _more_ results when I _exclude_
repeated ones from the total).
Dawid
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers