They claim to have less than 500 servers that contain 10billion pages

Such statements are not always supported by evidence. As a side-effect of another experiment, we compared document-count estimates from Google, Yahoo, Live and Gigablast -- they seem to reflect the actual index proportions between these search engines.

It's an internal tech report, so it may be rough around the edges, but even the illustrations should be pretty self-evident:

http://www.cs.put.poznan.pl/dweiss/xml/publications/index.xml?lang=en&highlight=phrasals#phrasals

Here is a direct PDF link:

http://www.cs.put.poznan.pl/dweiss/site/publications/download/2008-weiss-chamielec.pdf

Dawid

Reply via email to