Hi I'm trying to design a datbase which is used to store web pages for search engine. Can you guys give me some good advice for this?
I read the page of bigtable. Google give an example of webtable, but it makes me a little confused. google shows how www.cnn.com is stored, but if I have 2 pages named www.cnn.com/a.html and www.cnn.com/b.html, I don't know weather or not to store 2 pages in on row. Google's paper said "In Webtable, we would use URLs as row keys, various aspects of web pages as column names, and store the contents of the web pages in the contents", it seems google will use domain name as row key, and store a.html and b.html as column names. But in that way, it seems impossible for anchor design, how can users tell which page a.html or b.html an anchor text refer to? Luo Lei
