I think it is not only database that matters here. It is the clustering technique they use. If you have a distributed array of computers and a fast internet connection, you can build a huge database in one or two days. The real problem is when a user types in a certain topic of interest, how do you decide what the user actually wants. This now boils down to how you cluster your database such that you can select which particular sites the user want. Designing an algorithm for clustering or grouping of large datasets is normally extremely difficult and computationally intensive because it is generally multi-dimensional (just imagine computing the derivatives, partial derivatives and nth power of a 1000 X 1000 matrix) and you have to know who the user is as much as possible (in terms of gender, nationality, age, education, job, etc.) . This is why google as far as I know requires beowulf computing just to cluster their database quickly. They use python not perl though :)
rowel On Mon, 29 Apr 2002, fooler wrote: > as of the moment yes because google has a big cache storage or database of > entire web on the net... let us see when teoma reaches that :-> > > fooler. > _ Philippine Linux Users Group. Web site and archives at http://plug.linux.org.ph To leave: send "unsubscribe" in the body to [EMAIL PROTECTED] To subscribe to the Linux Newbies' List: send "subscribe" in the body to [EMAIL PROTECTED]
