Andrew Lentvorski([EMAIL PROTECTED])@Fri, Sep 14, 2007 at 01:05:27AM -0700: > Bob La Quey wrote: > >OK. So create a better search engine. > > Non-sequitur. > > The problem is not creating a better search engine. Those dead > pages will still exist. The pages with nothing but Google-dreck > will still exist. Even your mythical "better" engine will have > to deal with that. I don't see the problem as being /caused/ by Google. It is caused by people trying to manipulate Google's search service. That isn't to say that Google didn't unintentionally create an environment that encourages that behavior, but a better search engine might be able to compensate. In fact, seems to me that this is exactly how Google came to prominence. Everyone knew how to manipulate AltaVista, Yahoo, etc... but Google was useful because it hadn't been figured out, yet.
Taking that point of view, it may just be necessary for a new "mythical, better" search engine to come around every few years. It certainly couldn't hurt if those new tools were able to compensate for the abundance of garbage to provide more useful results. > The solution (for the end user) is for Yahoo, Microsoft, and > Google to all split the search engine about equally. Then, > specifically gaming one engine will be likely to drop your > ranking on the other two. Suddenly, Yahoo can exploit the > Google-dreck to clean up their listings and the Google-dreck will > go away. I understand your reasoning, but it isn't realistic. The end users would have to use multiple search engines, at the same time the search providers are trying to /distinguish/ themselves from the other companies. At some point, one *will* be more useful than the others. Expecting the worldwide user base to constantly change search engines to manipulate the providers of the content, &/or the search service is far removed from real possibility. > What I really want from a search engine is the "Junk" button from > Thunderbird which is used to help train your spam filter. When I > run a search, I want to be able to classify sites as "Junk" so > that they start dropping in Googlerank for me. Bayesian filtering of search results -- now this idea sounds useful. It puts power in the users' hands, and would make it /much/ more difficult for any content provider to even /guess/ how search results are ranked for any given user, regardless of the choice of search engine. In the past, I was aware of a search provider called dogpile.com. Their strategy was to query several of the popular search engines, and provide a certain number of results from each of them. It sounded useful, but the presentation of the data left something to be desired. Let's say, though, that a user could control an aggregate search, and then a bayesian type filter could re-rank, and collate the results. Of course, now we're back to creating that "mythical, better" tool :-) > Even better would be to be able to download and then share this > with somebody else. For example, I'm pretty sure that I would be > very happy to add Stewart's "Google Junk Corpus" to my own. I don't know. On the surface this sounds good. In practice, I suspect that people will get better results by sticking with their own data. One concern is that, if one corpus of data became popular, the content providers might find it useful to manipulate the results at that level. I might even balk at having a default corpus for new installations of such a tool. In the end, though, the results for those who maintain their own filters would still be much improved. > >Meanwhile we muddle along with less than perfect tools, blaming > >them if we must, but rarely throwing them away. > > I don't think I agree with that. > > I would argue that computer folks are a little too eager to throw > things completely away when a little more polish is actually > called for. The problem isn't the tools, their quality, or their polish. The problem is human behavior. It is generally believed among computer folks, that tools can help avoid some of the results of that bad behavior. The more "mythical" the solution sounds, the more interesting it, and maybe the results, are. It's difficult to stay on top of changes in human behavior, and examine new approaches, by only polishing the old tools. Wade Curry syntaxman -- [email protected] http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list
