There are a couple of problems that no one has mentioned yet. For one thing, no one knows how Google's search engine works expect people who work at google and are under NDA. (Heck, I had to sign an NDA to have lunch at Google. Of course, I had to sign an NDA to go to the ColdFusion birthday party too.) There aren't any CFCs that work like google because it's not possible to build them without information that would get you sued into oblivion. Also, the search algorithm changes frequently because when people figure out how to cheat their ranks to get higher, the algorithm changes to eliminate the cheats. Not only is it a task that would require PhD mathematicians to build, but would also require constant maintenance by said mathematicians.
Google ranking your site is such a hot topic that people write books on it, and the better websites that discuss ranking techniques tend to make you pay to view the user-submitted threads. We host websites in our crawlspace to increase google ranking and the people that we host for basically make a living by google ranking their sites. Even then, all google ranking techniques are merely theory because google doesn't release info about the algorithm and makes frequent changes to it. That said, if you *could* store all the data in terrabytes of RAM (which is *highly* questionable), databases exist for a reason. They search data and they search data very very well. It is highly unlikely that you would be able to create an algorithm that could search data as fast as, for instance, Oracle can search data. Again, PhD mathematicians work on search algorithms for databases. It basically IS rocket science. Everyone is so amused because you can't start with google and improve upon it (for legal reasons) and it would take such enormous hardware and expert personnel that it would cost literally millions of dollars. It's fine to dream big but Microsoft has already tried to beat google and hasn't succeeded. I guarantee you that they have more expendable capital for such experiments than your company. I bet you a dollar they do. On 12/22/05, Jason Parkils <[EMAIL PROTECTED]> wrote: > > I have already identified a potential competitve advantage over google. > Currently they store all their site data in custom databases. As everyone > knows, database access is notoriously slow. So if the metadata was moved > into ram (the "application" scope) onApplicationStart, you would be able to > perform deeper search functions in the same amount of time - getting you > better results. Nowadays, you could install several terabytes of ram on each > machine (64-bit computing) - so space shouldn't be an issue. The only thing > is that onApplicationStart will take several hours to load the data into ram > - but the servers will be clustered so that should only happen once. > > So no-one knows of any good CFCs to do google-style searching? The > important thing is that they are open source so that I can improve upon > them. Are CFCs even the best way to go? I don't mind doing it through TAGs > or anything else. I was just told that CFCs were the best. > > Also, is it worth it to get CF Enterprise edition for this or is Standard > ok? > > Jason Parkils > > > > > > > > ---------------------------------------------------------- > You are subscribed to cfcdev. To unsubscribe, send an email to > [email protected] with the words 'unsubscribe cfcdev' as the subject of the > email. > > CFCDev is run by CFCZone (www.cfczone.org) and supported by CFXHosting > (www.cfxhosting.com). > > An archive of the CFCDev list is available at > www.mail-archive.com/[email protected] -- It's my birthday! But me a present: http://www.blivit.org/mr_urc/birthday.cfm Now blogging.... http://www.blivit.org/blog/index.cfm http://www.blivit.org/mr_urc/index.cfm ---------------------------------------------------------- You are subscribed to cfcdev. To unsubscribe, send an email to [email protected] with the words 'unsubscribe cfcdev' as the subject of the email. CFCDev is run by CFCZone (www.cfczone.org) and supported by CFXHosting (www.cfxhosting.com). An archive of the CFCDev list is available at www.mail-archive.com/[email protected]
