There is actually a project called open crawl or something like that which is a publically accessible webcrawler. Someone wrote an article on how to convert their db dump into a real search engine using hadoop or something similar from amazon. Ill try to pick through my links and find it when I get home. On Apr 27, 2012 12:08 PM, "Daniel Fussell" <[email protected]> wrote:
> On 04/27/2012 10:54 AM, Michael Torrie wrote: > > On 04/27/2012 09:27 AM, Lonnie Olson wrote: > >> On Fri, Apr 27, 2012 at 9:17 AM, Michael Torrie<[email protected]> > wrote: > >>> I'm trying to find our last discussion on domain name registrars, but I > >>> can't find a good way to search this list archives. Have I missed > >>> anything on plug.org's site, or maybe my google fu is failing me? > >> Just use Google. > >> > >> "site:plug.org rhel" > > Are you suggesting "rhel" is a keyword that will return name registrar > > threads? Cause I already tried "site:plug.org domain name registrars" > > and got nothing relevant. I got a 2006 thread as the second hit. Not > > terribly relevant anymore. When I tried to restrict the hits to last > > year, google turns up nothing. I should try bing as Google is sucking > > more and more at returning relevant links. Google used to be great for > > linux results, > > I yearn for the days when my search page had only a simple text field, > an image, and 2 search buttons; and when the result list was just as > simple and helpful. Then they started sticking their fingers in > everyone else's pie, tasting each one repeatedly. Now a 500MHz ARM > processor isn't enough to render the simplified mobile search page in > under 60 seconds, let alone the results. I'm beginning to think wading > through the unsorted results from AOLs' original Webcrawler would be > faster and easier. Or even surfing semi-random links directly. > > Yes, Google is now the Walmart of the Internets, and has gone down the > series-of-tubes. I'm half tempted to start an open-source, distributed > search engine akin to SETI@home. > > ;-Daniel Fussell > > /* > PLUG: http://plug.org, #utah on irc.freenode.net > Unsubscribe: http://plug.org/mailman/options/plug > Don't fear the penguin. > */ > /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
