Craig Andrews wrote:
So... how can we implement federated, cross OMB search? I have some crazy
ideas, but I'm interested in what others think first.

(Ideally, search would search OMB microblogs, not just StatusNet instances.)
There are three ways this can work that I can think of.

  1. *Peer-to-peer search*, like Gnutella. When you do a global search
     on a site, it asks all the sites it knows about (through
     subscriptions or whatever) for /their/ results for the search
     term. They maybe ask /their/ peers for the search term, out to
     some arbitrary depth. At an outer limit, we could demand that
     /all/ sites respond to the request, either by keeping them in a
     master list somewhere, or some other weird way.

     This would perform terribly -- a matter of minutes or hours. It
     wouldn't get /all/ notices, and it would be hard for the site
     where the search originated to present the results in any
     reasonable way. As far as I'm concerned, distributed peer-to-peer
     search is not a problem we should have to solve.

  2. *Centralized search, pull model*. A centralized search engine
     provides the UI for search (and maybe an API that OMB sites can
     embed into their sites, viz. Collecta's API). The search engine
     discovers OMB servers on its own and polls the public feed or
     individuals' feeds at regular intervals. It may use SUP to make
     this crawling more efficient; it may even use PuSH or RSSCloud to
     do push-mode subscriptions. We could also support subscriptions to
     the public feed through OMB.

     The great part about this is that we (statusnet developers) don't
     have to do anything. The bad part is that it's hard to run a
     search engine. You have to have a lot of resources and smarts.

  3. *Centralized search, push model*. We could configure StatusNet to,
     by default, ping or push public notices to one or more centralized
     search engines. For status.net hosted sites, we ping pingomatic,
     Google Blog Search, and weblogs.com by default. We could probably
     add these and others to lib/common.php for pings by default. We
     could do the same thing with the public XMPP feed (identi.ca feeds
     most search engines using XMPP, but very few StatusNet sites are
     XMPP-enabled). Pings are a little inefficient; we could use PuSH,
     RSSCloud, or a custom notification system.

     We could also run a big weblogs.com-style "reflector" for all
     sites. So, every StatusNet site (and other OMB or even non-OMB
     microblogging sites) could ping "reflector.status.net" or
     whatever, and that site could in turn ping lots of other search
     services. We could even host a Free Network Service for search
     (although I would still want to have the reflector service.)

Of the three, I think #3 makes the most sense. I think we should add some default ping targets in 0.9.0, and try to get a reflector working for status.net ASAP.

-Evan

--
Evan Prodromou
CEO, StatusNet, Inc.
[email protected] - http://identi.ca/evan - +1-514-554-3826

_______________________________________________
StatusNet-dev mailing list
[email protected]
http://lists.status.net/mailman/listinfo/statusnet-dev

Reply via email to