[Nutch-dev] database exchange of 2 nutches (hybridity of nutch with yacy)

thomasasta Mon, 01 Jan 2007 16:03:07 -0800

Hi

quite interesting projects out:
http://search.wikia.com/wiki/Search_Wikia


I want to suggest another one here.

Nutch is used for specified customers to index specified pages, or to have an 
open source engine for the worldwide web.

*Two* Nutch engines indexing the web make no sense.
It would be useful, if all Nutch - indexing the web - can be connected together 
and perform a database exchange.

Well you all know www.yacy.net - the p2p search engine - I do not want to 
suggest for nutch the same, but some interoperability of two nutch nodes.

Is it possible to add / import the indexed database of nutch A to nutch B ?

This import must be done manually, but why not within a network ?

If we have 5 nutch engines in the world indexing the web (I do not speak for 
customer solutions for partials intranet webs), why then not accumulating their 
indexes?

I want to suggest a structure, which is hybird with yacy.net

Would it be possible to peform a database-structure, which is usable as well 
for yacy?


Then the nutch index could be spread as well to yacy-nodes and get an backup 
there, other nutches then could add the yacy indexed media into their database.


So yacy p2p is the way to exchange and backup the database of several nutches, 
and the nutch can backup and exchange with yacy nodes and with other nutch 
engines.

I think therefore any nutch should run a yacy node as well and the database 
must be made interoperable.


Would this be possible?

Well, you know the emule-proejct.net filesharing structure. Or take gnutella 
with its ultrapeers. The emule servers support collecting urls/hashed and there 
is as well in emule a p2p node system called kademlia.


Would such a p2p engine structure be possible, if yacy is the p2p node and 
nutch the Ultrapeer indexing for its own, but as well backuping its database to 
the p2p yacy network and getting as well from the network redundant urls ?

See then the wiki-search project of the link above.

As urls get a human ranking (exactly the page is ranked after it was seen with 
the yacy bar) the nutch database could get as well these human ranked urls over 
the database exchange.

Any Idea, if a common database structure is possible and if nutch could 
implement a yacy node to held connections to the dht network of yacy, so nutch 
could be (as well) a yacy node? as both is java this should work?

Thanks for subscribing as well to the yacy.net forums to play around with this 
node and toolbar and the already implemented (need to be developed) human 
ranking.

Thanks for collaboration ideas.
tom

-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

[Nutch-dev] database exchange of 2 nutches (hybridity of nutch with yacy)

Reply via email to