Our research is mainly concerned with the improvement of update interval
and coverage of search engine. We proposed a new system, web search engine
based on DNS, a public search engine.In theory, this system can cover all
the pages on Internet. Its update interval could be one day.For more advice.
A brief introduction for our work:
Now no a search engine can cover more 60 per of all the pages on Internet.
The update interval of most pages database is almost one month. This
condition hasn't changed for many years. Converge and recency problem has
become the bottleneck problem of current web search system. These bottleneck
problems mainly result from the centralized architecture of current system.
Its architecture can't continue to index close the entire Web as it grows.
So a better solution must apply a completely different architecture. Some
research like SIREN in IRTF had some considerations for extending the DNS's
navigation function to web pages search function. The hierarchical
distributed architecture of DNS is an efficient architecture to manage the
WWW. Our research group just gives a practical system based on this basic
idea. A BOF agenda is held in IETF 59th meeting (3.1-3.5) for it.
This system is layered search engine. We just need to build the search
engine in the scope of local network and then integrate them by metadata
harvest and other technologies. So it may be an also good topic for Nutch.
Our research belongs to a digital library project. Now we are building its
experimental system on CERNET (China education and research network). The
progress for this research was very slow in these years and very little
academic work has been done on them. So we also need more collaboration for
a better search engine.
Unfortunately,a better search engine may have to be a public search engine:)
A research paper could be found in:
Web search engine based on DNS, http://arxiv.org/abs/cs.NI/0405099
IIRI agenda of IETF: http://www.ietf.org/ietf/04mar/iiri.txt
Wang Liang
and coverage of search engine. We proposed a new system, web search engine
based on DNS, a public search engine.In theory, this system can cover all
the pages on Internet. Its update interval could be one day.For more advice.
A brief introduction for our work:
Now no a search engine can cover more 60 per of all the pages on Internet.
The update interval of most pages database is almost one month. This
condition hasn't changed for many years. Converge and recency problem has
become the bottleneck problem of current web search system. These bottleneck
problems mainly result from the centralized architecture of current system.
Its architecture can't continue to index close the entire Web as it grows.
So a better solution must apply a completely different architecture. Some
research like SIREN in IRTF had some considerations for extending the DNS's
navigation function to web pages search function. The hierarchical
distributed architecture of DNS is an efficient architecture to manage the
WWW. Our research group just gives a practical system based on this basic
idea. A BOF agenda is held in IETF 59th meeting (3.1-3.5) for it.
This system is layered search engine. We just need to build the search
engine in the scope of local network and then integrate them by metadata
harvest and other technologies. So it may be an also good topic for Nutch.
Our research belongs to a digital library project. Now we are building its
experimental system on CERNET (China education and research network). The
progress for this research was very slow in these years and very little
academic work has been done on them. So we also need more collaboration for
a better search engine.
Unfortunately,a better search engine may have to be a public search engine:)
A research paper could be found in:
Web search engine based on DNS, http://arxiv.org/abs/cs.NI/0405099
IIRI agenda of IETF: http://www.ietf.org/ietf/04mar/iiri.txt
Wang Liang
Do you Yahoo!?
Friends. Fun. Try the all-new Yahoo! Messenger
