[ECCO] Fwd: Fwd: Scientists get their own Google

C. Gershenson Mon, 22 Nov 2004 11:17:12 -0800

Probably some of you already saw this, which Sven Aerts sent to some
of us...


It seems it is much related to the database we were speaking about...
Proabably a way of getting into it would be better by trying to join
efforts with Google (and resources). We should make a paper describing
what the current search engine is lacking, and how to overcome it, and
then suggest it to the Google people...

    Carlos Gershenson...
    Centrum Leo Apostel, Vrije Universiteit Brussel
    Krijgskundestraat 33. B-1160 Brussels, Belgium
    http://homepages.vub.ac.be/~cgershen/

  �Knowledge brings more questions than answers�



This is a forwarded message
From: Ricardo Barbosa <[EMAIL PROTECTED]>
Date: Monday, November 22, 2004, 7:25:22 PM
Subject: Fwd: Scientists get their own Google

===8<==============Original message text===============

    Scientists get their own Google

Declan Butler <http://www.nature.com/news/about/aboutus.html#Butler>

*New search engine ranks papers by importance, and finds the free 
versions.*



Google Scholar searches for scientific articles instead of web pages.

/� Google/  Media box <javascript:void(0)>

Imagine searching the Internet and being able to restrict your results 
to academic texts. Today Google launched a free search engine that aims 
to do just that. Google Scholar searches only journal articles, theses, 
books, preprints, and technical reports across any area of research.

A test version of the search engine is available at 
http://scholar.google.com <http://scholar.google.com/>, so you can try 
it out. In a search for the phrase "human genome", for example, a normal 
Google web search throws back 450,000 or so hits, with genome centres 
and databases and other websites ranked top.

In contrast, Google Scholar returns just 113,000 hits, and all the 
top-ranked items are not websites but seminal papers on the subject. In 
fact, the number one hit is the landmark article "Initial sequencing and 
analysis of the human genome"^1 
<http://www.nature.com/news/2004/041115/full/041115-13.html#B1> 
published in /Nature/ in 2001.

*On the links*

The tool is based on principles similar to those of Google's web search. 
The original search manages to make the most useful references appear at 
the top of the page using algorithms that exploit the structure of the 
links between web pages. Pages with many links pointing to them are 
considered 'authorities', and ranked highest in search returns.

The ranking is refined by taking into account the importance of the 
origins of links to a paper. "We don't just look at the number of 
links," says Sergey Brin, a cofounder of Google. "A link from the Nature 
home page will be given more weight than a link from my home page," he 
explains.

Google Scholar works in much the same way, using the citations at the 
end of each paper, rather than web links. It automatically identifies 
the format and content of scientific texts from around the web, extracts 
the references and builds automatic citation analyses for all the papers 
indexed.

This approach has been pioneered in computer science by ResearchIndex, 
software produced by the information technology company NEC.

*Search for success*

Much of the peer-reviewed material has been made available to Google by 
publishers, including Nature Publishing Group, the Association for 
Computing Machinery and the Institute of Electrical and Electronics 
Engineers, through a pilot cross-publisher search engine called CrossRef 
Search.

Publishers have arranged for Google robots to scan the full texts of 
their articles. Users clicking on a hit returned by Google Scholar are 
directed to the article on the publisher's site, where subscribers can 
access full text and non-subscribers get an abstract or information on 
how to buy an article.

Google Scholar has a subversive feature, however. Each hit also links to 
all the free versions of the article it has found saved on other sites, 
for example on personal home pages, elsewhere on the Internet.
 

===8<===========End of original message text===========

Scientists get their own Google

Declan Butler

New search engine ranks papers by importance, and finds the free versions.

Google Scholar searches for scientific articles instead of web pages.

Imagine searching the Internet and being able to restrict your results to academic texts. Today Google launched a free search engine that aims to do just that. Google Scholar searches only journal articles, theses, books, preprints, and technical reports across any area of research.

A test version of the search engine is available at http://scholar.google.com, so you can try it out. In a search for the phrase "human genome", for example, a normal Google web search throws back 450,000 or so hits, with genome centres and databases and other websites ranked top.

In contrast, Google Scholar returns just 113,000 hits, and all the top-ranked items are not websites but seminal papers on the subject. In fact, the number one hit is the landmark article "Initial sequencing and analysis of the human genome"¹ published in Nature in 2001.

On the links

The tool is based on principles similar to those of Google's web search. The original search manages to make the most useful references appear at the top of the page using algorithms that exploit the structure of the links between web pages. Pages with many links pointing to them are considered 'authorities', and ranked highest in search returns.

The ranking is refined by taking into account the importance of the origins of links to a paper. "We don't just look at the number of links," says Sergey Brin, a cofounder of Google. "A link from the Nature home page will be given more weight than a link from my home page," he explains.

Google Scholar works in much the same way, using the citations at the end of each paper, rather than web links. It automatically identifies the format and content of scientific texts from around the web, extracts the references and builds automatic citation analyses for all the papers indexed.

This approach has been pioneered in computer science by ResearchIndex, software produced by the information technology company NEC.

Search for success

Much of the peer-reviewed material has been made available to Google by publishers, including Nature Publishing Group, the Association for Computing Machinery and the Institute of Electrical and Electronics Engineers, through a pilot cross-publisher search engine called CrossRef Search.

Publishers have arranged for Google robots to scan the full texts of their articles. Users clicking on a hit returned by Google Scholar are directed to the article on the publisher's site, where subscribers can access full text and non-subscribers get an abstract or information on how to buy an article.

Google Scholar has a subversive feature, however. Each hit also links to all the free versions of the article it has found saved on other sites, for example on personal home pages, elsewhere on the Internet.

[ECCO] Fwd: Fwd: Scientists get their own Google

Scientists get their own Google

Reply via email to