great, thanks Rob, I already tried with site:dspace.mit.edu/handle , basically the same as "inurl:handle", but show=full can make indeed the difference between community//collection pages and item pages.
best regards, Bram @mire NV Romeinse Straat 18 3001 Heverlee Belgium +32 2 888 29 56 http://www.atmire.com - Institutional Repository Solutions http://www.togather.eu - Before getting together, get t...@ther On Thu, Feb 19, 2009 at 6:04 PM, Robert Tansley <[email protected]>wrote: > You won't get entirely accurate numbers but you can get ballpark figures > with e.g. > > site:dspace.mit.edu inurl:handle inurl:show=full > > Basically this narrows things down to the "full item record" pages. Looks > like there may be dups in there -- you could try some additional conditions. > > For the number of bitstreams: > > site:dspace.mit.edu inurl:bitstream > > Hope this helps > > Rob > > > On Thu, Feb 19, 2009 at 05:47, Bram Luyten <[email protected]> wrote: > >> Hi Rob, >> >> I had a question somewhat related to robots.txt and they way how DSpace >> instances are being indexed by google. >> >> As a part of the Google analytics - DSpace comparison that I've been >> running, I would like to analyse which repositories are being indexed best >> by Google, and how that impacts their number of visits. >> >> As a first, very rough estimate, I searched for: >> >> "site:<<repository url>>" to get an indication of how many useful pages >> were indexed. It was interesting to see that these numbers did not really >> corellate with visits to this repository. >> I assumed that for many repositories, different browse pages were being >> indexed, and that these indexed pages were not very useful to generate >> visits // expose the content. >> >> In a second step, I tried to look for "site:<<repository url>>" -browse". >> The returned numbers were in some cases even less than half of the original >> number. >> But I realise this search is being too restrictive: because many pages >> include the word "browse" in their navigation bar, I'm probably excluding >> useful item pages etc in the search. >> >> So my question is the following: >> which search query could I use in Google, to get the number of useful >> indexed pages in Google (item pages, bitstreams, collection & community >> pages) ? >> >> Already an interesting finding from my research: >> the 15 repositories already included in the research, get 60% of their >> visits through search engines (average calculated on the visits in december >> 2008). So even more reason to get exposure through search engines as >> optimized as possible. >> >> best regards, >> >> Bram >> >> @mire NV >> Romeinse Straat 18 >> 3001 Heverlee >> Belgium >> +32 2 888 29 56 >> >> http://www.atmire.com - Institutional Repository Solutions >> http://www.togather.eu - Before getting together, get t...@ther >> >> >> On Thu, Feb 5, 2009 at 10:21 PM, Robert Tansley <[email protected] >> > wrote: >> >>> To all users of DSpace 1.5 and DSpace 1.5.1: >>> These versions of DSpace ship with a bad robots.txt file that prevents >>> search engines such as Google Scholar or Yahoo from indexing any content on >>> a DSpace site. To check if this applies to you: >>> - Visit your site's robots.txt -- >>> http://your_dspace_hostname.edu/robots.txt >>> - If you see the following line you have a bad robots.txt: >>> >>> Disallow: /browse >>> >>> It is important that you REMOVE this line from your robots.txt to ensure >>> that your DSpace instance is correctly indexed by search engines. More info >>> on ensuring your DSpace site is correctly indexed here: >>> >>> http://wiki.dspace.org/index.php?title=Ensuring_your_instance_is_indexed >>> >>> Robert Tansley / Google >>> >>> >>> ------------------------------------------------------------------------------ >>> Create and Deploy Rich Internet Apps outside the browser with >>> Adobe(R)AIR(TM) >>> software. With Adobe AIR, Ajax developers can use existing skills and >>> code to >>> build responsive, highly engaging applications that combine the power of >>> local >>> resources and data with the reach of the web. Download the Adobe AIR SDK >>> and >>> Ajax docs to start building applications today- >>> http://p.sf.net/sfu/adobe-com >>> _______________________________________________ >>> DSpace-tech mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/dspace-tech >>> >>> >> >
_______________________________________________ Dspace-general mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/dspace-general
