Hi Peter Assuming the same methodology as Gargouri Y, Hajjem C, Larivière V, Gingras Y, Carr L, et al. (2010) Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research. PLoS ONE 5(10): e13636. doi:10.1371/journal.pone.0013636 Available from: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636
Quote: The full-text OA status of the articles in our sample was verified using an automated webwide search-robot[8]<http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636#pone.0013636-Hajjem1> as well as an automated Google Scholar search. (Note that any OA articles that our robot missed would reduce any OA Advantage. Hence our estimate of the OA Advantage is conservative.) *Figure 1*<http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636#pone-0013636-g001> shows each of our four mandated institutions' verified annual OA article deposits as a percentage of the institution's total published article output for each year based (only) on those articles published in the journals indexed by the Thomson-Reuters citation database; the resulting estimate of the overall OA mandate compliance rate is about 60%.(for publishing years 2002–2006, with the deposits up to 2009, when the analysis was conducted). Note also the robot data's confirmation of the approximately 15% baseline for spontaneous, self-selected (i.e., non-mandated) OA self-archiving among the control articles in the same journal/years[19]<http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636#pone.0013636-Bjrk1> . Ref 8 is to Hajjem, Chawki, Harnad, Stevan and Gingras, Yves (2005) Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How it Increases Research Citation Impact. *IEEE Data Engineering Bulletin*, 28, (4), 39-47. Available from: http://eprints.soton.ac.uk/262906/ Quote: The robot’s search algorithm was the following: (1) Send request to ISI database for metadata of article (firstauthor name and article title). (2) Send request (name, title) to: Yahoo, Metacrawler, Vivissimo, Eo, AlltheWeb and Altavista. (3) Extract external (irrelevant) links. (4) Remove duplicate URLs. (5) Sort URLs to process PDF and PS files first (probable full-texts). (5) Convert files (PDF, PS, Latex, HTML, XML, RTF, and Word) to text. (6) Parse files to test for full-text of reference article (name/title in first 20% of text, references in last 20%). (7) If, in parsing HTML file, title found but not full text, extract and follow links in file further as references possibly leading to the full text (to depth of 3 levels). (8) Sort articles by discipline/journal/issue/year; calculate percent OA articles within each; then by discipline/journal; and finally for each discipline. (9) Sort articles by discipline/journal/issue/year, calculate citation ratio as (OA - NOA/NOA) within each, then by discipline/journal and finally for each discipline. (10) Exclude data for all journals that are 100% OA (OA journals) from both the article counts and the citation counts (as we are only doing within-journal comparisons for NOA journals); exclude data from all single issues that are 100% OA (to eliminate denominators). On Mon, Jul 16, 2012 at 2:20 PM, Peter Murray-Rust <[email protected]> wrote: > Thanks very much Alma, > This is very useful - I have some more questions, and would be grateful > for answers if you can... > > > >> The data are from Yassine Gargouri (who has used the methodology he >> previously used, which consists of trawling the web for openly accessible >> full-texts and comparing the number of those with the papers in Web of >> Science, which is not a perfect, but a reasonable measure of the ‘universe’ >> for UK researchers). >> > > Is this published anywhere (formally or informally) such that we can > understand the details? > * How does he or Google know that the full-text is "openly accessible"? Is > this by trying to read it or is there a Google flag for openly accessible? > >> >> Previously, Yassine has done this only on a global basis, but this time >> he has looked for papers with at least one UK author. >> >> * How is this done? Does *he* analyze the author affiliations or does he > get them from WoS? > > >> * is there an open electronic list of the publications (and their >> funders) so that I can access them >> >> >> He used Google to search for the papers. >> > > More questions: > * Google or GoogleScholar? [Apparently they can give very different > answers] > > Assuming it was GoogleScholar. > * How was the subject classification done? > > I can see one method how the "Gold" access papers were retrieved - by > mapping the Journal onto known Gold journals (sic). (I cannot see how > hybrid gold were easily measured but the numbers are probably too small to > worry about statistically) > > I cannot see the next phase but I can conjecture. More questions: > * did he use his/Google results to compare with WoS? > > * how did he determine that the paper was Green? Almost by definition this > has to be somewhere other than the publisher's site. [so the paper needs > another search for the paper mounted somewhere OTHER than the publisher. > > * does he then have a system to determine whether the paper is readable > (not all papers in repositories are readable, as we have seen). > > If he has such as system then it would seem to answer the key question: > * if I find a paper on a publisher's site can I find a free-as-in-beer > copy somewhere else on the web? > > If he can really answer that question then is his system openly available? > > P. > > _______________________________________________ >> GOAL mailing list >> [email protected] >> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal >> >> > > > -- > Peter Murray-Rust > Reader in Molecular Informatics > Unilever Centre, Dep. Of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-763069 > > _______________________________________________ > GOAL mailing list > [email protected] > http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal > >
_______________________________________________ GOAL mailing list [email protected] http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
