[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Heather, Thank you for this deep analysis. I don't feel like an expert on licensing issues so I will let others comment, but every new idea on how in general to fund academic services like Paperity is more than welcome. The individual who finally discovers a satisfactory solution should get a Nobel Prize at the very least. Best Marcin On 10/12/2014 10:22 PM, Heather Morrison wrote: Thank you for providing the information, Marcin. Since there is a subset of the open access community that demands blanket permissions for commercial rights downstream (a position I strongly disagree with), it is important to discuss what the potential commercial uses might be to determine whether these actually advance open access or scholarly knowledge or not. Some comments on these options for Paperity: In the subscriptions model, aggregators (such as EBSCO and ProQuest), typically pay journals to include their content, or in the case of open access journals, at least do not charge the journals. Charging journals to include them in an aggregated service changes a revenue stream to an expense stream for the journals. This makes it harder to find the revenue to produce journals; a barrier to publishing journals in the first place is not in the interests of advancing scholarly knowledge. Advertising is one of the potential revenue streams for open access journals (and one that some journals are currently using). If Paperity is using journal content to sell advertising, then Paperity could easily be competing with the journals for this revenue. It is lovely to hear of Paperity's good intentions starting out to be fair, efficient and acceptable for everyone. But what can happen with services like this down the road when there are bills to be paid, journals are less than keen to pay for this service and advertisers continue to prefer Google? The following is addressed to my fellow open access advocates as this is a good discussion about open access downstream, and these comments are not intended to apply to Paperity: If the purpose of insisting on re-use and commercial rights downstream is designed to facilitate the design of services such as Paperity, let's discuss these possibilities downstream that I argue are facilitated by CC-BY and/or CC-BY-SA licenses: - aggregator takes CC-BY content and develops a toll-access value-added service By way of illustration of this: Elsevier's Scopus claims to include 2,800 gold open access journals. Scopus is a subscription-based service. - aggregator takes CC-BY content, initially develops an open access value-added search service, then sells the service to a for-profit company that changes the business model to toll access By way of illustration of the sales aspect, consider that Elsevier bought Mendeley and Springer bought BioMedCentral. Both are still free services, but offered by largely subscription-based companies; why would we assume that they would never change the business model? - aggregator follows the Paperity suggestion of charging journals, but with a twist: does not include journals that do not pay and/or returns results based on payments by journals (i.e. pay-to-play) Are these models seen as desirable by advocates of requiring CC-BY and/or CC-BY-SA licenses? Are any of these scenarios aligned with the Budapest vision? If you agree that they are not, can you explain why you think these are unlikely or how the licenses would prevent this from happening? For example, perhaps someone can explain how it is that Elsevier is able to charge to direct people to OA journals through Scopus? A comment on SA: although Sharealike is the most copyleft of the CC license elements, it does not come with an obligation to share in the same way, rather an obligation to use the same license when including re-used content. One can take a work that is licensed SA and is freely available on the web and include it in a work that is limited in any of a variety of fashions (part of a presentation to an audience limited to those who are willing and able to pay to attend; a toll access work, etc.) - as long the work downstream uses the license. In other words, CC-BY-SA does not do as much to protect OA downstream as one might think. best, Heather Morrison On 2014-10-12, at 3:20 PM, Marcin Wojnarski wrote: Hi Serge, We're working on this. Paperity started as a non-profit academic project, but yes, we need to develop a business model to make it sustainable and to achieve the goal of 100% OA aggregated. Most likely we'll expect participating journals to support our services, which we think is a fair solution when many of them charge APCs and we actually help them do their job (dissemination). We're aware however that there are also many small non-profit journals which don't charge APC at all, and we definitely want to aggregate them all, too. So the
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Dear Stevan, We started with Gold, because we believe that journals play a fundamental role in the system of scholarly communication and every service that tries to facilitate access to literature must start with journals, not only with a flat collection of papers like the one found in repositories. For 400 years, journals have been the backbone of the system, the main structural element. They provide a brand name for papers, create consistent editoral policy and take responsibility for the quality and relevance of articles they publish - these features are of topmost importance for readers, without them navigating through millions of articles becomes infeasible. That said, we're fully aware how much great unique content there is in repositories and we'd like very much to merge these two streams - Gold and Green - in Paperity at some point. Although there are some tensions inside OA community between the Gold and Green camps, I think they are unjustified, because these routes are complementary, not competitive. As to indexing, it is actually much easier to be done for repositories than for journals, because most repos expose standardized interfaces. So we don't need Google Scholar for this purpose, only as I said, we believe that the right order is journals first. Best Marcin On 10/12/2014 01:51 PM, Stevan Harnad wrote: Harvesting Gold OA journal articles is a piece of cake. How will Paperity/redex harvest Green OA articles published in non-OA journals but made OA somewhere on the Web — via Google Scholar? Sounds like a splendid idea if it can be done… But not if it is just Gold-biassed, because most refereed research is not Gold, and the fastest growing form of OA is Green (because of mandates, and absence of extra cost). SH -- Marcin Wojnarski, Founder of Paperity, www.paperity.org www.linkedin.com/in/marcinwojnarski www.facebook.com/Paperity www.twitter.com/Paperity Paperity. Open science aggregated. ___ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
On Oct 12, 2014, at 4:50 PM, Marcin Wojnarski mwojnar...@paperity.org wrote: Dear Stevan, We started with Gold, because we believe that journals play a fundamental role in the system of scholarly communication and every service that tries to facilitate access to literature must start with journals, not only with a flat collection of papers like the one found in repositories. Dear Marcin, I think there may be a fundamental misunderstanding here. Green OA consists of self-archived journal articles and their bibliographic metadata — including journal name. And institutional repositories consist of an institution’s journal article output. Nothing “flat” about those! Were you perhaps thinking that repositories just contain unpublished preprints and gray literature? For 400 years, journals have been the backbone of the system, the main structural element. I don’t understand why you are pointing this out: From the very outset the Open Access movement has been very specifically about opening access to journal articles. Please see the original BOAI statement: http://www.budapestopenaccessinitiative.org/read The literature that should be freely accessible online is that which scholars give to the world without expectation of payment. Primarily, this category encompasses their peer-reviewed journal articles… They provide a brand name for papers, create consistent editoral policy and take responsibility for the quality and relevance of articles they publish - these features are of topmost importance for readers, without them navigating through millions of articles becomes infeasible. Marcin, it remains clear why you are telling us this. We all know it. What I asked you was: Harvesting Gold OA journal articles is a piece of cake. How will Paperity/redex harvest Green OA articles published in non-OA journals but made OA somewhere on the Web That said, we're fully aware how much great unique content there is in repositories and we’d like very much to merge these two streams - Gold and Green - in Paperity at some point. The great unique content in repositories is the very same great unique content that there is in journals. Gold OA and Green OA both consist of journal articles. There are many more non-Gold journals and non-Gold journal-articles than Gold ones. Why is Paperity focusing on Gold? Why is all the rest only to be merged at some point”? And how, exactly? Although there are some tensions inside OA community between the Gold and Green camps, I think they are unjustified, because these routes are complementary, not competitive. You are quite right, the two roads to OA are complementary, not competitive. But in order to complement one another they must both be clearly understood, and much of the tension is about misunderstandings, for example, that OA = Gold OA while Green OA is about something else (preprints, gray literature). And another point of tension is about priorities: Which needs to come first, Gold or Green? (My own reply is that it is for many important reasons Green that must come first: (1) because Green does not cost the author money, (2) because Green can be mandated by institutions and funders, and (3) because by coming first Green will make subscriptions unsustainable, force journals to cut obsolete costs, downsize to providing peer review alone, and convert to to affordable, sustainable, Fair Gold instead of today’s over-priced, double-paid pre-Green Fools Gold. http://j.mp/fairgoldOA As to indexing, it is actually much easier to be done for repositories than for journals, because most repos expose standardized interfaces. Then why is Paperity starting with Gold OA journal articles instead of Green OA journal articles in repositories? So we don't need Google Scholar for this purpose, only as I said, we believe that the right order is journals first. What you have said it that you believe the right order is Gold OA first, but you have certainly not explained why — apart from the fact that Gold OA is certainly much easier to access and aggregate: Gold OA journal article blibliographic data can be harvested from the journals’ websites using DOAJ to identify all the journals. But how are you going to find all the Green OA journal articles, if not with Google Scholar? (WoS or SCOPUS can find you all journal articles, but but won’t tell you which ones are Green OA.) (BASE provides some of these data; ROAR 2.0 will soon provide it all.) Best wishes, Stevan Best Marcin On 10/12/2014 01:51 PM, Stevan Harnad wrote: Harvesting Gold OA journal articles is a piece of cake. How will Paperity/redex harvest Green OA articles published in non-OA journals but made OA somewhere on the Web — via Google Scholar? Sounds like a splendid idea if it can be done… But not if it is just Gold-biassed, because most refereed research is not Gold, and the fastest growing form of OA is Green (because of
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Many thanks, indeed Your answer is clear, and I wish you success Cheers Serge De : goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] De la part de Marcin Wojnarski Envoyé : dimanche 12 octobre 2014 21:20 À : Global Open Access List (Successor of AmSci) Objet : [GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers Hi Serge, We're working on this. Paperity started as a non-profit academic project, but yes, we need to develop a business model to make it sustainable and to achieve the goal of 100% OA aggregated. Most likely we'll expect participating journals to support our services, which we think is a fair solution when many of them charge APCs and we actually help them do their job (dissemination). We're aware however that there are also many small non-profit journals which don't charge APC at all, and we definitely want to aggregate them all, too. So the details are still to be sorted out, but I'm confident that over time we'll come up with a good solution: one that's fair, efficient and acceptable for everybody. Of course, there are also more traditional solutions that we'll investigate, like adverts. Cheers Marcin On 10/11/2014 09:07 PM, BAUIN Serge wrote: Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web site. Cheers Serge Envoyé d'un téléphone portable, désolé pour le caractère inélégant... Le 10 oct. 2014 à 08:22, Marcin Wojnarski mwojn...@ns.onet.plmailto:mwojn...@ns.onet.pl a écrit : Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize true peer reviewed papers and what you consider to be true peer reviewed papers? Best, Jeroen Bosman @jeroenbosman Utrecht University Library From: goal-boun...@eprints.orgmailto:goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf Of Marcin Wojnarski Sent: donderdag 9 oktober 2014 14:51 To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) With the beginning of the new academic year, Paperityhttp://paperity.org, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature. Right now, Paperityhttp://paperity.org (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Stevan, Repositories are not an authoritative source of metadata about paper-journal relation. Metadata is put there by authors themselves and it can be missing, incomplete or erroneous, in extreme cases even fake. Thus in practice repository collections are flat even if metadata is present. If you think that finding Green articles is impossible, then you shall not be surprised that we focus on Gold first, right? Best Marcin On 10/13/2014 02:14 PM, Stevan Harnad wrote: On Oct 12, 2014, at 4:50 PM, Marcin Wojnarski mwojnar...@paperity.org mailto:mwojnar...@paperity.org wrote: Dear Stevan, We started with Gold, because we believe that journals play a fundamental role in the system of scholarly communication and every service that tries to facilitate access to literature must start with journals, not only with a flat collection of papers like the one found in repositories. Dear Marcin, I think there may be a fundamental misunderstanding here. Green OA consists of self-archived *journal articles* and their bibliographic metadata — including journal name. And institutional repositories consist of an institution’s *journal article* output. Nothing “flat” about those! Were you perhaps thinking that repositories just contain unpublished preprints and gray literature? For 400 years, journals have been the backbone of the system, the main structural element. I don’t understand why you are pointing this out: From the very outset the Open Access movement has been very specifically about opening access to *journal articles*. Please see the original BOAI statement: http://www.budapestopenaccessinitiative.org/read /The literature that should be freely accessible online is that which scholars / /give to the world without expectation of payment. Primarily, this category / /encompasses their *peer-reviewed journal articles*…/ They provide a brand name for papers, create consistent editoral policy and take responsibility for the quality and relevance of articles they publish - these features are of topmost importance for readers, without them navigating through millions of articles becomes infeasible. Marcin, it remains clear why you are telling us this. We all know it. What I asked you was: Harvesting Gold OA journal articles is a piece of cake. How will Paperity/redex harvest *Green OA articles published in non-OA journals* but made OA somewhere on the Web That said, we're fully aware how much great unique content there is in repositories and we’d like very much to merge these two streams - Gold and Green - in Paperity at some point. The great unique content in repositories is the very same great unique content that there is in journals. Gold OA and Green OA both consist of *journal articles*. There are many more non-Gold journals and non-Gold journal-articles than Gold ones. Why is Paperity focusing on Gold? Why is all the rest only to be merged at some point”? And how, exactly? Although there are some tensions inside OA community between the Gold and Green camps, I think they are unjustified, because these routes are complementary, not competitive. You are quite right, the two roads to OA are complementary, not competitive. But in order to complement one another they must both be clearly understood, and much of the tension is about misunderstandings, for example, that OA = Gold OA while Green OA is about something else (preprints, gray literature). And another point of tension is about priorities: Which needs to come first, Gold or Green? (My own reply is that it is for many important reasons Green that must come first: (1) because Green does not cost the author money, (2) because Green can be mandated by institutions and funders, and (3) because by coming first Green will make subscriptions unsustainable, force journals to cut obsolete costs, downsize to providing peer review alone, and convert to to affordable, sustainable, Fair Gold instead of today’s over-priced, double-paid pre-Green Fools Gold. http://j.mp/fairgoldOA As to indexing, it is actually much easier to be done for repositories than for journals, because most repos expose standardized interfaces. Then why is Paperity starting with Gold OA journal articles instead of Green OA journal articles in repositories? So we don't need Google Scholar for this purpose, only as I said, we believe that the right order is journals first. What you have said it that you believe the right order is Gold OA first, but you have certainly not explained why — apart from the fact that Gold OA is certainly much /easier/ to access and aggregate: Gold OA journal article blibliographic data can be harvested from the journals’ websites using DOAJ to identify all the journals. But how are you going to find all the Green OA journal articles, if not with Google Scholar? (WoS or SCOPUS can find you all journal articles, but but won’t tell you
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
On Oct 13, 2014, at 1:06 PM, Marcin Wojnarski mwojnar...@paperity.org wrote: Repositories are not an authoritative source of metadata about paper-journal relation. Metadata is put there by authors themselves and it can be missing, incomplete or erroneous, in extreme cases even fake. Thus in practice repository collections are flat even if metadata is present. Are you looking for “authoritative metadata” or metadata of OA journal articles? The majority of OA journal articles are Green, not Gold. Focussing on the Gold because it is more “authoritative” calls to mind the joke about the drunkard who prefers to keep looking for his keys by the lamp-post because it is brighter there. If you think that finding Green articles is impossible, then you shall not be surprised that we focus on Gold first, right? I certainly did not say it was impossible! (We do it all the time! So does Google Scholar.) I only said it was not as easy as it is to just go to DOAJ journal websites (the lamp-post) for only the Gold. And I think the preoccupation with “authoritative” sources of metadata is monumentally misplaced. (In fact, the notion of “aggregation” is probably obsolescent too): we have journal articles all over the web, and all that’s needed is a way to find them. Google Scholar’s pretty good, and can potentially be made even better. But what’s missing now is not a better harvester or more “authoritative” metadata, but more OA articles (whether Gold or Green). Only about 30% of journal articles published today are OA (the majority of it Green). The fastest and surest (and cheapest) way to provide the remaining 70% is to mandate and provide Green. Stevan Harnad On 10/13/2014 02:14 PM, Stevan Harnad wrote: On Oct 12, 2014, at 4:50 PM, Marcin Wojnarski mwojnar...@paperity.org wrote: Dear Stevan, We started with Gold, because we believe that journals play a fundamental role in the system of scholarly communication and every service that tries to facilitate access to literature must start with journals, not only with a flat collection of papers like the one found in repositories. Dear Marcin, I think there may be a fundamental misunderstanding here. Green OA consists of self-archived journal articles and their bibliographic metadata — including journal name. And institutional repositories consist of an institution’s journal article output. Nothing “flat” about those! Were you perhaps thinking that repositories just contain unpublished preprints and gray literature? For 400 years, journals have been the backbone of the system, the main structural element. I don’t understand why you are pointing this out: From the very outset the Open Access movement has been very specifically about opening access to journal articles. Please see the original BOAI statement: http://www.budapestopenaccessinitiative.org/read The literature that should be freely accessible online is that which scholars give to the world without expectation of payment. Primarily, this category encompasses their peer-reviewed journal articles… They provide a brand name for papers, create consistent editoral policy and take responsibility for the quality and relevance of articles they publish - these features are of topmost importance for readers, without them navigating through millions of articles becomes infeasible. Marcin, it remains clear why you are telling us this. We all know it. What I asked you was: Harvesting Gold OA journal articles is a piece of cake. How will Paperity/redex harvest Green OA articles published in non-OA journals but made OA somewhere on the Web That said, we're fully aware how much great unique content there is in repositories and we’d like very much to merge these two streams - Gold and Green - in Paperity at some point. The great unique content in repositories is the very same great unique content that there is in journals. Gold OA and Green OA both consist of journal articles. There are many more non-Gold journals and non-Gold journal-articles than Gold ones. Why is Paperity focusing on Gold? Why is all the rest only to be merged at some point”? And how, exactly? Although there are some tensions inside OA community between the Gold and Green camps, I think they are unjustified, because these routes are complementary, not competitive. You are quite right, the two roads to OA are complementary, not competitive. But in order to complement one another they must both be clearly understood, and much of the tension is about misunderstandings, for example, that OA = Gold OA while Green OA is about something else (preprints, gray literature). And another point of tension is about priorities: Which needs to come first, Gold or Green? (My own reply is that it is for many important reasons Green that must come first: (1) because Green does not cost the author
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
It would be nice if 'Paperity' would maintain a listing of the publishers of the journals they index. T-R does this for Web of Science Journal Citation Reports, and it is very helpful. Dana L. Roth Millikan Library / Caltech 1-32 1200 E. California Blvd. Pasadena, CA 91125 626-395-6423 fax 626-792-7540 dzr...@library.caltech.edumailto:dzr...@library.caltech.edu http://library.caltech.edu/collections/chemistry.htm From: goal-boun...@eprints.org [goal-boun...@eprints.org] on behalf of BAUIN Serge [serge.ba...@cnrs.fr] Sent: Saturday, October 11, 2014 12:07 PM To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web site. Cheers Serge Envoyé d'un téléphone portable, désolé pour le caractère inélégant... Le 10 oct. 2014 à 08:22, Marcin Wojnarski mwojn...@ns.onet.plmailto:mwojn...@ns.onet.pl a écrit : Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library From: goal-boun...@eprints.orgmailto:goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf Of Marcin Wojnarski Sent: donderdag 9 oktober 2014 14:51 To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) With the beginning of the new academic year, Paperityhttp://paperity.org, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature. Right now, Paperityhttp://paperity.org (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than top 10% of research institutions have good access to communication channels and can share their findings efficiently. The remaining 90%, especially authors from developing countries and early-career researchers, start from a much lower stand and often stay unnoticed despite high quality of their work – says Wojnarski. He adds that it is not by accident that Paperity partners right now with the EU Contest for Young Scientists, the biggest science fair in Europe
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
On Sun, Oct 12, 2014 at 2:08 AM, Dana Roth dzr...@library.caltech.edu wrote: It would be nice if 'Paperity' would maintain a listing of the publishers of the journals they index. T-R does this for Web of Science Journal Citation Reports, and it is very helpful. Is this listing (a) publicly visible - or only available to WoS subscribers? (b) re-usable without further permission from T-R? (CC-BY or weaker?) If it's not re-usable then we need a fully Open equivalent for indexable journals. Dana L. Roth Millikan Library / Caltech 1-32 1200 E. California Blvd. Pasadena, CA 91125 626-395-6423 fax 626-792-7540 dzr...@library.caltech.edu http://library.caltech.edu/collections/chemistry.htm -- -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 ___ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Harvesting Gold OA journal articles is a piece of cake. How will Paperity/redex harvest Green OA articles published in non-OA journals but made OA somewhere on the Web — via Google Scholar? Sounds like a splendid idea if it can be done… But not if it is just Gold-biassed, because most refereed research is not Gold, and the fastest growing form of OA is Green (because of mandates, and absence of extra cost). SH On Oct 11, 2014, at 9:08 PM, Dana Roth dzr...@library.caltech.edu wrote: It would be nice if 'Paperity' would maintain a listing of the publishers of the journals they index. T-R does this for Web of Science Journal Citation Reports, and it is very helpful. Dana L. Roth Millikan Library / Caltech 1-32 1200 E. California Blvd. Pasadena, CA 91125 626-395-6423 fax 626-792-7540 dzr...@library.caltech.edu http://library.caltech.edu/collections/chemistry.htm From: goal-boun...@eprints.org [goal-boun...@eprints.org] on behalf of BAUIN Serge [serge.ba...@cnrs.fr] Sent: Saturday, October 11, 2014 12:07 PM To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web site. Cheers Serge Envoyé d'un téléphone portable, désolé pour le caractère inélégant... Le 10 oct. 2014 à 08:22, Marcin Wojnarski mwojn...@ns.onet.pl a écrit : Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library From: goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf Of Marcin Wojnarski Sent: donderdag 9 oktober 2014 14:51 To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) With the beginning of the new academic year, Paperity, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature. Right now, Paperity (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than top 10% of research institutions have good access
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
On 12 Oct 2014, at 12:51, Stevan Harnad har...@ecs.soton.ac.uk wrote: Harvesting Gold OA journal articles is a piece of cake. Indeed. Not just for Paperity, but for anybody else. It's one of the attractions and benefits of open access via the 'gold' route. Another is that most articles can be harvested in XML-format, which enables sophisticated and worthwhile services to be added to aggregations. And aggregations enable researchers to conveniently make large-scale pattern- and meta-analyses without first having to gather all the material from different and disparate sources. Few 'green' repositories that I'm aware of have XML-versions (correct me if I'm wrong – and should I be wrong, is there a list of such repositories?). Aggregations, by the way, cannot be made without clarity about rights and licences, since they are a form of re-use. Those rights are clear, and properly included in metadata, for proper 'gold', but often not for 'green' versions of paywalled articles in repositories. How will Paperity/redex harvest Green OA articles published in non-OA journals but made OA somewhere on the Web — via Google Scholar? Indeed, how will they. Or anybody else? JV Sounds like a splendid idea if it can be done… But not if it is just Gold-biassed, because most refereed research is not Gold, and the fastest growing form of OA is Green (because of mandates, and absence of extra cost). SH On Oct 11, 2014, at 9:08 PM, Dana Roth dzr...@library.caltech.edu wrote: It would be nice if 'Paperity' would maintain a listing of the publishers of the journals they index. T-R does this for Web of Science Journal Citation Reports, and it is very helpful. Dana L. Roth Millikan Library / Caltech 1-32 1200 E. California Blvd. Pasadena, CA 91125 626-395-6423 fax 626-792-7540 dzr...@library.caltech.edu http://library.caltech.edu/collections/chemistry.htm From: goal-boun...@eprints.org [goal-boun...@eprints.org] on behalf of BAUIN Serge [serge.ba...@cnrs.fr] Sent: Saturday, October 11, 2014 12:07 PM To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web site. Cheers Serge Envoyé d'un téléphone portable, désolé pour le caractère inélégant... Le 10 oct. 2014 à 08:22, Marcin Wojnarski mwojn...@ns.onet.pl a écrit : Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library From: goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf Of Marcin Wojnarski Sent: donderdag 9 oktober 2014 14:51 To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) With the beginning of the new academic year, Paperity, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature. Right now, Paperity (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
On Sun, Oct 12, 2014 at 1:44 PM, Jan Velterop velte...@gmail.com wrote: On 12 Oct 2014, at 12:51, Stevan Harnad har...@ecs.soton.ac.uk wrote: Harvesting Gold OA journal articles is a piece of cake. Indeed. Not just for Paperity, but for anybody else. It's one of the attractions and benefits of open access via the 'gold' route. Yes, It's noteworthy that almost all modern text and data mining exercises are carried out on the Open Access subset of the literature. In some cases this is an attempt to get the whole Open literature - in others it's a subsubset such as EuropePubMedCentral. (The alternatives to this are (a) to ignore rights and mine anyway - something we are legally allowed to do in the UK but almost nowhere else or (b) do in in private hoping you won't be found and scared of publishing your sources as a good scholar should). Another is that most articles can be harvested in XML-format, which enables sophisticated and worthwhile services to be added to aggregations. This is true for born-Open publishers such as BioMedCentral, PLOS*, eLIfe, PeerJ, Ubiquity ... This is a straightforward sale - author payment = freedom for re-use. It works very well for text miners. (And please don't tell us that mining is a minority sport which has to tread water for another 5-10 years). I have not systematically surveyed whether XML is offered in the Gold Open Access journals of other major publishers nor whether the licence is always permissive. Those people who argue that CC-NC-ND protects authors (it doesn't) should realise that it has a massive negative impact on useful re-use including mining. Hybrid journals almost certainly do not offer XML. It's hard enough for them to offer CC-BY for Open Access. It works less well for born-Closed publishers (such as Elsevier, NPG, ACS, etc.). Rather than having the simple And aggregations enable researchers to conveniently make large-scale pattern- and meta-analyses without first having to gather all the material from different and disparate sources. Yes - we have built the apparatus to do this in contentmine.org Few 'green' repositories that I'm aware of have XML-versions (correct me if I'm wrong – and should I be wrong, is there a list of such repositories?). Aggregations, by the way, cannot be made without clarity about rights and licences, since they are a form of re-use. Those rights are clear, and properly included in metadata, for proper 'gold', but often not for 'green' versions of paywalled articles in repositories. Exactly. Most Green repositories make it very hard to re-use material. This is primarily due to copyright - the default library approach is to say this may be copyright and you cannot use it unless you write to the author and get permission in writing with real ink. Then there is the technology. University repositories are constructed on the basis that each document is a priceless artefact that scholars will spend hours discovering and reading. The reality of science is that most of these documents will probably only be read by machines. Some counties (NL, FR for example) at least aggregate some documents - such as theses - and the UK has CORE to try to remedy the situation, but even so it's extremely difficult to index and search repositories. I wrote to Bernard Rentier offering to index his repository for scientific terms but was told - sadly - that there was a new phase of investment required before this would be possible. Another problem with most repositories is that they insist on transforming DOCX or LaTeX into PDF. Even for their own theses. This is an act of barbarism. PDF has no semantics and it destroys about 50-75% of the science in the document. Anyway we expect to announce our own Open indexing of the literature RSN. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 ___ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Hi Serge, We're working on this. Paperity started as a non-profit academic project, but yes, we need to develop a business model to make it sustainable and to achieve the goal of 100% OA aggregated. Most likely we'll expect participating journals to support our services, which we think is a fair solution when many of them charge APCs and we actually help them do their job (dissemination). We're aware however that there are also many small non-profit journals which don't charge APC at all, and we definitely want to aggregate them all, too. So the details are still to be sorted out, but I'm confident that over time we'll come up with a good solution: one that's fair, efficient and acceptable for everybody. Of course, there are also more traditional solutions that we'll investigate, like adverts. Cheers Marcin On 10/11/2014 09:07 PM, BAUIN Serge wrote: Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web site. Cheers Serge Envoyé d'un téléphone portable, désolé pour le caractère inélégant... Le 10 oct. 2014 à 08:22, Marcin Wojnarski mwojn...@ns.onet.pl mailto:mwojn...@ns.onet.pl a écrit : Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library *From:*goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] *On Behalf Of *Marcin Wojnarski *Sent:* donderdag 9 oktober 2014 14:51 *To:* Global Open Access List (Successor of AmSci) *Subject:* [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) *With the beginning of the new academic year, Paperity http://paperity.org, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature.* Right now, Paperity http://paperity.org (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - /There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than top 10% of research institutions have good access to communication channels and can share their findings efficiently. The remaining 90%, especially authors from developing countries and early-career researchers, start from a much lower stand and often stay unnoticed despite high quality of their work/ – says Wojnarski. He adds that it is not
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Thanks Dana. On our to-do list. :) Marcin On 10/12/2014 03:08 AM, Dana Roth wrote: It would be nice if 'Paperity' would maintain a listing of the publishers of the journals they index. T-R does this for Web of Science Journal Citation Reports, and it is very helpful. Dana L. Roth Millikan Library / Caltech 1-32 1200 E. California Blvd. Pasadena, CA 91125 626-395-6423 fax 626-792-7540 dzr...@library.caltech.edu mailto:dzr...@library.caltech.edu http://library.caltech.edu/collections/chemistry.htm *From:* goal-boun...@eprints.org [goal-boun...@eprints.org] on behalf of BAUIN Serge [serge.ba...@cnrs.fr] *Sent:* Saturday, October 11, 2014 12:07 PM *To:* Global Open Access List (Successor of AmSci) *Subject:* [GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web site. Cheers Serge Envoyé d'un téléphone portable, désolé pour le caractère inélégant... Le 10 oct. 2014 à 08:22, Marcin Wojnarski mwojn...@ns.onet.pl mailto:mwojn...@ns.onet.pl a écrit : Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library *From:*goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] *On Behalf Of *Marcin Wojnarski *Sent:* donderdag 9 oktober 2014 14:51 *To:* Global Open Access List (Successor of AmSci) *Subject:* [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) *With the beginning of the new academic year, Paperity http://paperity.org, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature.* Right now, Paperity http://paperity.org (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - /There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than top 10% of research institutions have good access to communication channels and can share their findings efficiently. The remaining 90%, especially authors from developing countries and early-career researchers, start from a much lower stand and often stay unnoticed despite high quality of their work/ – says Wojnarski. He adds
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Thank you for providing the information, Marcin. Since there is a subset of the open access community that demands blanket permissions for commercial rights downstream (a position I strongly disagree with), it is important to discuss what the potential commercial uses might be to determine whether these actually advance open access or scholarly knowledge or not. Some comments on these options for Paperity: In the subscriptions model, aggregators (such as EBSCO and ProQuest), typically pay journals to include their content, or in the case of open access journals, at least do not charge the journals. Charging journals to include them in an aggregated service changes a revenue stream to an expense stream for the journals. This makes it harder to find the revenue to produce journals; a barrier to publishing journals in the first place is not in the interests of advancing scholarly knowledge. Advertising is one of the potential revenue streams for open access journals (and one that some journals are currently using). If Paperity is using journal content to sell advertising, then Paperity could easily be competing with the journals for this revenue. It is lovely to hear of Paperity's good intentions starting out to be fair, efficient and acceptable for everyone. But what can happen with services like this down the road when there are bills to be paid, journals are less than keen to pay for this service and advertisers continue to prefer Google? The following is addressed to my fellow open access advocates as this is a good discussion about open access downstream, and these comments are not intended to apply to Paperity: If the purpose of insisting on re-use and commercial rights downstream is designed to facilitate the design of services such as Paperity, let's discuss these possibilities downstream that I argue are facilitated by CC-BY and/or CC-BY-SA licenses: - aggregator takes CC-BY content and develops a toll-access value-added service By way of illustration of this: Elsevier's Scopus claims to include 2,800 gold open access journals. Scopus is a subscription-based service. - aggregator takes CC-BY content, initially develops an open access value-added search service, then sells the service to a for-profit company that changes the business model to toll access By way of illustration of the sales aspect, consider that Elsevier bought Mendeley and Springer bought BioMedCentral. Both are still free services, but offered by largely subscription-based companies; why would we assume that they would never change the business model? - aggregator follows the Paperity suggestion of charging journals, but with a twist: does not include journals that do not pay and/or returns results based on payments by journals (i.e. pay-to-play) Are these models seen as desirable by advocates of requiring CC-BY and/or CC-BY-SA licenses? Are any of these scenarios aligned with the Budapest vision? If you agree that they are not, can you explain why you think these are unlikely or how the licenses would prevent this from happening? For example, perhaps someone can explain how it is that Elsevier is able to charge to direct people to OA journals through Scopus? A comment on SA: although Sharealike is the most copyleft of the CC license elements, it does not come with an obligation to share in the same way, rather an obligation to use the same license when including re-used content. One can take a work that is licensed SA and is freely available on the web and include it in a work that is limited in any of a variety of fashions (part of a presentation to an audience limited to those who are willing and able to pay to attend; a toll access work, etc.) - as long the work downstream uses the license. In other words, CC-BY-SA does not do as much to protect OA downstream as one might think. best, Heather Morrison On 2014-10-12, at 3:20 PM, Marcin Wojnarski wrote: Hi Serge, We're working on this. Paperity started as a non-profit academic project, but yes, we need to develop a business model to make it sustainable and to achieve the goal of 100% OA aggregated. Most likely we'll expect participating journals to support our services, which we think is a fair solution when many of them charge APCs and we actually help them do their job (dissemination). We're aware however that there are also many small non-profit journals which don't charge APC at all, and we definitely want to aggregate them all, too. So the details are still to be sorted out, but I'm confident that over time we'll come up with a good solution: one that's fair, efficient and acceptable for everybody. Of course, there are also more traditional solutions that we'll investigate, like adverts. Cheers Marcin On 10/11/2014 09:07 PM, BAUIN Serge wrote: Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Marcin, May I ask what is the economic model of Paperity? I didn't find any information about that on your web site. Cheers Serge Envoyé d'un téléphone portable, désolé pour le caractère inélégant... Le 10 oct. 2014 à 08:22, Marcin Wojnarski mwojn...@ns.onet.plmailto:mwojn...@ns.onet.pl a écrit : Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library From: goal-boun...@eprints.orgmailto:goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf Of Marcin Wojnarski Sent: donderdag 9 oktober 2014 14:51 To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) With the beginning of the new academic year, Paperityhttp://paperity.org, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature. Right now, Paperityhttp://paperity.org (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than top 10% of research institutions have good access to communication channels and can share their findings efficiently. The remaining 90%, especially authors from developing countries and early-career researchers, start from a much lower stand and often stay unnoticed despite high quality of their work – says Wojnarski. He adds that it is not by accident that Paperity partners right now with the EU Contest for Young Scientists, the biggest science fair in Europe. With the help of Paperity, the Contest wants to improve dissemination of discoveries authored by its participants – top young talents from all over the continent. Paperity is the first service of this kind. The most similar existing website, PubMed Central, aggregates open journals, too, but is limited to life sciences alone. Another related service, the Directory of Open Access Journals, does index articles from multiple periodicals and different disciplines, but does not provide aggregation, only pure indexing: it shows metadata of articles, but for fulltext access redirects to external sites. Moreover, both PMC and DOAJ impose strict technical requirements on participating journals, which limits the scope of aggregation. Paperity adapts to
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Jeroen, Thanks, it's great to hear that you like Paperity! True peer-reviewed means published in a peer-reviewed journal, in contrast to a pdf just posted somewhere on the web (think Google Scholar), which can be anything: a peer-reviewed paper or not, published or not, even randomly generated to resemble a scholarly article, for example to pump up G Scholar citations (http://arxiv.org/abs/1212.0638). The new technology is called REgular Document EXpressions (redex). It is a computer language for analyzing long and complex documents, particularly written in a markup, like HTML or XML. It facilitates analysis of web context where the paper occured, which is critical for maintaining the link between the paper and its journal. Redex builds on top of the very fundamental technology of regular expressions (regex), but redefines the language entirely to make it suitable for large structured texts. Best, Marcin On 10/09/2014 05:02 PM, Bosman, J.M. (Jeroen) wrote: Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library *From:*goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] *On Behalf Of *Marcin Wojnarski *Sent:* donderdag 9 oktober 2014 14:51 *To:* Global Open Access List (Successor of AmSci) *Subject:* [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) *With the beginning of the new academic year, Paperity http://paperity.org, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature.* Right now, Paperity http://paperity.org (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - /There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than top 10% of research institutions have good access to communication channels and can share their findings efficiently. The remaining 90%, especially authors from developing countries and early-career researchers, start from a much lower stand and often stay unnoticed despite high quality of their work/ – says Wojnarski. He adds that it is not by accident that Paperity partners right now with the EU Contest for Young Scientists, the biggest science fair in Europe. With the help of Paperity, the Contest wants to improve dissemination of discoveries authored by its participants – top young talents from all over the continent. Paperity is the first service of this kind. The most similar existing website, PubMed Central, aggregates open journals, too, but is limited to life sciences alone. Another related service, the Directory of Open Access Journals, does index articles from multiple periodicals and different disciplines, but does not provide aggregation, only pure indexing: it shows metadata of articles, but for fulltext access redirects to external sites. Moreover, both PMC and DOAJ impose strict technical requirements on participating journals, which limits the scope of aggregation. Paperity adapts to whatever technology a given periodical employs. Paperity website: http://paperity.org/ http://paperity.org/ -- Marcin Wojnarski, Founder of Paperity,www.paperity.org http://www.paperity.org www.linkedin.com/in/marcinwojnarski http://www.linkedin.com/in/marcinwojnarski
[GOAL] Re: Paperity launched. The 1st multidisciplinary aggregator of OA journals papers
Marcin, This is a great initiative. I had been hoping BASEsearch would take on this task, but it is good to see others are stepping in. Congrats on the initiative. Still, a long way to go Could you elaborate on how your technology is able to recognize “true peer reviewed papers” and what you consider to be “ true peer reviewed papers”? Best, Jeroen Bosman @jeroenbosman Utrecht University Library From: goal-boun...@eprints.org [mailto:goal-boun...@eprints.org] On Behalf Of Marcin Wojnarski Sent: donderdag 9 oktober 2014 14:51 To: Global Open Access List (Successor of AmSci) Subject: [GOAL] Paperity launched. The 1st multidisciplinary aggregator of OA journals papers (press release, apologies for cross-posting) With the beginning of the new academic year, Paperityhttp://paperity.org, the first multidisciplinary aggregator of Open Access journals and papers, has been launched. Paperity will connect authors with readers, boost dissemination of new discoveries and consolidate academia around open literature. Right now, Paperityhttp://paperity.org (http://paperity.org/) includes over 160,000 open articles, gold and hybrid, from 2,000 scholarly journals, and growing. The goal of the team is to cover - with the support of journal editors and publishers - 100% of Open Access literature in 3 years from now. In order to achieve this, Paperity utilizes an original technology for article indexing, designed by Marcin Wojnarski, a data geek from Poland and a medalist of the International Mathematical Olympiad. This technology indexes only true peer-reviewed scholarly papers and filters out irrelevant entries, which easily make it into other aggregators and search engines. The amount of scholarly literature has grown enormously in the last decades. Successful dissemination became a big issue. New tools are needed to help readers access vast amounts of literature dispersed all over the web and to help authors reach their target audience. Moreover, research is interdisciplinary now and scholars need broad access to literature from many fields, also from outside of their core research area. This is the reason why Paperity covers all subjects, from Sciences, Technology, Medicine, through Social Sciences, to Humanities and Arts. - There are lots of great articles out there which report new significant findings, yet attract no attention, only because they are hard to find. No more than top 10% of research institutions have good access to communication channels and can share their findings efficiently. The remaining 90%, especially authors from developing countries and early-career researchers, start from a much lower stand and often stay unnoticed despite high quality of their work – says Wojnarski. He adds that it is not by accident that Paperity partners right now with the EU Contest for Young Scientists, the biggest science fair in Europe. With the help of Paperity, the Contest wants to improve dissemination of discoveries authored by its participants – top young talents from all over the continent. Paperity is the first service of this kind. The most similar existing website, PubMed Central, aggregates open journals, too, but is limited to life sciences alone. Another related service, the Directory of Open Access Journals, does index articles from multiple periodicals and different disciplines, but does not provide aggregation, only pure indexing: it shows metadata of articles, but for fulltext access redirects to external sites. Moreover, both PMC and DOAJ impose strict technical requirements on participating journals, which limits the scope of aggregation. Paperity adapts to whatever technology a given periodical employs. Paperity website: http://paperity.org/ -- Marcin Wojnarski, Founder of Paperity, www.paperity.orghttp://www.paperity.org www.linkedin.com/in/marcinwojnarskihttp://www.linkedin.com/in/marcinwojnarski www.facebook.com/Paperityhttp://www.facebook.com/Paperity www.twitter.com/Paperityhttp://www.twitter.com/Paperity Paperity. Open science aggregated. ___ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal