I like it! I have often asked the question. What is different in principle between paper data and publications and digital data and publications? Ive yet to get an answer. My own answer is Nothing. Which implies that Librarians, archivists and curators need to work out systems along the lines you describe to deal with digital material alongside paper material.
If those methods initially turn out to be fairly crude, then as long as they are CONSISTENTLY crude we have the computer power and network bandwidth and intelligent software writers to develop tools to trawl digital material much more easily than trawling the shelves of the planet for books, journals, logs and lab note books. Google wasnt exactly sophisticated when it took over the world and seemed to be useful, purely because it was better than the alternative. We should not be striving for perfection we should be starting a journey. Remember that useful but incomplete is a worthwhile benchmark of new tools and methods! John K. Milner mailto: john.mil...@btinternet.com or john.mil...@ja.net From: owner-dcc-associa...@lists.ed.ac.uk [mailto:owner-dcc-associa...@lists.ed.ac.uk] On Behalf Of Simon Fenton-Jones Sent: 11 November 2011 05:45 To: 'John Milner'; 'Peter Murray-Rust' Cc: 'Joy Davidson'; research-data...@jiscmail.ac.uk; email@example.com Subject: RE: [dcc-associates] Manchester and Elsevier team up on text-mining tool Thanks John, Peter, Well, let me play devils advocate. Curators are very nice people, sorry public servants, who might say to their institutional researchers. OK, youre paid by the public purse to do a job. Part of that job is to put your papers, or audios, or videos into a repository which is open to everyone. Im a busy person, so put it in our institutions database. Someone is probably going to trawl our open space and the suck it up into some huge repository in the sky. Then youll have the pleasure of sifting through haystacks to find something useful. Hey its better than nothing. As for translations, forget it. OK, wed better make some noise about the massive amount of money we have to pay to take a copy of the rapacious publishers database/platform. But what more can we do? So now we have public money being spent on enhancing private publishers (these days) platforms, and in my world, the functionality of private platforms like Adobe Connect. Thank you very much Mr. Taxpayer. Now no one could ever accuse curators of original thoughts. They look at the back end in much the same way as many of my correspondents look at the front = lecture/event/conference capture. They also make sure that they look only at information, (sometimes) in all its formats, in the same way my mates look at the real time communications. Professional courtesy appears to discourage interest. So I have to ask. Do remember an old saying Libraries are not made, they grow. The idea it seems was that one needed to a community to build a library around/on behalf of. These days our communities are global. The web has made that perfectly obvious. Our curators, meantime, are institutionalized. So what is there to stop librarians agreeing on a DNS system, based (perhaps) on a bibliographic classification system, so when a researcher says, where do i put my paper? or where do I find my peers? Curators could say stick it up on that domain number like they would if a researcher had a physical book to put on a shelf. And just like an a Wikipedia article, when one reads its history, peers could discover one another. I know that people who are interested in just classifying things would just laugh. Classify social media? Youre daft. http://www.semantico.com/2011/11/a-taxonomy-of-social-media-forget-it/?mid=5 1 But Im not suggesting that they just classify things. Im suggesting that they curate on behalf of a global community. Or is it more to do with the idea that building a public platform on a global basis is all a bit too hard. Hmm. Doesnt seem like thats an excuse either. http://www.geant.net/service/edugain/about_edugain/how_eduGAIN_works/Pages/h ome.aspx Just a lack of perspective, imagination and collaboration. Sounds like its time for another conference where, just for a change, peers do more than invite the usual suspects. Hasta Luego, si From: John Milner [mailto:john.mil...@btinternet.com] Sent: Thursday, 10 November 2011 2:33 PM To: 'Peter Murray-Rust'; 'Simon Fenton-Jones' Cc: 'Joy Davidson'; research-data...@jiscmail.ac.uk; firstname.lastname@example.org Subject: RE: [dcc-associates] Manchester and Elsevier team up on text-mining tool Hmmm so its not your money. If you are paid from the public purse too, then it may not be, but it might be mine and I dont like it much either! I thought public policy was all about open access these days. Moreover I think Elsevier are not even acting in their own best interests. In my experience defending IPR in that way is always doomed to failure, they need to start looking at new business models not try to defend a doomed one. John K. Milner From: owner-dcc-associa...@lists.ed.ac.uk [mailto:owner-dcc-associa...@lists.ed.ac.uk] On Behalf Of Peter Murray-Rust Sent: 10 November 2011 02:25 To: Simon Fenton-Jones Cc: Joy Davidson; research-data...@jiscmail.ac.uk; email@example.com Subject: Re: [dcc-associates] Manchester and Elsevier team up on text-mining tool On Thu, Nov 10, 2011 at 2:50 AM, Simon Fenton-Jones <simo...@cols.com.au> wrote: Let me see if I got this right. "Elsevier, a leading provider of scientific, technical and medical information products and services", at a cost which increases much faster than inflation, to libraries who can't organize their researchers to back up a copy of their journal articles so they can be aggregated, is to have their platform, Sciverse, made more attractive, by the public purse by a simple text mining tool which they could build on a shoestring. Sciverse Applications, in return, will take advantage of this public largesse to charge more for the journals which should/could have been compiled by public digital curators in the first instance. Hmmm. So this is progress. Hey. It's not my money! Thanks very much Simon No - it's worse. I have been expressly and consistently asking Elsevier for permission to text-mine factual data form their (sorry OUR) papers. They have prevaricated and fudged and the current situation is: "you can sign a text-mining licence which forbids you to publish any results and handsover all results to Elsevier" I shall not let this drop - I am very happy to collect allies. Basically I am forbidden to deploy my text-mining tools on Elsevier content. P. -----Original Message----- From: owner-dcc-associa...@lists.ed.ac.uk [mailto:owner-dcc-associa...@lists.ed.ac.uk] On Behalf Of Joy Davidson Sent: Monday, 7 November 2011 11:59 PM To: research-data...@jiscmail.ac.uk; firstname.lastname@example.org Subject: [dcc-associates] Manchester and Elsevier team up on text-mining tool This press release may be of interest to list members. University enters collaboration to develop text mining applications 07 Nov 2011 http://www.manchester.ac.uk/aboutus/news/display/?id=7627 The University of Manchester has joined forces with Elsevier, a leading provider of scientific, technical and medical information products and services, to develop new applications for text mining, a crucial research tool. The primary goal of text mining is to extract new information such as named entities, relations hidden in text and to enable scientists to systematically and efficiently discover, collect, interpret and curate knowledge required for research. The collaborative team will develop applications for SciVerse Applications, which provides opportunities for researchers to collaborate with developers in creating and promoting new applications that improve research workflows. The University's National Centre for Text Mining (NaCTeM), the first publicly-funded text mining centre in the world, will work with Elsevier's Application Marketplace and Developer Network team on the project. Text mining extracts semantic metadata such as terms, relationships and events, which enable more pertinent search. NaCTeM provides a number of text mining services, tools and resources for leading corporations and government agencies that enhance search and discovery. Sophia Ananiadou, Professor in the University's School of Computer Science and Director of the National Centre for Text Mining, said: "Text mining supports new knowledge discovery and hypothesis generation. "Elsevier's SciVerse platform will enable access to sophisticated text mining techniques and content that can deliver more pertinent, focused search results." "NaCTeM has developed a number of innovative, semantic-based and time-saving text mining tools for various organizations," said Rafael Sidi, Vice President Product Management, Applications Marketplace and Developer Network, Elsevier. "We are excited to work with the NaCTeM team to bring this expertise to the research community." Notes for editors Elsevier is a world-leading provider of scientific, technical and medical information products and services. The company works in partnership with the global science and health communities to publish more than 2,000 journals, and close to 20,000 book titles. A global business headquartered in Amsterdam, Elsevier employs 7,000 people worldwide. NaCTeM is the first publicly funded, text mining centre in the world providing resources, tools and services to academia and industry. NaCTeM collaborates with both academia and industry, nationally and internationally. The University of Manchester The University of Manchester, a member of the Russell Group, is the most popular university in the UK. It has 22 academic schools and hundreds of specialist research groups undertaking pioneering multi-disciplinary teaching and research of worldwide significance. According to the results of the 2008 Research Assessment Exercise, The University of Manchester is now one of the country's major research universities, rated third in the UK in terms of 'research power'. The University had an annual income of £788 million in 2009/10. For media enquiries please contact: Daniel Cochlin Media Relations Officer The University of Manchester 0161 275 8387 daniel.coch...@manchester.ac.uk ***************** Joy Davidson DCC Associate Director Humanities Advanced Technology and Information Institute (HATII) George Service House, 11 University Gardens, University of Glasgow Glasgow G12 8QJ Scotland Tel: +44(0)141 330 8592 <tel:%2B44%280%29141%20330%208592> Fax: +44(0)141 330 3788 <tel:%2B44%280%29141%20330%203788> http://www.dcc.ac.uk -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069