peter murray-rust <[EMAIL PROTECTED]> writes: > that whatever we do here will require a lot of work and we have to > make our case carefully. (I have been making all of these points in > different forums for about 5 years and while they are accepted in > some communities - eScience, Digital libraries, etc. they have very > litte traction in chemistry.
Hi Peter, What a pity, maybe because it currently works so well? Or as Borat would say - NOT? The small attraction of open data concepts is maybe due to the fact that it currently works so well? Well not for me. > Indeed Henry Rzepa and I wrote a > "manifesto for Open chemistry" in 2004. Well and even before that was acknowledged. A plea for publishing without loss of data G. Kauppa, M. Haak, G. Gauglitz, H.-J. Schneider, R. Moll, H. Schmelz, T. Fröhlich; http://kaupp.chemie.uni-oldenburg.de/global-info/plea/plea_end.html > Note that NMRShiftDB is not > highly valued by many in the chemistry community. It will be > interesting to see what they think of CrystalEye when we launch it. Well if you have 80% coverage and 90% accuracy - it rocks! Regarding the NMRShiftDB, well the keyword is curation. The same for mass spectral data, I would still buy the NIST mass spectral database, because its curated, but there will be a shift towards machine curation, so in this case its good to have open data. Anyway every database vendor should be happy to have a free and open machine readable format available. > One important thing is that we must be *better* than the current > position - free/open is not always compelling in this community. We > have to produce more than arguments - we have to create new things > that people want. I wish it weren't so. I absolutely agree. > I thin we should restrict our work to mainstream chemistry. Work is > already going on elsewhere. Moreover chemistry is seen in the Open > Access and Open Data communities as an area of darkness so it's a > simple and attractive target to point to. I also think we should > restrict ourselves to data - Open Access is already well worked over I absolutely agree and I never wrote about Open Access. It's very easy. If I dont have access to certain journals I just dont cite them. Unfortunately the University of California is very rich so we subscribe to most of the journals. However I am sick on commenting on the journal prices, hey this is the free market; price and demand; its very easy Smith and Marx wrote about it and Keynesian microeconomics tries to explain why prices are sometimes very sticky :-). I still think that 1500 Euro-Dollar author fees for open access articles are a rip-off. Hopefully the prices fall in the future. On the other hand I think that 1500 Dollar for a commercial publication are an extreme rip-off. A high quality 12-issue print magazine like Geo or Mare or National Geographics costs around 60 Euro-Dollar per year (full color)! I think the price issue is not that important, at least not for the developed countries (which have money anyway) more important is the intellectual property issue. More important than open access is open data. (Peter Murray-Rust XTECH 2007): * publishers should adopt a positive policy of making scientific data openly available and remove restrictions. * authors, editors and publishers should recognize the value of publications in semantic form ("machine-understandable"). * funders should require data to be semantic and open. * authors should deposit data in institutional repositories. > I think we could do the following - but again be warned it's a lot of work: > * collect metadata systematically from publishers. Sure. I find it interesting that ACS journals have a pretty good tools section, you can submit ZIPs, CIFs, MOLs, SDFs. But most chemists submit PDF supplement data. Open access journals where you can submit anything, have almost no requirements and chemists submit almost nothing. *** One of the most progressive requirements for a chemistry journal is published in an ACS journal (hear hear). Thats incredible. Its not about open access, its about open data. QSAR/QSPR and Proprietary Data William L. Jorgensen J. Chem. Inf. Model.; 2006; 46(3) pp 937 - 937; http://dx.doi.org/10.1021/ci0680079 ... All data and molecular structures used to carryout a QSAR/QSPR study are to be reported in the paper and/or in its Supporting Information, or be readily available, without infringements or restrictions. The use of proprietary data is generally not acceptable because it is inconsistent with the ACS Ethical Guidelines for publications: "A primary research report should contain sufficient detail and reference to public sources of information to permit the author's peers to repeat the work." This is fundamental, though possible exceptions can be discussed with the editor in the unusual circumstance that a convincing case could be made that the data are somehow a secondary issue. ... *** > Heather (Remix) > has done this very nicely for some in the biomedical community. For > that we need an agreed list of journals (note that policies vary > between journals of the same publisher). For each we should list: > - publisher details and contacts > - current openly stated license policies > - apparent current practice (formats, etc.). You will see I asked the > blogosphere for some of this yesterday in synthetic org chem > - anecdotes as to whether the license is, in fact, waivable if authors ask. > * prepare a document summarising our views on desirable practice. I think if the requirements are fulfilled (existing formats, existing software) there should be no problem. Why does it work in proteomics or genomics? > This will include some of the points below, some of our previous > manifestos, etc. It will need to be very carefully worded as we > cannot go forward with a substandard document. It *must* refer to > general protocols on publications (e.g. the ALPSP/STM document) which > IMO should induce the publishers to open data but doesn't in > chemistry. We should also collect policies from funding bodies. > * then (or earlier) enlist the help of sympathetic bodies. These > might include SPARC (e.g. on the Open Data mailing list) and Peter > Suber. They will be able to relate our suggestions to other > manifestos, protcols, etc. > * (optional but desirable). Show from our own discipline the value of > Openness. This is not easy as we don't have many examples. We might > also collect examples of negative (anticommons) practice. Agreed. > Then - and only then - do we have a credible public face to take to > publishers. Remember that many of these have a large data business > (abstracts, databases) and have every reason to oppose us. Well. look a Guy Kawasaki (How to Change the World), theres no such thing as bad PR. There is an absolute need for commercial publishers, commercial database vendors, companies etc, but on a new progressive and innovative level. Lets call it Darwinism, lets use disruptive technologies, there will be a way out of the current dilemma. > publishers have lobbied the EU to refuse the request t make funded > research open. So it is naive to think that a good argument will carry the > day. Thank you :-) I think its all about moderation. I have no problem buying databases, journals, software, I have no problem selling data, software, publications. Its about balance. Currently we are off-balance. > What I have written is a lot of work. Until recently I would have > said that it required formal research funding to carry out. But I now > believe in the power of the blogosphere and I think we have an ideal > position. But it has to be done well. And that is work. > > We might start with subfields. SPECTRa created a questionnaire for > crystallography, comp chem and spectra. It took a person-year to take > it round colleagues. A person-year is now quite feasible in the > blogosphere - much can be done at coffee meetings, etc. > > It is critical that we have a systematic and professional approach to > our interaction with journals and editors. Here is an example of part > of a possible standard letter > > Dear (X), Editor of Y > > We represent a large group of practicing chemical scientists who are > concerned that lack of access to primary data is holding back > chemistry and sciences that rely on it. We note that bodies such as > CODATA, [funding bodies], provosts, etc. have expressed similar > concerns in science and have argued for... We have summarized a > number of areas where access to data enhance science and lack of > access is harmful. We have prepared a summary of the practices we > feel would be valuable for journals publishing chemistry and ask for > your help in clarifying your current practice and your comments on > our protocol. > > All our discussion is hosted Openly and we will publish your reply > verbatim and with attribution. Comments on our site will be factual > rather than judgmental but we may make comparisons with other > publishers. We shall record lack of a reply after [... days] as > "failed to reply". > > ========= Yes, there need to be some simple questions and some room for comments. I usually are very lazy to comment on such questionnaires, so the questions should be very easy. And some positive examples, like supplement requirements from J. Chem. Inf. Model. or Nature(?) or Science(?) or RCS(?) should be given. The questions should be streamlined towards the journal. It doesn't make sense to ask unrelated questions, like in case of an NMR journal if they support MS exchange formats. But the questions should not be to generic. Thats why I would like to fulfill some of my posted requirements, like collect existing software or databases. And I would only request info from journals with impact factor > 1 or new journals which like to self-report on that purpose. There are more than 1400 chemistry journals and even WIKIPEDIA has only 200 or 300. If somebody has experience on auto-generating entries for WIKIPEDIA, please step forward. A nice template would be this one. http://en.wikipedia.org/wiki/Chemical_Society_Reviews http://en.wikipedia.org/wiki/Category:Chemistry_journals An additional section could contain information on open accessible supplement data and requirements therefore. However I would host this on BlueObelisk in the first place because WIKIpedia is prone to vandalism. > So if you wish to take up the challenge - and no-one will think less > if you do not - you can see the way that I feel we should take it. Lets see, I will do it in my free time so it will be extremely slow anyway. But one journal a day so it should be finished in one year, but I dont have the correct numbers on chemistry journals with IF>1 (maybe 300 or 400). And I would like to have some software or databases at the first hand, or maybe chemists whould just put all the supplement data in a ZIP file? Well thats why it is important to be clear about formats and software in the first place (aka requirements). > Others may take this up as well and you are unlikely to be without > help. But building a systematic framework is essential and probably tedious. > > P. Thank you. Tobias _______________________________________________ Blue-obelisk mailing list Bluefirstname.lastname@example.org http://hardly.cubic.uni-koeln.de/mailman/listinfo/blue-obelisk