Dear all,
Considering the several posts about this topic, I would like to share with you
my personal experience in using HTML(+RDF) as a format for
preparing/submitting/processing papers in scientific events.
In the past months, I (together with several people in the my research group at the
University of Bologna plus other interested researchers from other institutions) have
released a format for writing academic articles called RASH, i.e., Research Articles in
Simplified HTML. RASH is a markup language that restricts the use of HTML elements to
only 25 elements for writing academic research articles. It is possible to includes also
RDFa annotations within any element of the language and other RDF statements in Turtle
and JSON-LD format by using the appropriate tag "script". The RASH
documentation is available online at [1] and documents RASH version 0.3.5, defined as a
RelaxNG grammar [2].
RASH is the core component of a larger framework that includes a set of
specifications and writing/conversion/extraction tools for academic articles.
All the sources (released with Open Source and Creative Commons Licences) are
available on GitHub [3] and have been developed by a group of several people so
far. An internal note [4] provides a complete overview of the RASH Framework -
please find attached the structured abstract of such note at the end of this
email, for your convenience.
Currently, the RASH Framework includes the following tools:
- a script to enable RASH users to check their documents simultaneously both
against the specific requirements in the RASH RelaxNG grammar and also against
the full set of HTML checks that the W3C Nu HTML Checker (a.k.a., HTML5
validator) does for all HTML documents (by checking all requirements given in
the HTML specification);
- javascript scripts (based on Bootstrap and JQuery) and CSS stylesheets
(partially based on Linked Research [5] CSSs) implementing the visualisation of
RASH documents in the browser. Such scripts also include into RASH papers a
footbar with statistics about the paper (i.e., number of words, figures, tables
and formulas), a menu to change the actual layout of the page, the automatic
reordering of footnotes and references, the visualisation of the metadata of
the paper, etc.;
- XSLT 2.0 files for converting RASH documents into LaTeX according to the ACM
ICPS [6] and Springer LNCS [7] styles (other styles to come soon);
- an XSLT 2.0 file to perform conversions from OpenOffice documents into RASH
documents;
- a Java application called SPAR Xtractor suite that takes a RASH document as
input and returns a new RASH document where all its markup elements have been
annotated with their actual (structural) semantics according to the Document
Components Ontology (DoCO) [8].
In order to experiment with the use of RASH in official venues, it has been
already proposed among the possible submission formats in three academic
events, i.e., the Semantic Publishing Challenge 2015 [9] (that will be held
during ESWC 2015), and the workshops SAVE-SD 2015 [10] (held during WWWW 2015)
and Linking in the Cloud 2015 [11] (that will be held during Hypertext 2015).
In particular, six papers were actually submitted in RASH in the SAVE-SD 2015
Workshop [10] (which I have co-organised) - the sources of such papers are
available in the workshop program webpage [12]. All the RASH papers also
include RDF statements (for a total of about 1300 RDF triples) concerning
article metadata, basic article structures (mainly based on DoCO [9]), citation
functions (based on CiTO [13]), and even semantic descriptions of figures as in
the case of the SAVE-SD 2015 Best RASH Paper [14].
It is worth mentioning that the conversion of the RASH submissions into the ACM
format requested by Sheridan publisher (responsible for the publications of all
WWW proceedings including the workshop proceedings) was handled by us, the
workshop organisers, through a semi-automatic process. In particular, we used
the aforementioned XSLT files to convert RASH papers into LaTeX files compliant
with the official ACM format requested [6], and then we fixed only a few of
layout misalignments.
I hope that the RASH Framework (together with others, e.g., Linked Research [5]
and Scholarly Markdown [15]) and the related initiatives and adoption in
academic events can be considered a first concrete step towards the possible
adoption of HTML(+RDF) for scientific publications in academic venues.
I'm looking forward to having your comments about RASH and its framework and,
in case you are already an earlier adopter of it, please feel free to
participate in a 10 minutes survey about the use of RASH for writing academic
papers, available at http://esurv.org/?u=rash-format.
Please don't hesitate to contact me (email: essepunt...@gmail.com) for
comments, suggestions, and further questions.
Have a nice day :-)
S.
# References
1. http://cs.unibo.it/save-sd/rash/documentation/index.html
2. http://cs.unibo.it/save-sd/rash/grammar/rash.rng
3. http://github.com/essepuntato/rash
4. http://www.essepuntato.it/2015/sepublica/rash-sepublica2015.html
5. https://github.com/csarven/linked-research
6. http://www.acm.org/sigs/publications/proceedings-templates
7. http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0
8. Constantin, A., Peroni, S., Pettifer, S., Shotton, D., Vitali, F. (in
press). The Document Components Ontology (DoCO). To appear in Semantic Web –
Interoperability, Usability, Applicability. OA available at
http://www.semantic-web-journal.net/content/document-components-ontology-doco-0
9. https://github.com/ceurws/lod/wiki/SemPub2015
10. http://cs.unibo.it/save-sd/2015/index.html
11. http://lc2015.dibris.unige.it/
12. http://cs.unibo.it/save-sd/2015/program.html
13. Peroni, S., Shotton, D. (2012). FaBiO and CiTO: ontologies for describing
bibliographic resources and citations. In Journal of Web Semantics: Science,
Services and Agents on the World Wide Web, 17 (December 2012): 33-43.
Amsterdam, The Netherlands: Elsevier.
http://dx.doi.org/10.1016/j.websem.2012.08.001
14. Kuhn, T. (2015). Science Bots: A Model for the Future of Scientific
Computation? http://cs.unibo.it/save-sd/2015/papers/html/kuhn-savesd2015.html
15. http://scholarlymarkdown.com
# Abstract of [4]
Purpose: this paper introduces the RASH Framework, i.e., a set of
specifications and tools for writing academic articles in RASH (a simplified
version of HTML).
Design: RASH has been developed in order to: be easy to learn and use; share
scholarly documents (and embedded semantic annotations) through the Web;
support its adoption within the publishing workflow.
Findings: RASH has been used for papers submitted to the SAVE-SD 2015 workshop.
The authors of papers were able to self-learn it by simply referring to its
documentation page without facing particular issues. The conversion of the RASH
submissions into the format requested by the publisher was handled by the
workshop organisers quickly through a semi-automatic process.
Research limitations: additional tools are needed, e.g., for extracting
additional RDF statements from RASH documents and to enable additional
conversion from/to existing formats.
Practical implications: the RASH Framework is another step towards enabling the
definition of formal representations of the meaning of the content of an
article, facilitate its automatic discovery, enable its linking to semantically
related articles, provide access to data within the article in actionable form,
and allow integration of data between papers.
Social implications: RASH addresses the intrinsic needs related to the various
users of a scholarly article: researchers (focussing on its content), readers
(experiencing new ways for browsing it), citizen scientists (reusing available
data formally defined within it through semantic annotations), publishers
(using the advantages of new technologies as envisioned by the Semantic
Publishing movement).
Value: RASH focuses strictly on writing the content of the paper (i.e.,
organisation of text + semantic annotations) and leaves all the issues about
validation, visualisation, conversion, and semantic data extraction to the
various tools developed within the framework.
----------------------------------------------------------------------------
Silvio Peroni, Ph.D.
Department of Computer Science and Engineering
University of Bologna, Bologna (Italy)
Tel: +39 051 2094871
E-mail: silvio.per...@unibo.it
Web: http://www.essepuntato.it
Blog: http://palindrom.es/phd
Twitter: essepuntato