Re: PDFBox and COVID-19

Karl Heinz Kremer Thu, 26 Mar 2020 03:49:33 -0700

The correct URL is https://github.com/petermr/openVirus - it is github.com, not 
.org


Karl Heinz Kremer
PDF Acrobatics Without a Net 
PDF Software Development, Training and More... 
http://www.khkonsulting.com 

> On Mar 26, 2020, at 6:15 AM, Peter Murray-Rust <pm...@cam.ac.uk> wrote:
> 
> One of the tools in tackling epidemics is to collect, clean and analyze the
> science. I have set up a site https://github.org/petermr/openVirus to
> download scientific papers and convert them to semantic, queryable form.
> There can be many thousands of papers on topics such as ventilators, or
> epidemics and schools, as well as genomes and chemistry. Many of us
> believe these could hold information useful to current and future COVID-19
> strategies (e.g. the Liberian Ebola outbreak was actually predicted in a
> paper).
>  Most papers are exposed as PDF which means it's hard for machines to read
> them reliably. PDFBox is an essential tool in my workflow. Work on parsing
> PDF is never complete, so any volunteers work be welcome (mail me NOT this
> list). (I do diagrams as well as text).
> 
> And sincere thanks to the small team that keep PDFBox going. Simply seeing
> the daily stream of factual issues is an encouragement to all of us.
> 
> P.
> 
> -- 
> "I always retain copyright in my papers, and nothing in any contract I sign
> with any publisher will override that fact. You should do the same".
> 
> Peter Murray-Rust
> Reader Emeritus in Molecular Informatics
> Unilever Centre, Dept. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069

Re: PDFBox and COVID-19

Reply via email to