[Tika Wiki] Update of "GrobidJournalParser" by NickBurch

Apache Wiki Wed, 19 Aug 2015 10:38:10 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.


The "GrobidJournalParser" page has been changed by NickBurch:
https://wiki.apache.org/tika/GrobidJournalParser?action=diff&rev1=7&rev2=8

Comment:
Add a note on binaries, and where to track the progress

  The GrobidJournalParser uses the 
[[http://grobid.readthedocs.org/en/latest/Introduction/|GROBID (or Grobid) 
GeneRation Of BIbliographic Data]] machine learning framework to parse PDF 
files and to extract information such as  title, abstract, authors, 
affiliations, keywords, etc, from journal publications. The parser has been 
integrated into Tika. You can follow this guide to get it working on your 
system.
  
  == Installing GROBID ==
+ Currently, to install GROBID, it's necessary to start from the source code. 
We are currently working with the GROBID community to get pre-build binaries 
into Maven central, which is being tracked with 
[[https://github.com/kermitt2/grobid/issues/59|issue #59]]. For now, a git 
checkout of head is recommended, as detailed here.
  
  You should be able to install GROBID from a Git checkout such as the below.

[Tika Wiki] Update of "GrobidJournalParser" by NickBurch

Reply via email to