Forwarding to the Discovery list, since this project seems like it might be
of interest even outside the wikidata context. Blame me if you've already
seen this elsewhere. :)

Kevin Smith
Agile Coach, Wikimedia Foundation


---------- Forwarded message ----------
From: Marco Fossati <[email protected]>
Date: Wed, Jun 15, 2016 at 9:06 AM
Subject: [Wikimedia-l] [ANNOUNCEMENT] StrepHit 1.0 Beta Release
To: "Discussion list for the Wikidata project." <
[email protected]>
Cc: [email protected], [email protected]


[Feel free to blame me if you read this more than once]

To whom it may interest,

Full of delight, I would like to announce the first beta release of
*StrepHit*:

https://github.com/Wikidata/StrepHit

TL;DR: StrepHit is an intelligent reading agent that understands text and
translates it into *referenced* Wikidata statements.
It is a IEG project funded by the Wikimedia Foundation.

Key features:
-Web spiders to harvest a collection of documents (corpus) from reliable
sources
-automatic corpus analysis to understand the most meaningful verbs
-sentences and semi-structured data extraction
-train a machine learning classifier via crowdsourcing
-*supervised and rule-based fact extraction from text*
-Natural Language Processing utilities
-parallel processing

You can find all the details here:
https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References
https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Midpoint

If you like it, star it on GitHub!

Best,

Marco
_______________________________________________
discovery mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/discovery

Reply via email to