What I am looking for is an openly and collectively maintained DB a
la wikipedia from which interface you could download all search hits
as well-formatted, parsable lines in a text file without having to
"click next", copy and paste, and all that kind of nonsense.
I could imagine someone in the corpora research community has taken
the time to compile a database which IMO should include:
...
That sounds like a description of Wikidata; that is the kind of thing it
was created to achieve, at least.
You can construct your own queries in SPARQL, which I've found quite
hard, though I've not done much with it the past couple of years, so it
may have got better.
The query builder will make it a bit easier:
https://query.wikidata.org/querybuilder/?uselang=en
It will still be a bit of work, finding out the property numbers for all
the fields you want included.
(If you go down this route, and it works, maybe you can post back the
SPARQL query you end up with.)
Darren
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]