you could try the Adelung
(http://woerterbuchnetz.de/cgi-bin/WBNetz/wbgui_py?sigle=Adelung). It is
18th c. German, but it should not contain poetic contractions. It also was
the basis of the morphological analyzer developed in TextGrid (Morphisto),
but probably using abridged orthography. Furthermore, check
Goethe dictionary or Grimm may be applicable.
Neither dictionary is directly available in machine-readable form, but you
may contact the Trier Center for Digital Humanities for the XML sources or
just scrape the generated HTML (if that's allowed in your legislation --
for Germany, the new UrhWissG allows to use up to 75% of a resource *for
your own scientific research* [but disseminate only up to 15%]).
Suggestion: Use the Adelung for 18th c. German and maybe Grimm for
pre-Duden 19th c. German, write expansion rules for contractions (there
should be very few such rules, mostly e-insertion as in your examples) and
double-check whether your rules hit anything in the dictionary.
Am .03.2018, 15:26 Uhr, schrieb Angelika Peljak-Łapińska
we're currently working on the corpus of 18th-21st century German
translations of 'Othello' (the corpus is accessible with some
purpose-built analysis tools >at www.delightedbeauty.org/vvvclosed) and
we encountered a problem while lemmatizing the data.A dictionary found
at the Institut für Deutsche Sprache and WebLicht tool do not work well
with antiquated and contracted (poetic and vernacular) word >forms (eg.
Euer/Eur/Eu'r or Abentheuer/Abentheu'r/Abenteuer/Abenteu'r/Abenteur).
Does anybody know a dictionary that would contain such old orthographic
PhD student at Swansea University
PS. In case of any specific questions concerning the corpus please
contact prof. Tom Cheesman (t.chees...@swansea.ac.uk).
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list