For my personal use, I would like to acquire electronic dictionaries, principally for the major European languages, with the following characteristics:
- reputable source - "raw" datafiles accessible - I appreciate the interfaces that dictionary vendors may provide, but I want to be able to write my own code to find the data I am looking for - the wordlist is the principal aspect; I can live without definitions. - "markup" about the structure of words, for things like hyphenation, etc. (or from which hyphenation can be derived) - some form of frequency count would be nice For example, I'd like to compute something like: "the average French character occupies x bytes in UTF-8", with average defined in sync with the frequency count. And I'd like to compute things like spelling changes introduced by hyphenation in Dutch. Any pointers? Thanks, Eric.

