Hi Rob, *, On Tue, Nov 8, 2011 at 1:58 AM, Rob Weir <[email protected]> wrote: > On Mon, Nov 7, 2011 at 7:29 PM, Christian Lohmaier <[email protected]> > wrote: >> On Tue, Nov 8, 2011 at 12:34 AM, Rob Weir <[email protected]> wrote: >> [...] >> Why don't you just admit that you have absolutely no clue about how >> the dictionaries (or for that matter hunspell/affix compression as a >> whole) works? > > Actually, I know quite a lot about spell checking and dictionaries. > And copyright. How a work is created is totally irrelevant. A > painting is not copyrightable because of what paint the artist uses or > how they hold the brush.
Yes, and nobody claims the words that a dictionary represents or whatever fragment of grammar it entails would be copyrightable. > Spell checking dictionaries are just compilations of facts As is any other software. Following that logic, you could not put a copyright on *anything*, as the law of physics or math are the same for everyone. > that are > constrained by the preexisting external facts of the language. The > compiler of the dictionary does not create these facts. No computer dictionary in the world is a perfect representation of the "facts" that make up a language. There is no dictionary with 100% accuracy. There is no way to take a dictionary and reverseengineer the language from it. Unmunge a hunspell dictionary, especially one with componds enabled, and you will get gigabyte over gigabyte of "valid" (as by the rules of the dictionary, not by the rules of the language) words. Claiming that a dictionary represents external *facts* of the language just doesn't make any sense. > He merely > encodes them. No, this is not true. If it was encoding of the facts, you would create a perfect dictionary. But what affix transformations are created depends on the creator of the dictionary, the stems that are included in the dictionary, what level of accuracy is targeted. The existing affix rules affect other rules in a complex way. These are not just "outside facts". > The particular dictionary might be copyrightable as a > specific selection, coordination and arrangement of these facts, but > fair use would allow me to extract the same facts from the > dictionaries, via reverse engineering, and make my own selection, > coordination and arrangement of these same facts and distribute them > as my own dictionary. Of course you are free to create your own dictionary. But once gain your conclusion is silly to the point where I cannot take you seriously. You logic really means that you cannot copyright any kind of software, because you are still able to write your own copy that does the same since the fundamental math that makes up software is the same, as you're just rearranging some keywords, the fundamental facts of the programming language around. This is stupid. While it is true that you can rewrite software to do the same, and no copyright does hinder you from doing so (as Tor implied there might be other means like patents or other stuff that have nothing to do with copyright), you all put copyright statements in sourcecode. > In other words, you might be able to protect > the compilation of facts, but you cannot protect the underlying facts, Yes, with that I agree. (Everyone does I guess). Except the "compilation of facts" part. It is not a compilation of facts. It is "guesswork", closing the gaps to the actual facts. You might be able to do a "just a compilation of facts" style dictionary for an artificial language, but not for a language that people are actually using in real life. > or prevent people from copying your encoding of these facts and > distributing a different arrangement of them. Here I (and others) strongly disagree with you. Copying the encoding of the facts and just altering them is no different from taking sourcecode from any software, putting your nametag on it and shuffling things around. The important matter is *different arrangement* here. Once again: You are free to (attempt to) create a dictionary for the same language by yourself. Language itself of course is not protected. But you will not end up with the same "encoding" (approximation) of the language since it is not just a matter of collecting facts. It is a creative process. And once again I challenge your knowledge about the dictionaries. I just cannot explain otherwise how you can claim it is just a collection of facts with no creative effort behind. Once again: Following your path of thought, you could not put a copyright on any software, as the rules of math are the same for anyone, and you're just applying those rules to create the same result. And this is nonsense. Copyright is not to prevent others from creating stuff that does or behaves the same. Copyright does cover the actual way how it is done, applies to the concrete solution to the given problem. > This should not be hard to understand. Free software advocates argue > all the time that software cannot be patented because it is "just > math". Books have been written about it. So why is it so hard to > understand that linguistic facts cannot be copyrighted? Because you don't understand that a dictionary doesn't represent linguistic facts. As there is no such thing as linguistic fact. If there was, you could create a perfect dictionary. There is an approximation at best. A computer dictionary is not a list of words, is not a list of hard facts. > In any case, these are important concepts to understand. If it is not > clearer, after reading this response, then try doing a Google query on > terms such as copyright, compilation and facts. No, it is pointless to search those facts, when the basic assumption that a dictionary is a mere compilation of facts is wrong already. ciao Christian
