El ds 23 de 01 de 2010 a les 11:27 +0100, en/na Jeff Allen va escriure: > All, > > Can we please end this discussion about nitpicking over licensing and > potential > lawsuits.
Choosing the right licence is important. > As Bob and Alon have clearly said, this licensing statement has been used for > over a decade with other CMU published stuff. Did anything negative ever come > of that? If nothing negative, then case is closed. "Because nothing happened in the past, means nothing will happen in the future" doesn't quite gel. There is a nice test in the DFSG called the "Tentacles of Evil" test (see 9c http://people.debian.org/~bap/dfsg-faq.html) that gives an idea of this. It is better to be sure now, than to wait until it is too late, the work has been used, and it cannot be distributed, or worse it has to be removed from distribution. > As Bob said, we just want credit to be given where credit is due. I work > outside of this industry and had the guts (in the midst of an already stressed > out software dev cycle) to approach my boss a week ago with a proposal to > participate in this special Haiti Disaster Relief project, and to see how the > company would let me participate. So just give credit to the participants and > their affiliations. Create a little "About" menu item in your tools with a > link > off to the content providers. I'm pretty sure all free software licences give credit where credit is due. In our project (GPL), in the source for each of the translators, we have an AUTHORS file, which lists authors, and if necessary, an ACKNOWLEDGEMENTS file, which lists people who have contributed. You may appreciate that while we may make a nuisance about choosing the right licence, we are fiercely honest about giving credit where credit is due and acknowledging other people's work. > Last night was the first time in a week that I've slept more than 2 or 3 > hours. > And we are working on unarchiving, verifying, validating more CMU content. > This > takes time to do, and I am thankful to my employer for allowing me to > participate. > > I've receiving so many emails and calls for a week with people screaming to > have > Haitian Creole data. Who else has been willing to give such content to the > community at large? > > We were able to deliver to Doctors without Borders in Haiti the list of 1600 > sentences that were translated in 1 day by an experienced translation company. > Watch the news. These sentences are needed NOW because in a few days, the > value > of that small corpus would have dropped to null if the people are still no > longer living who could benefit from the communication. > > And I'm right now answering emails to 2 separate Haitian Creole content > providers who did not give me their data 12 years ago, and now might need to > arrange some confcalls. These are relationships that took years to develop > and > maintain. It's about who you know and how much they trust you. And > there is no magic wand to accelerate this, even in a crisis situation. > > The statements in this discussion thread can undermine and destroy all efforts > at present to get others to also make their content available, in whatever way > and conditions that they are willing to do so. Releasing data under the wrong licence or not under a clear licence may effectively be the same as not releasing it at all. > So if you want to potentially see more data from orgs other than CMU, then it > would be more constructive to send the CMU guys (Bob as the Haitian Creole > Project PI, Alon as the AMTA president) and me (as participant now + industry > advisor to both the LINGUIST-list and also to MultiLingual Computing and > Technology) your statements of support of the CMU initiative. Your letters of > support (email is OK) for the continued release of Haitian Creole data will be > one way to make things possible. > > And spread the word all over the internet (every blog, every discussion > forum, every newsletter, every news list) that this initiative by CMU is a > first > and essential step to make language data available for building language > technologies upon it. > And encouraging others to do the same with their data, as they are able to do > so. Encouraging people to make their data freely available under an appropriate licence takes up a good proportion of my time. Getting some resources released under standard licences has taken back-and-forth emails over two years or more -- so you can appreciate that I'm impressed by the efficiency in your releasing. > Please try to grasp and feel that moment in time of 10 days back in March 1998 > where Haitian colleagues and I were speaking in Creole with 200 students at > the > Universite d'Haiti with instructions on how to do the recording sessions. > And the final day with my seminar (in Creole) to a full room of Haitians about > the reason, the need and the ways of doing text and speech data collection, > and > what it could do to develop a speech-to-speech MT system. I was inventing > terminology in Haitian Creole that had never existed. (BTW, that seminar is > sitting on a VHS cassette and I need to get the conversion kit and find the > time > to convert it to DV format and make it available to all.) > > And then a week later when we built the system and were creating the demos > along > with our Haitian colleagues, and to see the incredible excitement in their > eyes > and the emotion in their voices. And at that moment in time, I told them that > no one in the world could ever again say that Haitian Creole was a > substandard, > broken patois. Because you can't creole a speech-to-speech MT system to > convert > a "non-language" to a language. You can only match two entities which are at > the same level of the hiearchy: a real language to a real language. > > And then as I sat there a week ago watching the TV and was nearly in tears as > 10 > years of my life (half of it only in my free time, late nights, while family > members were already sleeping) had been spent working with people of several > different French-based Creoles, helping promote the idea of elevating the > value > of their mother tongue from the oppressed, substandard, broken-patois to the > right to call it a "language". All of that in vain if thousands of people > would > suffer and die today because the system is sitting in a box, and the data is > archived on various media. Something had to be done to help to resolve > immediate needs, short-term needs, and longer-term needs. And in one-week, > we > have started to release the data that has been carfully checked. It's not like > the crap content that you will find on internet sites. We had already done > the > triage and clean-up of that over a decade ago. Jeff, I'm not sure if describing other people's work as "crap" is quite right. I'm not sure if you got any money for working on this, but I'm sure the people who are releasing stuff on the web are doing the best they can with limited-resources. Maybe when your work is released, they'll be able to make use of it, and everyone will win. Best, Fran _______________________________________________ Mt-list mailing list
