On Wed, May 27, 2009 at 7:36 AM, Francesco Poli <f...@firenze.linux.it> wrote:
> I think that in the case of machine learning models, source form is > even more clearly distinct from compiled object. > We can consider an artificial neural network, for instance (Mathieu, > correct me if it's a wrong example). > I am under the impression that basically nobody would change connection > weights by hand, in order to modify a neural network. Yes the connection weights of an artificial neural network are a good example of the parameters I was talking about. In practice, nobody would change a connection weight by hand because it's impossible to predict the effect of this particular weight on the overall performance of the model. Training algorithms are mostly clever ways to find a good model without trying the infinity of parameter combinations. So in practice yes, a model would be barely useful for further work on the model without the original data. In that regard, the original data AND the program used to train the model (this includes the implementations and the options passed to the algorithm) can be seen as the only real source. But yet again, I could pretend that I just happened to find the model parameters by hand. Afterall, a model is just a big set of numbers. Who could tell what data I did use to train my model? Due to the lack of quality free data, it's quite tempting to use non-free data in order to create free models. However, this is not good on the long term since that makes the model dependent on the person who holds the data. I mentioned Voxforge in my previous email. Their goal is to use their free spech data to train models with HTK and use the models with Julius. You can get the source code of HTK after registration on their website but the license has severe restrictions so HTK is not free software. Julius is a free software speech recognition engine that can use models trained with HTK. Note that HTK is pretty much THE speech recognition framework in the speech recognition community. If you consider that the ultimate source of a model is not only the data but also the software used to train it, then Voxforge models built with HTK can't be free, even though the data were free. Is it forbidden for someone to release an image made with Photoshop as free? Regarding Debian packaging, I think it's a wise decision to rebuild the model whenever the data and the training program are free, the data is not too large and the computation not too long. Should objective criterion of what is too large and what is too large be decided or should that be left to the DD? Then a remaining question is what to do with models for which we don't have the original data or the original training program? Thank you, Mathieu -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org