On Wed, Jul 20, 2011 at 9:56 AM, Fabian Christ <[email protected]> wrote: > 2011/7/19 Olivier Grisel <[email protected]>: >> Maybe we could work on extending the DataFileProvider to make the >> defaultdata provider only provide download URLs from the existing gray >> licensed opennlp 1.5 models from >> http://opennlp.sourceforge.net/models-1.5/ and let the >> DataFileProvider download them from there automatically the first time >> they are required. The issue then is that every integration tests job >> will re-download the same data from sourceforge over and over again... >> That will slowdown the builds / tests and waste bandwith for nothing + >> add a new way for the builds and test to fail (dependency on the >> network / sourceforge availability). > > I think having the OpenNLP models in our trunk and use them during > development in incubation is no problem. So we don't need to change > anything for build and integration tests right now. > The models where never in the trunk, but downloaded by using a shell script. I am currently upgrading this to use the maven-ant-plugin instead. This will allow to download data files automatically during the normal build process
> I would propose to exclude the models when a release is made. In this > case the OpenNLP engine has to be prepared to recognize that the > required model is missing and download it from Sourceforge. If the > model is not missing as during development in our trunk everything is > fine. > In other words we will exclude such bundles from the release forcing users to * check out /data and build them locally or * download those bundles from a Maven repository. But if we want to also release the launchers, that we would need to include those bundles otherwise the launcher would not work out of the box - something very important for adoption. I would assume that the normal user would double-click the jar; open the Browser; paste some text to /engines - everything he will get with the missing models would be "Invalid query" :( We can not expect that he will go to the Felix Web Console; open the DataFileProvider tab; look at the list of missing files; download them from SourceForge and copy them to the /datafiles directory. best Rupert -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
