Hi Suman, [2] describes it very well.
As far as I understand Sphinx4 uses JAR files as container for parsing multiple required configuration files. If this is correct, you should not try to add those files to the classpath (e.g by adding a new dependency) but rather allow users to just copy those files to the `stanbol/datafile` folder. The DatafileProvider allows you to lookup resources by their name. There is also a DatafileTracker that can be used to track files. The tracker will provide you a callback as soon as an resource becomes available. So all you need to do is to implement a service that allows to request a "Acoustic Modal", "Dictionary file" or "Language modal file" by its name and does provide the loaded models as Java Objects. The names need to be provided by the requesting component (the Enhancement Engine). You should define default naming templates (convention over configuration). The OpenNLP service [3] does exactly this for all the OpenNLP engines. So I guess this is the best place to look at best Rupert > [1] https://sites.google.com/site/gsoc2014stanbol/home/abstract > [2] http://stanbol.apache.org/docs/trunk/utils/datafileprovider [3] http://svn.apache.org/repos/asf/stanbol/trunk/commons/opennlp/src/main/java/org/apache/stanbol/commons/opennlp/OpenNLP.java On Wed, May 28, 2014 at 12:05 AM, Suman Saurabh <ss.sumansaurab...@gmail.com> wrote: > Hi Rupert, All > > I am building Speech To Text Engine ( [1] for those who need introduction). > Engine requires DataFileProvider infrastructure for handling configuration > file of acoustic and language modal. Basically what happens is client > provides the *Acoustic Modal* *folder *, *Dictionary file* and *Language > modal file* in jar file in following format. > eg. > sphinx4-data-1.0-SNAPSHOT.jar default modal file, it contains > /edu/cmu/sphinx/models/language/en-us.lm.dmp *File* for language modal > /edu/cmu/sphinx/models/acoustic/wsj/dict/cmudict.0.6d *File *for dictionary > /edu/cmu/sphinx/models/acoustic/wsj/ *Folder* for acoustic modal > > This jar can be added to project using the following dependency: > <dependency> > <groupId>edu.cmu.sphinx</groupId> > <artifactId>sphinx4-data</artifactId> > <version>1.0-SNAPSHOT</version> > </dependency> > > but when clients wants to use his own modal file, Stanbol > hasDataFileProvider infrastructure for handling such big binary > configuration > files. > > I went through the documentation of DataFileProvider [2] and some of the > enhancement engine like Sentiment Word Classifier - source code that uses > DataFileProvider service, to see the implementation of DataFileProvider , > but I am not yet clear how to use it. > > Maybe you can provide some *insights* or *links* that provides better > description of it. It will save lot of time. > > Regards, > Suman Saurabh > > [1] https://sites.google.com/site/gsoc2014stanbol/home/abstract > [2] http://stanbol.apache.org/docs/trunk/utils/datafileprovider -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen | REDLINK.CO .......................................................................... | http://redlink.co/