Hi Mohammad, Maybe your question is more suitable for OpenNLP mail list but I can try to help you. First you need to clarify if you want to build a document classifier or an enhancer, because maybe a document classifier doesn't really fit what an enhancement mean in Stanbol.
If you want to build your custom "concept" or Named entity recognition engine, you have some different options. Maybwe the easiest one is to train your custom OpenNLP NER model and then integrate it in an engine in Stanbol. You can follow OpenNLP documentation for that [1]. You would need some custom training data for your problem domain. In the other hand, if you have your own dataset or vocabulary and you want to link surface forms or concept mentions in text with such dataset, you should create an EntityHub site for your data an configure a new Entity Linking engine. You can then also follow a quite helpful guide at Stanbol website [2]. I hope these two links are useful for your first steps. Cheers [1] - http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#tools.namefind [2] - https://stanbol.apache.org/docs/trunk/customvocabulary.html El jueves, 30 de mayo de 2013, Mohammad Benslimne escribió: > Hello folks, > > I am developping for my undergraduate project a document > classifier/extractor. > I would like use your tools, espacially the OpenNLP Custom NER Model > extraction engine to define what kind of data to extract. > Can you please fill me examples how to make it woking out? > How can I make my own name Finder models and type mapping? > > Thanks in advance for your precious hints > > > Regards, > Med > -- ------------------------------ This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.