Hi Furkan,
El 20/03/14 14:10, Furkan KAMACI escribió:
Hi;
If anybody can suggest something about to make this issue more clear it
will be nice.
Thanks;
Furkan KAMACI
Welcome to the Stanbol community :-). As you can check at STANBOL-1294,
this issue is related to further improvements of the current Topic
Classification engine in Stanbol. Although there are some clear points
of improvement (mainly current missing features at STANBOL-197), it is
still a high level idea that would be nice to discuss in detail here.
Some of the possible expected new features would be the following:
1. Different implementations for managing the TrainingSet. In the
current approach, the training set has to be stored in Solr and the
users have to configure which fields will be used for training and which
fields will be used as categories. It would be nice to have an abstract
API for managing a TrainingSet in stanbol independent of the final
backend which actually could be Solr or any other storage system.
2. Different implementations of the Classifier. Current classifier API
is also completely coupled with the current implementation, therefore it
should be refactored for allowing different implementations based on,
for instance, different frameworks like OpenNLP and Apache Mahout
3. Change current TopicClassification engine for working with the new APIs.
4. Also, as Rupert pointed in another email, evaluation support would be
also great.
These are, of course, initial ideas, but we are looking forward to hear
more suggestions.
Cheers,
Rafa
2014-03-20 14:51 GMT+02:00 Furkan KAMACI <furkankam...@gmail.com>:
Hi;
I'm attending a Master program at Computer Engineering for Machine
Learning and NLP on Big Data at one of the top universities of Turkey. I
am a Senior Java Developer and a team lead of a big project which uses
Solr. On the other hand I am one of the most active people at Solr mail
list and one of mail list moderators.
I want to work for STANBOL-1294 Topic Classification Framework for
Stanbol if I can catch up the deadline.
Thanks;
Furkan KAMACI