Hi Rafa; Thanks for the explanations and ideas. I will start to write a proposal for it. On the other hand could I learn that is there any mentor who is volunteer for it :)
Thanks; Furkan KAMACI 2014-03-20 15:47 GMT+02:00 Rafa Haro <rh...@apache.org>: > Hi Furkan, > > El 20/03/14 14:10, Furkan KAMACI escribió: > > Hi; >> >> If anybody can suggest something about to make this issue more clear it >> will be nice. >> >> Thanks; >> Furkan KAMACI >> > Welcome to the Stanbol community :-). As you can check at STANBOL-1294, > this issue is related to further improvements of the current Topic > Classification engine in Stanbol. Although there are some clear points of > improvement (mainly current missing features at STANBOL-197), it is still a > high level idea that would be nice to discuss in detail here. Some of the > possible expected new features would be the following: > > 1. Different implementations for managing the TrainingSet. In the current > approach, the training set has to be stored in Solr and the users have to > configure which fields will be used for training and which fields will be > used as categories. It would be nice to have an abstract API for managing a > TrainingSet in stanbol independent of the final backend which actually > could be Solr or any other storage system. > > 2. Different implementations of the Classifier. Current classifier API is > also completely coupled with the current implementation, therefore it > should be refactored for allowing different implementations based on, for > instance, different frameworks like OpenNLP and Apache Mahout > > 3. Change current TopicClassification engine for working with the new APIs. > > 4. Also, as Rupert pointed in another email, evaluation support would be > also great. > > These are, of course, initial ideas, but we are looking forward to hear > more suggestions. > > Cheers, > Rafa > > >> >> 2014-03-20 14:51 GMT+02:00 Furkan KAMACI <furkankam...@gmail.com>: >> >> Hi; >>> >>> I'm attending a Master program at Computer Engineering for Machine >>> Learning and NLP on Big Data at one of the top universities of Turkey. I >>> am a Senior Java Developer and a team lead of a big project which uses >>> Solr. On the other hand I am one of the most active people at Solr mail >>> list and one of mail list moderators. >>> >>> I want to work for STANBOL-1294 Topic Classification Framework for >>> Stanbol if I can catch up the deadline. >>> >>> Thanks; >>> Furkan KAMACI >>> >>> >>> >>> >>> >