Hello everyone! I have a question...maybe it a silly question but I don't know how to manage it. I need to build a classifier for CV. In order to do this I assume that I need to build a model file containing a set of skills. I have a list of skills but I don't know how to build the input file. Here is a sample of my input file:
Tiles and clinkers, setting experience Tile layer . Silk screen printing Lead typesetter, printing shop . CTI, computer telephony Alarm operator . GifBuilder animation program Specialist book writer . Gardening, study circle leadership Sports centre manager . ........ etc. The first part, until the next capital letter is the skill name and the second part is the job name. Ex: Gardening, study circle leadership - skill name, Sports centre manager - job name. In order to create the actual training file I use the following command: opennlp DoccatTrainer -encoding UTF-8 -lang en -data /tmp/jobs.txt -model /tmp/en-language-jobs.bin Now, my question is if the input file I am providing to the above command has the right format. Also, please note that I was able to create the training file but when running the command opennlp Doccat /tmp/en-language-jobs.bin < /tmp/programmer.txt the results are 100% irrelevant. Best regards, Florin
