Hello everybody! Could someone explain why should I separate each sentence of my documents to train my models? My documents are like resume/cv and the sentences can be very different. For example a sentence could also be :
1. Name: John 2. Surname: travolta Etc etc So my question is. What is the problem if i train ny models (namefinder,tokenizer) with the complete resume/cv one per line? Could It be a problem? In this case when i will like to tokenize the resume and doing the NER i will simply pass the complete resume text skiping the "sentences detection" process. Thanks for your opinion in advance! Best Damiano
