Hello, we need to categorize our documents in 80 sectors. These documents are resumes/cv.
We have many documents (more than 30k) but there is a problem. Should we try to extract the job positions inside each resume and categorize them or can we just add the entire document and categorize it in one or more categories? (max 3 categories) I think there is a lof o noising data that can give us many false positives if we use the entire document. For example, the personal data, hobbies etc BUT I also know that extract every job position from all the documents will take years! Can anyone give me any workaround ? Thank you so much! Damiano