Hi, I'm new in Solr usage and I want to know if it's the right choice for my problem. I need to index pdf documents stored in filesystem and make query over them. So i used solr with solrj as extractingrequesthandler and all works, but I'm not interested in index pdf metadata, while in the content text of the document. I saw that the content is indexed entirely in a single field ("attr_content" in my case), but what i want is to index fields that are inside the field content.
As example: I've a pdf document that contain an invoice. I need to extract and index informations relative to recipient, price, sold items, items description, and so on. Is Solr the right choice for this purpose or do i need to use other framework in addiction before posting document to Solr? thanks in advance