The extracted content goes into text field which is not stored. You can make it stored but the output will really not be pretty. PDF is not a linear storage format.
Regards, Alex On 14 Sep 2016 5:16 AM, "Alexandre Martins" <alexandremart...@gmail.com> wrote: > Hi Guys, > > I'm trying to use the last version of solr and i have used the post tool to > upload 28 pdf files and it works fine. However, I don't know how to show > the content of the files in the resulted json. Anybody know how to include > this field? > > "responseHeader":{ "zkConnected":true, "status":0, "QTime":43, "params":{ > "q > ":"ABC", "indent":"on", "wt":"json", "_":"1473804420750"}}, "response":{" > numFound":40,"start":0,"maxScore":9.1066065,"docs":[ { "id": > "/home/alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf", > " > date":["2016-09-13T14:44:17Z"], "pdf_pdfversion":[1.5], > "xmp_creatortool":["PDFCreator > Version 1.7.3"], "stream_content_type":["application/pdf"], " > access_permission_modify_annotations":[false], " > access_permission_can_print_degraded":[false], "dc_creator":["abc"], " > dcterms_created":["2016-09-13T14:44:17Z"], "last_modified":[ > "2016-09-13T14:44:17Z"], "dcterms_modified":["2016-09-13T14:44:17Z"], " > dc_format":["application/pdf; version=1.5"], "title":["ABC tittle"], " > xmpmm_documentid":["uuid:100ccff2-7c1c-11e6-0000-ab7b62fc46ae"], " > last_save_date":["2016-09-13T14:44:17Z"], "access_permission_fill_in_ > form":[ > false], "meta_save_date":["2016-09-13T14:44:17Z"], > "pdf_encrypted":[false], > "dc_title":["Tittle abc"], "modified":["2016-09-13T14:44:17Z"], " > content_type":["application/pdf"], "stream_size":[101948], "x_parsed_by":[ > "org.apache.tika.parser.DefaultParser", > "org.apache.tika.parser.pdf.PDFParser"], "creator":["mauricio.tostes"], " > meta_author":["mauricio.tostes"], "meta_creation_date":[ > "2016-09-13T14:44:17Z"], "created":["Tue Sep 13 14:44:17 UTC 2016"], " > access_permission_extract_for_accessibility":[false], " > access_permission_assemble_document":[false], "xmptpg_npages":[3], " > creation_date":["2016-09-13T14:44:17Z"], "resourcename":[ > "/home/alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf"], > " > access_permission_extract_content":[false], "access_permission_can_print": > [ > false], "author":["abc.add"], "producer":["GPL Ghostscript 9.10"], " > access_permission_can_modify":[false], "_version_":1545395897488113664}, > > Alexandre Costa Martins > DATAPREV - IT Analyst > Software Reuse Researcher > MSc Federal University of Pernambuco > RiSE Member - http://www.rise.com.br > Sun Certified Programmer for Java 5.0 (SCPJ5.0) > > MSN: xandecmart...@hotmail.com > GTalk: alexandremart...@gmail.com > Skype: xandecmartins > Mobile: +55 (85) 9626-3631 >