The extracted content goes into text field which is not stored. You can
make it stored but the output will really not be pretty. PDF is not a
linear storage format.

Regards,
    Alex

On 14 Sep 2016 5:16 AM, "Alexandre Martins" <alexandremart...@gmail.com>
wrote:

> Hi Guys,
>
> I'm trying to use the last version of solr and i have used the post tool to
> upload 28 pdf files and it works fine. However, I don't know how to show
> the content of the files in the resulted json. Anybody know how to include
> this field?
>
> "responseHeader":{ "zkConnected":true, "status":0, "QTime":43, "params":{
> "q
> ":"ABC", "indent":"on", "wt":"json", "_":"1473804420750"}}, "response":{"
> numFound":40,"start":0,"maxScore":9.1066065,"docs":[ { "id":
> "/home/alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf",
> "
> date":["2016-09-13T14:44:17Z"], "pdf_pdfversion":[1.5],
> "xmp_creatortool":["PDFCreator
> Version 1.7.3"], "stream_content_type":["application/pdf"], "
> access_permission_modify_annotations":[false], "
> access_permission_can_print_degraded":[false], "dc_creator":["abc"], "
> dcterms_created":["2016-09-13T14:44:17Z"], "last_modified":[
> "2016-09-13T14:44:17Z"], "dcterms_modified":["2016-09-13T14:44:17Z"], "
> dc_format":["application/pdf; version=1.5"], "title":["ABC tittle"], "
> xmpmm_documentid":["uuid:100ccff2-7c1c-11e6-0000-ab7b62fc46ae"], "
> last_save_date":["2016-09-13T14:44:17Z"], "access_permission_fill_in_
> form":[
> false], "meta_save_date":["2016-09-13T14:44:17Z"],
> "pdf_encrypted":[false],
> "dc_title":["Tittle abc"], "modified":["2016-09-13T14:44:17Z"], "
> content_type":["application/pdf"], "stream_size":[101948], "x_parsed_by":[
> "org.apache.tika.parser.DefaultParser",
> "org.apache.tika.parser.pdf.PDFParser"], "creator":["mauricio.tostes"], "
> meta_author":["mauricio.tostes"], "meta_creation_date":[
> "2016-09-13T14:44:17Z"], "created":["Tue Sep 13 14:44:17 UTC 2016"], "
> access_permission_extract_for_accessibility":[false], "
> access_permission_assemble_document":[false], "xmptpg_npages":[3], "
> creation_date":["2016-09-13T14:44:17Z"], "resourcename":[
> "/home/alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf"],
> "
> access_permission_extract_content":[false], "access_permission_can_print":
> [
> false], "author":["abc.add"], "producer":["GPL Ghostscript 9.10"], "
> access_permission_can_modify":[false], "_version_":1545395897488113664},
>
> Alexandre Costa Martins
> DATAPREV - IT Analyst
> Software Reuse Researcher
> MSc Federal University of Pernambuco
> RiSE Member - http://www.rise.com.br
> Sun Certified Programmer for Java 5.0 (SCPJ5.0)
>
> MSN: xandecmart...@hotmail.com
> GTalk: alexandremart...@gmail.com
> Skype: xandecmartins
> Mobile: +55 (85) 9626-3631
>

Reply via email to