Alexandre Madurell created TIKA-1252:
----------------------------------------
Summary: Tika is not indexing all authors of a PDF
Key: TIKA-1252
URL: https://issues.apache.org/jira/browse/TIKA-1252
Project: Tika
Issue Type: Bug
Components: metadata, parser
Affects Versions: 1.4
Environment: Ubuntu 12.04 (x64) Solr 4.6.0 (Amazon Web Services,
Bitnami Stack)
Reporter: Alexandre Madurell
When submitting a PDF with this information in its XMP metadata:
...
<dc:creator>
<rdf:Bag>
<rdf:li>Author 1</rdf:li>
<rdf:li>Author 2</rdf:li>
</rdf:Bag>
</dc:creator>
...
Only the first one appears in the collection:
...
"author":["Author 1"],
"author_s":"Author 1",
...
In spite of having set the field to multiValued in the Solr schema:
<field name="author" type="text_general" indexed="true" stored="true"
multiValued="true"/>
Let me know if there's any further specific information I could provide.
Thanks in advance!
--
This message was sent by Atlassian JIRA
(v6.2#6252)