[ 
https://issues.apache.org/jira/browse/TIKA-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947940#comment-14947940
 ] 

Hudson commented on TIKA-1765:
------------------------------

SUCCESS: Integrated in tika-trunk-jdk1.7 #865 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/865/])
TIKA-1765 (tallison: 
[http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1707427])
* trunk/CHANGES.txt
* trunk/tika-core/src/main/java/org/apache/tika/metadata/Metadata.java
* 
trunk/tika-core/src/main/java/org/apache/tika/metadata/OfficeOpenXMLExtended.java
* 
trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/JackcessExtractor.java
* 
trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/SummaryExtractor.java
* 
trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/MetadataExtractor.java
* 
trunk/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/WordParserTest.java
* 
trunk/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java
* 
trunk/tika-parsers/src/test/resources/test-documents/testWORD_multi_authors.doc
* 
trunk/tika-parsers/src/test/resources/test-documents/testWORD_multi_authors.docx


> Some doc and docx store multiple authors as semi-colon delimited list
> ---------------------------------------------------------------------
>
>                 Key: TIKA-1765
>                 URL: https://issues.apache.org/jira/browse/TIKA-1765
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Trivial
>
> It looks like doc and docx are storing multiple authors in a single author 
> field delimited by semi-colons.  We should parse this value and add multiple 
> authors where appropriate.
> Notes: when I tried to add an author with a semicolon in the name, the result 
> was two authors...doesn't look like there is any escaping going on.
> We should check to see what's going on in the other MS formats and with other 
> metadata items that are allowed to be multivalued in Dublin Core.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to