[
https://issues.apache.org/jira/browse/TIKA-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164145#comment-14164145
]
Tyler Palsulich commented on TIKA-1438:
---------------------------------------
In my opinion, a single multivalued metadata entry is better than a single
metadata entry with a list of comma separated numbers. Users should be able to
decide what they want to do with the array of numbers. See [the
test|https://github.com/apache/tika/blob/trunk/tika-parsers/src/test/java/org/apache/tika/sax/PhoneExtractingContentHandlerTest.java]
for an example.
> PhoneExtractingContentHandler to not add individual MD entries for individual
> phone numbers
> -------------------------------------------------------------------------------------------
>
> Key: TIKA-1438
> URL: https://issues.apache.org/jira/browse/TIKA-1438
> Project: Tika
> Issue Type: Bug
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Priority: Minor
> Fix For: 1.7
>
> Attachments: TIKA-1438.patch
>
>
> Right now we have the PhoneExtractingContentHandler adding phone numbers as
> individual metadata entires.... I feel that this is cumbersome.
> An example would be that we have a webpage with phone numbers on it, we then
> have many fields of the same type with different values!
> I propose we reverse this and have one field with multiple values.
> I would fully understand the current behaviour if we wished to augment the
> phone numbers further by associating dialing code, country, carrier, etc,
> however we are not currently doing this.
> Patch coming for trunk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)