Lewis John McGibbney created TIKA-1609:
------------------------------------------

             Summary: Leverage Google's LibPhonenumber for enhanced phone 
number extraction and metadata modeling
                 Key: TIKA-1609
                 URL: https://issues.apache.org/jira/browse/TIKA-1609
             Project: Tika
          Issue Type: New Feature
          Components: core
            Reporter: Lewis John McGibbney
            Assignee: Lewis John McGibbney
             Fix For: 1.9


Google's Libphonenumber can provide us with comprehensive support for modeling 
Phone number metadata properly in Tika.
During the development of this patch I realized two things, namely
 * This is not a parser as such as Phone numbers are not mapped to any 
particular Mimetype
 * In addition, there can be many phone numbers per document, so this is most 
likely a Content Handler of sorts
 * Tika's Metadata support is currently too restrictive to allow us to persist 
many complex objects e.g. String, Object. We need to expand Meatdata support 
over and above String, String[].

https://github.com/googlei18n/libphonenumber/




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to