On collision, the precedence order defines what key takes precedence and 
_overwrites_ the
other. Overwrite is but one option (you could save *all* the values it’s a 
multi-valued key structure
so…)

Cheers,
Chris




On 10/26/17, 9:43 AM, "Nick Burch" <[email protected]> wrote:

    On Thu, 26 Oct 2017, Chris Mattmann wrote:
    > My general approach to conflicting metadata is simply to define 
    > precedence orders.
    >
    > For example here is one documented from OODT:
    >
    > 
https://cwiki.apache.org/confluence/display/OODT/Understanding+CAS-PGE+Metadata+Precendence
    >
    > We can do similar things with Tika, e.g.,
    >
    > [CoreMetadata.PROPERTIES]
    > [ImageParser.METADATA]
    > [TikaOCR.METADATA]
    
    What happens if two different parsers both output the same bit of metadata 
    though? eg Tim's example of one giving dc:creator of Tim and the second 
    giving dc:creator of Chris?
    
    
    Secondly, what about the XHTML sax events stream? I think that's probably 
    the harder case...
    
    Nick
    


Reply via email to