I'm trying to parse content from an XML file that contains data that looks
like this:

<metadata name="key1">value1</metadata>
<metadata name="key2">value2</metadata>
<metadata name="key3">value3</metadata>
<metadata name="key4">value4</metadata>

I'd like to add the key=value pairs to the Tika metadata.

I've tried two different things:

ContentHandler elementMetadataHandler = new ElementMetadataHandler("",
"metadata", metadata, "metadata");

will pull out
    metadata=value1, metadata=value2, metadata=value3, metadata=value4

And the second one:

ContentHandler attributeHandler = new AttributeMetadataHandler("", "name",
metadata, "name");

will pull out
    name=key1, name=key2, name=key3, name=key4

What I really want is this:
    key1=value1, key2=value2, key3=value3, key4=value4

Is there a way to do this using Tika's built-in parsing? If not, what do I
need to do to extend the parsing for this purpose?

Thank you!

Reply via email to