I'm trying to parse content from an XML file that contains data that looks
like this:
<metadata name="key1">value1</metadata>
<metadata name="key2">value2</metadata>
<metadata name="key3">value3</metadata>
<metadata name="key4">value4</metadata>
I'd like to add the key=value pairs to the Tika metadata.
I've tried two different things:
ContentHandler elementMetadataHandler = new ElementMetadataHandler("",
"metadata", metadata, "metadata");
will pull out
metadata=value1, metadata=value2, metadata=value3, metadata=value4
And the second one:
ContentHandler attributeHandler = new AttributeMetadataHandler("", "name",
metadata, "name");
will pull out
name=key1, name=key2, name=key3, name=key4
What I really want is this:
key1=value1, key2=value2, key3=value3, key4=value4
Is there a way to do this using Tika's built-in parsing? If not, what do I
need to do to extend the parsing for this purpose?
Thank you!