Yes. We need to do a better job of documenting this. To inject
custom/external metadata, do something like this:
{
"emitKey": "emitKey1",
"emitter": "my_emitter",
"fetchKey": "fetchKey1",
"fetcher": "my_fetcher",
"handlerConfig": {
"maxEmbeddedResources": 10,
"parseMode": "concatenate",
"type": "xml",
"writeLimit": 10000
},
"id": "my_id",
"metadata": {
"m1": [
"v1",
"v1"
],
"m2": [
"v2",
"v3"
],
"m3": "v4"
},
"onParseException": "skip"
}
On Thu, Nov 3, 2022 at 2:08 PM sam k <[email protected]> wrote:
> Hi,
>
> I'm running a Tika server with HttpFetcher and SolrEmitter, and it works
> great.
>
> When asking Tika to send documents to Solr, I can specify the document id
> as "emitKey" parameter:
>
> curl -X POST -H "Content-Type: application/json" -d '{"fetcher":"http",
> "fetchKey":"<URL>", "emitter":"solr", "emitKey":"<Document Id>"}'
> http://tika.server
>
> Is there a way to specify more custom fields for the Solr document being
> submitted, like:
>
> curl -X POST -H "Content-Type: application/json" -d '{"fetcher":"http",
> "fetchKey":"<URL>", "emitter":"solr", "emitKey":"<Document Id>",
> "anotherSolrField":"<Value>", "yetAnotherSolrField":"<Value>"}'
> http://tika.server
>
> We would like to set around 10 custom fields in each Solr document, such
> as the id of the user who created the PDF/Word, etc, so the values for the
> Solr fields would be different for each Solr document.
>
> Thanks,
> -Sam
>