Yes.  We need to do a better job of documenting this. To inject
custom/external metadata, do something like this:

{
    "emitKey": "emitKey1",
    "emitter": "my_emitter",
    "fetchKey": "fetchKey1",
    "fetcher": "my_fetcher",
    "handlerConfig": {
        "maxEmbeddedResources": 10,
        "parseMode": "concatenate",
        "type": "xml",
        "writeLimit": 10000
    },
    "id": "my_id",
    "metadata": {
        "m1": [
            "v1",
            "v1"
        ],
        "m2": [
            "v2",
            "v3"
        ],
        "m3": "v4"
    },
    "onParseException": "skip"
}

On Thu, Nov 3, 2022 at 2:08 PM sam k <[email protected]> wrote:

> Hi,
>
> I'm running a Tika server with HttpFetcher and SolrEmitter, and it works
> great.
>
> When asking Tika to send documents to Solr, I can specify the document id
> as "emitKey" parameter:
>
> curl -X POST -H "Content-Type: application/json" -d '{"fetcher":"http",
> "fetchKey":"<URL>", "emitter":"solr", "emitKey":"<Document Id>"}'
> http://tika.server
>
> Is there a way to specify more custom fields for the Solr document being
> submitted, like:
>
> curl -X POST -H "Content-Type: application/json" -d '{"fetcher":"http",
> "fetchKey":"<URL>", "emitter":"solr", "emitKey":"<Document Id>",
> "anotherSolrField":"<Value>", "yetAnotherSolrField":"<Value>"}'
> http://tika.server
>
> We would like to set around 10 custom fields in each Solr document, such
> as the id of the user who created the PDF/Word, etc, so the values for the
> Solr fields would be different for each Solr document.
>
> Thanks,
> -Sam
>

Reply via email to