Thanks Tim, this is great!

I was experimenting, whether
org.apache.tika.metadata.filter.FieldNameMappingFilter in tika-config.xml
can be used to also rename those custom metadata fields, but it seems to
let them go through without renaming. Not sure if it would be a very useful
feature anyhow.

Thanks,
-Sam

On Thu, Nov 3, 2022 at 12:52 PM Tim Allison <[email protected]> wrote:

> Yes.  We need to do a better job of documenting this. To inject
> custom/external metadata, do something like this:
>
> {
>     "emitKey": "emitKey1",
>     "emitter": "my_emitter",
>     "fetchKey": "fetchKey1",
>     "fetcher": "my_fetcher",
>     "handlerConfig": {
>         "maxEmbeddedResources": 10,
>         "parseMode": "concatenate",
>         "type": "xml",
>         "writeLimit": 10000
>     },
>     "id": "my_id",
>     "metadata": {
>         "m1": [
>             "v1",
>             "v1"
>         ],
>         "m2": [
>             "v2",
>             "v3"
>         ],
>         "m3": "v4"
>     },
>     "onParseException": "skip"
> }
>
> On Thu, Nov 3, 2022 at 2:08 PM sam k <[email protected]> wrote:
>
>> Hi,
>>
>> I'm running a Tika server with HttpFetcher and SolrEmitter, and it works
>> great.
>>
>> When asking Tika to send documents to Solr, I can specify the document id
>> as "emitKey" parameter:
>>
>> curl -X POST -H "Content-Type: application/json" -d '{"fetcher":"http",
>> "fetchKey":"<URL>", "emitter":"solr", "emitKey":"<Document Id>"}'
>> http://tika.server
>>
>> Is there a way to specify more custom fields for the Solr document being
>> submitted, like:
>>
>> curl -X POST -H "Content-Type: application/json" -d '{"fetcher":"http",
>> "fetchKey":"<URL>", "emitter":"solr", "emitKey":"<Document Id>",
>> "anotherSolrField":"<Value>", "yetAnotherSolrField":"<Value>"}'
>> http://tika.server
>>
>> We would like to set around 10 custom fields in each Solr document, such
>> as the id of the user who created the PDF/Word, etc, so the values for the
>> Solr fields would be different for each Solr document.
>>
>> Thanks,
>> -Sam
>>
>

Reply via email to