Hello,

I'm having an issue where I'm getting back two or three metadata properties
that are related to a temp file that tika is apparently creating under the
hood:

File Modified Date (the current date)
File Name (temp file name: apache-tika-3021300783416279997.tmp)
File Size

I assume this is because I only have a stream to give Tika and no longer
have a physical file.  However the users are seeing these (particularly the
modified date) and misinterpreting it.

I'd like to exclude these, which I could of course do by just a
string-based filter.  However that feels a little hackish... I was hoping
there may be some way to deactivate file metadata if Tika is the one that
created the temp file?  I tried to find the spot in Tika where these are
being added by greping all the source but I seem to have come up empty for
some reason.

Thanks for any pointers,
Brian

Reply via email to