Hi list,

I'm using the ExtractingRequestHandler to extract content from
documents. It's extracting the "last_modified" field quite fine, but of
course only for documents where this field is set. If this field is not
set I want to pass the file system timestamp of the file.

I'm doing:

final ContentStreamUpdateRequest up =
   new ContentStreamUpdateRequest("/update/extract");

up.setParam("literal.last_modified",
   format.format(new Date(file.lastModified())));

This works fine but only for documents that don't have a last modified
field inside (like many PDFs have). Then I get

"multiple values encountered for non multiValued field last_modified"

Is it possible to make ExtractingRequestHandler overwrite the
last_modified I passed as parameter with the one Tika extracted?

Thanks,
 Chris

Reply via email to