Hello, freinds

I'm using Tika 0.7

When I test content and metadata extraction by Tika, I met next usecases:
- Date in metadata (DublinCore.DATE, MSOffice.LAST_SAVED,
MSOffice.CREATION_DATE)
Date returned as String, but format is different for different document
types. Probably you already working on this problem (I saw Date object in
metadata in Tika 0.8) but if not, how can I configure Tika to use single
Date format?

- Date in Excel file content.
As we know, Excel have Date fields, and Tika extract it well. But format is
not acceptable for me.

For example
I have field 03/10/2005
Tika extracts it as  10/03/2005
But, I need "yyyy-MM-dd HH:mm:ss.SSSZ"   - 2005-10-03 00:00:00.000+0300

So, the question is:
- Can I configure Tika to use singel Date format?
- Can I configure Excel parser to extract date/time objects with specified
date format?


Thanks

Reply via email to