I created https://issues.apache.org/jira/browse/TIKA-3493 with a test case which is reproducing the issue.
HTH David Le 22 juil. 2021 à 11:43 +0200, Nick Burch <[email protected]>, a écrit : > On Thu, 22 Jul 2021, David Pilato wrote: > > TL;DR: the created date of the document changes depending on the timezone. > > That does seem a bug > > > For example: > > > > • Asia/Sakhalin gives dcterms:created=2016-07-06T23:38:00Z > > • Asia/Colombo gives dcterms:created=2016-07-07T05:08:00Z > > • Europe/Stockholm gives dcterms:created=2016-07-07T08:38:00Z > > As a general rule, if we know the timezone, we should be returning it, or > taking acount of it. If the file format doesn't store the timezone, we > should be returning a datetime without any timezone specified > > > I don't know if it's a bug or expected. May be the RTF format does not > > specify the Timezone. > > If there's no timezone in the format, there shouldn't be a timezone (eg Z > for UTC) in the output > > Any chance you could report a bug in JIRA, and upload a small sample file > showing the problem and a small unit test demonstrating it? > > Thanks > Nick
