To fix the date formatting specific error, you are correct that you need to use
the Expression Language functions toDate() [1] and format() [2] to convert
to/from plain strings to date objects. You are currently concatenating the two
date values (the year-month-day segment and the hour:minute:second segment),
then changing the delimiter from a space to a ’T’ (you can just do this
explicitly in the first step), then concatenating the timezone offset and
trying to convert this to a timestamp via a prescribed format, but the format
doesn’t match the input you have.
Please use the values below (I tested these against the current main branch
build, but nothing should have changed since prior releases):
To concatenate the string attributes into a parseable format and convert it to
a date object (internally represented as the number of milliseconds since the
epoch began at Jan 1, 1970 00:00:00 UTC):
${fileMetadata.6:append('T'):append(${fileMetadata.7:substringBefore('.')}):append('
'):append(${fileMetadata.8}):toDate("yyyy-MM-dd'T'HH:mm:ss Z”)}
To parse the result of the above into various timezones:
Local timezone: ${parsedTimestamp:format("yyyy-MM-dd'T'HH:mm:ss Z”)}
UTC timezone: ${parsedTimestamp:format("yyyy-MM-dd'T'HH:mm:ss Z", "UTC”)}
If you set the PutFile Last Modified Time to ${timestampUTCString} (or whatever
you name the attribute mentioned in Step 2 above), it will successfully set the
file’s timestamp when writing it out (06:14 in February in my timezone is equal
to 14:14 UTC):
/tmp ll timestamptest 15:28:55
total 0
drwxrwxrwx 14 alopresto wheel 448B Jul 2 15:29 ./
drwxrwxrwt 7 root wheel 224B Jul 2 15:28 ../
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
0eec229c-5658-4a86-b6ba-3fe507672bd4
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
113fb95e-5a10-48e4-ba9b-616909b68684
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
13fd2b13-fc8e-455d-8ca9-4afa2886a8e8
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
3228111c-476d-4cf6-a141-587270d821e2
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
397e7a21-944b-4a0c-a0d7-6150e10b385e
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
400313d8-9511-451a-ba40-6a37e7649906
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
46c587f6-06ee-463e-8e91-b432073aa98d
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
4a783b61-2304-44c6-9820-045e0cfaac52
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
a30a3e6c-e3ed-4180-9486-3de274116652
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
d4cdafc4-b5f3-4a18-9548-c7a5a2a3ea68
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
e6b94e07-9bd1-4fbc-aee2-27b687681849
-rw-r--r-- 1 alopresto wheel 0B Feb 14 06:14
f6802781-d820-4e18-b803-ceeaf5abee11
I’m not sure I understand your other concerns — ListFile and GetFile do not
accept incoming connections because they are designed to retrieve the list of
or explicit files from a particular file system location (e.g. you want to list
all the files that appear in
/some/location/where/another/process/puts/them/over/time as they appear). If
you have some other initial process to determine an absolute file path, you can
pass it to FetchFile as you’re doing.
You can also file a feature request Jira to also read the file metadata and
make it available as named attributes in the flowfile after reading the file,
as this seems like a useful behavior for you and others moving forward.
[1]
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#todate
<https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#todate>
[2]
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#format
Andy LoPresto
[email protected]
[email protected]
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> On Jul 2, 2020, at 6:32 AM, Valentina Ivanova <[email protected]> wrote:
>
> Hello!
>
> I need to set Last Modified Time in PutFile however I cannot use
> file.creationTime as it is retrieved from either ListFile or GetFile.
>
> I am retrieving files from a folder in the middle of my flow using FetchFile
> and passing the absolute path to the files (as ListFile and GetFile have no
> input connections).
> After FetchFile I retrieve the file metadata with ls - l
> --time-style=full-iso which outputs something like this:
>
> -rw-r--r-- 1 nifi nifi 60 2020-02-14 14:14:07.000000000 +0000 file.txt
>
> From this I retrieve all components of the date and time that are needed and
> merge them together with the following:
>
> fileMetadata.6 value:2020-02-14
> fileMetadata.7 value:14:14:07.000000000
> fileMetadata.8 value:+0000
>
> dateMetadata value:${fileMetadata.6:append('
> '):append(${fileMetadata.7:substringBefore('.')})}
> Last Modified Time value:${dateMetadata:replace(' ', 'T'):append('
> '):append(${fileMetadata.8}):toDate("yyyy-MM-dd'T'HH:mm:ssZ")}
>
> After this I expect the following value for Last Modified Time
> 2020-02-14T14:14:07 +0000 which should correspond to the format required
> in the PutFile processor (yyyy-MM-dd'T'HH:mm:ssZ).
> Instead, after the above I obtain Fri Feb 14 15:10:56 CET 2020 which makes
> me think that there is some other transformation taking place which I am not
> aware of.
> When the above value Fri Feb 14 15:10:56 CET 2020 is used in the Last
> Modified Time I get the following error message:
>
> Could not set file lastModifiedTime to Fri Feb 14 15:10:56 CET 2020 because
> unparsable date:Fri Feb 14 15:10:56 CET 2020
>
> So I am wondering what I could do to address this issue and if there is
> another transformation taking place.
>
> Thanks in advance & all the best
>
> Valentina