Joel Croteau created AIRFLOW-5612:
-------------------------------------
Summary: Add ability to actually do things with created and
modified date in GoogleCloudStorageHook
Key: AIRFLOW-5612
URL: https://issues.apache.org/jira/browse/AIRFLOW-5612
Project: Apache Airflow
Issue Type: Improvement
Components: gcp
Affects Versions: 1.10.5
Reporter: Joel Croteau
{{GoogleCloudStorageHook}} seems to support only a very small subset of the
actual GCS API. In particular, the only thing it allows you to do with the date
of an object is check if the metadata was updated after a specified time using
{{is_updated_after}}. First of all, this only looks at the metadata update
date, which is probably not what is wanted for most purposes, as
{{timeCreated}} is generally what conveys useful information. Second of all, it
seems rather arbitrary to only allow me to compare if the updated time is
greater than some other time, and not just give me the time and let me make my
own inferences. In particular, for a scheduled workflow with a potential
backfill, I would like to check for a creation date with both a minimum and
maximum value, which this doesn't allow.
Also, tangentially, if you want to get multiple pieces of information on an
object, using {{GoogleCloudStorageHook}} will necessitate a separate call to
{{objects().get()}} for every piece of information, even though everything is
returned by the one call. Would it not make more sense to be able to return an
object structure with all of the needed information in it?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)