[
https://issues.apache.org/jira/browse/AIRFLOW-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manish Zope reassigned AIRFLOW-5612:
------------------------------------
Assignee: Manish Zope
> Add ability to actually do things with created and modified date in
> GoogleCloudStorageHook
> ------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-5612
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5612
> Project: Apache Airflow
> Issue Type: Improvement
> Components: gcp
> Affects Versions: 1.10.5
> Reporter: Joel Croteau
> Assignee: Manish Zope
> Priority: Major
>
> {{GoogleCloudStorageHook}} seems to support only a very small subset of the
> actual GCS API. In particular, the only thing it allows you to do with the
> date of an object is check if the metadata was updated after a specified time
> using {{is_updated_after}}. First of all, this only looks at the metadata
> update date, which is probably not what is wanted for most purposes, as
> {{timeCreated}} is generally what conveys useful information. Second of all,
> it seems rather arbitrary to only allow me to compare if the updated time is
> greater than some other time, and not just give me the time and let me make
> my own inferences. In particular, for a scheduled workflow with a potential
> backfill, I would like to check for a creation date with both a minimum and
> maximum value, which this doesn't allow.
>
> Also, tangentially, if you want to get multiple pieces of information on an
> object, using {{GoogleCloudStorageHook}} will necessitate a separate call to
> {{objects().get()}} for every piece of information, even though everything is
> returned by the one call. Would it not make more sense to be able to return
> an object structure with all of the needed information in it?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)