[ 
https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894725#comment-16894725
 ] 

Joel Croteau edited comment on AIRFLOW-5046 at 8/3/19 5:00 PM:
---------------------------------------------------------------

[~m1racoli], I would agree that the third solution sounds best. I would suggest 
that this should be done at template expansion time. An XCom value can be any 
pickleable object, and since the template expansion simply replaces the content 
of the templated value with the expanded value returned by Jinja, there is no 
reason that that expanded value would have to be a string. Changing that 
behavior would require modifying or extending Jinja though. Perhaps there could 
be a second phase of template expansion, or a special template syntax 
specifically to check for XCom operators, and change the templated value to 
whatever was actually passed to XCom.


was (Author: tv4fun):
[~m1racoli], I would agree that the third solution sounds best. I would suggest 
that this should be done at template expansion time. An XCom value can be any 
pickleable object, and since the template expansion simply replaces the content 
of the templated value with the expanded value returned by Jinja, there is no 
reason that that expanded value would have to be a string. Changing that 
behavior would require modifying or extending Jinja though. Perhaps their could 
be a second phase of template expansion, or a special template syntax 
specifically to check for XCom operators, and change the templated value to 
whatever was actually passed to XCom.

> Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a 
> string or otherwise take input from XCom
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5046
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5046
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: contrib, gcp
>    Affects Versions: 1.10.2
>            Reporter: Joel Croteau
>            Priority: Minor
>
> `GoogleCloudStorageToBigQueryOperator` should be able to have its 
> `source_objects` dynamically determined by the results of a previous 
> workflow. This is hard to do with it expecting a list, as any template 
> expansion will render as a string. This could be implemented either as a 
> check for whether `source_objects` is a string, and trying to parse it as a 
> list if it is, or a separate argument for a string encoded as a list.
> My particular use case for this is as follows:
>  # A daily DAG scans a GCS bucket for all objects created in the last day and 
> loads them into BigQuery.
>  # To find these objects, a `PythonOperator` scans the bucket and returns a 
> list of object names.
>  # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects 
> into BigQuery.
> The operator should be able to have its list of objects provided by XCom, but 
> there is no functionality to do this, and trying to do a template expansion 
> along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") 
> }}'` doesn't work because this is rendered as a string, which 
> `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each 
> character being a single item.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to