lexey-e-shelf commented on issue #37577: URL: https://github.com/apache/airflow/issues/37577#issuecomment-1964789983
> I did a regression and can confirm the analysis you made. Problem is that the expansion makes a lazy XCom resolution and this requires airflow codebase and XCom database access - thus both airflow python code and correct DB connection string needs to be there. > > I can also confirm that your proposed workaround to put a PythonOperator in between resolves the problem as this would do the lazy aggregate. > > If not easily to be fixed I'd propose the documentation should mention this at least. Current restrictions of XCom in KPO are solely about how XCom is passed outof the operator but no statement about the inpout in case of task maping aggregation / lazy XCom restriction. That's great to hear, thanks for confirming these things! I agree that at a minimum the documentation (probably on the Dynamic Task Mapping page [here](https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/dynamic-task-mapping.html#:~:text=Values%20passed%20from,number%20is%20large.)) should reflect this limitation, and potentially should also mention the `PythonOperator` workaround depending on how difficult it is to find and implement a widely agreeable solution. A potential solution would be to build in eager-loading of `LazyXComAccess` arguments to all operators that are meant to be able to work without a connection to the XCom database (e.g., all operators listed under the "Using the TaskFlow API with complex/conflicting Python dependencies" section of the docs [here](https://airflow.apache.org/docs/apache-airflow/stable/tutorial/taskflow.html#using-the-taskflow-api-with-complex-conflicting-python-dependencies)). I'm not sure how difficult that would be though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
