mszpot-future-processing opened a new issue, #41306:
URL: https://github.com/apache/airflow/issues/41306

   ### Apache Airflow version
   
   main (development)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.8.1
   
   ### What happened?
   
   Below Airflow task throws an error:
   
   ```
   [2024-08-07T09:05:00.142+0000] {{xcom.py:664}} ERROR - Object of type 
GlueJobOperator is not JSON serializable. If you are using pickle instead of 
JSON for XCom, then you need to enable pickle support for XCom in your airflow 
config or make sure to decorate your object with attr.
   ```
   
   
   Code:
   
   ```
   @task
   def lag_tasks_with_filter(a, b, c, d, e, f, g, h):
   
       return GlueJobOperator(
           task_id=f"create_lags_task_{a}_{b}_w{c}_lag{d}_filter{e}",
           job_name=config.generate_job_name(f"lag{d}-weeks{c}-" + 
f"filter{e}-job-{a}-{b}"),
           script_location=config.get_bridge_script("lags_bridge_script.py"),
           iam_role_name=f,
           script_args={
                   "--lagWithCatPath": f"s3://{g}/output/with_cat" + 
f"/a={a}/demographic={b}",
                   "--rawDataInputPath": f"s3://{h}/output/oneyear" + 
f"/a={a}/demographic_code={b}/",
                   "--numberOfLagWeeks": str(d),
                   "--windowSizeWeeks": str(create_job_kwargs),
                   "--filterCol": e,
                   "--taskId": 
f"create_lags_task_{a}_{b}_w{c}_lag{d}_filter{e}",    
           },
           create_job_kwargs={
                   "WorkerType": "G.2X",
                   "NumberOfWorkers": 5,
                   "GlueVersion": "4.0",
                   "DefaultArguments": {
                                       "--job-language": "python",
                                       "--enable-job-insights": "true",
                                       "--enable-metrics": "true",
                                       "--enable-auto-scaling": "true",
                                       "--enable-observability-metrics": "true",
                                       "--TempDir": 
f"s3://{config.get_environment_variable('glue_tmp_dir_location', 
default_var='undefined')}",
                                       "--extra-py-files": 
config.get_asset_file_location(
                                           
"ctc_telligence_forecasting_data_product-0.0.1-py3-none-any.whl"
                                       ),
                                       "--enable-spark-ui": "true",
                                       "--spark-event-logs-path": 
f"s3://{config.get_environment_variable('glue_spark_ui_logs_location', 
default_var='undefined')}",
                                   },
           },
           update_config=True,
       )
   
   ts = DummyOperator(task_id='start')
   te = DummyOperator(task_id='end')
   t1 = lag_tasks_with_filter.partial(f=stage3_task_role, 
g=intermittent_data_location, h=playground_bucket).expand(a=as, b=bs, c=cs, 
d=ds, e=es)
   
   
   # setting dependencies
   ts >> t1 >> te
   ```
   
   When removing return, DAG passes but Glue jobs don't get created and 
triggered. I want to keep @task decorator syntax since it allows for creating 
mapped instances with expand().
   
   Thanks in advance for any help!
   
   ### What you think should happen instead?
   
   Glue jobs should get created in AWS.
   
   ### How to reproduce
   
   Please use above provided code for `@task`.
   
   ### Operating System
   
   NA
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Amazon (AWS) MWAA
   
   ### Deployment details
   
   Airflow version == 2.8.1
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to