bolkedebruin commented on issue #5254: [AIRFLOW-4473] Add papermill operator URL: https://github.com/apache/airflow/pull/5254#issuecomment-490224686 Papermill is awesome! Consider the following dag: ``` import airflow from airflow.models import DAG from airflow.operators.papermill_operator import PapermillOperator from airflow.operators.bash_operator import BashOperator from datetime import timedelta args = { 'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2) } dag = DAG( dag_id='example_papermill_operator', default_args=args, schedule_interval='0 0 * * *', dagrun_timeout=timedelta(minutes=60)) run_this = PapermillOperator( task_id="run_example_notebook", dag=dag, input_nb="/tmp/hello_world.ipynb", output_nb="/tmp/out-{{ execution_date }}.ipynb", parameters={"msgs": "Ran from Airflow at {{ execution_date }}!"} ) if __name__ == "__main__": dag.cli() ``` the simple notebook looks like this ``` msgs = "Hello!" <-- parameterized cell print(msgs) ``` BTW: you will also like this in the context of Amundsen. This operator auto generates lineage information. If you implement your own lineage client in Airflow you can integrate this with Neo4j/Elastic. Atlas is already supported ;-) (Overall the lineage capability in AIrflow needs to love, usage needs to guide it)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
