marcosmarxm opened a new pull request #14492: URL: https://github.com/apache/airflow/pull/14492
This PR includes the new community provider Airbyte. [Airbyte](www.airbyte.io) is an open-source integration tool. It's an essential tool for working with data. In my opinion, it will be handy to have this integration working. (I work as a Data Engineer, and it will help a lot in my projects to be able to control Airbyte jobs through Airflow) The Airbyte API was released at the beginning of the month, so I took the opportunity to start building Operator and Hook. In a discussion on the [Airbyte Github](https://github.com/airbytehq/airbyte/issues/836) was suggested to use the UUID identification (`connectionId`) of the connection itself. This way, the Operator will act as a trigger calling the Airbyte API. There are some additional files like the logo and editing of docs that I am doing. If you can comment on it I appreciate it. --- **About the execution steps from the Operator and Hook** I built the **AirbyteTriggerSyncOperator** calls the **AirbyteHook** (like most providers do). The AirbyteHook performs a call to the `submit_job` function, and this is a request to the API that will return the `job_id` value. Because jobs can take a long time, a flow is performed to wait for some status other than running. With the `job_id` variable, the process is monitored with the `wait_for_job` function until it returns some **success** or error **status**, or **timeout**. --- I used the GoogleDataProc and OpsGenie providers as the basis for building mine. OpsGenie also uses API calls based on HttpHook. Dataproc helped me build the process of waiting for the job to finish to release the flow from the DAG. I already executed the tests locally but using my own setup. At the moment, I'd a problem configuring Airflow following CONTRIBUTORS_QUICK_START. As this is my first contribution, I think there are some points for improvement. I will be very grateful for your feedback. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
