koustreak opened a new pull request, #43408:
URL: https://github.com/apache/airflow/pull/43408
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of an existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
<!-- Please keep an empty line above the dashes. -->
---
**^ Add meaningful description above**
Read the **[Pull Request
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
for more information.
In case of fundamental code changes, an Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party
License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in a
newsfragment file, named `{pr_number}.significant.rst` or
`{issue_number}.significant.rst`, in
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
Purpose of the PR : Git-Sync from multiple repo and branch ,
change-list:
- in dags.gitSync , now it is possible to have multiple entries
- changes in values.schema.json to amend the new schema for
dags.gitSync.repositories
- changes in check-values.yaml to check the below rules
- to check if duplicate repo and branch combinations are entered
- containerName should be unique in values.yaml for every entry in
dags.gitSync.repositories
- changes in _helpers.yml ( the helper function ) to accommodate multi-repo
settings
- changes in deployment yaml to support multirepo setting
- changes in dags folder path so that symlink can be created for all repos
- changes to support known-hosts and sshKey for multiple repos or branchs
**This is a new feature which i have been working , i am ready to co-operate
with all test-cases , below are the test cases i have performed**
Test Cases | Expected Behaviour | Result
-- | -- | --
multiple git-repo ( 2 Private and 1 Public ) | gitSync pulling from multiple
repo or branch | PASSED
multiple git-repo ( 2 Private with ssh ) | gitSync pulling from multiple
repo or branch | PASSED
single git-repo | existing behavior should not be interrupted | PASSED
multiple duplicate entry in repositories section in dags | Should raise
error while helm install | PASSED
Known-hosts for all repos | Known-hosts are listed in airflow.cfg for all
repos | PASSED
**Change Lists** :
File Name | Changes
-- | --
templates/NOTES.txt | Notes added for multirepo changes , it is just a Note
templates/check-value.yaml | The Multirepo changes comes with some
conditions - <br>**1**. containerName in dags.gitSync for every repo in
repositories list , should be unique <br>**2**. whether user is providing
duplicate input in the dags.gitSync.repolistories list
templates/_helpers.yaml | Below are the changes - <br>**1**. create
git-sync-init and git-sync container for every entry in the
dags.gitSync.repositories list <br>**2**. create seperate volumeMounts and
volumes for the ssh key and repo in dags.gitSync.repositories <br>**3**. create
dags folder path for all entries in dags.gitSync.repositories <br>**4**.
created a new helper to create symlink for every repo and call this helper in
container ( scheduler , worker , triggerer , webserver , .. ) in their
lifecycle
templates/scheduler/scheduler-deployment.yaml | changes to enlist all
volumnes for all repos and as well as for sshKey , also in container lifecycle
, helper airflow.dags_poststart has been called , so that it can create symlink
for every repo , only after the git-sync is finished , to avoid any
race-condition,
templates/worker/worker-deployment.yaml | changes to enlist all volumnes for
all repos and as well as for sshKey , also in container lifecycle , helper
airflow.dags_poststart has been called , so that it can create symlink for
every repo , only after the git-sync is finished , to avoid any race-condition,
templates/webserver/webserver-deployment.yaml | changes to enlist all
volumnes for all repos and as well as for sshKey , also in container lifecycle
, helper airflow.dags_poststart has been called , so that it can create symlink
for every repo , only after the git-sync is finished , to avoid any
race-condition,
templates/triggerer/triggerer-deployment.yaml | changes to enlist all
volumnes for all repos and as well as for sshKey , also in container lifecycle
, helper airflow.dags_poststart has been called , so that it can create symlink
for every repo , only after the git-sync is finished , to avoid any
race-condition,
templates/secrets/git-ssh-key-secrets.yaml | changes in secrets , so that it
can have secrets from multiple repos from dags.gitSync.repositories
templates/dag-processor/dag-processor-deployment.yaml | changes to enlist
all volumnes for all repos and as well as for sshKey , also in container
lifecycle , helper airflow.dags_poststart has been called , so that it can
create symlink for every repo , only after the git-sync is finished , to avoid
any race-condition,
templates/configmaps/configmap.yaml | changes to facilitate multiple
known-hosts from dags.gitSync.repositories
files/pod-template-file.kubernetes-hem-yaml | changes in volumne and
container lifecycle to facilitate the multirepo changes
values.yaml | introduced a new list element in dags.gitSync , this new list
element named repositories which will have multiple entries of repos or branches
values-schema.json | because a new element has been introduced in
dags.gitSync , the schema.json is needed to be changed .
Thanks and Regards,
koushik
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]