On 2018-01-13 08:12, Daniel Imberman <[email protected]> wrote:
> @jordan can you turn delete mode off and post the kubectl describe results
> for the workers?
Already had delete mode turned off. This was a really useful command. I can see
the basic logs in the k8s dashboard:
+ airflow run jordan_dag_3 run_this_1 2018-01-15T10:00:00 --local -sd
/root/airflow/dags/jordan3.py
[2018-01-15 19:40:52,978] {__init__.py:46} INFO - Using executor LocalExecutor
[2018-01-15 19:40:53,012] {models.py:187} INFO - Filling up the DagBag from
/root/airflow/dags/jordan3.py
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 27, in <module>
args.func(args)
File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 350,
in run
dag = get_dag(args)
File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 128,
in get_dag
'parse.'.format(args.dag_id))
airflow.exceptions.AirflowException: dag_id could not be found: jordan_dag_3.
Either the dag did not exist or it failed to parse.
I know the DAG is there in the scheduler and the webserver. I have reason to
believe the git-sync init container in the worker isn't checking out the files
in a way that the worker can use. Here's the info you requested:
Name: jordandag3runthis1-8df809f80c874d6ca50acb0d0480307c
Namespace: default
Node: minikube/192.168.99.100
Start Time: Mon, 15 Jan 2018 11:59:57 -0800
Labels: airflow-slave=
dag_id=jordan_dag_3
execution_date=2018-01-15T19_59_54.838835
task_id=run_this_1
Annotations:
pod.alpha.kubernetes.io/init-container-statuses=[{"name":"git-sync-clone","state":{"terminated":{"exitCode":0,"reason":"Completed","startedAt":"2018-01-15T19:59:58Z","finishedAt":"2018-01-15T19:59:59Z...
pod.alpha.kubernetes.io/init-containers=[{"name":"git-sync-clone","image":"gcr.io/google-containers/git-sync-amd64:v2.0.5","env":[{"name":"GIT_SYNC_REPO","value":"<our
git repo>...
pod.beta.kubernetes.io/init-container-statuses=[{"name":"git-sync-clone","state":{"terminated":{"exitCode":0,"reason":"Completed","startedAt":"2018-01-15T19:59:58Z","finishedAt":"2018-01-15T19:59:59Z"...
pod.beta.kubernetes.io/init-containers=[{"name":"git-sync-clone","image":"gcr.io/google-containers/git-sync-amd64:v2.0.5","env":[{"name":"GIT_SYNC_REPO","value":"https://github.com/pubnub/caravan.git"...
Status: Failed
IP:
Init Containers:
git-sync-clone:
Container ID:
docker://c3dcc435d18362271fe5ab8098275d082c01ab36fc451d695e6e0e54ad71132a
Image: gcr.io/google-containers/git-sync-amd64:v2.0.5
Image ID:
docker-pullable://gcr.io/google-containers/git-sync-amd64@sha256:904833aedf3f14373e73296240ed44d54aecd4c02367b004452dfeca2465e5bf
Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 15 Jan 2018 11:59:58 -0800
Finished: Mon, 15 Jan 2018 11:59:59 -0800
Ready: True
Restart Count: 0
Environment:
GIT_SYNC_REPO: <dag repo>
GIT_SYNC_BRANCH: master
GIT_SYNC_ROOT: /tmp
GIT_SYNC_DEST: dags
GIT_SYNC_ONE_TIME: true
GIT_SYNC_USERNAME: jzucker2
GIT_SYNC_PASSWORD: <password>
Mounts:
/root/airflow/airflow.cfg from airflow-config (ro)
/root/airflow/dags/ from airflow-dags (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-0bq1k
(ro)
Containers:
base:
Container ID: <container id>
Image: <image>
Image ID: <our image id>
Port: <none>
Command:
bash
-cx
--
Args:
airflow run jordan_dag_3 run_this_1 2018-01-15T19:59:54.838835 --local
-sd /root/airflow/dags/jordan3.py
State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 15 Jan 2018 12:00:00 -0800
Finished: Mon, 15 Jan 2018 12:00:01 -0800
Ready: False
Restart Count: 0
Environment:
AIRFLOW__CORE__AIRFLOW_HOME: /root/airflow
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__DAGS_FOLDER: /tmp/dags
SQL_ALCHEMY_CONN: <set to the key 'sql_alchemy_conn' in
secret 'airflow-secrets'> Optional: false
GIT_SYNC_USERNAME: <set to the key 'username' in secret
'gitsecret'> Optional: false
GIT_SYNC_PASSWORD: <set to the key 'password' in secret
'gitsecret'> Optional: false
AIRFLOW_CONN_PORTAL_DB_URI: <set to the key 'portal_mysql_conn' in
secret 'portaldbsecret'> Optional: false
AIRFLOW_CONN_OVERMIND_DB_URI: <set to the key 'overmind_mysql_conn' in
secret 'overminddbsecret'> Optional: false
Mounts:
/root/airflow/airflow.cfg from airflow-config (ro)
/root/airflow/dags/ from airflow-dags (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-0bq1k
(ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
airflow-dags:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
airflow-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: airflow-configmap
Optional: false
default-token-0bq1k:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-0bq1k
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m default-scheduler Successfully assigned
jordandag3runthis1-8df809f80c874d6ca50acb0d0480307c to minikube
Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp
succeeded for volume "airflow-dags"
Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp
succeeded for volume "airflow-config"
Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp
succeeded for volume "default-token-0bq1k"
Normal Pulled 3m kubelet, minikube Container image
"gcr.io/google-containers/git-sync-amd64:v2.0.5" already present on machine
Normal Created 3m kubelet, minikube Created container
Normal Started 3m kubelet, minikube Started container
Normal Pulled 3m kubelet, minikube Container image
"artifactnub1-docker-local.jfrog.io/pubnub/pnairflow:0.1.6" already present on
machine
Normal Created 3m kubelet, minikube Created container
Normal Started 3m kubelet, minikube Started container
>
> On Sat, Jan 13, 2018, 3:20 AM Koen Mevissen <[email protected]> wrote:
>
> > Are you using kubernetes on Google Cloud Platform? (GKE)
> >
> > You should be able to capture the logs from your nodes. In case you run GKE
> > with logging automatically deployed then deamonsets with fluentd will ship
> > logs from /var/log/containers on the nose to Google Cloud Logging.
> >
> > Koen
> >
> >
> >
> >
> >
> >
> > Op za 13 jan. 2018 om 01:18 schreef Anirudh Ramanathan
> > <[email protected]>
> >
> > > > Any good way to debug this?
> > >
> > > One way might be reading the events from "kubectl get events". That
> > should
> > > reveal some information about the pod removal event.
> > > This brings up another question - should errored pods be persisted for
> > > debugging?
> > >
> > > On Fri, Jan 12, 2018 at 3:07 PM, [email protected] <
> > > [email protected]> wrote:
> > >
> > > > I'm trying to use Airflow and Kubernetes and having trouble using git
> > > sync
> > > > to pull DAGs into workers.
> > > >
> > > > I use a git sync init container on the scheduler to pull in DAGs
> > > initially
> > > > and that works. But when worker pods are spawned, the workers terminate
> > > > almost immediately because they cannot find the DAGs. But since the
> > > workers
> > > > terminate so quickly, I can't even inspect the file structure to see
> > > where
> > > > the DAGs ended up during the workers git sync init container.
> > > >
> > > > I noticed that the git sync init container for the workers is hard
> > coded
> > > > into /tmp/dags and there is a git_subpath config setting as well. But I
> > > > can't understand how the git synced DAGs ever end up in
> > > /root/airflow/dags
> > > >
> > > > I am successfully using a git sync init container for the scheduler,
> > so I
> > > > know my git credentials are valid. Any good way to debug this? Or an
> > > > example of how to set this up correctly?
> > > >
> > >
> > >
> > >
> > > --
> > > Anirudh Ramanathan
> > >
> > --
> > Kind regards,
> > Met vriendelijke groet,
> >
> > *Koen Mevissen*
> > Principal BI Developer
> >
> >
> > *Travix Nederland B.V.*
> > Piet Heinkade 55
> > 1019 GM Amsterdam
> > The Netherlands
> >
> > T. +31 (0)20 203 3241
> > E: [email protected]
> > www.travix.com
> >
> > *Brands: * CheapTickets | Vliegwinkel | Vayama | BudgetAir |
> > Flugladen
> >
>