surahman commented on pull request #3710:
URL: https://github.com/apache/incubator-heron/pull/3710#issuecomment-939633719
I have finalised the feature. I am moving on to tracing where the `submit`
command originates within the `scheduler-core` to locate where to delete the
topologies from in the event of a failed submission to K8s. I shall update the
documentation shortly. Input on what follows is genuinely appreciated and we
need broader testing.
<details><summary>Pod Template</summary>
```yaml
apiVersion: v1
kind: PodTemplate
metadata:
name: pod-template-example
namespace: default
template:
metadata:
name: acking-pod-template-example
spec:
containers:
# Executor container
- name: executor
securityContext:
allowPrivilegeEscalation: false
env:
- name: var_one
value: "variable one"
- name: var_two
value: "variable two"
- name: var_three
value: "variable three"
- name: POD_NAME
value: "MUST BE OVERWRITTEN"
- name: HOST
value: "REPLACED WITH ACTUAL HOST"
ports:
- name: overwritten
protocol: TCP
containerPort: 6001
- name: tcp-port-kept
protocol: TCP
containerPort: 5555
- name: udp-port-kept
protocol: UDP
containerPort: 5556
volumeMounts:
- name: shared-volume
mountPath: /shared_volume
# Sidecar container
- name: sidecar-container
image: alpine
volumeMounts:
- name: shared-volume
mountPath: /shared_volume
# Volumes
volumes:
- name: shared-volume
emptyDir: {}
```
</details>
<details><summary>describe pod acking-0</summary>
```bash
Name: acking-0
Namespace: default
Priority: 0
Node: <none>
Labels: app=heron
controller-revision-hash=acking-748f986d6f
statefulset.kubernetes.io/pod-name=acking-0
topology=acking
Annotations: prometheus.io/port: 8080
prometheus.io/scrape: true
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/acking
Containers:
executor:
Image: apache/heron:testbuild
Ports: 5555/TCP, 5556/UDP, 6001/TCP, 6002/TCP, 6003/TCP, 6004/TCP,
6005/TCP, 6006/TCP, 6007/TCP, 6008/TCP, 6009/TCP
Host Ports: 0/TCP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP,
0/TCP, 0/TCP, 0/TCP
Command:
sh
-c
./heron-core/bin/heron-downloader-config kubernetes &&
./heron-core/bin/heron-downloader
distributedlog://zookeeper:2181/heronbkdl/acking-saad-tag-0-5579618957728031586.tar.gz
. && SHARD_ID=${POD_NAME##*-} && echo shardId=${SHARD_ID} &&
./heron-core/bin/heron-executor --topology-name=acking
--topology-id=acking6276c0d9-866f-4ac4-b8af-d54a9d51b3f9
--topology-defn-file=acking.defn --state-manager-connection=zookeeper:2181
--state-manager-root=/heron
--state-manager-config-file=./heron-conf/statemgr.yaml
--tmanager-binary=./heron-core/bin/heron-tmanager
--stmgr-binary=./heron-core/bin/heron-stmgr
--metrics-manager-classpath=./heron-core/lib/metricsmgr/*
--instance-jvm-opts="LVhYOitIZWFwRHVtcE9uT3V0T2ZNZW1vcnlFcnJvcg(61)(61)"
--classpath=heron-api-examples.jar
--heron-internals-config-file=./heron-conf/heron_internals.yaml
--override-config-file=./heron-conf/override.yaml
--component-ram-map=exclaim1:1073741824,word:1073741824 --component-jvm-opts=""
--pkg-type=jar --topology-bi
nary-file=heron-api-examples.jar --heron-java-home=$JAVA_HOME
--heron-shell-binary=./heron-core/bin/heron-shell --cluster=kubernetes
--role=saad --environment=default
--instance-classpath=./heron-core/lib/instance/*
--metrics-sinks-config-file=./heron-conf/metrics_sinks.yaml
--scheduler-classpath=./heron-core/lib/scheduler/*:./heron-core/lib/packing/*:./heron-core/lib/statemgr/*
--python-instance-binary=./heron-core/bin/heron-python-instance
--cpp-instance-binary=./heron-core/bin/heron-cpp-instance
--metricscache-manager-classpath=./heron-core/lib/metricscachemgr/*
--metricscache-manager-mode=disabled --is-stateful=false
--checkpoint-manager-classpath=./heron-core/lib/ckptmgr/*:./heron-core/lib/statefulstorage/*:
--stateful-config-file=./heron-conf/stateful.yaml
--checkpoint-manager-ram=1073741824 --health-manager-mode=disabled
--health-manager-classpath=./heron-core/lib/healthmgr/* --shard=$SHARD_ID
--server-port=6001 --tmanager-controller-port=6002 --tmanager-stats-port=6003
--she
ll-port=6004 --metrics-manager-port=6005 --scheduler-port=6006
--metricscache-manager-server-port=6007 --metricscache-manager-stats-port=6008
--checkpoint-manager-port=6009
Limits:
cpu: 3
memory: 4Gi
Requests:
cpu: 3
memory: 4Gi
Environment:
HOST: (v1:status.podIP)
POD_NAME: acking-0 (v1:metadata.name)
var_one: variable one
var_three: variable three
var_two: variable two
Mounts:
/shared_volume from shared-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from
kube-api-access-csb95 (ro)
sidecar-container:
Image: alpine
Port: <none>
Host Port: <none>
Environment: <none>
Mounts:
/shared_volume from shared-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from
kube-api-access-csb95 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
shared-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-csb95:
Type: Projected (a volume that contains injected data
from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute
op=Exists for 10s
node.alpha.kubernetes.io/unreachable:NoExecute
op=Exists for 10s
node.kubernetes.io/not-ready:NoExecute
op=Exists for 10s
node.kubernetes.io/unreachable:NoExecute
op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 17s default-scheduler 0/1 nodes are
available: 1 Insufficient cpu.
```
</details>
## Heron Configured Items in Pod Templates
Heron will locate the container named `executor` in the Pod Template and
customize it as outlined below. All other containers within the Pod Template
will remain unchanged.
### Executor Container
All metadata for the `executor` container will be overwritten by Heron. In
some other cases, values from the Pod Template for the `executor` will be
overwritten by Heron as outline below.
| Name | Description | Policy |
|---|---|---|
| `image` | The `executor` container's image. | Overwritten by Heron using
values form the config.
| `env` | Environment variables are made available within the container. The
`HOST` and `POD_NAME` keys are required by Heron and are thus reserved. |
Merged with Heron's values taking precedence. Deduplication is based on `name`.
| `ports` | Port numbers opened within the container. Some of these port
number are required by Heron and are thus reserved. The reserved ports are
defined in Heron's constants as [`6001`-`6010`]. | Merged with Heron's values
taking precedence. Deduplication is based on the `containerPort` value.
| `limits` | Heron will attempt to load values for `cpu` and `memory` from
its configs. If these values are not provided in the containers specs, Heron
will place values from its configs. | User input takes precedence over Heron's
values. This allows for per job custom resource limits.
| `volumeMounts` | These are the mount points within the `executor`
container for the `volumes` available in the Pod. | Merged with Heron's values
taking precedence. Deduplication is based on the `name` value.
| Annotation: `prometheus.io/scrape` | Flag to indicate whether Prometheus
logs can be scraped and is set to `true`. | Value is are overridden by Heron. |
| Annotation `prometheus.io/port` | Port address for Prometheus log scraping
and is set to `8080`. | Values are overridden by Heron.
| Annotation: Pod | Pod's revision/version hash. | Automatically set.
| Annotation: Service | Labels services can use to attach to the Pod. |
Automatically set.
| Label: `app` | Name of the application lauching the Pod and is set to
`Heron`. | Values are overridden by Heron.
| Label: `topology`| The name of topology which was provided when
submitting. | User defined and supplied on the CLI.
### Pod
The following items will be set in the Pod Template's `spec` by Heron.
| Name | Description | Policy |
|---|---|---|
`terminationGracePeriodSeconds` | Grace period to wait before shutting down
the Pod after a `SIGTERM` signal and is set to `0` seconds. | Values are
overridden by Heron.
| `tolerations` | Attempts to colocate Pods with `tolerations` and `taints`
onto nodes hosting Pods with matching `tolerations` and `taints`. <br>
Keys:<br>`node.kubernetes.io/not-ready` <br>
`node.alpha.kubernetes.io/notReady` <br>
`node.alpha.kubernetes.io/unreachable`. <br> Values (common):<br> `operator:
"Exists"`<br> `effect: NoExecute`<br> `tolerationSeconds: 10L` | Values are
overridden by Heron.
| `containers` | Container images to be used on the executor Pods. | All
`containers`, excluding the `executor`, are loaded as-is.
| `volumes` | Volumes to be made available to the entire Pod. | Merged with
Heron's values taking precedence. Deduplication is based on the `name` value.
| `secretVolumes` | Secrets to be mounted as volumes within the Pod. |
Loaded from the Heron configs if present.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]