Taragolis commented on code in PR #25931:
URL: https://github.com/apache/airflow/pull/25931#discussion_r954748912


##########
docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst:
##########
@@ -47,3 +47,86 @@ You can also use `LocalStack <https://localstack.cloud/>`_ 
to emulate Amazon S3
 To configure it, you must additionally set the endpoint url to point to your 
local stack.
 You can do this via the Connection Extra ``host`` field.
 For example, ``{"host": "http://localstack:4572"}``
+
+Enabling remote logging for Amazon S3 with AWS IRSA
+'''''''''''''''''''''''''''''''''''''''''''''''''''
+`IRSA 
<https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html>`_
 is a feature that allows you to assign an IAM role to a Kubernetes service 
account.
+It works by leveraging a `Kubernetes <https://kubernetes.io/>`_ feature known 
as `Service Account 
<https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/>`_
 Token Volume Projection.
+When Pods are configured with a Service Account that references an `IAM Role 
<https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html>`_, the 
Kubernetes API server will call the public OIDC discovery endpoint for the 
cluster on startup.When an AWS API is invoked, the AWS SDKs calls 
``sts:AssumeRoleWithWebIdentity``. IAM exchanges the Kubernetes issued token 
for a temporary AWS role credential after validating the token's signature.
+
+It's recommended best practise to use IAM Role for ServiceAccounts to access 
AWS services(e.g., S3) from Amazon EKS.
+The steps below guides you to create a new IAM role with ServiceAccount and 
use with Airflow WebServers and Workers (Kubernetes Executors) Pods.
+
+Step1: Create IAM role for service account (IRSA)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+This step is creating IAM role and service account using `eksctl 
<https://eksctl.io/>`_.
+Also, note that this example is using managed policy with full S3 permissions 
attached to the IAM role. This is only used for testing purpose.
+We highly recommend you to create a restricted S3 IAM policy and use it with 
``--attach-policy-arn``
+
+Alternatively, you can use other IaC tools like Terraform. For deploying 
Airflow with Terraform including IRSA. Checkout this example `link 
<https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/main/examples/analytics/airflow-on-eks>`_.
+
+Execute the following command by providing all the necessary inputs.
+
+.. code-block:: bash
+
+    eksctl create iamserviceaccount --cluster="<EKS_CLUSTER_ID>" 
--name="<SERVICE_ACCOUNT_NAME>" --namespace="<NAMESPACE>" 
--attach-policy-arn="<IAM_POLICY_ARN>" --approve``
+
+Example with sample inputs
+
+.. code-block:: bash
+
+    eksctl create iamserviceaccount --cluster=airflow-eks-cluster 
--name=airflow-sa --namespace=airflow 
--attach-policy-arn=arn:aws:iam::aws:policy/AmazonS3FullAccess --approve
+
+
+Step2: Update Helm Chart `values.yaml` with Service Account
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+This step is using `Airflow Helm Chart 
<https://github.com/apache/airflow/tree/main/chart>`_ deployment.
+If you are deploying Airflow using Helm Chart then you can modify the 
``values.yaml`` as mentioned below.
+Add the Service Account (e.g., ``airflow-sa``) created by Step1 to Helm Chart 
``values.yaml`` under the following sections.
+We are  using the existing ``serviceAccount`` hence ``create: false`` with 
existing name as ``name: airflow-sa``.
+
+
+.. code-block:: yaml
+
+    workers:
+      serviceAccount:
+        create: false
+        name: airflow-sa
+        # Annotations are automatically added by **Step1** to serviceAccount. 
So, you dont need to mention the annotations. We have added this for 
information purpose
+        annotations:
+          eks.amazonaws.com/role-arn: 
<ENTER_IAM_ROLE_ARN_CREATED_BY_EKSCTL_COMMAND>
+
+    webserver:
+      serviceAccount:
+        create: false
+        name: airflow-sa
+        # Annotations are automatically added by **Step1** to serviceAccount. 
So, you dont need to mention the annotations. We have added this for 
information purpose
+        annotations:
+          eks.amazonaws.com/role-arn: 
<ENTER_IAM_ROLE_ARN_CREATED_BY_EKSCTL_COMMAND
+
+    config:
+      logging:
+        remote_logging: 'True'
+        logging_level: 'INFO'
+        remote_base_log_folder: 's3://<ENTER_YOUR_BUCKET_NAME>/<FOLDER_PATH' # 
Specify the S3 bucket used for logging
+        remote_log_conn_id: 'aws_s3_conn' # Notice that this name is used in 
Step3 for creating connections through Airflow UI
+        delete_worker_pods: 'False'
+        encrypt_s3_logs: 'True'
+
+Step3: Create Amazon S3 connection in Airflow Web UI
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+With the above configurations, Webserver and Worker Pods can access Amazon S3 
bucket and write logs without using any Access Key and Secret Key or Instance 
profile credentials.
+
+The final step to create connections under Airflow UI before executing the DAGs
+
+* Login to Airflow Web UI with ``admin`` credentials and Navigate to ``Admin 
-> Connections``
+* Create connection for ``S3`` and select the options(Connection ID and 
Connection Type) as shown in the image.

Review Comment:
   S3Hook, as well as all of hooks based on AwsBaseHook, intend to use [Amazon 
Web Services 
Connection](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html).
   
   It could use Amazon S3 Connection (not documented, exists for historical 
reason???) but in this case users get UserWarning that connection has incorrect 
connection type.



##########
docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst:
##########
@@ -47,3 +47,86 @@ You can also use `LocalStack <https://localstack.cloud/>`_ 
to emulate Amazon S3
 To configure it, you must additionally set the endpoint url to point to your 
local stack.
 You can do this via the Connection Extra ``host`` field.
 For example, ``{"host": "http://localstack:4572"}``
+
+Enabling remote logging for Amazon S3 with AWS IRSA
+'''''''''''''''''''''''''''''''''''''''''''''''''''

Review Comment:
   [Using IAM Roles for Service Accounts (IRSA) on 
EKS](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html#using-iam-roles-for-service-accounts-irsa-on-eks)
 already contain information about IRSA on EKS in Connection page.
   
   Might be better append general configuration stuff in connection rather than 
only for S3 Logging?
   This configurations might be useful not only for S3 Logging but also in 
secrets backends and Cloudwatch Logging



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to