Taragolis commented on PR #26162:
URL: https://github.com/apache/airflow/pull/26162#issuecomment-1250873799

   @potiuk Right now I thought this is only one approach which could cover:
   - Auth live in provider packages or user-defined scripts
   - Do not use `import_string`
   
   It is only required small changes in current operators and do not need 
re-implement everything in case if it required only the way of Auth. This PR 
also in draft by the one reason I do not have an idea the best place where 
write the documentation how to use Auth.
   
   It might be:
   - In docker provider just information how to implements custom and what 
actually DockerHook send to this class 
   - In amazon provider information about what user should provide in ECR 
related class
   
   And also I would like to add some information about other Hooks and 
something which might (or might not) be nixe implemented in the future. Which 
not related to current PR
   
   ---
   
   ### HttpHook
   
   Right now my daily workload do not required but couple years ago we were 
using it very active. I do not remember might be in 1.10.4 HttpHook do not even 
have this parameter for request Auth (today I lazy to check it).
   
   But major issue currently you need to create new hook and overwrite 
`get_conn` method if you need provide not only `login` and `password` from 
connection. So may be it also good idea to implement some generic way to grab 
credentials from Connection and provide into HttpHook, so it wouldn't require 
re-create hook and operators in cases if only required custom Auth
   
   I found this code from the old project (note: code created for 1.10.x)
   
   #### Custom Auth
   
   ```python
   class BearerAuth(AuthBase):
       """ Bearer Authorization implementation for requests """
   
       def __init__(self, token):
           self._token = token
   
       def __call__(self, r):
           r.headers["Authorization"] = "Bearer %s" % self._token
           return r
   ```
   
   ```python
   class SomeDumbAuth(AuthBase):
       """ Authorization for some private API """
   
       def __init__(self, key, secret):
           self._key = key
           self._secret = secret
   
       def __call__(self, r):
           if not r.body:
               r.body = ""
           if r.body:
               b = {x.split("=")[0]: x.split("=")[1] for x in r.body.split("&")}
           else:
               b = {}
   
           b.update({
               "some_key": self._key,
               "some_secret": self._secret,
           })
           r.body = "&".join("%s=%s" % (k, v) for k,v in b.items())
   
           return r
   ```
   
   ```python
   class AppStoreAuth(AuthBase):
       """ AppStore Authorization implementation for requests """
   
       def __init__(self, private_key, key_id, issuer_id):
           self._private_key = private_key
           self._key_id = key_id
           self._issuer_id = issuer_id
   
       def __call__(self, r):
           headers = {
               "alg": 'ES256',
               "kid": self._key_id,
               "typ": "JWT"
           }
           payload = {
               "iss": self._issuer_id,
               "exp": int((datetime.now() + 
timedelta(minutes=20)).strftime("%s")),
               "aud": "appstoreconnect-v1"
           }
   
           token = jwt.encode(
               payload=payload,
               key=self._private_key,
               algorithm='ES256',
               headers=headers
           ).decode(encoding="utf-8")
   
           r.headers["Authorization"] = "Bearer %s" % token
           return r
   ```
   
   #### Hook which use custom auth from conn
   
   ```python
   class AppStoreSalesHttpHook(HttpHook):
       """ HTTP Hook for AppStore connect API """
   
       def __init__(self, endpoint, *args, **kwargs):
           super().__init__(*args, **kwargs)
           self.endpoint = endpoint
   
       def get_conn(self, headers: dict = None):
           """ Returns http requests session for AppStore connect use with 
requests
   
           Args:
               headers: additional headers to be passed through as a dictionary
           """
           session = requests.Session()
           if self.http_conn_id:
               conn = self.get_connection(self.http_conn_id)
   
               if conn.password:
                   private_key = conn.password.replace("\\n", "\n")
                   key_id = conn.extra_dejson.get('KeyId')
                   issuer_id = conn.extra_dejson.get('IssuerId')
   
                   session.auth = AppStoreAuth(
                       private_key=private_key,
                       key_id=key_id,
                       issuer_id=issuer_id,
                   )
               else:
                   raise ValueError("Missing extra parameters for connection 
%r" % self.http_conn_id)
   
               if conn.host and "://" in conn.host:
                   self.base_url = conn.host
               else:
                   # schema defaults to HTTP
                   schema = conn.schema if conn.schema else "http"
                   host = conn.host if conn.host else ""
                   self.base_url = schema + "://" + host
   
               if conn.port:
                   self.base_url = self.base_url + ":" + str(conn.port)
   
           if headers:
               session.headers.update(headers)
   
           return session
   ```
   
   ---
   
   ### PostgresHook
   
   It is not a big deal to change in hook to use custom Auth. One thing what I 
would change it drop Redshift support from PostgresHook
   
   
https://github.com/apache/airflow/blob/6045f7ad697e2bdb934add1a8aeae5a817306b22/airflow/providers/postgres/hooks/postgres.py#L191-L205
   
   In Amazon provider there is two different ways (and two different hooks) how 
to interact with Redshift
   1. DB-API by `redshift-connector`
   2. By AWS API and boto3
   
   Event thought Redshift use PostgreSQL protocol I'm not sure that `psycopg2` 
officially support Redshift or not but `psycopg` (v3) not officially supported 
it - https://github.com/psycopg/psycopg/issues/122 .
   
   ---
   
   ### MySQL
   
   I'm still not sure that the current implementation actually allow use AWS 
IAM (need to check).
   But also for MySQL Auth class need to have ability for specify which driver 
user actually use
   
   
https://github.com/apache/airflow/blob/6045f7ad697e2bdb934add1a8aeae5a817306b22/airflow/providers/mysql/hooks/mysql.py#L169-L179
   
   --- 
   
   ### Extensible Connection / Pluggable Auth
   
   In my head right now this part a bit more than  just Auth.
   Currently Airflow Connection might use for different things.
   
   - For authentication
   - For configuration client
   - Parameters for API call (like Amazon EMR Connection, which actually not a 
connection and use only in one place)
   
   The main problem how it showed in the UI and how it stored. Good sample is 
AWS Connection:
   - Quite a few different method and extra parameters which actually won't 
work together
   - boto3 configurations
   
   Extensible Connection might show different ui depend on select Auth Type, 
also it would be nice have different tabs for Auth and Configurations.
   
   But again it is just an idea which in my head might be brilliant but in real 
world required a lot of changes.
   Also I'm not even try to draw some scratches on paper how it would work and 
integrate with airflow and components.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to