imamdigmi opened a new issue #16828:
URL: https://github.com/apache/airflow/issues/16828


   **Apache Airflow version**: 2.0.0
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): ```Client Version: version.Info{Major:"1", Minor:"20", 
GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", 
GitTreeState:"clean", BuildDate:"2021-01-13T13:28:09Z", GoVersion:"go1.15.5", 
Compiler:"gc", Platform:"linux/amd64"}
   Server Version: version.Info{Major:"1", Minor:"18+", 
GitVersion:"v1.18.8-aliyun.1", GitCommit:"94f1dc8", GitTreeState:"", 
BuildDate:"2021-01-10T02:57:47Z", GoVersion:"go1.13.15", Compiler:"gc", 
Platform:"linux/amd64"}```
   
   **Environment**: -
   
   - **Cloud provider or hardware configuration**: Alibaba Cloud
   - **OS** (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
   - **Kernel** (e.g. `uname -a`): `Linux airflow-webserver-fb89b7f8b-fgzvv 
3.10.0-1160.11.1.el7.x86_64 #1 SMP Fri Dec 18 16:34:56 UTC 2020 x86_64 
GNU/Linux`
   - **Install tools**: Helm (Custom)
   - **Others**: None
   
   **What happened**:
   My Airflow use fluent-bit to catch the stdout logs from airflow containers 
and then send the logs messages to Elasticsearch in a remote machine and it 
works well, I can see the logs through Kibana. But the Airflow cannot display 
the logs, because an error:
   ```
   ERROR - Exception on /get_logs_with_metadata [GET]
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
2447, in wsgi_app
       response = self.full_dispatch_request()
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1952, in full_dispatch_request
       rv = self.handle_user_exception(e)
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1821, in handle_user_exception
       reraise(exc_type, exc_value, tb)  
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/_compat.py", 
line 39, in reraise
       raise value
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1950, in full_dispatch_request
       rv = self.dispatch_request()
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1936, in dispatch_request
       return self.view_functions[rule.endpoint](**req.view_args)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/auth.py", line 
34, in decorated
       return func(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/decorators.py", 
line 60, in wrapper
       return f(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", 
line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/views.py", line 
1054, in get_logs_with_metadata
       logs, metadata = task_log_reader.read_log_chunks(ti, try_number, 
metadata)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/log/log_reader.py",
 line 58, in read_log_chunks
       logs, metadatas = self.log_handler.read(ti, try_number, 
metadata=metadata)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/log/file_task_handler.py",
 line 217, in read
       log, metadata = self._read(task_instance, try_number_element, metadata)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py",
 line 160, in _read
       logs = self.es_read(log_id, offset, metadata)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py",
 line 233, in es_read
       max_log_line = search.count()
     File 
"/home/airflow/.local/lib/python3.8/site-packages/elasticsearch_dsl/search.py", 
line 701, in count
       return es.count(index=self._index, body=d, **self._params)["count"]
     File 
"/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/client/utils.py",
 line 84, in _wrapped
       return func(*args, params=params, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/client/__init__.py",
 line 528, in count
       return self.transport.perform_request(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/transport.py", 
line 351, in perform_request
       status, headers_response, data = connection.perform_request(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py",
 line 261, in perform_request
       self._raise_error(response.status, raw_data)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/connection/base.py",
 line 181, in _raise_error
       raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
   elasticsearch.exceptions.AuthorizationException: AuthorizationException(403, 
'security_exception', 'no permissions for [indices:data/read/search] and User 
[name=airflow, backend_roles=[], request
   ```
   but when I debug and use this code, I can see the logs:
   ```
   es = elasticsearch.Elasticsearch(['...'], **es_kwargs)
   es.search(index="airflow-*", body=dsl)
   ```
   
   and when I look into the source code of elasticsearch providers there are no 
definition of the index-pattern on that
   
   
https://github.com/apache/airflow/blob/88199eefccb4c805f8d6527bab5bf600b397c35e/airflow/providers/elasticsearch/log/es_task_handler.py#L216
   
   so I assume the issue is insufficient permission to scan all the indices, 
therefore, how can I set the index-pattern so that Airflow only reads certain 
indices?
   Thank you!
   
   **What you expected to happen**: The Airflow configuration has option to add 
elasticsearch index pattern so that airflow only queries certain indices, not 
querying all indexes on the elasticsearch server
   
   **How to reproduce it**: Click log button on task popup modal to see logs 
page
   
   **Anything else we need to know**: Every time etc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to