harshgadhia opened a new issue #17347:
URL: https://github.com/apache/superset/issues/17347


   
   I have installed 
[elasticsearch-dbapi](https://github.com/preset-io/elasticsearch-dbapi) 
library, and I have setup a connection to AWS elasticsearch [running Kibana 
version 6.8.0] with the following connection string:
   ```
   
odelasticsearch+https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/
   ```
   Superset is able to successfully connect to the ES endpoint, and get the 
list of indexes correctly. As I see that the dropdown for tables is correctly 
populated. However, its not able to parse index metadata and run SQL query on 
ES:
   
   Below I have provided as much details as possible for the issues relating to 
   
   ### 1. Superset is not able to parse the index metadata:
   
   -  When I select any of the indexes (or table schema as called in the 
superset UI) from the list, I get a UI error at the bottom. 
   - ERROR **An error occurred while fetching table metadata**
   -  Presumably this is because, SuperSet is not able to parse the index 
metadata coming from the AWS elastic search endpoint.
   - I have verified using curl command that the ES server is responding with 
data. This is the same endpoint that superset is hitting (see logs below).
   ```
   curl --location --request GET 
'https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/<INDEX-NAME>/_mapping?format=json'
   ```
   #### Error
   File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 
236, in get_valid_columns
       response[index_real_name]["mappings"]["properties"], []
   KeyError: 'properties'
   
   
   #### Application Logs
   
   ```bash
   2021-11-04 20:08:21,317:INFO:elasticsearch:GET 
https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/dummy_index_alias/_mapping?format=json
 [status:200 request:0.026s]
   2021-11-04 20:08:21,317:DEBUG:elasticsearch:> None
   2021-11-04 20:08:21,317:DEBUG:elasticsearch:< 
{"dummy_app":{"mappings":{"dummy_app":{"properties":{"param1":{"type":"boolean"},"param2":{"type":"float"},"param3":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"param4":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"param5":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"param6":{"type":"date"},"deleted":{"type":"boolean"},"event_time":{"type":"date"},"gwGen":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}}}
   2021-11-04 20:08:21,318:ERROR:root:'properties'
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.7/site-packages/flask_appbuilder/api/__init__.py", line 
85, in wraps
       return f(self, *args, **kwargs)
     File "/app/superset/views/base_api.py", line 85, in wraps
       raise ex
     File "/app/superset/views/base_api.py", line 82, in wraps
       duration, response = time_function(f, self, *args, **kwargs)
     File "/app/superset/utils/core.py", line 1429, in time_function
       response = func(*args, **kwargs)
     File "/app/superset/utils/log.py", line 241, in wrapper
       value = f(*args, **kwargs)
     File "/app/superset/databases/api.py", line 517, in table_metadata
       table_info = get_table_metadata(database, table_name, schema_name)
     File "/app/superset/databases/utils.py", line 66, in get_table_metadata
       columns = database.get_columns(table_name, schema_name)
     File "/app/superset/models/core.py", line 650, in get_columns
       return self.db_engine_spec.get_columns(self.inspector, table_name, 
schema)
     File "/app/superset/db_engine_specs/base.py", line 887, in get_columns
       return inspector.get_columns(table_name, schema)
     File 
"/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 
391, in get_columns
       self.bind, table_name, schema, info_cache=self.info_cache, **kw
     File "/usr/local/lib/python3.7/site-packages/es/opendistro/sqlalchemy.py", 
line 60, in get_columns
       result = connection.execute(query)
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", 
line 2235, in execute
       return connection.execute(statement, *multiparams, **params)
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", 
line 1003, in execute
       return self._execute_text(object_, multiparams, params)
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", 
line 1178, in _execute_text
       parameters,
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", 
line 1317, in _execute_context
       e, statement, parameters, cursor, context
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", 
line 1514, in _handle_dbapi_exception
       util.raise_(exc_info[1], with_traceback=exc_info[2])
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", 
line 182, in raise_
       raise exception
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", 
line 1277, in _execute_context
       cursor, statement, parameters, context
     File 
"/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 
593, in do_execute
       cursor.execute(statement, parameters)
     File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 37, in 
wrap
       return f(self, *args, **kwargs)
     File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 
275, in execute
       return self.get_valid_columns(re_table_name[1])
     File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 
236, in get_valid_columns
       response[index_real_name]["mappings"]["properties"], []
   KeyError: 'properties'
   ```
   
   
   ### 2. SuperSet gets 400 Bad Request error from ES endpoint, when running a 
SQL query for ES index.
   
   - When running a SQL query in the SQL Lab editor, we get the error as shown 
in the below screenshot.
   - The issue seems to be in superset, as it is sending a malformed POST 
request to the ES server.
   - I have verified using curl, that the ES server is responding correctly 
with appropriate data.
   - Below is the endpoint hit by SuperSet, which is responding with 400 Bad 
request error.
   ```
   https://search-SOME-CLUSTER.us-west-2.es.amazonaws.com:443/_opendistro/_sql'
   ```
   - Using curl, I verified response OK from ES server
   ```
   curl --location --request POST 
'https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/_opendistro/_sql'
 \
   --header 'Content-Type: application/json' \
   --data-raw '{
    "query": "SELECT * FROM dummy_index where some_id = 1 order by id DESC 
limit 1"
   }'
   ```
   
   
   #### Application Logs
   
   ```bash
   Query 118: Set query to 'running'
   [2021-11-04 20:48:03,858: INFO/ForkPoolWorker-14] Query 118: Set query to 
'running'
   Query 118: Running statement 1 out of 1
   [2021-11-04 20:48:03,941: INFO/ForkPoolWorker-14] Query 118: Running 
statement 1 out of 1
   [2021-11-04 20:48:04,000: WARNING/ForkPoolWorker-14] POST 
https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/_opendistro/_sql/ 
[status:400 request:0.021s]
   Query 118: <class 'es.exceptions.ProgrammingError'>
   Traceback (most recent call last):
     File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 319, in 
elastic_query
       response = self.es.transport.perform_request("POST", path, body=payload)
     File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", 
line 415, in perform_request
       raise e
     File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", 
line 388, in perform_request
       timeout=timeout,
     File 
"/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py",
 line 277, in perform_request
       self._raise_error(response.status, raw_data)
     File 
"/usr/local/lib/python3.7/site-packages/elasticsearch/connection/base.py", line 
331, in _raise_error
       status_code, error_message, additional_info
   elasticsearch.exceptions.RequestError: RequestError(400, 
'IllegalArgumentException')
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
   
     File "/app/superset/sql_lab.py", line 266, in execute_sql_statement
       db_engine_spec.execute(cursor, sql, async_=True)
     File "/app/superset/db_engine_specs/base.py", line 1094, in execute
       raise cls.get_dbapi_mapped_exception(ex)
     File "/app/superset/db_engine_specs/base.py", line 1092, in execute
       cursor.execute(query)
     File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 37, in 
wrap
       return f(self, *args, **kwargs)
     File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 
278, in execute
       results = self.elastic_query(query)
     File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 323, in 
elastic_query
       raise exceptions.ProgrammingError(f"Error ({ex.error}): {ex.info}")
   es.exceptions.ProgrammingError: Error (IllegalArgumentException): {'error': 
{'reason': 'Invalid SQL query', 'details': 'Failed to parse request payload', 
'type': 'IllegalArgumentException'}, 'status': 400}
   [2021-11-04 20:48:04,004: ERROR/ForkPoolWorker-14] Query 118: <class 
'es.exceptions.ProgrammingError'>
   
   Traceback (most recent call last):
     File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 319, in 
elastic_query
       response = self.es.transport.perform_request("POST", path, body=payload)
     File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", 
line 415, in perform_request
       raise e
     File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", 
line 388, in perform_request
       timeout=timeout,
     File 
"/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py",
 line 277, in perform_request
       self._raise_error(response.status, raw_data)
     File 
"/usr/local/lib/python3.7/site-packages/elasticsearch/connection/base.py", line 
331, in _raise_error
       status_code, error_message, additional_info
   elasticsearch.exceptions.RequestError: RequestError(400, 
'IllegalArgumentException')
   ```
   
   #### How to reproduce the bug
   
   1. As mentioned above, setup a connection to Elasticsearch from SuperSet
   2. For seeing first error: Click on any index name on the left side in the 
SQL Lab editor, it will result into error as shown in screenshot 1 below.
   3. For seeing second error, Run a SQL query as explained in problem 2 above, 
and it will result into error as shown in screenshot 2 below. 
   
   ### Expected results
   
   For problem 1, superset should correctly parse the index metadata.
   For problem 2, superset should correctly form the query request for ES 
server.
   
   ### Actual results
   
   For problem 1: See Screenshot 1
   For problem 2: See Screenshot 2
   
   
   #### Screenshots
   
   For Problem 1:
   
![image](https://user-images.githubusercontent.com/10794287/140436093-117efe06-fad7-4438-93e8-74f4007925a7.png)
   
   
   For problem 2:
   
   Run query like: `SELECT * FROM dummy_index where some_id = 1 order by id 
DESC limit 1`
   
   
![error2](https://user-images.githubusercontent.com/10794287/140437016-cc693de4-eb05-4b40-ab90-052e71a9f741.png)
   
   ### Environment
   
   - browser type and version: Google Chrome [Version 95.0.4638.69 (Official 
Build) (x86_64)]
   - superset version: `1.3.1`
   - python version: `3.7`
   - node.js version: `node -v`
   - any feature flags active: None
   - pip elasticsearch-dbapi version: `0.2.6`
   - pip elasticsearch version: `7.13.4`
   - Kibana version: `6.8.0`
   
   ### Checklist
   
   Make sure to follow these steps before submitting your issue - thank you!
   
   - [x] I have checked the superset logs for python stacktraces and included 
it here as text if there are any.
   - [x] I have reproduced the issue with at least the latest released version 
of superset.
   - [x] I have checked the issue tracker for the same issue and I haven't 
found one similar.
   
   ### Additional context
   I looked into a similar issue described here [Trouble connecting to AWS 
OpenSearch via 
Superset](https://github.com/preset-io/elasticsearch-dbapi/issues/70), but my 
issue seems to be different from theirs. For me, It seems that the connection 
to Elastic search is succeeding. However, there are issues with parsing of the 
data / forming right POST request in the library.
   Any help is greatly appreciated. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to