harshgadhia opened a new issue #17347: URL: https://github.com/apache/superset/issues/17347
I have installed [elasticsearch-dbapi](https://github.com/preset-io/elasticsearch-dbapi) library, and I have setup a connection to AWS elasticsearch [running Kibana version 6.8.0] with the following connection string: ``` odelasticsearch+https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/ ``` Superset is able to successfully connect to the ES endpoint, and get the list of indexes correctly. As I see that the dropdown for tables is correctly populated. However, its not able to parse index metadata and run SQL query on ES: Below I have provided as much details as possible for the issues relating to ### 1. Superset is not able to parse the index metadata: - When I select any of the indexes (or table schema as called in the superset UI) from the list, I get a UI error at the bottom. - ERROR **An error occurred while fetching table metadata** - Presumably this is because, SuperSet is not able to parse the index metadata coming from the AWS elastic search endpoint. - I have verified using curl command that the ES server is responding with data. This is the same endpoint that superset is hitting (see logs below). ``` curl --location --request GET 'https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/<INDEX-NAME>/_mapping?format=json' ``` #### Error File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 236, in get_valid_columns response[index_real_name]["mappings"]["properties"], [] KeyError: 'properties' #### Application Logs ```bash 2021-11-04 20:08:21,317:INFO:elasticsearch:GET https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/dummy_index_alias/_mapping?format=json [status:200 request:0.026s] 2021-11-04 20:08:21,317:DEBUG:elasticsearch:> None 2021-11-04 20:08:21,317:DEBUG:elasticsearch:< {"dummy_app":{"mappings":{"dummy_app":{"properties":{"param1":{"type":"boolean"},"param2":{"type":"float"},"param3":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"param4":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"param5":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"param6":{"type":"date"},"deleted":{"type":"boolean"},"event_time":{"type":"date"},"gwGen":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}}} 2021-11-04 20:08:21,318:ERROR:root:'properties' Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/flask_appbuilder/api/__init__.py", line 85, in wraps return f(self, *args, **kwargs) File "/app/superset/views/base_api.py", line 85, in wraps raise ex File "/app/superset/views/base_api.py", line 82, in wraps duration, response = time_function(f, self, *args, **kwargs) File "/app/superset/utils/core.py", line 1429, in time_function response = func(*args, **kwargs) File "/app/superset/utils/log.py", line 241, in wrapper value = f(*args, **kwargs) File "/app/superset/databases/api.py", line 517, in table_metadata table_info = get_table_metadata(database, table_name, schema_name) File "/app/superset/databases/utils.py", line 66, in get_table_metadata columns = database.get_columns(table_name, schema_name) File "/app/superset/models/core.py", line 650, in get_columns return self.db_engine_spec.get_columns(self.inspector, table_name, schema) File "/app/superset/db_engine_specs/base.py", line 887, in get_columns return inspector.get_columns(table_name, schema) File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 391, in get_columns self.bind, table_name, schema, info_cache=self.info_cache, **kw File "/usr/local/lib/python3.7/site-packages/es/opendistro/sqlalchemy.py", line 60, in get_columns result = connection.execute(query) File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2235, in execute return connection.execute(statement, *multiparams, **params) File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1003, in execute return self._execute_text(object_, multiparams, params) File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1178, in _execute_text parameters, File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context e, statement, parameters, cursor, context File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception util.raise_(exc_info[1], with_traceback=exc_info[2]) File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_ raise exception File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context cursor, statement, parameters, context File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute cursor.execute(statement, parameters) File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 37, in wrap return f(self, *args, **kwargs) File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 275, in execute return self.get_valid_columns(re_table_name[1]) File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 236, in get_valid_columns response[index_real_name]["mappings"]["properties"], [] KeyError: 'properties' ``` ### 2. SuperSet gets 400 Bad Request error from ES endpoint, when running a SQL query for ES index. - When running a SQL query in the SQL Lab editor, we get the error as shown in the below screenshot. - The issue seems to be in superset, as it is sending a malformed POST request to the ES server. - I have verified using curl, that the ES server is responding correctly with appropriate data. - Below is the endpoint hit by SuperSet, which is responding with 400 Bad request error. ``` https://search-SOME-CLUSTER.us-west-2.es.amazonaws.com:443/_opendistro/_sql' ``` - Using curl, I verified response OK from ES server ``` curl --location --request POST 'https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/_opendistro/_sql' \ --header 'Content-Type: application/json' \ --data-raw '{ "query": "SELECT * FROM dummy_index where some_id = 1 order by id DESC limit 1" }' ``` #### Application Logs ```bash Query 118: Set query to 'running' [2021-11-04 20:48:03,858: INFO/ForkPoolWorker-14] Query 118: Set query to 'running' Query 118: Running statement 1 out of 1 [2021-11-04 20:48:03,941: INFO/ForkPoolWorker-14] Query 118: Running statement 1 out of 1 [2021-11-04 20:48:04,000: WARNING/ForkPoolWorker-14] POST https://vpc-some-search-domain.us-east-1.es.amazonaws.com:443/_opendistro/_sql/ [status:400 request:0.021s] Query 118: <class 'es.exceptions.ProgrammingError'> Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 319, in elastic_query response = self.es.transport.perform_request("POST", path, body=payload) File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", line 415, in perform_request raise e File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", line 388, in perform_request timeout=timeout, File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 277, in perform_request self._raise_error(response.status, raw_data) File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/base.py", line 331, in _raise_error status_code, error_message, additional_info elasticsearch.exceptions.RequestError: RequestError(400, 'IllegalArgumentException') During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/superset/sql_lab.py", line 266, in execute_sql_statement db_engine_spec.execute(cursor, sql, async_=True) File "/app/superset/db_engine_specs/base.py", line 1094, in execute raise cls.get_dbapi_mapped_exception(ex) File "/app/superset/db_engine_specs/base.py", line 1092, in execute cursor.execute(query) File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 37, in wrap return f(self, *args, **kwargs) File "/usr/local/lib/python3.7/site-packages/es/opendistro/api.py", line 278, in execute results = self.elastic_query(query) File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 323, in elastic_query raise exceptions.ProgrammingError(f"Error ({ex.error}): {ex.info}") es.exceptions.ProgrammingError: Error (IllegalArgumentException): {'error': {'reason': 'Invalid SQL query', 'details': 'Failed to parse request payload', 'type': 'IllegalArgumentException'}, 'status': 400} [2021-11-04 20:48:04,004: ERROR/ForkPoolWorker-14] Query 118: <class 'es.exceptions.ProgrammingError'> Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/es/baseapi.py", line 319, in elastic_query response = self.es.transport.perform_request("POST", path, body=payload) File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", line 415, in perform_request raise e File "/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py", line 388, in perform_request timeout=timeout, File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 277, in perform_request self._raise_error(response.status, raw_data) File "/usr/local/lib/python3.7/site-packages/elasticsearch/connection/base.py", line 331, in _raise_error status_code, error_message, additional_info elasticsearch.exceptions.RequestError: RequestError(400, 'IllegalArgumentException') ``` #### How to reproduce the bug 1. As mentioned above, setup a connection to Elasticsearch from SuperSet 2. For seeing first error: Click on any index name on the left side in the SQL Lab editor, it will result into error as shown in screenshot 1 below. 3. For seeing second error, Run a SQL query as explained in problem 2 above, and it will result into error as shown in screenshot 2 below. ### Expected results For problem 1, superset should correctly parse the index metadata. For problem 2, superset should correctly form the query request for ES server. ### Actual results For problem 1: See Screenshot 1 For problem 2: See Screenshot 2 #### Screenshots For Problem 1:  For problem 2: Run query like: `SELECT * FROM dummy_index where some_id = 1 order by id DESC limit 1`  ### Environment - browser type and version: Google Chrome [Version 95.0.4638.69 (Official Build) (x86_64)] - superset version: `1.3.1` - python version: `3.7` - node.js version: `node -v` - any feature flags active: None - pip elasticsearch-dbapi version: `0.2.6` - pip elasticsearch version: `7.13.4` - Kibana version: `6.8.0` ### Checklist Make sure to follow these steps before submitting your issue - thank you! - [x] I have checked the superset logs for python stacktraces and included it here as text if there are any. - [x] I have reproduced the issue with at least the latest released version of superset. - [x] I have checked the issue tracker for the same issue and I haven't found one similar. ### Additional context I looked into a similar issue described here [Trouble connecting to AWS OpenSearch via Superset](https://github.com/preset-io/elasticsearch-dbapi/issues/70), but my issue seems to be different from theirs. For me, It seems that the connection to Elastic search is succeeding. However, there are issues with parsing of the data / forming right POST request in the library. Any help is greatly appreciated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
