Re: [I] Sql lab + shillelagh does not return column values greater than 64. [superset]

via GitHub Wed, 11 Sep 2024 13:31:41 -0700


dosubot[bot] commented on issue #30245:
URL: https://github.com/apache/superset/issues/30245#issuecomment-2344646724


   <!-- Greeting -->
   @mibarannikov Hello! I'm here to help you with any bugs, questions, or 
contributions.
   
   <!-- Answer -->
   The issue you're encountering with the Shillelagh adapter not returning 
column values greater than 64 in JSON data sources might be related to specific 
configurations or limitations in the adapter's handling of JSON data.
   
   ### Potential Causes and Solutions:
   
   1. **Column Limit Configuration**:
      The Shillelagh adapter has a limit configuration that might be affecting 
the number of columns processed. The `get_data` method in the 
`SupersetShillelaghAdapter` class includes a limit configuration that fetches 
the `app_limit` from the application configuration `SUPERSET_META_DB_LIMIT`:
   
      ```python
      def get_data(
          self,
          bounds: dict[str, Filter],
          order: list[tuple[str, RequestedOrder]],
          limit: int | None = None,
          offset: int | None = None,
          **kwargs: Any,
      ) -> Iterator[Row]:
          app_limit: int | None = current_app.config["SUPERSET_META_DB_LIMIT"]
          if limit is None:
              limit = app_limit
          elif app_limit is not None:
              limit = min(limit, app_limit)
   
          query = self._build_sql(bounds, order, limit, offset)
   
          with self.engine_context() as engine:
              connection = engine.connect()
              rows = connection.execute(query)
              for i, row in enumerate(rows):
                  data = dict(zip(self.columns, row))
                  data["rowid"] = data[self._rowid] if self._rowid else i
                  yield data
      ```
      Ensure that the `SUPERSET_META_DB_LIMIT` is set to a value that 
accommodates the number of columns you need.
   
   2. **Column Inspection**:
      The `_set_columns` method in the `SupersetShillelaghAdapter` class 
inspects the table to get its columns during initialization. This method might 
be missing columns if there is a limit or issue during the inspection process:
   
      ```python
      def _set_columns(self) -> None:
          from superset.models.core import Database
   
          database = (
              
db.session.query(Database).filter_by(database_name=self.database).first()
          )
          if database is None:
              raise ProgrammingError(f"Database not found: {self.database}")
          self._allow_dml = database.allow_dml
   
          table = sql_parse.Table(self.table, self.schema, self.catalog)
          security_manager.raise_for_access(database=database, table=table)
   
          self.engine_context = partial(
              database.get_sqla_engine,
              catalog=self.catalog,
              schema=self.schema,
          )
   
          metadata = MetaData()
          with self.engine_context() as engine:
              try:
                  self._table = Table(
                      self.table,
                      metadata,
                      schema=self.schema,
                      autoload=True,
                      autoload_with=engine,
                  )
              except NoSuchTableError as ex:
                  raise ProgrammingError(f"Table does not exist: {self.table}") 
from ex
   
          primary_keys = [
              column for column in list(self._table.primary_key) if 
column.primary_key
          ]
          if len(primary_keys) == 1 and primary_keys[0].type.python_type == int:
              self._rowid = primary_keys[0].name
   
          self.columns = {
              column.name: self.get_field(column.type.python_type)
              for column in self._table.c
          }
      ```
   
   3. **Logging and Debugging**:
      Enable detailed logging to identify any issues during the data retrieval 
process. Adjust the logging level to `DEBUG` in the `superset/config.py` file:
   
      ```python
      LOG_LEVEL = logging.DEBUG
      TIME_ROTATE_LOG_LEVEL = logging.DEBUG
      ```
   
   4. **Check for Known Issues**:
      There might be known issues or bug reports related to handling large JSON 
objects or arrays in the Shillelagh adapter. Ensure you have the latest version 
of the adapter and check the repository for any related issues.
   
   By addressing these potential causes, you should be able to resolve the 
issue with column values greater than 64 being null in JSON data sources. If 
the problem persists, further investigation into the specific configurations 
and limitations of the Shillelagh adapter may be necessary.
   
   
   <!-- Dosu Comment Footer -->
   *To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Sql lab + shillelagh does not return column values ​​greater than 64. [superset]

Reply via email to

Re: [I] Sql lab + shillelagh does not return column values greater than 64. [superset]