villebro commented on code in PR #23476:
URL: https://github.com/apache/superset/pull/23476#discussion_r1158341466
##########
superset/common/query_context.py:
##########
@@ -86,10 +86,9 @@ def __init__(
self._processor = QueryContextProcessor(self)
def get_data(
- self,
- df: pd.DataFrame,
+ self, df: pd.DataFrame, coltypes: List[str]
Review Comment:
I believe `coltypes` is incorrectly typed here; they're `GenericDataType`,
not `str`:
```suggestion
self, df: pd.DataFrame, coltypes: List[GenericDataType]
```
##########
superset/common/query_context_processor.py:
##########
@@ -447,7 +448,9 @@ def processing_time_offsets( # pylint:
disable=too-many-locals,too-many-stateme
rv_df = pd.concat(rv_dfs, axis=1, copy=False) if time_offsets else df
return CachedTimeOffset(df=rv_df, queries=queries,
cache_keys=cache_keys)
- def get_data(self, df: pd.DataFrame) -> Union[str, List[Dict[str, Any]]]:
+ def get_data(
+ self, df: pd.DataFrame, coltypes: List[str]
Review Comment:
Same here:
```suggestion
self, df: pd.DataFrame, coltypes: List[GenericDataType]
```
##########
superset/common/query_context_processor.py:
##########
@@ -461,7 +464,17 @@ def get_data(self, df: pd.DataFrame) -> Union[str,
List[Dict[str, Any]]]:
df, index=include_index, **config["CSV_EXPORT"]
)
elif self._query_context.result_format ==
ChartDataResultFormat.XLSX:
- result = excel.df_to_excel(df, **config["EXCEL_EXPORT"])
+ ndf = df.copy()
+ columns = list(df)
+ i = 0
+ for col in columns:
+ coltype = None
+ if i < len(coltypes):
+ coltype = coltypes[i]
+ if coltype == GenericDataType.NUMERIC:
+ ndf[col] = pd.to_numeric(df[col], errors="ignore")
+ i += 1
Review Comment:
Do we really need to be prepared for a case where `len(coltypes !=
len(columns)`? AFAIK this should never happen, and if it does, I wouldn't be
comfortable setting column types based on the respective index in `coltypes`..
##########
superset/common/query_context_processor.py:
##########
@@ -461,7 +464,17 @@ def get_data(self, df: pd.DataFrame) -> Union[str,
List[Dict[str, Any]]]:
df, index=include_index, **config["CSV_EXPORT"]
)
elif self._query_context.result_format ==
ChartDataResultFormat.XLSX:
- result = excel.df_to_excel(df, **config["EXCEL_EXPORT"])
+ ndf = df.copy()
+ columns = list(df)
+ i = 0
+ for col in columns:
+ coltype = None
+ if i < len(coltypes):
+ coltype = coltypes[i]
+ if coltype == GenericDataType.NUMERIC:
+ ndf[col] = pd.to_numeric(df[col], errors="ignore")
Review Comment:
Maybe we can leave this for a follow-up if it's needed, but do you think it
would make sense to apply temporal formatting for `GenericDataType.TEMPORAL`?
```suggestion
if coltype == GenericDataType.NUMERIC:
ndf[col] = pd.to_numeric(df[col],
errors="ignore")
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]