mbcsa opened a new issue, #20919:
URL: https://github.com/apache/superset/issues/20919

   Export to CSV with cached data is converted to Panda Dataframe without 
column formats.
   Then this breaks the usage of CSV_EXPORT option 'sep' (decimal separator).
   
   The first time the data is exported, all works well and decimal separator is 
respected.
   
   But when the query is executed again during the Cache Timeout, the data is 
gathered from "results_backend" and a Dataframe is dynamically created. This 
way, the Dataframe doesn't have column format specifications
   
   It was working fine until merged commit 
e1fd90697c1ed4f72e7982629779783ad9736a47
   https://github.com/apache/superset/pull/20760
   
   In file superset/core.py, line dtype=object is setting "object" type for all 
columns of Dataframe.
   
   ```
   def csv(...)
       ...
       df = pd.DataFrame(
           data=obj["data"],
           dtype=object,
           columns=[c["name"] for c in obj["columns"]],
       )
       ...
   ```
   When removing line "dtype=object,", the CSV works correctly:
   ```
   def csv(...)
       ...
       df = pd.DataFrame(
           data=obj["data"],
           columns=[c["name"] for c in obj["columns"]],
       )
       ...
   ```
   
   #### How to reproduce the bug
   
   1. In superset_config.py configure CSV_EXPORT options:
   ```
   CSV_EXPORT = {
       'encoding': 'utf-8',
       'sep': ';',
       'decimal': ',',
   }
   ```
   2. Go to /superset/sqllab/
   5. Create a NEW SQL query having a Decimal / Float / Real column.
   For Example:
   
![image](https://user-images.githubusercontent.com/92950610/181808295-b7d3b433-857e-4307-ab58-70b87d1099be.png)
   7. Export results to CSV
   8. Note that exported CSV file is correctly formed with configured decimal 
separator ","
   9. Execute again the SAME SQL, pressing Run button
   10. Export results to CSV AGAIN
   11. Note that exported CSV file has a point "." for decimal separator, 
instead of ",".
   
   ### Expected results
   
   Export to CSV to use configured decimal separator, either is using Cached 
data or not.
   
   ### Actual results
   
   OK - Export to CSV is using configured decimal separator when data is 
comming without caching, directly from DB.
   FAIL - Expor to CSV is NOT using configured decimal separator when data is 
comming from "results_backend" CACHE. 
   
   #### Screenshots
   
   Pandas Dataframe when cached
   
   
![image](https://user-images.githubusercontent.com/92950610/181809289-c527f913-0758-4aaf-b25a-306c58460fd9.png)
   
   Pandas Dataframe when NOT cached
   
   
![image](https://user-images.githubusercontent.com/92950610/181809345-4bf545a9-fb64-41d0-9f2b-f7c1ed2d0a59.png)
   
   
   ### Environment
   - browser type and version: Chrome / Brave / Firefox
   - superset version: Docker `Superset 0.0.0dev`
   - docker build rusackas Thu Jul 28 17:36:00 UTC 2022
   - any feature flags active:
       "ALERT_REPORTS": True,
       "ENABLE_TEMPLATE_PROCESSING": True
   
   ### Checklist
   
   Make sure to follow these steps before submitting your issue - thank you!
   
   - [x] I have checked the superset logs for python stacktraces and included 
it here as text if there are any.
   - [x] I have reproduced the issue with at least the latest released version 
of superset.
   - [x] I have checked the issue tracker for the same issue and I haven't 
found one similar.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to