EBoisseauSierra opened a new issue #15036:
URL: https://github.com/apache/superset/issues/15036


   ## Issue
   
   I have a simple metric (say: `SUM(orders_count)`), and I would like to track 
its evolution over time. My objective is to visualize the total for different 
time granularity (group by hour, day, week, etc.).
   
   However, I don't always have records for each time bucket (e.g. I didn't had 
any order between 2 and 3am, or on Sunday, etc.). In this situation, **I want 
to plot `0` for each time bucket I had no data for**.
   
   In that situation, the line chart simply omit the given time bucket and 
interpolate the line given the previous and next data point:
   
   ![Screenshot from 2021-06-08 
09-26-46_shadow](https://user-images.githubusercontent.com/37387755/121152480-27218300-c83d-11eb-8394-798110b73378.png)
   
   I am aware that I can use `pandas` resampling methods to actually plot a `0` 
data point on each given hour I had no order:
   
   ![Screenshot from 2021-06-08 
09-27-10_shadow](https://user-images.githubusercontent.com/37387755/121153236-cc3c5b80-c83d-11eb-8b04-e74029efad46.png)
   
   However, this workaround doesn't “follow” the time granularity I aggregate 
data on:
   
   ![Screenshot from 2021-06-08 
09-27-37_shadow](https://user-images.githubusercontent.com/37387755/121154367-d6128e80-c83e-11eb-85e7-000374fe2daf.png)
   
   This means that each time I want to update the granularity, I have to modify 
it at two different places to get the correct graph. 
   
   ![Screenshot from 2021-06-08 
09-27-55_shadow](https://user-images.githubusercontent.com/37387755/121155016-5802b780-c83f-11eb-9243-b9b1d74fc1cf.png)
   
   Moreover, this makes that I cannot pass the time granularity as a parameter 
from a native filter.
   
   ## Requested feature
   
   I would like a simple option to replace missing values (i.e. time buckets 
with no record to aggregate) with either:
   
   * linear interpolation (current behaviour),
   * nothing (can be emulated via `pandas.resample(<granularity>, mean)`), 
   * 0 (can be emulated via `pandas.resample(<granularity>, sum)`).
   
   Metabase uses a simple dropdown menu for that:
   
   ![Screenshot from 2021-06-08 
11-02-27_shadow.png](https://user-images.githubusercontent.com/37387755/121167244-a917a900-c849-11eb-9086-7d76927e2a54.png)
   
   
   ## Alternatives
   
   As seen above, using `pandas.resample` is solving the issue partially only, 
as it doesn't dynamically adjust to the selected time granularity.
   
   One could of course write a custom query that joins the list of every single 
hour between `min(timestamp)` and `max(timestamp)` to force generate records 
for these time buckets… but it's a lot of work — and again wouldn't be 
dynamically adapting to different time grains.
   
   ## Context
   
   Examples generated on Superset 1.1.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to