yurmix commented on issue #6320: Moving Average query type
URL: 
https://github.com/apache/incubator-druid/issues/6320#issuecomment-449161171
 
 
   @b-slim :
   
   > In your example i see that the first data point has <...> not sure what am 
missing here but isn't for the first data point `delta30Min` should be equal to 
`trailing30MinChanges`
   
   The formula for the `longMean` averager is roughtly `sum(range(n))/n`, where 
`n` is the number of buckets. 
   Here's the computation for the first record:
   `trailing30MinChange = sum(delta30Min[i-6..i0])/7 = 
sum(delta30Min[i0..i0])/7 = sum(delta30Min[i0])/7 = 30490/7 = 
4355.714285714285`.
   
   There are two happening regarding to `intervals` which I should clarify:
   
   * movingAverage takes the `intervals` defined by the user and expands with 
an offset it when running groupBy query. This allows the averager to iterate 
over all N buckets contributing to a record, even for periods not part of the 
original `intervals`. The additional time range will not be displayed but its 
data will contribute to the aggregator result. This is done 
[here](https://github.com/apache/incubator-druid/pull/6430/files#diff-41837c2a60693bdeee35530e5dcfc6d5R109)).
   
   * When querying the datasource's "beginning of time", there will be a 
noticable drop of value for the first few rows, as you noticed in your comment. 
This is because the prior buckets for these records are empty.
   
   To clarify, here's an example of a datasource and a result:
   
   Datasource:
   
   date       | metric
   -----------|-------
   12/23/2017 | 3     
   12/24/2017 | 3     
   12/25/2017 | 3     
   12/26/2017 | 3     
   12/27/2017 | 3     
   12/28/2017 | 3     
   12/29/2017 | 3     
   12/30/2017 | 3     
   12/31/2017 | 3     
   01/01/2018 | 9     
   01/02/2018 | 9     
   01/03/2018 | 9     
   01/04/2018 | 9     
   01/05/2018 | 9     
   01/06/2018 | 9     
   01/07/2018 | 9     
   01/08/2018 | 9     
   01/09/2018 | 9     
   
   Query #1 - Last 7 days averager, Interval Jan-2018:
   `{"queryType": "movingAverage", "granularity" :{"period":"P1D"}, 
"intervals": ["2018-01-01/2018-01-09"], "averagers": {"buckets":7}}`
   (I skipped a few parameters here).
   
   Result #1:
   
   date       | metric | averager 
   -----------|--------|----------
   01/01/2018 | 9      | 3.9      
   01/02/2018 | 9      | 4.7      
   01/03/2018 | 9      | 5.6      
   01/04/2018 | 9      | 6.4      
   01/05/2018 | 9      | 7.3      
   01/06/2018 | 9      | 8.1      
   01/07/2018 | 9      | 9.0      
   01/08/2018 | 9      | 9.0      
   01/09/2018 | 9      | 9.0      
   
   Query #2 - Last 7 days averager, All time:
   `{"queryType": "movingAverage", "granularity" :{"period":"P1D"}, 
"intervals": ["2017-01-23/2018-01-09"], "averagers": {"buckets":7}}`
   
   Result #2:
   
   date       | metric | averager 
   -----------|--------|----------
   12/23/2017 | 3      | 0.4      
   12/24/2017 | 3      | 0.9      
   12/25/2017 | 3      | 1.3      
   12/26/2017 | 3      | 1.7      
   12/27/2017 | 3      | 2.1      
   12/28/2017 | 3      | 2.6      
   12/29/2017 | 3      | 3.0      
   12/30/2017 | 3      | 3.0      
   12/31/2017 | 3      | 3.0      
   01/01/2018 | 9      | 3.9      
   01/02/2018 | 9      | 4.7      
   01/03/2018 | 9      | 5.6      
   01/04/2018 | 9      | 6.4      
   01/05/2018 | 9      | 7.3      
   01/06/2018 | 9      | 8.1      
   01/07/2018 | 9      | 9.0      
   01/08/2018 | 9      | 9.0      
   01/09/2018 | 9      | 9.0      
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to