ankitsultana commented on issue #13760:
URL: https://github.com/apache/pinot/issues/13760#issuecomment-2370935188

   @gortiz thanks Gonzalo for sharing your views. There are several points made 
by you here and in the design document, but I think it's best to first 
establish the broader picture in which we are doing this.
   
   As you know, most companies have been using Pinot for OLAP on mainly 
"events". The idea being that you can power user-facing dashboards or other 
end-user App UI elements like counters based on Realtime data with low latency.
   
   At least at Uber, over the last 2 years, we have seen an increase in the 
number of users who are using Pinot for alerting and monitoring ("metrics"). 
And recently we also shared in the Uber Meetup that we are now using Pinot for 
our logging platform too. (side-note: Pinot specifically solves the problem of 
High Cardinality Metrics)
   
   The term being used these days to cover this landscape of use-cases is 
[MELT](https://www.youtube.com/watch?v=CgrDLykZ21I): Metrics, Events, Logs and 
Traces.
   
   Pinot is able to solve all of these problems, but its capability to solve 
all of these problems differs. What our proposal aims to do is to immediately 
improve Pinot's ability to tackle the Metrics and the Logging problem, and IMO 
bring it up to the state of the art in the HICAM space. (note that CH also 
announced in their latest release that they [plan to support 
PromQL](https://clickhouse.com/blog/clickhouse-release-24-08)).
   
   re: your proposal about Streaming queries in Pinot, I think it's talking 
about a different problem.
   
   re: your specific points about this not being necessary and we can leverage 
the MSE or add enable some SQL extensions, I shared the transformNull (gapfill) 
example in the doc which should help clear that up. For other readers, see the 
comment thread [at the bottom of this 
doc](https://docs.google.com/document/d/1SBDDf71QZINYUjAbRSWguNMfbrWRGfdcF1JPi8SJZlM/edit)
   
   Finally, I think this time-series engine may stick out like a sore thumb 
right now, but once we integrate it with the Multistage Engine Shuffle 
Framework, I think it will sit very neatly with the rest of the code. And from 
a use-case perspective, it takes Pinot forward and expands its capabilities 
around use-cases that it theoretically can already support, but can't support 
as well. (side-note: from a high-level, I am thinking in the direction that we 
will have a common Operator interface in the SPI that will be oblivious of the 
data-model: relational, time-series, etc.)
   
   A potential future direction is that we may make the entire engine (above 
the V1 server level engine) pluggable, but it might make sense to consider that 
after making sure that the time-series engine becomes a success first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to