prodeezy commented on issue #617:
URL: https://github.com/apache/iceberg/issues/617#issuecomment-655595194


   Posting the conversation from last sync.. 
   
   
   > Gautam: how are others monitoring dataset health?  Discussing additional 
metrics about datasets, like number of manifests, number of data files, etc.
   > Ryan: What is the use case?
   > Gautam: We want to know when tables are misconfigured or need attention. 
For example, when commits take too long due to retries, or when scan planning 
takes a long time. Are current stats enough?
   > Ryan: That would be useful, we don’t currently do much to keep track of 
those. I think the current approach of emitting events to listeners is good, 
but we will need to add more data to those events, like how long a commit or 
planning run took. We can also start tracking total number of metadata files in 
snapshot summaries to emit with the events for context.
   > Edgar: Should we add similar listeners to the catalogs?
   > Ryan: That’s a good idea, so you could know how long it took to create or 
drop a table.
   
   
   
   So seems like we want to stick to registering listeners. If one wants they 
can use a codahale metrics client to bubble up any metrics. does this sound 
good @raptond ? In that regard, we need some additional metrics that need to be 
bubbled up in the snapshot summary/scan events etc. I'l create an issue for 
that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to