leerho opened a new issue #9806:
URL: https://github.com/apache/druid/issues/9806


   ### Description
   
   Debugging Segments with sketches 
   
   ### Motivation
   
   Debugging Segments with sketches can be challenging as currently there is no 
simple way to extract the sketch binaries from the segment.  
   
   This idea was motivated by Issue #9736 where the segment with the sketches 
that were causing problems was provided to the DataSketches team, but without 
familiarity with the segment structure schema, it was all but impossible to 
analyze.  @gianm supported the idea of creating this issue for discussion.
   
   The 
[dump_segment_tool](https://druid.apache.org/docs/latest/operations/dump-segment.html)
 was partially helpful, but it presents the sketches as a column of Base64 
encodings in a JSON format.  But in order to understand the internal state of 
each sketch it needs to be binary form.  Once the sketches are in binary form, 
there are a number of tools in the DataSketches library that can be used to 
analyze the exact state of the sketches. 
   
   Sketches, by their nature are complex state machines.  Providing a tool like 
this will significantly enhance our ability to debug sketching problems in the 
future.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to