paul-rogers commented on issue #7124:
URL: https://github.com/apache/druid/issues/7124#issuecomment-893938762


   +1 for this feature. As noted, without this, I cannot really tell how much 
space a column consumes and whether it is worth the cost. I suppose I could 
infer this by creating a new table without the column, and comparing the 
difference, but doing so is clearly a bit of a hassle.
   
   The number returned should account for all the space dedicated to the 
column, including any dictionary overhead and run-length encoding or whatever. 
Would be wonderful to have separate numbers for in-memory and on-disk, if they 
are vastly different for some reason.
   
   The key bit we want to know is the cost of column X relative to the overall 
table size. So, as long as the in-memory and on-disk sizes are proportional, 
having one size is good enough (if it is accurate.)
   
   A good check would be that the sum of column sizes (per segment) should 
more-or-less equal the segment size, aside from any segment overhead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to