[ https://issues.apache.org/jira/browse/HBASE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107770#comment-17107770 ]
Rushabh Shah commented on HBASE-17756: -------------------------------------- > I could do some ground work since have idea where to plug in and how to test >and could then hand over to you [~shahrs87] if that'd help? [~stack] Sounds like a good idea. Will wait for your green signal then. > We should have better introspection of HFiles > --------------------------------------------- > > Key: HBASE-17756 > URL: https://issues.apache.org/jira/browse/HBASE-17756 > Project: HBase > Issue Type: Brainstorming > Components: HFile > Reporter: Esteban Gutierrez > Assignee: Rushabh Shah > Priority: Major > > [~saint....@gmail.com] was suggesting to use DataSketches > (https://datasketches.github.io) in order to write additional statistics to > the HFiles. This could be used to improve our split decisions, > troubleshooting or potentially do other interesting analysis without having > to perform full table scans. The statistics could be stored as part of the > HFile but we could initially improve the visibility of the data by adding > some statistics to HFilePrettyPrinter. -- This message was sent by Atlassian Jira (v8.3.4#803005)