[ https://issues.apache.org/jira/browse/PHOENIX-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237524#comment-14237524 ]
James Taylor commented on PHOENIX-1453: --------------------------------------- [~ramkrishna] - one other thought I had. It'd be fine to keep a single ROW_COUNT on each row of the status table that is the total number of rows for that region. We also have the total number of bytes for that region. These two together are what we need. We just have to make sure that on PTableStats that we either a) end up with two parallel arrays: bytes per guide post and total row count or b) we have a kind of per guidepost structure that includes the byte count and number of rows. In other words, we don't necessarily need a row count for every guidepost - it's just an estimate and it's fine if we maintain this per region. On the client-side it's useful to have the parallel arrays as we can easily maintain the equivalent parallel array for row count and bytes to get the information we need. > Collect row counts per region in stats table > -------------------------------------------- > > Key: PHOENIX-1453 > URL: https://issues.apache.org/jira/browse/PHOENIX-1453 > Project: Phoenix > Issue Type: Sub-task > Reporter: James Taylor > Assignee: ramkrishna.s.vasudevan > Attachments: Phoenix-1453.patch, Phoenix-1453_1.patch, > Phoenix-1453_2.patch, Phoenix-1453_3.patch > > > We currently collect guideposts per equal chunk, but we should also capture > row counts. Should we have a parallel array with the guideposts that count > rows per guidepost, or is it enough to have a per region count? -- This message was sent by Atlassian JIRA (v6.3.4#6332)