Yes James, Query is using guidePosts as per the cf used in filter. But I think Maryann is expecting that rowcount and bytescount should be available at each guidePost key level, which we currently don't store. currently, we can use metrics(like rowcount/bytecount) at cf level only right?
On Sat, Feb 13, 2016 at 11:34 AM, James Taylor <[email protected]> wrote: > We should have separate guideposts per cf, as the data distribution may be > different. We use the default cf if it's being filtered on, but otherwise > use a different cf. > > Is that how it works currently, Ankit? > > On Friday, February 12, 2016, Ankit Singhal <[email protected]> > wrote: > > > but I think we need these metrics at cf only right as per this comment- > > > > > https://issues.apache.org/jira/browse/PHOENIX-2143?focusedCommentId=15069779&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15069779 > > > > > > that's why we serialize aggregated value of region at cf level in first > > guide post only. > > > > Regards, > > Ankit Singhal > > > > On Sat, Feb 13, 2016 at 9:07 AM, Maryann Xue <[email protected] > > <javascript:;>> wrote: > > > > > Thanks a lot for the answer, James! The data size has well exceeded the > > > guidepost width and the guideposts do exist but without corresponding > > > "rowCount" or "byteCount" cell. I'll try doing a Phoenix query instead > > and > > > confirm that it is a bug. > > > > > > > > > Thanks, > > > Maryann > > > > > > On Fri, Feb 12, 2016 at 10:21 PM, James Taylor <[email protected] > > <javascript:;>> > > > wrote: > > > > > > > Hi Maryann, > > > > If the amount of data in a region is less than the guidepost width, > > then > > > > it's possible you'd get no guideposts for that region. Do you think > > > that's > > > > the case? If not, it sound like there may be a bug. > > > > > > > > Assuming you're querying to get the stats information, I'd recommend > > > doing > > > > a Phoenix query directly. The code you're emulating uses straight > HBase > > > > APIs because it's called from the server-side. It'd be a one liner > as a > > > > Phoenix query. > > > > > > > > Thanks, > > > > James > > > > > > > > On Fri, Feb 12, 2016 at 11:23 AM, Maryann Xue <[email protected] > > <javascript:;>> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > This was something I noticed when applying Phoenix table stats into > > > > > Calcite-Phoenix cost calculation: When executing the following code > > (a > > > > > slightly modified version of the existing StatisticsUtil method) to > > > scan > > > > > stats table for a specific column-family and a specific start/stop > > key > > > > > range, I got guidepost rows that did not contain the rowCount or > > > > byteCount > > > > > cell, for all rows in the specified range. Apparently, I had set > the > > > > > corresponding columns in the Scan (as shown below). Meanwhile, > > another > > > > > range of stats in the same table gave me the right result. I am > > > wondering > > > > > if this is an expected behavior or it is a bug? > > > > > > > > > > public static PTableStats readStatistics(HTableInterface > > > statsHTable, > > > > > > > > > > byte[] tableNameBytes, ImmutableBytesPtr cf, byte[] > > > startKey, > > > > > byte[] stopKey, > > > > > > > > > > long clientTimeStamp) > > > > > > > > > > throws IOException { > > > > > > > > > > ImmutableBytesWritable ptr = new ImmutableBytesWritable(); > > > > > > > > > > Scan s; > > > > > > > > > > if (cf == null) { > > > > > > > > > > s = MetaDataUtil.newTableRowsScan(tableNameBytes, > > > > > MetaDataProtocol.MIN_TABLE_TIMESTAMP, clientTimeStamp); > > > > > > > > > > } else { > > > > > > > > > > s = > > MetaDataUtil.newTableRowsScan(getAdjustedKey(startKey, > > > > > tableNameBytes, cf, false), > > > > > > > > > > getAdjustedKey(stopKey, tableNameBytes, cf, > > true), > > > > > MetaDataProtocol.MIN_TABLE_TIMESTAMP, > > > > > > > > > > clientTimeStamp); > > > > > > > > > > } > > > > > > > > > > s.addColumn(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES, > > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES); > > > > > > > > > > s.addColumn(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES, > > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_ROW_COUNT_BYTES); > > > > > > > > > > s.addColumn(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES, > > > > > QueryConstants.EMPTY_COLUMN_BYTES); > > > > > > > > > > ResultScanner scanner = null; > > > > > > > > > > long timeStamp = MetaDataProtocol.MIN_TABLE_TIMESTAMP; > > > > > > > > > > TreeMap<byte[], GuidePostsInfoBuilder> > > > guidePostsInfoWriterPerCf > > > > = > > > > > new TreeMap<byte[], GuidePostsInfoBuilder>(Bytes.BYTES_COMPARATOR); > > > > > > > > > > try { > > > > > > > > > > scanner = statsHTable.getScanner(s); > > > > > > > > > > Result result = null; > > > > > > > > > > while ((result = scanner.next()) != null) { > > > > > > > > > > CellScanner cellScanner = result.cellScanner(); > > > > > > > > > > long rowCount = 0; > > > > > > > > > > long byteCount = 0; > > > > > > > > > > byte[] cfName = null; > > > > > > > > > > int tableNameLength; > > > > > > > > > > int cfOffset; > > > > > > > > > > int cfLength; > > > > > > > > > > boolean valuesSet = false; > > > > > > > > > > // Only the two cells with quals > > > > > GUIDE_POSTS_ROW_COUNT_BYTES and GUIDE_POSTS_BYTES would be > retrieved > > > > > > > > > > while (cellScanner.advance()) { > > > > > > > > > > Cell current = cellScanner.current(); > > > > > > > > > > if (!valuesSet) { > > > > > > > > > > tableNameLength = tableNameBytes.length + > 1; > > > > > > > > > > cfOffset = current.getRowOffset() + > > > > > tableNameLength; > > > > > > > > > > cfLength = > > > > getVarCharLength(current.getRowArray(), > > > > > cfOffset, > > > > > > > > > > current.getRowLength() - > > > > tableNameLength); > > > > > > > > > > ptr.set(current.getRowArray(), cfOffset, > > > > cfLength); > > > > > > > > > > valuesSet = true; > > > > > > > > > > } > > > > > > > > > > cfName = ByteUtil.copyKeyBytesIfNecessary(ptr); > > > > > > > > > > if (Bytes.equals(current.getQualifierArray(), > > > current > > > > > .getQualifierOffset(), > > > > > > > > > > current.getQualifierLength(), > > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_ROW_COUNT_BYTES, 0, > > > > > > > > > > PhoenixDatabaseMetaData. > > > > > GUIDE_POSTS_ROW_COUNT_BYTES.length)) { > > > > > > > > > > rowCount = > > > PLong.INSTANCE.getCodec().decodeLong( > > > > > current.getValueArray(), > > > > > > > > > > current.getValueOffset(), > > > > > SortOrder.getDefault()); > > > > > > > > > > } else if > > > (Bytes.equals(current.getQualifierArray(), > > > > > current.getQualifierOffset(), > > > > > > > > > > current.getQualifierLength(), > > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES, 0, > > > > > > > > > > > > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES. > > > > > length)) { > > > > > > > > > > byteCount = > > > PLong.INSTANCE.getCodec().decodeLong( > > > > > current.getValueArray(), > > > > > > > > > > current.getValueOffset(), > > > > > SortOrder.getDefault()); > > > > > > > > > > } > > > > > > > > > > if (current.getTimestamp() > timeStamp) { > > > > > > > > > > timeStamp = current.getTimestamp(); > > > > > > > > > > } > > > > > > > > > > } > > > > > > > > > > if (cfName != null) { > > > > > > > > > > byte[] newGPStartKey = > > getGuidePostsInfoFromRowKey( > > > > > tableNameBytes, cfName, result.getRow()); > > > > > > > > > > GuidePostsInfoBuilder guidePostsInfoWriter = > > > > > guidePostsInfoWriterPerCf.get(cfName); > > > > > > > > > > if (guidePostsInfoWriter == null) { > > > > > > > > > > guidePostsInfoWriter = new > > > > GuidePostsInfoBuilder(); > > > > > > > > > > guidePostsInfoWriterPerCf.put(cfName, > > > > > guidePostsInfoWriter); > > > > > > > > > > } > > > > > > > > > > > guidePostsInfoWriter.addGuidePosts(newGPStartKey, > > > > > byteCount, rowCount); > > > > > > > > > > } > > > > > > > > > > } > > > > > > > > > > if (!guidePostsInfoWriterPerCf.isEmpty()) { return new > > > > > PTableStatsImpl( > > > > > > > > > > getGuidePostsPerCf(guidePostsInfoWriterPerCf), > > > > > timeStamp); > > > > > } > > > > > > > > > > } finally { > > > > > > > > > > if (scanner != null) { > > > > > > > > > > scanner.close(); > > > > > > > > > > } > > > > > > > > > > } > > > > > > > > > > return PTableStats.EMPTY_STATS; > > > > > } > > > > > > > > > > > > > > >
