Yes James, Query is using guidePosts as per the cf used in filter.
But I think Maryann is expecting that rowcount and bytescount should be
available at each guidePost key level, which we currently don't store.
currently, we can use metrics(like rowcount/bytecount) at cf level only
right?

On Sat, Feb 13, 2016 at 11:34 AM, James Taylor <[email protected]>
wrote:

> We should have separate guideposts per cf, as the data distribution may be
> different. We use the default cf if it's being filtered on, but otherwise
> use a different cf.
>
> Is that how it works currently, Ankit?
>
> On Friday, February 12, 2016, Ankit Singhal <[email protected]>
> wrote:
>
> > but I think we need these metrics at cf only right as per this comment-
> >
> >
> https://issues.apache.org/jira/browse/PHOENIX-2143?focusedCommentId=15069779&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15069779
> >
> >
> > that's why we serialize aggregated value of region at cf level in first
> > guide post only.
> >
> > Regards,
> > Ankit Singhal
> >
> > On Sat, Feb 13, 2016 at 9:07 AM, Maryann Xue <[email protected]
> > <javascript:;>> wrote:
> >
> > > Thanks a lot for the answer, James! The data size has well exceeded the
> > > guidepost width and the guideposts do exist but without corresponding
> > > "rowCount" or "byteCount" cell. I'll try doing a Phoenix query instead
> > and
> > > confirm that it is a bug.
> > >
> > >
> > > Thanks,
> > > Maryann
> > >
> > > On Fri, Feb 12, 2016 at 10:21 PM, James Taylor <[email protected]
> > <javascript:;>>
> > > wrote:
> > >
> > > > Hi Maryann,
> > > > If the amount of data in a region is less than the guidepost width,
> > then
> > > > it's possible you'd get no guideposts for that region. Do you think
> > > that's
> > > > the case? If not, it sound like there may be a bug.
> > > >
> > > > Assuming you're querying to get the stats information, I'd recommend
> > > doing
> > > > a Phoenix query directly. The code you're emulating uses straight
> HBase
> > > > APIs because it's called from the server-side. It'd be a one liner
> as a
> > > > Phoenix query.
> > > >
> > > > Thanks,
> > > > James
> > > >
> > > > On Fri, Feb 12, 2016 at 11:23 AM, Maryann Xue <[email protected]
> > <javascript:;>>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > This was something I noticed when applying Phoenix table stats into
> > > > > Calcite-Phoenix cost calculation: When executing the following code
> > (a
> > > > > slightly modified version of the existing StatisticsUtil method) to
> > > scan
> > > > > stats table for a specific column-family and a specific start/stop
> > key
> > > > > range, I got guidepost rows that did not contain the rowCount or
> > > > byteCount
> > > > > cell, for all rows in the specified range. Apparently, I had set
> the
> > > > > corresponding columns in the Scan (as shown below). Meanwhile,
> > another
> > > > > range of stats in the same table gave me the right result. I am
> > > wondering
> > > > > if this is an expected behavior or it is a bug?
> > > > >
> > > > >     public static PTableStats readStatistics(HTableInterface
> > > statsHTable,
> > > > >
> > > > >             byte[] tableNameBytes, ImmutableBytesPtr cf, byte[]
> > > startKey,
> > > > > byte[] stopKey,
> > > > >
> > > > >             long clientTimeStamp)
> > > > >
> > > > >             throws IOException {
> > > > >
> > > > >         ImmutableBytesWritable ptr = new ImmutableBytesWritable();
> > > > >
> > > > >         Scan s;
> > > > >
> > > > >         if (cf == null) {
> > > > >
> > > > >             s = MetaDataUtil.newTableRowsScan(tableNameBytes,
> > > > > MetaDataProtocol.MIN_TABLE_TIMESTAMP, clientTimeStamp);
> > > > >
> > > > >         } else {
> > > > >
> > > > >             s =
> > MetaDataUtil.newTableRowsScan(getAdjustedKey(startKey,
> > > > > tableNameBytes, cf, false),
> > > > >
> > > > >                     getAdjustedKey(stopKey, tableNameBytes, cf,
> > true),
> > > > > MetaDataProtocol.MIN_TABLE_TIMESTAMP,
> > > > >
> > > > >                     clientTimeStamp);
> > > > >
> > > > >         }
> > > > >
> > > > >         s.addColumn(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES,
> > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES);
> > > > >
> > > > >         s.addColumn(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES,
> > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_ROW_COUNT_BYTES);
> > > > >
> > > > >         s.addColumn(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES,
> > > > > QueryConstants.EMPTY_COLUMN_BYTES);
> > > > >
> > > > >         ResultScanner scanner = null;
> > > > >
> > > > >         long timeStamp = MetaDataProtocol.MIN_TABLE_TIMESTAMP;
> > > > >
> > > > >         TreeMap<byte[], GuidePostsInfoBuilder>
> > > guidePostsInfoWriterPerCf
> > > > =
> > > > > new TreeMap<byte[], GuidePostsInfoBuilder>(Bytes.BYTES_COMPARATOR);
> > > > >
> > > > >         try {
> > > > >
> > > > >             scanner = statsHTable.getScanner(s);
> > > > >
> > > > >             Result result = null;
> > > > >
> > > > >             while ((result = scanner.next()) != null) {
> > > > >
> > > > >                 CellScanner cellScanner = result.cellScanner();
> > > > >
> > > > >                 long rowCount = 0;
> > > > >
> > > > >                 long byteCount = 0;
> > > > >
> > > > >                 byte[] cfName = null;
> > > > >
> > > > >                 int tableNameLength;
> > > > >
> > > > >                 int cfOffset;
> > > > >
> > > > >                 int cfLength;
> > > > >
> > > > >                 boolean valuesSet = false;
> > > > >
> > > > >                 // Only the two cells with quals
> > > > > GUIDE_POSTS_ROW_COUNT_BYTES and GUIDE_POSTS_BYTES would be
> retrieved
> > > > >
> > > > >                 while (cellScanner.advance()) {
> > > > >
> > > > >                     Cell current = cellScanner.current();
> > > > >
> > > > >                     if (!valuesSet) {
> > > > >
> > > > >                         tableNameLength = tableNameBytes.length +
> 1;
> > > > >
> > > > >                         cfOffset = current.getRowOffset() +
> > > > > tableNameLength;
> > > > >
> > > > >                         cfLength =
> > > > getVarCharLength(current.getRowArray(),
> > > > > cfOffset,
> > > > >
> > > > >                                 current.getRowLength() -
> > > > tableNameLength);
> > > > >
> > > > >                         ptr.set(current.getRowArray(), cfOffset,
> > > > cfLength);
> > > > >
> > > > >                         valuesSet = true;
> > > > >
> > > > >                     }
> > > > >
> > > > >                     cfName = ByteUtil.copyKeyBytesIfNecessary(ptr);
> > > > >
> > > > >                     if (Bytes.equals(current.getQualifierArray(),
> > > current
> > > > > .getQualifierOffset(),
> > > > >
> > > > >                             current.getQualifierLength(),
> > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_ROW_COUNT_BYTES, 0,
> > > > >
> > > > >                             PhoenixDatabaseMetaData.
> > > > > GUIDE_POSTS_ROW_COUNT_BYTES.length)) {
> > > > >
> > > > >                         rowCount =
> > > PLong.INSTANCE.getCodec().decodeLong(
> > > > > current.getValueArray(),
> > > > >
> > > > >                                 current.getValueOffset(),
> > > > > SortOrder.getDefault());
> > > > >
> > > > >                     } else if
> > > (Bytes.equals(current.getQualifierArray(),
> > > > > current.getQualifierOffset(),
> > > > >
> > > > >                             current.getQualifierLength(),
> > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES, 0,
> > > > >
> > > > >
> > > > > PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES.
> > > > > length)) {
> > > > >
> > > > >                         byteCount =
> > > PLong.INSTANCE.getCodec().decodeLong(
> > > > > current.getValueArray(),
> > > > >
> > > > >                                 current.getValueOffset(),
> > > > > SortOrder.getDefault());
> > > > >
> > > > >                     }
> > > > >
> > > > >                     if (current.getTimestamp() > timeStamp) {
> > > > >
> > > > >                         timeStamp = current.getTimestamp();
> > > > >
> > > > >                     }
> > > > >
> > > > >                 }
> > > > >
> > > > >                 if (cfName != null) {
> > > > >
> > > > >                     byte[] newGPStartKey =
> > getGuidePostsInfoFromRowKey(
> > > > > tableNameBytes, cfName, result.getRow());
> > > > >
> > > > >                     GuidePostsInfoBuilder guidePostsInfoWriter =
> > > > > guidePostsInfoWriterPerCf.get(cfName);
> > > > >
> > > > >                     if (guidePostsInfoWriter == null) {
> > > > >
> > > > >                         guidePostsInfoWriter = new
> > > > GuidePostsInfoBuilder();
> > > > >
> > > > >                         guidePostsInfoWriterPerCf.put(cfName,
> > > > > guidePostsInfoWriter);
> > > > >
> > > > >                     }
> > > > >
> > > > >
>  guidePostsInfoWriter.addGuidePosts(newGPStartKey,
> > > > > byteCount, rowCount);
> > > > >
> > > > >                 }
> > > > >
> > > > >             }
> > > > >
> > > > >             if (!guidePostsInfoWriterPerCf.isEmpty()) { return new
> > > > > PTableStatsImpl(
> > > > >
> > > > >                     getGuidePostsPerCf(guidePostsInfoWriterPerCf),
> > > > > timeStamp);
> > > > > }
> > > > >
> > > > >         } finally {
> > > > >
> > > > >             if (scanner != null) {
> > > > >
> > > > >                 scanner.close();
> > > > >
> > > > >             }
> > > > >
> > > > >         }
> > > > >
> > > > >         return PTableStats.EMPTY_STATS;
> > > > >     }
> > > > >
> > > >
> > >
> >
>

Reply via email to