Re: Questions about query: cache, offset and order by desc

yu feng Mon, 21 Dec 2015 18:40:01 -0800

In my opinion, maybe you can increase your ehcache which set in ehcache.xml
or ehcache-test.xml based on you "profile" setting, but I find
the memoryStoreEvictionPolicy is "LRU", I am not clear about how it works,
hope someone can explain it...
you can disable the auto adding if you query it with jdbc instead of web UI
, and I do not find any cache in kylin storage engine, so it always fetch
result from hbase. maybe it is a optimization point.
I check my result with 'select xxx from xxx order by desc limit 10' and get
the right result, and I think kylin will take out all data from storage
engine in join or tableScan, So, it is strange that "order by desc limit"
get error result...


2015-12-21 19:44 GMT+08:00 Yerui Sun <[email protected]>:

> Hi,all
>    We met a scenario with strictly performance requirements. With some
> code reading of query service and engine, here’s some questions:
>
>    1. The cache doesn’t work.
>        Assuming a query with 11000 result rows, the first execution
> consumes 5+ seconds, and execute it immediately, still consumes 5 seconds.
> Looking into the query log, both the two queries has 'Hit Cache: false'.
> The duration threshold has been set to 3000, and the result count threshold
> is 10000. With additional log, the two queries has same hash code, and both
> been put into cache. But there’s a warning log as followed, which not sure
> be the reason:
> [WARN][net.sf.ehcache.pool.sizeof.ObjectGraphWalker.checkMaxDepth(ObjectGraphWalker.java:209)]
> - The configured limit of 1,000 object references was reached while
> attempting to calculate the size of the object graph. Severe performance
> degradation could occur if the sizing operation continues. This can be
> avoided by setting the CacheManger or Cache <sizeOfPolicy> elements
> maxDepthExceededBehavior to "abort" or adding stop points with
> @IgnoreSizeOf annotations. If performance degradation is NOT an issue at
> the configured limit, raise the limit value using the CacheManager or Cache
> <sizeOfPolicy> elements maxDepth attribute. For more information, see the
> Ehcache configuration documentation.
>
>     2. How offset (query with page) worked?
>         The user want to query kylin with page like mysql. I’ve found the
> writing like 'limit 100000 offset 20 row fetch next 10 row only’ worked,
> meaning fetch next 10 rows start from the 21st row, and limit up to 100000.
> In fact, the limit is not necessary for user, the reason to add this is to
> avoid auto adding ‘limit 50000’, so is it possible to disable the auto
> adding?
>         The second question is the processing of query with offset. One
> possible way is scan the first 30 (or 100000) rows, and filter out only the
> 21~30 rows by Calcite engine. Another way is jump to the 21st row directly,
> and scan out only the next 10 rows. I guess we process it in the first way
> for now, because there’s no way to directly jump specific rows in HBase. If
> my guessing is correct, how about the next query with 'limit 100000 offset
> 30 row fetch next 10 row only’? Do we need to scan out all results again?
> If yes, a cache in here should accelerate the next offset queries.
>
>     3. How 'order by desc' worked?
>         The 'order by' also be used. But some result seems not right with
> ‘order by desc’. Considering a query with 10000 result rows, adding ‘limit
> 5000 order by desc’, the result should be 10000~5001 rows, but 5000~1 in
> fact. I guess the reason is, the order was processed in Calcite engine
> after scanning result, and the scan of HBase is always ascend, that’s why
> got the result with 1~5000. To resolve this, is it possible to push order
> by down into hbase scan, and cast into a reverse scan?
>
>     I’m not very familiar with query service and engine for now, but feels
> that there’s still some room for performance improvements.
>     Looking for any comments.
>
>
> Best Regards,
> Yerui Sun
> [email protected]
>
>
>
>

Re: Questions about query: cache, offset and order by desc

Reply via email to