Re: Direct HBase vs. Phoenix query performance
Thanks James! I've made a JIRA ticket here: https://issues.apache.org/jira/projects/PHOENIX/issues/PHOENIX-4666 This is a priority for us at 23andMe as it substantially affects some of our queries, so we'd be happy to provide a patch if Phoenix maintainers are able to provide some guidance on the design. I've put a question in the JIRA ticket as well regarding the approach to take. On Thu, Mar 15, 2018 at 10:44 AM, James Taylorwrote: > Hi Marcell, > Yes, that's correct - the cache we build for the RHS is only kept around > while the join query is being executed. It'd be interesting to explore > keeping the cache around longer for cases like yours (and probably not too > difficult). We'd need to keep a map that maps the RHS query to its hash > join cache identifier and if found skip the running of the query. Would you > mind filing a JIRA and we can explore further? > Thanks, > James > > On Wed, Mar 14, 2018 at 3:40 PM, Marcell Ortutay > wrote: > >> A quick update--I did some inspection of the Phoenix codebase, and it >> looks like my understanding of the coprocessor cache was incorrect. I >> thought it was meant to be used across queries, eg. that the RHS of the >> join would be saved for subsequent queries. In fact this is not the case, >> the coprocessor cache is meant to live only for the duration of the query. >> This explains the performance difference--Phoenix is re-running a long >> subquery for each join, whereas my direct to HBase script saves those >> results across queries. >> >> On Tue, Mar 13, 2018 at 4:56 PM, Marcell Ortutay >> wrote: >> >>> Hi James, >>> >>> Thanks for the tips. Our row keys are (I think) reasonably optimized. >>> I've made a gist which is an anonymized version of the query, and it >>> indicates which conditions are / are not part of the PK. It is here: >>> https://gist.github.com/ortutay23andme/12f03767db13343ee797c328a4d78c9c >>> >>> I don't (yet) have an anonymized version of the raw HBase Go script >>> available, but after comparing the performance of the two, I've figured out >>> the root cause. The query does a subquery to produce the LHS of one of the >>> hash joins. This can be seen on L5 of the gist above. This subquery is >>> quite long (~1sec) to execute and scans a few million rows. It is shared >>> across all queries so in the raw HBase script I cached / re-used it for all >>> queries. This has a (very large) performance benefit, in particular under >>> high load. >>> >>> My understanding of Phoenix is that it is supposed to do the same thing. >>> It seems like the hash join code has some mechanic for caching data for >>> hash joining using the HBase coprocessor system. I would expect this cache >>> to kick in, and only execute the large subquery once. Looking at the >>> performance of the query (30sec timeouts after ~2qps), this doesn't seem to >>> be happening. >>> >>> I'm wondering if my understanding of the Phoenix join cache is right. Is >>> it correct to expect that it would cache the results of a subquery used in >>> a join? If so, what are possible reasons why it would *not* do so? Any >>> guidance on metrics / optimizations to look at would be appreciated. >>> >>> Thanks, >>> Marcell >>> >>> On Thu, Mar 8, 2018 at 2:59 PM, James Taylor >>> wrote: >>> Hi Marcell, It'd be helpful to see the table DDL and the query too along with an idea of how many regions might be involved in the query. If a query is a commonly run query, usually you'll design the row key around optimizing it. If you have other, simpler queries that have determined your row key, then another alternative is to add one or more secondary indexes. Another common technique is to denormalize your data in ways that precompute the join to avoid having to do it at run time. With joins, make sure to order your tables from post filtered largest (on LHS) to smallest (on RHS). Also, if you're joining on the PK of both tables, you should use the USE_SORT_MERGE_JOIN hint. Another common tuning exercise is around determining the best parallelization to use (i.e. guidepost width) or even disabling parallelization for more than an entire region's worth of data. It'd also be interesting to see the raw HBase code for a query of this complexity. Thanks, James On Thu, Mar 8, 2018 at 1:03 PM, Marcell Ortutay wrote: > Hi, > > I am using Phoenix at my company for a large query that is meant to be > run in real time as part of our application. The query involves several > aggregations, anti-joins, and an inner query. Here is the (anonymized) > query plan: https://gist.github.com/ortutay23andme/1da620472cc469e > d2d8a6fdd0cc7eb01 > > The query performance on this is not great, it takes about 5sec to > execute the query, and
Re: Direct HBase vs. Phoenix query performance
Hi Marcell, Yes, that's correct - the cache we build for the RHS is only kept around while the join query is being executed. It'd be interesting to explore keeping the cache around longer for cases like yours (and probably not too difficult). We'd need to keep a map that maps the RHS query to its hash join cache identifier and if found skip the running of the query. Would you mind filing a JIRA and we can explore further? Thanks, James On Wed, Mar 14, 2018 at 3:40 PM, Marcell Ortutaywrote: > A quick update--I did some inspection of the Phoenix codebase, and it > looks like my understanding of the coprocessor cache was incorrect. I > thought it was meant to be used across queries, eg. that the RHS of the > join would be saved for subsequent queries. In fact this is not the case, > the coprocessor cache is meant to live only for the duration of the query. > This explains the performance difference--Phoenix is re-running a long > subquery for each join, whereas my direct to HBase script saves those > results across queries. > > On Tue, Mar 13, 2018 at 4:56 PM, Marcell Ortutay > wrote: > >> Hi James, >> >> Thanks for the tips. Our row keys are (I think) reasonably optimized. >> I've made a gist which is an anonymized version of the query, and it >> indicates which conditions are / are not part of the PK. It is here: >> https://gist.github.com/ortutay23andme/12f03767db13343ee797c328a4d78c9c >> >> I don't (yet) have an anonymized version of the raw HBase Go script >> available, but after comparing the performance of the two, I've figured out >> the root cause. The query does a subquery to produce the LHS of one of the >> hash joins. This can be seen on L5 of the gist above. This subquery is >> quite long (~1sec) to execute and scans a few million rows. It is shared >> across all queries so in the raw HBase script I cached / re-used it for all >> queries. This has a (very large) performance benefit, in particular under >> high load. >> >> My understanding of Phoenix is that it is supposed to do the same thing. >> It seems like the hash join code has some mechanic for caching data for >> hash joining using the HBase coprocessor system. I would expect this cache >> to kick in, and only execute the large subquery once. Looking at the >> performance of the query (30sec timeouts after ~2qps), this doesn't seem to >> be happening. >> >> I'm wondering if my understanding of the Phoenix join cache is right. Is >> it correct to expect that it would cache the results of a subquery used in >> a join? If so, what are possible reasons why it would *not* do so? Any >> guidance on metrics / optimizations to look at would be appreciated. >> >> Thanks, >> Marcell >> >> On Thu, Mar 8, 2018 at 2:59 PM, James Taylor >> wrote: >> >>> Hi Marcell, >>> It'd be helpful to see the table DDL and the query too along with an >>> idea of how many regions might be involved in the query. If a query is a >>> commonly run query, usually you'll design the row key around optimizing it. >>> If you have other, simpler queries that have determined your row key, then >>> another alternative is to add one or more secondary indexes. Another common >>> technique is to denormalize your data in ways that precompute the join to >>> avoid having to do it at run time. >>> >>> With joins, make sure to order your tables from post filtered largest >>> (on LHS) to smallest (on RHS). Also, if you're joining on the PK of both >>> tables, you should use the USE_SORT_MERGE_JOIN hint. Another common tuning >>> exercise is around determining the best parallelization to use (i.e. >>> guidepost width) or even disabling parallelization for more than an entire >>> region's worth of data. >>> >>> It'd also be interesting to see the raw HBase code for a query of this >>> complexity. >>> >>> Thanks, >>> James >>> >>> On Thu, Mar 8, 2018 at 1:03 PM, Marcell Ortutay >>> wrote: >>> Hi, I am using Phoenix at my company for a large query that is meant to be run in real time as part of our application. The query involves several aggregations, anti-joins, and an inner query. Here is the (anonymized) query plan: https://gist.github.com/ortutay23andme/1da620472cc469e d2d8a6fdd0cc7eb01 The query performance on this is not great, it takes about 5sec to execute the query, and moreover it performs badly under load. If we run ~4qps of this query Phoenix starts to timeout and slow down a lot (queries take >30sec). For comparison, I wrote a simple Go script that runs a similar query talking directly to HBase. The performance on it is substantially better. It executes in ~1.5sec, and can handle loads of ~50-100qps on the same cluster. I'm wondering if anyone has ideas on what might be causing this difference in performance? Are there configs / optimizations we can do in Phoenix to bring the
Re: Direct HBase vs. Phoenix query performance
A quick update--I did some inspection of the Phoenix codebase, and it looks like my understanding of the coprocessor cache was incorrect. I thought it was meant to be used across queries, eg. that the RHS of the join would be saved for subsequent queries. In fact this is not the case, the coprocessor cache is meant to live only for the duration of the query. This explains the performance difference--Phoenix is re-running a long subquery for each join, whereas my direct to HBase script saves those results across queries. On Tue, Mar 13, 2018 at 4:56 PM, Marcell Ortutaywrote: > Hi James, > > Thanks for the tips. Our row keys are (I think) reasonably optimized. I've > made a gist which is an anonymized version of the query, and it indicates > which conditions are / are not part of the PK. It is here: > https://gist.github.com/ortutay23andme/12f03767db13343ee797c328a4d78c9c > > I don't (yet) have an anonymized version of the raw HBase Go script > available, but after comparing the performance of the two, I've figured out > the root cause. The query does a subquery to produce the LHS of one of the > hash joins. This can be seen on L5 of the gist above. This subquery is > quite long (~1sec) to execute and scans a few million rows. It is shared > across all queries so in the raw HBase script I cached / re-used it for all > queries. This has a (very large) performance benefit, in particular under > high load. > > My understanding of Phoenix is that it is supposed to do the same thing. > It seems like the hash join code has some mechanic for caching data for > hash joining using the HBase coprocessor system. I would expect this cache > to kick in, and only execute the large subquery once. Looking at the > performance of the query (30sec timeouts after ~2qps), this doesn't seem to > be happening. > > I'm wondering if my understanding of the Phoenix join cache is right. Is > it correct to expect that it would cache the results of a subquery used in > a join? If so, what are possible reasons why it would *not* do so? Any > guidance on metrics / optimizations to look at would be appreciated. > > Thanks, > Marcell > > On Thu, Mar 8, 2018 at 2:59 PM, James Taylor > wrote: > >> Hi Marcell, >> It'd be helpful to see the table DDL and the query too along with an idea >> of how many regions might be involved in the query. If a query is a >> commonly run query, usually you'll design the row key around optimizing it. >> If you have other, simpler queries that have determined your row key, then >> another alternative is to add one or more secondary indexes. Another common >> technique is to denormalize your data in ways that precompute the join to >> avoid having to do it at run time. >> >> With joins, make sure to order your tables from post filtered largest (on >> LHS) to smallest (on RHS). Also, if you're joining on the PK of both >> tables, you should use the USE_SORT_MERGE_JOIN hint. Another common tuning >> exercise is around determining the best parallelization to use (i.e. >> guidepost width) or even disabling parallelization for more than an entire >> region's worth of data. >> >> It'd also be interesting to see the raw HBase code for a query of this >> complexity. >> >> Thanks, >> James >> >> On Thu, Mar 8, 2018 at 1:03 PM, Marcell Ortutay >> wrote: >> >>> Hi, >>> >>> I am using Phoenix at my company for a large query that is meant to be >>> run in real time as part of our application. The query involves several >>> aggregations, anti-joins, and an inner query. Here is the (anonymized) >>> query plan: https://gist.github.com/ortutay23andme/1da620472cc469e >>> d2d8a6fdd0cc7eb01 >>> >>> The query performance on this is not great, it takes about 5sec to >>> execute the query, and moreover it performs badly under load. If we run >>> ~4qps of this query Phoenix starts to timeout and slow down a lot (queries >>> take >30sec). >>> >>> For comparison, I wrote a simple Go script that runs a similar query >>> talking directly to HBase. The performance on it is substantially better. >>> It executes in ~1.5sec, and can handle loads of ~50-100qps on the same >>> cluster. >>> >>> I'm wondering if anyone has ideas on what might be causing this >>> difference in performance? Are there configs / optimizations we can do in >>> Phoenix to bring the performance closer to direct HBase queries? >>> >>> I can provide context on the table sizes etc. if needed. >>> >>> Thanks, >>> Marcell >>> >>> >> >
Re: Direct HBase vs. Phoenix query performance
Hi James, Thanks for the tips. Our row keys are (I think) reasonably optimized. I've made a gist which is an anonymized version of the query, and it indicates which conditions are / are not part of the PK. It is here: https://gist.github.com/ortutay23andme/12f03767db13343ee797c328a4d78c9c I don't (yet) have an anonymized version of the raw HBase Go script available, but after comparing the performance of the two, I've figured out the root cause. The query does a subquery to produce the LHS of one of the hash joins. This can be seen on L5 of the gist above. This subquery is quite long (~1sec) to execute and scans a few million rows. It is shared across all queries so in the raw HBase script I cached / re-used it for all queries. This has a (very large) performance benefit, in particular under high load. My understanding of Phoenix is that it is supposed to do the same thing. It seems like the hash join code has some mechanic for caching data for hash joining using the HBase coprocessor system. I would expect this cache to kick in, and only execute the large subquery once. Looking at the performance of the query (30sec timeouts after ~2qps), this doesn't seem to be happening. I'm wondering if my understanding of the Phoenix join cache is right. Is it correct to expect that it would cache the results of a subquery used in a join? If so, what are possible reasons why it would *not* do so? Any guidance on metrics / optimizations to look at would be appreciated. Thanks, Marcell On Thu, Mar 8, 2018 at 2:59 PM, James Taylorwrote: > Hi Marcell, > It'd be helpful to see the table DDL and the query too along with an idea > of how many regions might be involved in the query. If a query is a > commonly run query, usually you'll design the row key around optimizing it. > If you have other, simpler queries that have determined your row key, then > another alternative is to add one or more secondary indexes. Another common > technique is to denormalize your data in ways that precompute the join to > avoid having to do it at run time. > > With joins, make sure to order your tables from post filtered largest (on > LHS) to smallest (on RHS). Also, if you're joining on the PK of both > tables, you should use the USE_SORT_MERGE_JOIN hint. Another common tuning > exercise is around determining the best parallelization to use (i.e. > guidepost width) or even disabling parallelization for more than an entire > region's worth of data. > > It'd also be interesting to see the raw HBase code for a query of this > complexity. > > Thanks, > James > > On Thu, Mar 8, 2018 at 1:03 PM, Marcell Ortutay > wrote: > >> Hi, >> >> I am using Phoenix at my company for a large query that is meant to be >> run in real time as part of our application. The query involves several >> aggregations, anti-joins, and an inner query. Here is the (anonymized) >> query plan: https://gist.github.com/ortutay23andme/1da620472cc469e >> d2d8a6fdd0cc7eb01 >> >> The query performance on this is not great, it takes about 5sec to >> execute the query, and moreover it performs badly under load. If we run >> ~4qps of this query Phoenix starts to timeout and slow down a lot (queries >> take >30sec). >> >> For comparison, I wrote a simple Go script that runs a similar query >> talking directly to HBase. The performance on it is substantially better. >> It executes in ~1.5sec, and can handle loads of ~50-100qps on the same >> cluster. >> >> I'm wondering if anyone has ideas on what might be causing this >> difference in performance? Are there configs / optimizations we can do in >> Phoenix to bring the performance closer to direct HBase queries? >> >> I can provide context on the table sizes etc. if needed. >> >> Thanks, >> Marcell >> >> >
Re: Direct HBase vs. Phoenix query performance
Hi Marcell, It'd be helpful to see the table DDL and the query too along with an idea of how many regions might be involved in the query. If a query is a commonly run query, usually you'll design the row key around optimizing it. If you have other, simpler queries that have determined your row key, then another alternative is to add one or more secondary indexes. Another common technique is to denormalize your data in ways that precompute the join to avoid having to do it at run time. With joins, make sure to order your tables from post filtered largest (on LHS) to smallest (on RHS). Also, if you're joining on the PK of both tables, you should use the USE_SORT_MERGE_JOIN hint. Another common tuning exercise is around determining the best parallelization to use (i.e. guidepost width) or even disabling parallelization for more than an entire region's worth of data. It'd also be interesting to see the raw HBase code for a query of this complexity. Thanks, James On Thu, Mar 8, 2018 at 1:03 PM, Marcell Ortutaywrote: > Hi, > > I am using Phoenix at my company for a large query that is meant to be run > in real time as part of our application. The query involves several > aggregations, anti-joins, and an inner query. Here is the (anonymized) > query plan: https://gist.github.com/ortutay23andme/ > 1da620472cc469ed2d8a6fdd0cc7eb01 > > The query performance on this is not great, it takes about 5sec to execute > the query, and moreover it performs badly under load. If we run ~4qps of > this query Phoenix starts to timeout and slow down a lot (queries take > >30sec). > > For comparison, I wrote a simple Go script that runs a similar query > talking directly to HBase. The performance on it is substantially better. > It executes in ~1.5sec, and can handle loads of ~50-100qps on the same > cluster. > > I'm wondering if anyone has ideas on what might be causing this difference > in performance? Are there configs / optimizations we can do in Phoenix to > bring the performance closer to direct HBase queries? > > I can provide context on the table sizes etc. if needed. > > Thanks, > Marcell > >
Direct HBase vs. Phoenix query performance
Hi, I am using Phoenix at my company for a large query that is meant to be run in real time as part of our application. The query involves several aggregations, anti-joins, and an inner query. Here is the (anonymized) query plan: https://gist.github.com/ortutay23andme/1da620472cc469ed2d8a6fdd0cc7eb01 The query performance on this is not great, it takes about 5sec to execute the query, and moreover it performs badly under load. If we run ~4qps of this query Phoenix starts to timeout and slow down a lot (queries take >30sec). For comparison, I wrote a simple Go script that runs a similar query talking directly to HBase. The performance on it is substantially better. It executes in ~1.5sec, and can handle loads of ~50-100qps on the same cluster. I'm wondering if anyone has ideas on what might be causing this difference in performance? Are there configs / optimizations we can do in Phoenix to bring the performance closer to direct HBase queries? I can provide context on the table sizes etc. if needed. Thanks, Marcell