Thanks James! I've made a JIRA ticket here:
https://issues.apache.org/jira/projects/PHOENIX/issues/PHOENIX-4666
This is a priority for us at 23andMe as it substantially affects some of
our queries, so we'd be happy to provide a patch if Phoenix maintainers are
able to provide some guidance on the
Hi Marcell,
Yes, that's correct - the cache we build for the RHS is only kept around
while the join query is being executed. It'd be interesting to explore
keeping the cache around longer for cases like yours (and probably not too
difficult). We'd need to keep a map that maps the RHS query to its
A quick update--I did some inspection of the Phoenix codebase, and it looks
like my understanding of the coprocessor cache was incorrect. I thought it
was meant to be used across queries, eg. that the RHS of the join would be
saved for subsequent queries. In fact this is not the case, the
Hi James,
Thanks for the tips. Our row keys are (I think) reasonably optimized. I've
made a gist which is an anonymized version of the query, and it indicates
which conditions are / are not part of the PK. It is here:
https://gist.github.com/ortutay23andme/12f03767db13343ee797c328a4d78c9c
I
Hi Marcell,
It'd be helpful to see the table DDL and the query too along with an idea
of how many regions might be involved in the query. If a query is a
commonly run query, usually you'll design the row key around optimizing it.
If you have other, simpler queries that have determined your row
Hi,
I am using Phoenix at my company for a large query that is meant to be run
in real time as part of our application. The query involves several
aggregations, anti-joins, and an inner query. Here is the (anonymized)
query plan:
e.org<mailto:user@phoenix.apache.org>"
<user@phoenix.apache.org<mailto:user@phoenix.apache.org>>
Subject: Re: Phoenix query performance
why cant you reduce your query to
select msbo1.PARENTID
from msbo_phoenix_comp_rowkey
where msbo1.PARENTTYPE = 'SHIPMENT'
and msbo1
why cant you reduce your query to
select msbo1.PARENTID
from msbo_phoenix_comp_rowkey
where msbo1.PARENTTYPE = 'SHIPMENT'
and msbo1.OWNERORGID = 100
and msbo1.MILESTONETYPEID != 19661
and msbo1.PARENTREFERENCETIME between 1479964000 and 1480464000
group by msbo1.PARENTID
order
lt;maryann@gmail.com>
> Reply-To: "user@phoenix.apache.org" <user@phoenix.apache.org>
> Date: Wednesday, February 22, 2017 at 2:22 PM
> To: "user@phoenix.apache.org" <user@phoenix.apache.org>
> Subject: Re: Phoenix query performance
>
> Hi Pradhe
;>
Date: Wednesday, February 22, 2017 at 2:22 PM
To: "user@phoenix.apache.org<mailto:user@phoenix.apache.org>"
<user@phoenix.apache.org<mailto:user@phoenix.apache.org>>
Subject: Re: Phoenix query performance
Hi Pradheep,
Thank you for posting the query and the log file
Hi Pradheep,
Thank you for posting the query and the log file! There are two things
going on on the server side at the same time here. I think it'd be a good
idea to isolate the problem first. So a few questions:
1. When you say data size went from "< 1M" to 30M, did the data from both
LHS and
Hi,
I was benchmarking some of the phoenix queries with different compaction level
tuning.
A strange thing is observed when there are huge number of Hfiles on disk. The
queries not returning any data (resultset size 0) execute very quickly (5-10 ms
or so) but just doing a rs.next() on result
12 matches
Mail list logo