Using Hints in Phoenix

2015-03-09 Thread Matthew Johnson
Hi guys, This is more of a general question than a problem – but I’m just wondering if someone can clarify for me what the syntax rules are for hints in Phoenix. Does it matter where in the query they go? Do they always go something like *SELECT insert hint x from y*? Or, if the hint is for a

Re: Using Hints in Phoenix

2015-03-09 Thread Maryann Xue
Hi Matt, So far in Phoenix, hints are only supported as specified right after keywords SELECT, UPSERT and DELETE. Same for join queries. It is currently impossible to hint a certain join algorithm for a specific join node in a multiple join query. However, for subqueries, the inner query can have

Re: Phoenix table scan performance

2015-03-09 Thread Mujtaba Chohan
During your scan with data on single region server (RS), do you see RS blocked on disk I/O due to heavy reads or 100% CPU utilized? if that is the case then having data distributed on 2 RS would effectively cut time in half. On Mon, Mar 9, 2015 at 10:01 AM, Yohan Bismuth yohan.bismu...@gmail.com

Re: Phoenix table scan performance

2015-03-09 Thread Yohan Bismuth
I've been facing this issue for a long time, so i'm pretty sure a major compaction already occured. Running your query returns 27006. I have run update statistics on my table, this didn't solve my problem. But if i understand well, these guideposts are used to parallelize scan over a region, not

Re: Phoenix table scan performance

2015-03-09 Thread James Taylor
Hi Yohan, Have you done a major compaction on your table and are stats generated for your table? You can run this to confirm: SELECT sum(guide_posts_count) from SYSTEM.STATS where physical_name=your full table name; Phoenix does intra-region parallelization based on these guideposts as described

Re: Phoenix table scan performance

2015-03-09 Thread Yohan Bismuth
Sorry, we're not on aws but on bare metal On Mon, Mar 9, 2015 at 6:13 PM, Brady, John john.br...@intel.com wrote: Hi Yohan, Apologies, I don’t have an answer to your question. Could I ask a separate question please? Is your cluster on AWS? I have Apache Phoenix installed on a 5

Re: Phoenix table scan performance

2015-03-09 Thread Yohan Bismuth
From what i've seen, we're mostly idle during scans. On Mon, Mar 9, 2015 at 6:11 PM, Mujtaba Chohan mujt...@apache.org wrote: During your scan with data on single region server (RS), do you see RS blocked on disk I/O due to heavy reads or 100% CPU utilized? if that is the case then having

Phoenix table scan performance

2015-03-09 Thread Yohan Bismuth
Hello, we're currently using Phoenix 4.2 with Hbase 0.98.6 from CDH5.3.2 on our cluster and we're experiencing some perf issues. What we need to do is a full table scan over 1 billion rows. We've got 50 regionservers and approximatively 1000 regions of 1Gb equally distributed on these rs (which

RE: Phoenix table scan performance

2015-03-09 Thread Brady, John
Hi Yohan, Apologies, I don’t have an answer to your question. Could I ask a separate question please? Is your cluster on AWS? I have Apache Phoenix installed on a 5 node cluster with 3 zookeeper nodes on AWS. Also using Phoenix 4.2 with Hbase 0.98.6 from CDH5.3.2. I put the phoenix server

Re: Phoenix table scan performance

2015-03-09 Thread Fulin Sun
Hi, Yohan What salts value you specified for your table ? Did you have a monitoring system for hbase that you can observe your table had loadbalancy well? One phoenomena we got for your use case is that if we use DATA_BLOCK_ENCODING as PREFIX_TREE not the default FAST_DIFF, the full table scan