> then you would be better off not using salt buckets all together rather than having 100 parallel scan and block reads in your case. I Didn't understand you correctly. What is difference between salted/not salted table in case of "primary key leading-part select"?
2016-02-02 1:18 GMT+01:00 Mujtaba Chohan <[email protected]>: > If you are filtering on leading part of row key which is highly selective > then you would be better off not using salt buckets all together rather > than having 100 parallel scan and block reads in your case. In our test > with billion+ row table, non-salted table offer much better performance > since it ends up reading fewer blocks from a single region. > > //mujtaba > > On Mon, Feb 1, 2016 at 1:16 PM, Serega Sheypak <[email protected]> > wrote: > >> Hi, here is my table DDL: >> CREATE TABLE IF NOT EXISTS id_ref >> ( >> id1 VARCHAR NOT NULL, >> value1 VARCHAR, >> >> id2 VARCHAR NOT NULL, >> value2 VARCHAR >> CONSTRAINT id_ref_pk PRIMARY KEY (id1, id2) >> )IMMUTABLE_ROWS=true,SALT_BUCKETS=100, VERSIONS=1, TTL=691200 >> >> I'm trying to analyze result of explain: >> >> explain select id1, value1, id2, value2 from id_ref where id1 = 'xxx' >> >> . . . . . . . . . . . . . . . . . . . . . . .> ; >> >> *+------------------------------------------+* >> >> *| ** PLAN ** |* >> >> *+------------------------------------------+* >> >> *| *CLIENT 100-CHUNK PARALLEL 100-WAY RANGE SCAN OVER ID_REF >> [0,'1fd5c44a75549162ca1602dda55f6d129cab61a6']* |* >> >> *| *CLIENT MERGE SORT * |* >> >> *+------------------------------------------+* >> >> >> What happens? Client spawns 100 parallel scans (because of bucketing) and >> waits for 100 responses? >> >> Is it effective? What is the right way to optimize such query pattern: >> "select by first part of primary key"? Reduce the amount of buckets? I get >> exeption a while after restarting app: >> >> >> *Task org.apache.phoenix.job.JobManager$JobFutureTask@60a40644 rejected >> from org.apache.phoenix.job.JobManager$1@58e3fe9aRunning, pool size = 128, >> active threads = 121, queued tasks = 5000, completed tasks = 2629565* >> >> >> >
