The keyspace is just random strings of length > 10. I'll try to reuse HBase
utility to split my string keyspace like they do it for bytes.
2016-02-02 23:23 GMT+01:00 Mujtaba Chohan :
> If you know your key space then you can use *SPLIT ON* in your table
> create DDL. See
Hm... and what is the right to presplit table then?
2016-02-02 18:30 GMT+01:00 Mujtaba Chohan :
> If your filter matches few rows due to filter on leading part of PK then
> your data might only reside in a single block which leads to less
> overall disk reads for non-salted
If you know your key space then you can use *SPLIT ON* in your table create
DDL. See http://phoenix.apache.org/language
On Tue, Feb 2, 2016 at 11:54 AM, Serega Sheypak
wrote:
> Hm... and what is the right to presplit table then?
>
> 2016-02-02 18:30 GMT+01:00 Mujtaba
> then you would be better off not using salt buckets all together rather
than having 100 parallel scan and block reads in your case. I
Didn't understand you correctly. What is difference between salted/not
salted table in case of "primary key leading-part select"?
2016-02-02 1:18 GMT+01:00
If your filter matches few rows due to filter on leading part of PK then
your data might only reside in a single block which leads to less
overall disk reads for non-salted case vs need for multiple blocks reads for
salted one.
On Tuesday, February 2, 2016, Serega Sheypak
Does phoenix have something similar:
hbase org.apache.hadoop.hbase.util.RegionSplitter MY_TABLE HexStringSplit
-c 10 -f c
Command creates pre-splitte table with 10 splits where each split takes a
part of range from 000 to f?
2016-02-02 10:34 GMT+01:00 Serega Sheypak
If you are filtering on leading part of row key which is highly selective
then you would be better off not using salt buckets all together rather
than having 100 parallel scan and block reads in your case. In our test
with billion+ row table, non-salted table offer much better performance
since it