Could you give us an example of what you call complex logic?
Perhaps putting this logic on client side could make sense? (probably not
you want, just asking...)
On Sat, May 17, 2014 at 7:26 PM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:
Moving the discussion to the user list.
Hi
I think I should dust off my schema design talk… clearly the talks given by
some of the vendors don’t really explain things …
(Hmmm. Strata London?)
See my reply below…. Note I used SHA-1. MD-5 should also give you roughly the
same results.
On May 18, 2014, at 4:28 AM, Software Dev
You may be missing the point. The primary reason for the salt prefix
pattern is to avoid hotspotting when inserting time series data AND at
the same time provide a way to perform range scans.
James, thanks for the input. Not too familiar with Phoenix although it
looks like a great contrib. Unfortunately our main client is ruby
using the thrift api. Using the thrift api also makes parallel scans
tough, if not impossible.
On Sat, May 17, 2014 at 9:31 PM, James Taylor
No, you’re missing the point.
Its not a good idea or design.
Is your data mutable or static?
To your point. Everytime you want to do a simple get() you have to open up n
get() statements. On your range scans you will have to do n range scans, then
join and sort the result sets. The fact that
@James,
I know and that’s the biggest problem.
Salts by definition are random seeds.
Now I have two new phrases.
1) We want to remain on a sodium free diet.
2) Learn to kick the bucket.
When you have data that is coming in on a time series, is the data mutable or
not?
A better
@Mike,
The biggest problem is you're not listening. Please actually read my
response (and you'll understand the what we're calling salting is not a
random seed).
Phoenix already has secondary indexes in two flavors: one optimized for
write-once data and one more general for fully mutable data.
@James…
You’re not listening. There is a special meaning when you say salt.
On May 18, 2014, at 7:16 PM, James Taylor jtay...@salesforce.com wrote:
@Mike,
The biggest problem is you're not listening. Please actually read my
response (and you'll understand the what we're calling salting is
In our measurements, scanning is improved by performing against n
range scans rather than 1 (since you are effectively striping the
reads). This is even better when you don't necessary care about the
order of every row, but want every row in a given range (then you can
just get whatever row is
The top two hits when you Google for HBase salt are
- Sematext blog describing salting as I described it in my email
- Phoenix blog again describing salting in this same way
I really don't understand what you're arguing about - the mechanism that
you're advocating for is exactly the way both
@Software Dev - if you use Phoenix, queries would leverage our Skip Scan
(which supports a superset of the FuzzyRowFilter perf improvements). Take a
look here:
http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html
Assuming a row key made up of a low cardinality first
@Software Dev - might be feasible to implement a Thrift client that speaks
Phoenix JDBC. I believe this is similar to what Hive has done.
Thanks,
James
On Sun, May 18, 2014 at 1:19 PM, Mike Axiak m...@axiak.net wrote:
In our measurements, scanning is improved by performing against n
range
12 matches
Mail list logo