> > > Totally randoms (even on keys that do not exist). > It worth checking if it matches your real use cases. I expect that read by row key are most of the time on existing rows (as a traditional db relationship or a UI or workflow driven stuff), even if I'm sure it's possible to have something totally different.
It's not going to have an impact all the time. But I can easily imagine scenarios with better performances when the row exists vs. does not exist. For example, you have to read more files to check that the row key is really not there. This will be even more true if you're inserting a lot of data simultaneously (i.e. the files won't be major compacted). On the opposite side, bloom filters may be more efficient in this case. But again, I'm not sure they're going to be efficient on random data. It's like compression algorithms: on really random data; they will all have similar & bad results. It does not mean they are equivalent, nor useless. > I'm working on it ! Thanks, > If you can reproduce a 'bad behavior' or a performance issue, we will try to fix it for sure. Have a nice day, N.