+1 to remove the sort requirement. Using a bitmap for range evaluation generally works quite well.
On Mon, Feb 8, 2021, 11:41 AM Owen O'Malley <[email protected]> wrote: > All, > Now that Lei is working on creating a replacement for the red-black > string dictionaries, it is a good time to discuss whether we should > continue to sort the dictionaries as they are written. > > Reason to stay sorted: > > 1. Searching for values in the dictionaries can use binary search. > 2. Ranges in the column can be translated into ranges in the indexes. > (eg. if you want myString >= "bar" and myString < "foo" that will > translate > into a range of indexes. > > Reasons to stop sorting: > > 1. Sorting the dictionary means that we need to hold all of the values > in uncompressed memory until the stripe is flushed. (The values are > translated to the sorted order.) > 2. As far as I know, there isn't any code that takes advantage of the > sorted dictionaries. > 3. It makes the writing code much simpler. > > Thoughts? > > .. Owen >
