Hi Todd, Thank you for good descriptions :)
Regards, Saeid On Mon, 6 Aug 2018, 21:26 Todd Lipcon, <[email protected]> wrote: > Hi Saeid, > > It's not based on the number of distinct values, but rather on the > combined size of the values. I believe the default is 256kb, so assuming > your strings are pretty short, a few thousand are likely to be able to be > dict-encoded. Note that dictionaries are calculated per-rowset (small chunk > of data) so even if your overall cardinality is much larger, if you have > some spatial locality such that rows with nearby primary keys have fewer > distinct values, then you're likely to get benefit here. > > -Todd > > On Sat, Aug 4, 2018 at 8:10 AM, Saeid Sattari <[email protected]> > wrote: > >> Hi Kudu community, >> >> Does any body know what is the maximum distinct values of a String column >> that Kudu considers in order to set its encoding to Dictionary? Many thanks >> :) >> >> br, >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >
