Yes, I’m pretty sure you understood correctly (I wrote most of this, but it’s been a long time so I cannot remember much for certain).
It should be implemented like the Strings generator. It looks like both HexStrings and HexBytes are incorrect, and have been for a long time. > On 12 Dec 2018, at 22:27, Saleil Bhat (BLOOMBERG/ 731 LEX) > <sbha...@bloomberg.net> wrote: > > Hi, > > I have a question about the behavior of the HexStrings value generator in the > cassandra-stress tool, particularly concerning its population/identity > distribution. > > > Per the discussion in JIRA item CASSANDRA-6146 concerning the stress YAML > profile, the population field in a columnspec “represents the total unique > population distribution of that column across rows.” > > > I interpreted this to mean that if I specify some distribution 'F' for a > column, then the probability of occurrence for each potential value of that > column is given by 'F'. > > So, for example, if I provided the following columnspec for a text column: > name: fake_column > size: fixed(32) > population: gaussian(1..100) > and then generated a large amount of data according to this specification, > I would expect there to be 100 distinct values for ‘fake_column’, and that a > histogram of the frequency of occurrence of each value would be roughly > bell-shaped. > > > > However, the current implementation of the HexStrings generator deviates from > this expectation. In the current implementation, each CHARACTER in the string > is drawn from F, rather than the string as a whole. Therefore, if you plot > the histogram of frequency of occurrence for each character, you get a > bell-shaped curve, but the distribution of the occurrences of whole strings > (the actual columns) is something else. > > > My question is, is this the desired behavior for string columns? Was my > expectation/interpretation incorrect? If so, can anyone give some insight as > to why strings are designed to behave this way and what the use case is for > this behavior? > > Thanks, > -Saleil --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org