"Chris Browne" <[EMAIL PROTECTED]> writes: > - Any columns marked "unique" could keep to having somewhat smaller > numbers of bins in the histogram because we know that uniqueness > will keep values dispersed at least somewhat.
I think you're on the wrong track. It's not dispersal that's significant but how evenly the values are dispersed. If the values are evenly spread throughout the region from low to high bound then we just need the single bucket telling us the low and high bound and how many values there are. If they're unevenly distributed then we need enough buckets to be able to distinguish the dense areas from the sparse areas. Perhaps something like starting with 1 bucket, splitting it into 2, seeing if the distributions are similar in which case we stop. If not repeat for each bucket. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings