On Tue, Nov 16, 2010 at 6:07 AM, Alexander Korotkov <aekorot...@gmail.com> wrote: > On Tue, Nov 16, 2010 at 3:07 AM, Robert Haas <robertmh...@gmail.com> wrote: >> But on a broader note, I'm not very certain the sorting algorithm is >> sensible. For example, suppose you have 10 segments that are exactly >> '0' and 20 segments that are exactly '1'. Maybe I'm misunderstanding, >> but it seems like this will result in a 15/15 split when we almost >> certainly want a 10/20 split. I think there will be problems in more >> complex cases as well. The documentation says about the less-than and >> greater-than operators that "These operators do not make a lot of >> sense for any practical purpose but sorting." > > In order to illustrate a real problem we should think about > gist behavior with great enough amount of data. For example, I tried to > extrapolate this case to 100000 of segs where 40% are (0,1) segs and 60% are > (1,2) segs. And this case doesn't seem a problem for me.
Well, the problem with just comparing on < is that it takes very little account of the upper bounds. I think the cases where a simple split would hurt you the most are those where examining the upper bound is necessary to to get a good split. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers