On Sat, Aug 20, 2016 at 4:58 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Jeff Janes <jeff.ja...@gmail.com> writes: >> On Thu, Aug 18, 2016 at 2:25 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> It does know it, what it doesn't know is how many duplicates there are. > >> Does it know whether the count comes from a parsed query-string list/array, >> rather than being an estimate from something else? If it came from a join, >> I can see why it would be dangerous to assume they are mostly distinct. >> But if someone throws 6000 things into a query string and only 200 distinct >> values among them, they have no one to blame but themselves when it makes >> bad choices off of that. > > I am not exactly sold on this assumption that applications have > de-duplicated the contents of a VALUES or IN list. They haven't been > asked to do that in the past, so why do you think they are doing it?
It's hard to know, but my intuition is that most people would deduplicate. I mean, nobody is going to want to their query generator to send X IN (1, 1, <repeat a zillion more times>) to the server if it could have just sent X IN (1). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers