On Fri, Apr 26, 2013 at 11:43:18AM -0700, Joshua D. Drake wrote: > > On 04/26/2013 11:17 AM, Galen Charlton wrote: > > > >Hi, > > > >On Fri, Apr 26, 2013 at 10:58 AM, Martha Driscoll <[email protected]> > >wrote: > >>In our database of 3,161,189 copies, cutting out the deleted copies > >>eliminates only 904 copies. Why does testing for deleted improve searching > >>so significantly? > > > >The quick answer is that the asset.copy index on the barcode column > >also includes a clause to limit the index to copies that aren't > >deleted: > > Actually, > > The quick (and correct) answer is that asset.copy doesn't have an > index that is on barcode. In fact the current index on > asset.copy(barcode) is actually wrong.
Before declaring something "wrong" so confidently, you might want to double-check your assumptions. > "copy_barcode_key" UNIQUE, btree (barcode) WHERE deleted = false OR > deleted IS FALSE > > Although a valid index, the two conditions within the index are > duplicative. It would be better to drop that index and just use: And deliberately so, because at the time the index was adjusted to be duplicative, PostgreSQL (8.4 IIRC?) was not smart enough to know that "deleted = FALSE" was the same as "deleted IS FALSE"; and as we were often writing ad hoc queries that used one or the other style, we were missing the index and going into table scan mode. Which sucks for performance. Also, the reason that we want the "deleted = FALSE" condition is because the vast majority of valid queries will be restricted to copies that have not been deleted. And there are corner cases where two identical barcodes can exist, except one was deleted, and the other was created since then.
