I checked out master and put together a test case using a small percentage of production data for a known problem we have with Pg 9.2 and text search scans.
A small percentage in this case means 10 million records randomly selected; has a few billion records. Tests ran for master successfully and I recorded timings. Applied the patch included here to master along with gin-packed-postinglists-14.patch. Run make clean; ./configure; make; make install. make check (All 141 tests passed.) initdb, import dump The GIN index fails to build with a segfault. DETAIL: Failed process was running: CREATE INDEX textsearch_gin_idx ON kp USING gin (to_tsvector('simple'::regconfig, string)) WHERE (score1 IS NOT NULL); #0 XLogCheckBuffer (holdsExclusiveLock=1 '\001', lsn=lsn@entry=0x7fffcf341920, bkpb=bkpb@entry=0x7fffcf341960, rdata=0x468f11 <ginFindLeafPage+529>, rdata=0x468f11 <ginFindLeafPage+529>) at xlog.c:2339 #1 0x00000000004b9ddd in XLogInsert (rmid=rmid@entry=13 '\r', info=info@entry=16 '\020', rdata=rdata@entry=0x7fffcf341bf0) at xlog.c:936 #2 0x0000000000468a9e in createPostingTree (index=0x7fa4e8d31030, items=items@entry=0xfb55680, nitems=nitems@entry=762, buildStats=buildStats@entry=0x7fffcf343dd0) at gindatapage.c:1324 #3 0x00000000004630c0 in buildFreshLeafTuple (buildStats=0x7fffcf343dd0, nitem=762, items=0xfb55680, category=<optimized out>, key=34078256, attnum=<optimized out>, ginstate=0x7fffcf341df0) at gininsert.c:281 #4 ginEntryInsert (ginstate=ginstate@entry=0x7fffcf341df0, attnum=<optimized out>, key=34078256, category=<optimized out>, items=0xfb55680, nitem=762, buildStats=buildStats@entry=0x7fffcf343dd0) at gininsert.c:351 #5 0x00000000004635b0 in ginbuild (fcinfo=<optimized out>) at gininsert.c:531 #6 0x0000000000718637 in OidFunctionCall3Coll (functionId=functionId@entry=2738, collation=collation@entry=0, arg1=arg1@entry=140346257507968, arg2=arg2@entry=140346257510448, arg3=arg3@entry=32826432) at fmgr.c:1649 #7 0x00000000004ce1da in index_build (heapRelation=heapRelation@entry=0x7fa4e8d30680, indexRelation=indexRelation@entry=0x7fa4e8d31030, indexInfo=indexInfo@entry=0x1f4e440, isprimary=isprimary@entry=0 '\000', isreindex=isreindex@entry=0 '\000') at index.c:1963 #8 0x00000000004ceeaa in index_create (heapRelation=heapRelation@entry=0x7fa4e8d30680, indexRelationName=indexRelationName@entry=0x1f4e660 "textsearch_gin_knn_idx", indexRelationId=16395, indexRelationId@entry=0, relFileNode=<optimized out>, indexInfo=indexInfo@entry=0x1f4e440, indexColNames=indexColNames@entry=0x1f4f728, accessMethodObjectId=accessMethodObjectId@entry=2742, tableSpaceId=tableSpaceId@entry=0, collationObjectId=collationObjectId@entry=0x1f4fcc8, classObjectId=classObjectId@entry=0x1f4fce0, coloptions=coloptions@entry=0x1f4fcf8, reloptions=reloptions@entry=0, isprimary=0 '\000', isconstraint=0 '\000', deferrable=0 '\000', initdeferred=0 '\000', allow_system_table_mods=0 '\000', skip_build=0 '\000', concurrent=0 '\000', is_internal=0 '\000') at index.c:1082 #9 0x0000000000546a78 in DefineIndex (stmt=<optimized out>, indexRelationId=indexRelationId@entry=0, is_alter_table=is_alter_table@entry=0 '\000', check_rights=check_rights@entry=1 '\001', skip_build=skip_build@entry=0 '\000', quiet=quiet@entry=0 '\000') at indexcmds.c:594 #10 0x000000000065147e in ProcessUtilitySlow (parsetree=parsetree@entry=0x1f7fb68, queryString=0x1f7eb10 "CREATE INDEX textsearch_gin_idx ON kp USING gin (to_tsvector('simple'::regconfig, string)) WHERE (score1 IS NOT NULL);", context=<optimized out>, params=params@entry=0x0, completionTag=completionTag@entry=0x7fffcf344c10 "", dest=<optimized out>) at utility.c:1163 #11 0x000000000065079e in standard_ProcessUtility (parsetree=0x1f7fb68, queryString=<optimized out>, context=<optimized out>, params=0x0, dest=<optimized out>, completionTag=0x7fffcf344c10 "") at utility.c:873 #12 0x000000000064de61 in PortalRunUtility (portal=portal@entry=0x1f4c350, utilityStmt=utilityStmt@entry=0x1f7fb68, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f7ff08, completionTag=completionTag@entry=0x7fffcf344c10 "") at pquery.c:1187 #13 0x000000000064e9e5 in PortalRunMulti (portal=portal@entry=0x1f4c350, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f7ff08, altdest=altdest@entry=0x1f7ff08, completionTag=completionTag@entry=0x7fffcf344c10 "") at pquery.c:1318 #14 0x000000000064f459 in PortalRun (portal=portal@entry=0x1f4c350, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f7ff08, altdest=altdest@entry=0x1f7ff08, completionTag=completionTag@entry=0x7fffcf344c10 "") at pquery.c:816 #15 0x000000000064d2d5 in exec_simple_query ( query_string=0x1f7eb10 "CREATE INDEX textsearch_gin_idx ON kp USING gin (to_tsvector('simple'::regconfig, string)) WHERE (score1 IS NOT NULL);") at postgres.c:1048 #16 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1f2ad40, dbname=0x1f2abf8 "rbt", username=<optimized out>) at postgres.c:3992 #17 0x000000000045b1b4 in BackendRun (port=0x1f47280) at postmaster.c:4085 #18 BackendStartup (port=0x1f47280) at postmaster.c:3774 #19 ServerLoop () at postmaster.c:1585 #20 0x000000000060d031 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1f28b20) at postmaster.c:1240 #21 0x000000000045bb25 in main (argc=3, argv=0x1f28b20) at main.c:196 On Thu, Nov 14, 2013 at 12:26 PM, Alexander Korotkov <aekorot...@gmail.com>wrote: > On Sun, Jun 30, 2013 at 3:00 PM, Heikki Linnakangas < > hlinnakan...@vmware.com> wrote: > >> On 28.06.2013 22:31, Alexander Korotkov wrote: >> >>> Now, I got the point of three state consistent: we can keep only one >>> consistent in opclasses that support new interface. exact true and exact >>> false values will be passed in the case of current patch consistent; >>> exact >>> false and unknown will be passed in the case of current patch >>> preConsistent. That's reasonable. >>> >> >> I'm going to mark this as "returned with feedback". For the next version, >> I'd like to see the API changed per above. Also, I'd like us to do >> something about the tidbitmap overhead, as a separate patch before this, so >> that we can assess the actual benefit of this patch. And a new test case >> that demonstrates the I/O benefits. > > > Revised version of patch is attached. > Changes are so: > 1) Patch rebased against packed posting lists, not depends on additional > information now. > 2) New API with tri-state logic is introduced. > > ------ > With best regards, > Alexander Korotkov. > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >