Hi all, OK, I really should have titled the post, "CheckIndex limit with large tvd files?"
I started a new CheckIndex run about 1:00 pm on Tuesday and it seems to be stuck again looking at termvectors. I gave CheckIndex 32GB of memory, turned on GC logging, and echoed STDERR and STDOUT to a file It's seems stuck while testing term vectors, but maybe it just takes several days to test a term vector file that is 343GB. Yes, I know I said we had term vectors turned off. I forgot that we were using a slightly modified version of the schema we use when we index individual books on a page level. We are using the fast-vector highlighter, so we have termvectors turned on: <fieldType name="FullText" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="false" stored="true" termVectors="true" termPositions="true" termOffsets="true" omitNorms="false"> I've appended a listing of the top memory users from pmap below. Looks like the *tvd file is using about 300GB of virtual memory, followed by the *doc,*fdt and *pos files. Since we have never run CheckIndex on large indexes with term vectors before, we have no idea how long we should expect it to take. Our normal page-level book indexes generally hold about 1,000 books (about 300,000 documents/pages) and are 10-15GB total, with the tvf files totalling about 700 MB and the *tvd files totaling a few hundred K. Tom ---- The top 10 processes in pmap are: total 804,745,732K 00002baaf526c000 300,897,888K r--s- /htsolr/lss-dev/solrs/4.2/3/core/data/index/_bch.tvd 00002b3b4bf1b000 155,250,472K r--s- /htsolr/lss-dev/solrs/4.2/3/core/data/index/_bch_Lucene41_0.doc 00002b88aa709,000 143,788,268K r--s- /htsolr/lss-dev/solrs/4.2/3/core/data/index/_bch.fdt 00002b604fae5,000 139,820,064K r--s- /htsolr/lss-dev/solrs/4.2/3/core/data/index/_bch_Lucene41_0.pos 00002b32e6c10,000 33,554,476K rw--- [ anon ] 00002b81a59ed000 29,196,076K r--s- /htsolr/lss-dev/solrs/4.2/3/core/data/index/_bch_Lucene41_0.tim 00002b3aee31b000 1,315,184K rw--- [ anon ] 00002b889b9b8,000 243,012K r--s- /htsolr/lss-dev/solrs/4.2/3/core/data/index/_bch.nvd 00002b3ae6c39,000 109,276K rw--- [ anon ] 00002bf2b2,804,000 99,272K r--s- /htsolr/lss-dev/solrs/4.2/3/core/data/index/_bch.tvx > > On Tue, Jul 30, 2013 at 1:06 PM, Tom Burton-West <tburt...@umich.edu> > wrote: > > Thanks Mike, Robert and Adrien, > > > > Unfortunately, I killed the processes, so its too late to get a stack > > trace. On thing that was suspicious was that top was reporting memory > use > > as 20GB res even though I invoked the JVM with java -Xmx10g -Xms10g. > > > > I'm going to double the memory, turn on GC logging, and remember to echo > > STDERR to a log and run it again on one of the indexes. > > I'll report back as soon as something interesting shows up. (Probably > > tomorrow sometime.) > > > > Tom > > > > > > On Tue, Jul 30, 2013 at 11:22 AM, Michael McCandless < > > luc...@mikemccandless.com> wrote: > > > >> Can you get a strack trace so we can see where the thread is stuck? > >> > >> Mike McCandless > >> > >> http://blog.mikemccandless.com > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >