I think it's the later. I don't think the term interval is exposed anywhere. If you expose it through the config and provide a patch, I think we can add this to the core quickly.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: "Burton-West, Tom" <tburt...@umich.edu> > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Cc: "Farber, Phillip" <pfar...@umich.edu>; "Dueber, William" > <dueb...@umich.edu> > Sent: Wednesday, March 25, 2009 1:50:17 PM > Subject: Can TermIndexInterval be set in Solr? > > Hello all, > > We are experimenting with the ShingleFilter with a very large document set (1 > million full-text books). Because the ShingleFilter indexes every word pair > as a > token, the number of unique terms increases tremendously. In our experiments > so > far the tii and tis files are getting very large and the tii file will > eventually be too large to fit into memory. If we set the TermIndexInterval > to > a larger number than the default 128, the tii file size should go down. Is > it > possible to set this somehow through Solr configuration or do we need to > modify > the code somewhere and call IndexWriter.setTermIndexInterval? > > > Tom > > Tom Burton-West > Digital Library Production Services > University of Michigan Library