Hi Erick, I have tried indexing code I have few times, this is the behaviour I have tried out:
When an indexing process starts, even if one or more tlog file exists, a new tlog file is created and all the new documents are stored there. When indexing process ends and does an hard commit, older old tlog files are removed but the new one (the latest) remains. As far as I can see, since my indexing process every time loads few millions of documents, at end of process latest tlog file persist with all these documents there. So I have such big tlog files. Now the question is, why latest tlog file persist even if the code have done a hard commit. When an hard commit is done successfully, why should we keep latest tlog file? On Mon, May 25, 2015 at 7:24 PM, Erick Erickson <erickerick...@gmail.com> wrote: > OK, assuming you're not doing any commits at all until the very end, > then the tlog contains all the docs for the _entire_ run. The article > really doesn't care whether the commits come from the solrconfig.xml > or SolrJ client or curl. The tlog simply is not truncated until a hard > commit happens, no matter where it comes from. > > So here's what I'd do: > 1> set autoCommit in your solrconfig.xml with openSearcher=false for > every minute. Then the problem will probably go away. > or > 2> periodically issue a hard commit (openSearcher=false) from the client. > > Of the two, I _strongly_ recommend <1> as it's more graceful when > there are multiple clents. > > Best, > Erick > > On Mon, May 25, 2015 at 4:45 AM, Vincenzo D'Amore <v.dam...@gmail.com> > wrote: > > Hi Erick, thanks for your support. > > > > Reading the post I realised that my scenario does not apply the > autoCommit > > configuration, now we don't have autoCommit in our solrconfig.xml. > > > > We need docs are searchable only after the indexing process, and all the > > documents are committed only at end of index process. > > > > Now I don't understand why tlog files are so big, given that we have an > > hard commit at end of every indexing. > > > > > > > > > > On Sun, May 24, 2015 at 5:49 PM, Erick Erickson <erickerick...@gmail.com > > > > wrote: > > > >> Vincenzo: > >> > >> Here's perhaps more than you want to know about hard commits, soft > >> commits and transaction logs: > >> > >> > >> > http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > >> > >> Best, > >> Erick > >> > >> On Sun, May 24, 2015 at 12:04 AM, Vincenzo D'Amore <v.dam...@gmail.com> > >> wrote: > >> > Thanks Shawn for your prompt support. > >> > > >> > Best regards, > >> > Vincenzo > >> > > >> > On Sun, May 24, 2015 at 6:45 AM, Shawn Heisey <apa...@elyograg.org> > >> wrote: > >> > > >> >> On 5/23/2015 9:41 PM, Vincenzo D'Amore wrote: > >> >> > Thanks Shawn, > >> >> > > >> >> > may be this is a silly question, but I looked around and didn't > find > >> an > >> >> > answer... > >> >> > Well, could I update solrconfig.xml for the collection while the > >> >> instances > >> >> > are running or should I restart the cluster/reload the cores? > >> >> > >> >> You can upload a new config to zookeeper with the zkcli program while > >> >> Solr is running, and nothing will change, at least not immediately. > The > >> >> new config will take effect when you reload the collection or restart > >> >> all the Solr instances. > >> >> > >> >> Thanks, > >> >> Shawn > >> >> > >> >> > >> > > > > > > > > -- > > Vincenzo D'Amore > > email: v.dam...@gmail.com > > skype: free.dev > > mobile: +39 349 8513251 > -- Vincenzo D'Amore email: v.dam...@gmail.com skype: free.dev mobile: +39 349 8513251