Thanks Erick for your willingness and patience, if I understood well when autoCommit with openSearcher=true at first commit (soft or hard) all new documents will be automatically available for search. But when openSearcher=false, the commit will flush recent index changes to stable storage, but does not cause a new searcher to be opened to make those changes visible <https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig#UpdateHandlersinSolrConfig-autoCommit> .
So, it is not clear what is this stable storage, where is and when the new documents will be visible? Only when at very end of indexing process my code will commit ? Does it mean, let me say, that when openSearcher=false we have implicit commit done by solrCloud <autoCommit> not visible to world and explicit commit done by clients visible to world? On Tue, May 26, 2015 at 2:55 AM, Erick Erickson <erickerick...@gmail.com> wrote: > The design is that the latest successfully flushed tlog file is kept > for "peer sync" in SolrCloud mode. When a replica comes up, there's a > chance that it's not very many docs behind. So, if possible, some of > the docs are taken from the leader's tlog and replayed to the follower > that's just been started. If the follower is too far out of sync, a > full old-style replication is done. So there will always be a tlog > file (and occasionally more than one if they're very small) kept > around, even on successful commit. It doesn't matter if you have > leaders and replicas or not, that's still the process that's followed. > > Please re-read the link I sent earlier. There's absolutely no reason > your tlog files have to be so big! Really, set you autoCommit to, say, > 15 seconds and 100000 docs and set openSearcher=false in your > solrconfig.xml file and your tlog file that's kept around will be much > smaller and they'll be available for "peer sync".. > > And if you really don't care about tlogs at all, just take this bit > our of your solrconfig.xml > > <updateLog> > <str name="dir">${solr.ulog.dir:}</str> > <int name="">${solr.ulog.numVersionBuckets:256}</int> > </updateLog> > > > > Best, > Erick > > On Mon, May 25, 2015 at 4:40 PM, Vincenzo D'Amore <v.dam...@gmail.com> > wrote: > > Hi Erick, > > > > I have tried indexing code I have few times, this is the behaviour I have > > tried out: > > > > When an indexing process starts, even if one or more tlog file exists, a > > new tlog file is created and all the new documents are stored there. > > When indexing process ends and does an hard commit, older old tlog files > > are removed but the new one (the latest) remains. > > > > As far as I can see, since my indexing process every time loads few > > millions of documents, at end of process latest tlog file persist with > all > > these documents there. > > So I have such big tlog files. Now the question is, why latest tlog file > > persist even if the code have done a hard commit. > > When an hard commit is done successfully, why should we keep latest tlog > > file? > > > > > > > > On Mon, May 25, 2015 at 7:24 PM, Erick Erickson <erickerick...@gmail.com > > > > wrote: > > > >> OK, assuming you're not doing any commits at all until the very end, > >> then the tlog contains all the docs for the _entire_ run. The article > >> really doesn't care whether the commits come from the solrconfig.xml > >> or SolrJ client or curl. The tlog simply is not truncated until a hard > >> commit happens, no matter where it comes from. > >> > >> So here's what I'd do: > >> 1> set autoCommit in your solrconfig.xml with openSearcher=false for > >> every minute. Then the problem will probably go away. > >> or > >> 2> periodically issue a hard commit (openSearcher=false) from the > client. > >> > >> Of the two, I _strongly_ recommend <1> as it's more graceful when > >> there are multiple clents. > >> > >> Best, > >> Erick > >> > >> On Mon, May 25, 2015 at 4:45 AM, Vincenzo D'Amore <v.dam...@gmail.com> > >> wrote: > >> > Hi Erick, thanks for your support. > >> > > >> > Reading the post I realised that my scenario does not apply the > >> autoCommit > >> > configuration, now we don't have autoCommit in our solrconfig.xml. > >> > > >> > We need docs are searchable only after the indexing process, and all > the > >> > documents are committed only at end of index process. > >> > > >> > Now I don't understand why tlog files are so big, given that we have > an > >> > hard commit at end of every indexing. > >> > > >> > > >> > > >> > > >> > On Sun, May 24, 2015 at 5:49 PM, Erick Erickson < > erickerick...@gmail.com > >> > > >> > wrote: > >> > > >> >> Vincenzo: > >> >> > >> >> Here's perhaps more than you want to know about hard commits, soft > >> >> commits and transaction logs: > >> >> > >> >> > >> >> > >> > http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > >> >> > >> >> Best, > >> >> Erick > >> >> > >> >> On Sun, May 24, 2015 at 12:04 AM, Vincenzo D'Amore < > v.dam...@gmail.com> > >> >> wrote: > >> >> > Thanks Shawn for your prompt support. > >> >> > > >> >> > Best regards, > >> >> > Vincenzo > >> >> > > >> >> > On Sun, May 24, 2015 at 6:45 AM, Shawn Heisey <apa...@elyograg.org > > > >> >> wrote: > >> >> > > >> >> >> On 5/23/2015 9:41 PM, Vincenzo D'Amore wrote: > >> >> >> > Thanks Shawn, > >> >> >> > > >> >> >> > may be this is a silly question, but I looked around and didn't > >> find > >> >> an > >> >> >> > answer... > >> >> >> > Well, could I update solrconfig.xml for the collection while the > >> >> >> instances > >> >> >> > are running or should I restart the cluster/reload the cores? > >> >> >> > >> >> >> You can upload a new config to zookeeper with the zkcli program > while > >> >> >> Solr is running, and nothing will change, at least not > immediately. > >> The > >> >> >> new config will take effect when you reload the collection or > restart > >> >> >> all the Solr instances. > >> >> >> > >> >> >> Thanks, > >> >> >> Shawn > >> >> >> > >> >> >> > >> >> > >> > > >> > > >> > > >> > -- > >> > Vincenzo D'Amore > >> > email: v.dam...@gmail.com > >> > skype: free.dev > >> > mobile: +39 349 8513251 > >> > > > > > > > > -- > > Vincenzo D'Amore > > email: v.dam...@gmail.com > > skype: free.dev > > mobile: +39 349 8513251 > -- Vincenzo D'Amore email: v.dam...@gmail.com skype: free.dev mobile: +39 349 8513251