Thanks Erick for your willingness and patience,

if I understood well when autoCommit with openSearcher=true at first commit
(soft or hard) all new documents will be automatically available for search.
But when openSearcher=false, the commit will flush recent index changes to
stable storage, but does not cause a new searcher to be opened to make
those changes visible
<https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig#UpdateHandlersinSolrConfig-autoCommit>
.

So, it is not clear what is this stable storage, where is and when the new
documents will be visible?
Only when at very end of indexing process my code will commit ?

Does it mean, let me say, that when openSearcher=false we have implicit
commit done by solrCloud <autoCommit> not visible to world and explicit
commit done by clients visible to world?




On Tue, May 26, 2015 at 2:55 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> The design is that the latest successfully flushed tlog file is kept
> for "peer sync" in SolrCloud mode. When a replica comes up, there's a
> chance that it's not very many docs behind. So, if possible, some of
> the docs are taken from the leader's tlog and replayed to the follower
> that's just been started. If the follower is too far out of sync, a
> full old-style replication is done. So there will always be a tlog
> file (and occasionally more than one if they're very small) kept
> around, even on successful commit. It doesn't matter if you have
> leaders and replicas or not, that's still the process that's followed.
>
> Please re-read the link I sent earlier. There's absolutely no reason
> your tlog files have to be so big! Really, set you autoCommit to, say,
> 15 seconds and 100000 docs and set openSearcher=false in your
> solrconfig.xml file and your tlog file that's kept around will be much
> smaller and they'll be available for "peer sync"..
>
> And if you really don't care about tlogs at all, just take this bit
> our of your solrconfig.xml
>
>     <updateLog>
>       <str name="dir">${solr.ulog.dir:}</str>
>       <int name="">${solr.ulog.numVersionBuckets:256}</int>
>     </updateLog>
>
>
>
> Best,
> Erick
>
> On Mon, May 25, 2015 at 4:40 PM, Vincenzo D'Amore <v.dam...@gmail.com>
> wrote:
> > Hi Erick,
> >
> > I have tried indexing code I have few times, this is the behaviour I have
> > tried out:
> >
> > When an indexing process starts, even if one or more tlog file exists, a
> > new tlog file is created and all the new documents are stored there.
> > When indexing process ends and does an hard commit, older old tlog files
> > are removed but the new one (the latest) remains.
> >
> > As far as I can see, since my indexing process every time loads few
> > millions of documents, at end of process latest tlog file persist with
> all
> > these documents there.
> > So I have such big tlog files. Now the question is, why latest tlog file
> > persist even if the code have done a hard commit.
> > When an hard commit is done successfully, why should we keep latest tlog
> > file?
> >
> >
> >
> > On Mon, May 25, 2015 at 7:24 PM, Erick Erickson <erickerick...@gmail.com
> >
> > wrote:
> >
> >> OK, assuming you're not doing any commits at all until the very end,
> >> then the tlog contains all the docs for the _entire_ run. The article
> >> really doesn't care whether the commits come from the solrconfig.xml
> >> or SolrJ client or curl. The tlog simply is not truncated until a hard
> >> commit happens, no matter where it comes from.
> >>
> >> So here's what I'd do:
> >> 1> set autoCommit in your solrconfig.xml with openSearcher=false for
> >> every minute. Then the problem will probably go away.
> >> or
> >> 2> periodically issue a hard commit (openSearcher=false) from the
> client.
> >>
> >> Of the two, I _strongly_ recommend <1> as it's more graceful when
> >> there are multiple clents.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, May 25, 2015 at 4:45 AM, Vincenzo D'Amore <v.dam...@gmail.com>
> >> wrote:
> >> > Hi Erick, thanks for your support.
> >> >
> >> > Reading the post I realised that my scenario does not apply the
> >> autoCommit
> >> > configuration, now we don't have autoCommit in our solrconfig.xml.
> >> >
> >> > We need docs are searchable only after the indexing process, and all
> the
> >> > documents are committed only at end of index process.
> >> >
> >> > Now I don't understand why tlog files are so big, given that we have
> an
> >> > hard commit at end of every indexing.
> >> >
> >> >
> >> >
> >> >
> >> > On Sun, May 24, 2015 at 5:49 PM, Erick Erickson <
> erickerick...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> Vincenzo:
> >> >>
> >> >> Here's perhaps more than you want to know about hard commits, soft
> >> >> commits and transaction logs:
> >> >>
> >> >>
> >> >>
> >>
> http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Sun, May 24, 2015 at 12:04 AM, Vincenzo D'Amore <
> v.dam...@gmail.com>
> >> >> wrote:
> >> >> > Thanks Shawn for your prompt support.
> >> >> >
> >> >> > Best regards,
> >> >> > Vincenzo
> >> >> >
> >> >> > On Sun, May 24, 2015 at 6:45 AM, Shawn Heisey <apa...@elyograg.org
> >
> >> >> wrote:
> >> >> >
> >> >> >> On 5/23/2015 9:41 PM, Vincenzo D'Amore wrote:
> >> >> >> > Thanks Shawn,
> >> >> >> >
> >> >> >> > may be this is a silly question, but I looked around and didn't
> >> find
> >> >> an
> >> >> >> > answer...
> >> >> >> > Well, could I update solrconfig.xml for the collection while the
> >> >> >> instances
> >> >> >> > are running or should I restart the cluster/reload the cores?
> >> >> >>
> >> >> >> You can upload a new config to zookeeper with the zkcli program
> while
> >> >> >> Solr is running, and nothing will change, at least not
> immediately.
> >> The
> >> >> >> new config will take effect when you reload the collection or
> restart
> >> >> >> all the Solr instances.
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Shawn
> >> >> >>
> >> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Vincenzo D'Amore
> >> > email: v.dam...@gmail.com
> >> > skype: free.dev
> >> > mobile: +39 349 8513251
> >>
> >
> >
> >
> > --
> > Vincenzo D'Amore
> > email: v.dam...@gmail.com
> > skype: free.dev
> > mobile: +39 349 8513251
>



-- 
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251

Reply via email to