subject:"Re\: soft commit"

Re: Soft commit and new replica types

2018-12-14 Thread Tomás Fernández Löbbe

properties for TLOGs vs PULL nodes.
> > > > >
> > > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > > leader.
> > > > > > Followers fetch the segments and **reload the core** every 150
> > > seconds
> > > > >
> > > > > Edward, "reload" shouldn't really happen in regular TLOG/PULL
> > fetches.
> > > Are
> > > > > you seeing reloads?
> > > > >
> > > > > On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <
> > > erickerick...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > bq. but not every poll attempt they fetch new segment from the
> > leader
> > > > > >
> > > > > > Ah, right. Ignore my comment. Commit will only occur on the
> > followers
> > > > > > when there are new segments to pull down, so your'e right,
> roughly
> > > > > > every second poll would commit find things to bring down and
> open a
> > > > > > new searcher.
> > > > > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> > > > 
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi Vadim,
> > > > > > >
> > > > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > > leader.
> > > > > > > Followers fetch the segments and **reload the core** every 150
> > > seconds
> > > > > > (if
> > > > > > > there were new segments, I suppose). Yeah, followers don't pay
> > the
> > > CPU
> > > > > > > price of indexing, but there are still cache invalidation,
> > > autowarming,
> > > > > > > etc, in addition to network and IO demand. Is that ritht,
> Erick?
> > > > > > >
> > > > > > > Besides that, Erick is pointing out that under a heavy indexing
> > > > > workload
> > > > > > > you could either have:
> > > > > > >
> > > > > > > 1. Very large transaction logs;
> > > > > > >
> > > > > > > 2. Very large numbers of segments. If that is the case, you
> could
> > > have
> > > > > > the
> > > > > > > following scenario numerous times:
> > > > > > >2.1. follower replica downloads segment A and B from leader;
> > > > > > >2.2 leader merges segments A + B into C;
> > > > > > >2.3. follower replicas discard A and B and download C on
> next
> > > poll;
> > > > > > >
> > > > > > > Under the second condition followers needlessly downloaded
> > segments
> > > > > that
> > > > > > > would eventually be merged.
> > > > > > >
> > > > > > > IMO, you should carefully evaluate if the use of TLOG/PULL is
> > > really
> > > > > > > recommended for your cluster setup, plus indexing and querying
> > > > > workload.
> > > > > > > You can very much stay with a NRT setup if it suits you better.
> > The
> > > > > > videos
> > > > > > > below provide a nice set of hints for when to choose between
> NRT
> > or
> > > > > some
> > > > > > > combination of TLOG and PULL.
> > > > > > >
> > > > > > > https://youtu.be/XIb8X3MwVKc
> > > > > > >
> > > > > > > https://youtu.be/dkWy2ykzAv0
> > > > > > >
> > > > > > > https://youtu.be/XqfTjd9KDWU
> > > > > > >
> > > > > > > Regards,
> > > > > > > Edward
> > > > > > >
> > > > > > > Em dom, 9 de dez de 2018 16:56, <
> > vadim.iva...@spb.ntk-intourist.ru
> > > > > > escreveu:
> > > > > > >
> > > > > > > >
> > > > > > > >  If hard commit max time is 300 sec then commit happens every
> > 300
> > > > sec
> > > > > > on
> > > > > > > > tlog leader. And new segments pop up on the leader every 300
> > sec,
> > > > > > during
> > > > > > > > indexing. Polling interval on other replicas 150 sec, but not
> > > every

Re: Soft commit and new replica types

2018-12-14 Thread Edward Ribeiro

etch new segment from the
> leader
> > > > >
> > > > > Ah, right. Ignore my comment. Commit will only occur on the
> followers
> > > > > when there are new segments to pull down, so your'e right, roughly
> > > > > every second poll would commit find things to bring down and open a
> > > > > new searcher.
> > > > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> > > 
> > > > > wrote:
> > > > > >
> > > > > > Hi Vadim,
> > > > > >
> > > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > leader.
> > > > > > Followers fetch the segments and **reload the core** every 150
> > seconds
> > > > > (if
> > > > > > there were new segments, I suppose). Yeah, followers don't pay
> the
> > CPU
> > > > > > price of indexing, but there are still cache invalidation,
> > autowarming,
> > > > > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > > > > >
> > > > > > Besides that, Erick is pointing out that under a heavy indexing
> > > > workload
> > > > > > you could either have:
> > > > > >
> > > > > > 1. Very large transaction logs;
> > > > > >
> > > > > > 2. Very large numbers of segments. If that is the case, you could
> > have
> > > > > the
> > > > > > following scenario numerous times:
> > > > > >2.1. follower replica downloads segment A and B from leader;
> > > > > >2.2 leader merges segments A + B into C;
> > > > > >2.3. follower replicas discard A and B and download C on next
> > poll;
> > > > > >
> > > > > > Under the second condition followers needlessly downloaded
> segments
> > > > that
> > > > > > would eventually be merged.
> > > > > >
> > > > > > IMO, you should carefully evaluate if the use of TLOG/PULL is
> > really
> > > > > > recommended for your cluster setup, plus indexing and querying
> > > > workload.
> > > > > > You can very much stay with a NRT setup if it suits you better.
> The
> > > > > videos
> > > > > > below provide a nice set of hints for when to choose between NRT
> or
> > > > some
> > > > > > combination of TLOG and PULL.
> > > > > >
> > > > > > https://youtu.be/XIb8X3MwVKc
> > > > > >
> > > > > > https://youtu.be/dkWy2ykzAv0
> > > > > >
> > > > > > https://youtu.be/XqfTjd9KDWU
> > > > > >
> > > > > > Regards,
> > > > > > Edward
> > > > > >
> > > > > > Em dom, 9 de dez de 2018 16:56, <
> vadim.iva...@spb.ntk-intourist.ru
> > > > > escreveu:
> > > > > >
> > > > > > >
> > > > > > >  If hard commit max time is 300 sec then commit happens every
> 300
> > > sec
> > > > > on
> > > > > > > tlog leader. And new segments pop up on the leader every 300
> sec,
> > > > > during
> > > > > > > indexing. Polling interval on other replicas 150 sec, but not
> > every
> > > > > poll
> > > > > > > attempt they fetch new segment from the leader, afaiu. Erick,
> do
> > you
> > > > > mean
> > > > > > > that on all other  tlog replicas(not leaders) commit occurs
> every
> > > > poll?
> > > > > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > > > > > erickerick...@gmail.com :
> > > > > > >
> > > > > > > >Not quite, 60. The polling interval is half the commit
> > > > > interval
> > > > > > > >
> > > > > > > >This has always bothered me a little bit, I wonder at the
> > utility
> > > > of a
> > > > > > > >config param. We already have old-style replication with a
> > > > > > > >configurable polling interval. Under very heavy indexing
> loads,
> > it
> > > > > > > >seems to me that either the tlogs will grow quite large or
> > we'll be
> > &g

Re: Soft commit and new replica types

2018-12-13 Thread Tomás Fernández Löbbe

). Yeah, followers don't pay the
> CPU
> > > > > price of indexing, but there are still cache invalidation,
> autowarming,
> > > > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > > > >
> > > > > Besides that, Erick is pointing out that under a heavy indexing
> > > workload
> > > > > you could either have:
> > > > >
> > > > > 1. Very large transaction logs;
> > > > >
> > > > > 2. Very large numbers of segments. If that is the case, you could
> have
> > > > the
> > > > > following scenario numerous times:
> > > > >2.1. follower replica downloads segment A and B from leader;
> > > > >2.2 leader merges segments A + B into C;
> > > > >2.3. follower replicas discard A and B and download C on next
> poll;
> > > > >
> > > > > Under the second condition followers needlessly downloaded segments
> > > that
> > > > > would eventually be merged.
> > > > >
> > > > > IMO, you should carefully evaluate if the use of TLOG/PULL is
> really
> > > > > recommended for your cluster setup, plus indexing and querying
> > > workload.
> > > > > You can very much stay with a NRT setup if it suits you better. The
> > > > videos
> > > > > below provide a nice set of hints for when to choose between NRT or
> > > some
> > > > > combination of TLOG and PULL.
> > > > >
> > > > > https://youtu.be/XIb8X3MwVKc
> > > > >
> > > > > https://youtu.be/dkWy2ykzAv0
> > > > >
> > > > > https://youtu.be/XqfTjd9KDWU
> > > > >
> > > > > Regards,
> > > > > Edward
> > > > >
> > > > > Em dom, 9 de dez de 2018 16:56,  > > > escreveu:
> > > > >
> > > > > >
> > > > > >  If hard commit max time is 300 sec then commit happens every 300
> > sec
> > > > on
> > > > > > tlog leader. And new segments pop up on the leader every 300 sec,
> > > > during
> > > > > > indexing. Polling interval on other replicas 150 sec, but not
> every
> > > > poll
> > > > > > attempt they fetch new segment from the leader, afaiu. Erick, do
> you
> > > > mean
> > > > > > that on all other  tlog replicas(not leaders) commit occurs every
> > > poll?
> > > > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > > > > erickerick...@gmail.com :
> > > > > >
> > > > > > >Not quite, 60. The polling interval is half the commit
> > > > interval
> > > > > > >
> > > > > > >This has always bothered me a little bit, I wonder at the
> utility
> > > of a
> > > > > > >config param. We already have old-style replication with a
> > > > > > >configurable polling interval. Under very heavy indexing loads,
> it
> > > > > > >seems to me that either the tlogs will grow quite large or
> we'll be
> > > > > > >pulling a lot of unnecessary segments across the wire, segments
> > > > > > >that'll soon be merged away and the merged segment re-pulled.
> > > > > > >
> > > > > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > > > > >theoretical at this point.
> > > > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > > > < vadim.iva...@spb.ntk-intourist.ru> wrote:
> > > > > > >
> > > > > > > Thanks, Edward, for clues.
> > > > > > > What bothers me is newSearcher start, warming, cache clear...
> all
> > > > that
> > > > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > > > With NRT I had autoSoftCommit:  30 .
> > > > > > > So I had new Searcher no more than  every 5 min on every
> replica.
> > > > > > > To have more or less  the same effect with TLOG - PULL
> collection,
> > > > > > > I suppose, I have to have  :  30
> > > > > > > (yes, I understand that newSearchers start asynchronously on
> leader
> > > > and
> > > > > > replicas)
> > > > > > >

RE: Soft commit and new replica types

2018-12-13 Thread Vadim Ivanov

bq. , after getting new segments from the leader the follower replica will 
still apply the hard/soft commit?
As was described in one of the videos below, follower tlog replica look for max 
docid in received new segments 
and purge  its transaction log of older records. Than it starts new searcher(it 
may be called soft commit).
-- 
Vadim



> -Original Message-
> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> Sent: Thursday, December 13, 2018 8:27 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Soft commit and new replica types
> 
> Hi Tomás,
> 
> No, I am not seeing reloads. I am trying to understand the interactions
> between hard commit, soft commit, transaction log update with a TLOG
> cluster for both leader and follower replicas. For example, after getting
> new segments from the leader the follower replica will still apply the
> hard/soft commit?
> 
> PS: congratulations on the Berlin Buzzwords' talk. :)
> 
> Thanks!
> 
> On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe
> 
> wrote:
> 
> > I think this is a good point. The tricky part is that if TLOG replicas
> > don't replicate often, their transaction logs will get too big too, so you
> > want the replication interval of TLOG replicas to be tied to the
> > auto(hard)Commit interval (by default at least). If you are using them for
> > search, you may also not want to open a searcher for each fetch... for PULL
> > replicas, maybe the best way is to use the autoSoftCommit interval to
> > define the polling interval. That said, I'm not sure using different
> > configurations is a good idea, some people may be mixing TLOG and PULL
> and
> > querying them both alike.
> >
> > In the meantime, if you have different hosts for TLOG and PULL replicas,
> > one workaround you can have is to define the autoCommit time with a
> system
> > property, and use different properties for TLOGs vs PULL nodes.
> >
> > > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > > Followers fetch the segments and **reload the core** every 150 seconds
> >
> > Edward, "reload" shouldn't really happen in regular TLOG/PULL fetches. Are
> > you seeing reloads?
> >
> > On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson 
> > wrote:
> >
> > > bq. but not every poll attempt they fetch new segment from the leader
> > >
> > > Ah, right. Ignore my comment. Commit will only occur on the followers
> > > when there are new segments to pull down, so your'e right, roughly
> > > every second poll would commit find things to bring down and open a
> > > new searcher.
> > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> 
> > > wrote:
> > > >
> > > > Hi Vadim,
> > > >
> > > > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > > > Followers fetch the segments and **reload the core** every 150 seconds
> > > (if
> > > > there were new segments, I suppose). Yeah, followers don't pay the CPU
> > > > price of indexing, but there are still cache invalidation, autowarming,
> > > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > > >
> > > > Besides that, Erick is pointing out that under a heavy indexing
> > workload
> > > > you could either have:
> > > >
> > > > 1. Very large transaction logs;
> > > >
> > > > 2. Very large numbers of segments. If that is the case, you could have
> > > the
> > > > following scenario numerous times:
> > > >2.1. follower replica downloads segment A and B from leader;
> > > >2.2 leader merges segments A + B into C;
> > > >2.3. follower replicas discard A and B and download C on next poll;
> > > >
> > > > Under the second condition followers needlessly downloaded segments
> > that
> > > > would eventually be merged.
> > > >
> > > > IMO, you should carefully evaluate if the use of TLOG/PULL is really
> > > > recommended for your cluster setup, plus indexing and querying
> > workload.
> > > > You can very much stay with a NRT setup if it suits you better. The
> > > videos
> > > > below provide a nice set of hints for when to choose between NRT or
> > some
> > > > combination of TLOG and PULL.
> > > >
> > > > https://youtu.be/XIb8X3MwVKc
> > > >
> > > > https://youtu.be/dkWy2ykzAv0
> > > >
> > > > https://youtu.be/XqfTjd

Re: Soft commit and new replica types

2018-12-13 Thread Edward Ribeiro

 I wonder at the utility
> of a
> > > > >config param. We already have old-style replication with a
> > > > >configurable polling interval. Under very heavy indexing loads, it
> > > > >seems to me that either the tlogs will grow quite large or we'll be
> > > > >pulling a lot of unnecessary segments across the wire, segments
> > > > >that'll soon be merged away and the merged segment re-pulled.
> > > > >
> > > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > > >theoretical at this point.
> > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > < vadim.iva...@spb.ntk-intourist.ru> wrote:
> > > > >
> > > > > Thanks, Edward, for clues.
> > > > > What bothers me is newSearcher start, warming, cache clear... all
> > that
> > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > With NRT I had autoSoftCommit:  30 .
> > > > > So I had new Searcher no more than  every 5 min on every replica.
> > > > > To have more or less  the same effect with TLOG - PULL collection,
> > > > > I suppose, I have to have  :  30
> > > > > (yes, I understand that newSearchers start asynchronously on leader
> > and
> > > > replicas)
> > > > > Am I right?
> > > > > --
> > > > > Vadim
> > > > >
> > > > >
> > > > >> -Original Message-
> > > > >> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > >> To:  solr-user@lucene.apache.org
> > > > >> Subject: Re: Soft commit and new replica types
> > > > >>
> > > > >> Some insights in the new replica types below:
> > > > >>
> > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > >> vadim.iva...@spb.ntk-intourist.ru wrote:
> > > > >>
> > > > >>>
> > > > >>> From Ref guide we have:
> > > > >>> " NRT is the only type of replica that supports soft-commits..."
> > > > >>> "If TLOG replica does become a leader, it will behave the same as
> > if it
> > > > >>> was a NRT type of replica."
> > > > >>> Does it mean, that if we do not have NRT replicas in the cluster
> > then
> > > > >>> autoSoftCommit section in solconfig.xml Ignored completely (even
> on
> > > > TLOG
> > > > >>> leader)?
> > > > >>>
> > > > >>
> > > > >> No, not completely. Both TLOG and PULL nodes will periodically
> poll
> > the
> > > > >> leader for changes in index segments' files and download those
> > segments
> > > > >> from the leader. If hard commit max time is defined in
> > solrconfig.xml
> > > > the
> > > > >> polling interval of each replica will be half that value. Or else
> > if the
> > > > >> soft commit max time is defined then the replicas will use half
> the
> > soft
> > > > >> commit max time as the interval. If neither are defined then the
> > poll
> > > > >> interval will be 3 seconds (hard coded). See here:
> > > > >> https://github.com/apache/lucene-
> > > > >>
> > solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > >>
> > > > >> If the TLOG is the leader it will index locally and append the doc
> > to
> > > > >> transaction log as a NRT node would do as well as it will
> > synchronously
> > > > >> replicate the data to other TLOG replicas' transaction logs (PULL
> > nodes
> > > > >> don't have transaction logs). But TLOG/PULL replicas doesn't
> support
> > > > soft
> > > > >> commits nor real time gets, afaik.
> > > > >>
> > > > >>>
> > > > >>
> > > > >>>
> > > > >>> 6
> > > > >>>
> > > > >>>
> > > > >>> Should we say that in autoCommit section openSearcher is always
> > true in
> > > > >>> that case?
> > > > >>
> > > > >>
> > > > >>
> > > > >> 1
> > > > >> 3
> > > > >> 512m
> > > > >> false
> > > > >>
> > > > >>
> > > > >> Does it mean that new Searcher always starts on all replicas when
> > hard
> > > > >> commit happens on leader?
> > > > >>
> > > > >>
> > > > >> Nope. Or at least, the searcher is not synchronously created. Each
> > non
> > > > >> leader replica will periodically fetch the index changes from the
> > leader
> > > > >> and open a new searcher to reflect those changes as seen here:
> > > > >> https://github.com/apache/lucene-
> > > > >>
> > solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > >> But it's important to note that the potential delay between the
> > leader's
> > > > >> hard commit and the other replicas fetching those changes from the
> > > > leader
> > > > >> and opening a new searcher to reflect latest changes.
> > > > >>
> > > > >> PS: I am still digging these new replica types so I can have
> > > > misunderstood
> > > > >> or missed some aspect of it.
> > > > >>
> > > > >> Regards,
> > > > >> Edward
> > > > >
> > > >
> >
>

Re: Soft commit and new replica types

2018-12-10 Thread Tomás Fernández Löbbe

CPU consuming stuff in my heavy-indexing scenario.
> > > > With NRT I had autoSoftCommit:  30 .
> > > > So I had new Searcher no more than  every 5 min on every replica.
> > > > To have more or less  the same effect with TLOG - PULL collection,
> > > > I suppose, I have to have  :  30
> > > > (yes, I understand that newSearchers start asynchronously on leader
> and
> > > replicas)
> > > > Am I right?
> > > > --
> > > > Vadim
> > > >
> > > >
> > > >> -Original Message-
> > > >> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > >> To:  solr-user@lucene.apache.org
> > > >> Subject: Re: Soft commit and new replica types
> > > >>
> > > >> Some insights in the new replica types below:
> > > >>
> > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > >> vadim.iva...@spb.ntk-intourist.ru wrote:
> > > >>
> > > >>>
> > > >>> From Ref guide we have:
> > > >>> " NRT is the only type of replica that supports soft-commits..."
> > > >>> "If TLOG replica does become a leader, it will behave the same as
> if it
> > > >>> was a NRT type of replica."
> > > >>> Does it mean, that if we do not have NRT replicas in the cluster
> then
> > > >>> autoSoftCommit section in solconfig.xml Ignored completely (even on
> > > TLOG
> > > >>> leader)?
> > > >>>
> > > >>
> > > >> No, not completely. Both TLOG and PULL nodes will periodically poll
> the
> > > >> leader for changes in index segments' files and download those
> segments
> > > >> from the leader. If hard commit max time is defined in
> solrconfig.xml
> > > the
> > > >> polling interval of each replica will be half that value. Or else
> if the
> > > >> soft commit max time is defined then the replicas will use half the
> soft
> > > >> commit max time as the interval. If neither are defined then the
> poll
> > > >> interval will be 3 seconds (hard coded). See here:
> > > >> https://github.com/apache/lucene-
> > > >>
> solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > >>
> > > >> If the TLOG is the leader it will index locally and append the doc
> to
> > > >> transaction log as a NRT node would do as well as it will
> synchronously
> > > >> replicate the data to other TLOG replicas' transaction logs (PULL
> nodes
> > > >> don't have transaction logs). But TLOG/PULL replicas doesn't support
> > > soft
> > > >> commits nor real time gets, afaik.
> > > >>
> > > >>>
> > > >>
> > > >>>
> > > >>> 6
> > > >>>
> > > >>>
> > > >>> Should we say that in autoCommit section openSearcher is always
> true in
> > > >>> that case?
> > > >>
> > > >>
> > > >>
> > > >> 1
> > > >> 3
> > > >> 512m
> > > >> false
> > > >>
> > > >>
> > > >> Does it mean that new Searcher always starts on all replicas when
> hard
> > > >> commit happens on leader?
> > > >>
> > > >>
> > > >> Nope. Or at least, the searcher is not synchronously created. Each
> non
> > > >> leader replica will periodically fetch the index changes from the
> leader
> > > >> and open a new searcher to reflect those changes as seen here:
> > > >> https://github.com/apache/lucene-
> > > >>
> solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > >> But it's important to note that the potential delay between the
> leader's
> > > >> hard commit and the other replicas fetching those changes from the
> > > leader
> > > >> and opening a new searcher to reflect latest changes.
> > > >>
> > > >> PS: I am still digging these new replica types so I can have
> > > misunderstood
> > > >> or missed some aspect of it.
> > > >>
> > > >> Regards,
> > > >> Edward
> > > >
> > >
>

Re: Soft commit and new replica types

2018-12-10 Thread Erick Erickson

bq. but not every poll attempt they fetch new segment from the leader

Ah, right. Ignore my comment. Commit will only occur on the followers
when there are new segments to pull down, so your'e right, roughly
every second poll would commit find things to bring down and open a
new searcher.
On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro  wrote:
>
> Hi Vadim,
>
> There is no commit on TLOG/PULL  follower replicas, only on the leader.
> Followers fetch the segments and **reload the core** every 150 seconds (if
> there were new segments, I suppose). Yeah, followers don't pay the CPU
> price of indexing, but there are still cache invalidation, autowarming,
> etc, in addition to network and IO demand. Is that ritht, Erick?
>
> Besides that, Erick is pointing out that under a heavy indexing workload
> you could either have:
>
> 1. Very large transaction logs;
>
> 2. Very large numbers of segments. If that is the case, you could have the
> following scenario numerous times:
>2.1. follower replica downloads segment A and B from leader;
>2.2 leader merges segments A + B into C;
>2.3. follower replicas discard A and B and download C on next poll;
>
> Under the second condition followers needlessly downloaded segments that
> would eventually be merged.
>
> IMO, you should carefully evaluate if the use of TLOG/PULL is really
> recommended for your cluster setup, plus indexing and querying workload.
> You can very much stay with a NRT setup if it suits you better. The videos
> below provide a nice set of hints for when to choose between NRT or some
> combination of TLOG and PULL.
>
> https://youtu.be/XIb8X3MwVKc
>
> https://youtu.be/dkWy2ykzAv0
>
> https://youtu.be/XqfTjd9KDWU
>
> Regards,
> Edward
>
> Em dom, 9 de dez de 2018 16:56, 
> >
> >  If hard commit max time is 300 sec then commit happens every 300 sec on
> > tlog leader. And new segments pop up on the leader every 300 sec, during
> > indexing. Polling interval on other replicas 150 sec, but not every poll
> > attempt they fetch new segment from the leader, afaiu. Erick, do you mean
> > that on all other  tlog replicas(not leaders) commit occurs every poll?
> > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > erickerick...@gmail.com :
> >
> > >Not quite, 60. The polling interval is half the commit interval
> > >
> > >This has always bothered me a little bit, I wonder at the utility of a
> > >config param. We already have old-style replication with a
> > >configurable polling interval. Under very heavy indexing loads, it
> > >seems to me that either the tlogs will grow quite large or we'll be
> > >pulling a lot of unnecessary segments across the wire, segments
> > >that'll soon be merged away and the merged segment re-pulled.
> > >
> > >Apparently, though, nobody's seen this "in the wild", so it's
> > >theoretical at this point.
> > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > < vadim.iva...@spb.ntk-intourist.ru> wrote:
> > >
> > > Thanks, Edward, for clues.
> > > What bothers me is newSearcher start, warming, cache clear... all that
> > CPU consuming stuff in my heavy-indexing scenario.
> > > With NRT I had autoSoftCommit:  30 .
> > > So I had new Searcher no more than  every 5 min on every replica.
> > > To have more or less  the same effect with TLOG - PULL collection,
> > > I suppose, I have to have  :  30
> > > (yes, I understand that newSearchers start asynchronously on leader and
> > replicas)
> > > Am I right?
> > > --
> > > Vadim
> > >
> > >
> > >> -Original Message-
> > >> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> > >> Sent: Sunday, December 09, 2018 12:42 AM
> > >> To:  solr-user@lucene.apache.org
> > >> Subject: Re: Soft commit and new replica types
> > >>
> > >> Some insights in the new replica types below:
> > >>
> > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > >> vadim.iva...@spb.ntk-intourist.ru wrote:
> > >>
> > >>>
> > >>> From Ref guide we have:
> > >>> " NRT is the only type of replica that supports soft-commits..."
> > >>> "If TLOG replica does become a leader, it will behave the same as if it
> > >>> was a NRT type of replica."
> > >>> Does it mean, that if we do not have NRT replicas in the cluster then
> > >>> autoSoftCommit section in solconfig.xml Ignored completely (even

Re: Soft commit and new replica types

2018-12-09 Thread Edward Ribeiro

Hi Vadim,

There is no commit on TLOG/PULL  follower replicas, only on the leader.
Followers fetch the segments and **reload the core** every 150 seconds (if
there were new segments, I suppose). Yeah, followers don't pay the CPU
price of indexing, but there are still cache invalidation, autowarming,
etc, in addition to network and IO demand. Is that ritht, Erick?

Besides that, Erick is pointing out that under a heavy indexing workload
you could either have:

1. Very large transaction logs;

2. Very large numbers of segments. If that is the case, you could have the
following scenario numerous times:
   2.1. follower replica downloads segment A and B from leader;
   2.2 leader merges segments A + B into C;
   2.3. follower replicas discard A and B and download C on next poll;

Under the second condition followers needlessly downloaded segments that
would eventually be merged.

IMO, you should carefully evaluate if the use of TLOG/PULL is really
recommended for your cluster setup, plus indexing and querying workload.
You can very much stay with a NRT setup if it suits you better. The videos
below provide a nice set of hints for when to choose between NRT or some
combination of TLOG and PULL.

https://youtu.be/XIb8X3MwVKc

https://youtu.be/dkWy2ykzAv0

https://youtu.be/XqfTjd9KDWU

Regards,
Edward

Em dom, 9 de dez de 2018 16:56, 
>  If hard commit max time is 300 sec then commit happens every 300 sec on
> tlog leader. And new segments pop up on the leader every 300 sec, during
> indexing. Polling interval on other replicas 150 sec, but not every poll
> attempt they fetch new segment from the leader, afaiu. Erick, do you mean
> that on all other  tlog replicas(not leaders) commit occurs every poll?
> воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> erickerick...@gmail.com :
>
> >Not quite, 60. The polling interval is half the commit interval
> >
> >This has always bothered me a little bit, I wonder at the utility of a
> >config param. We already have old-style replication with a
> >configurable polling interval. Under very heavy indexing loads, it
> >seems to me that either the tlogs will grow quite large or we'll be
> >pulling a lot of unnecessary segments across the wire, segments
> >that'll soon be merged away and the merged segment re-pulled.
> >
> >Apparently, though, nobody's seen this "in the wild", so it's
> >theoretical at this point.
> >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> < vadim.iva...@spb.ntk-intourist.ru> wrote:
> >
> > Thanks, Edward, for clues.
> > What bothers me is newSearcher start, warming, cache clear... all that
> CPU consuming stuff in my heavy-indexing scenario.
> > With NRT I had autoSoftCommit:  30 .
> > So I had new Searcher no more than  every 5 min on every replica.
> > To have more or less  the same effect with TLOG - PULL collection,
> > I suppose, I have to have  :  30
> > (yes, I understand that newSearchers start asynchronously on leader and
> replicas)
> > Am I right?
> > --
> > Vadim
> >
> >
> >> -Original Message-
> >> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> >> Sent: Sunday, December 09, 2018 12:42 AM
> >> To:  solr-user@lucene.apache.org
> >> Subject: Re: Soft commit and new replica types
> >>
> >> Some insights in the new replica types below:
> >>
> >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> >> vadim.iva...@spb.ntk-intourist.ru wrote:
> >>
> >>>
> >>> From Ref guide we have:
> >>> " NRT is the only type of replica that supports soft-commits..."
> >>> "If TLOG replica does become a leader, it will behave the same as if it
> >>> was a NRT type of replica."
> >>> Does it mean, that if we do not have NRT replicas in the cluster then
> >>> autoSoftCommit section in solconfig.xml Ignored completely (even on
> TLOG
> >>> leader)?
> >>>
> >>
> >> No, not completely. Both TLOG and PULL nodes will periodically poll the
> >> leader for changes in index segments' files and download those segments
> >> from the leader. If hard commit max time is defined in solrconfig.xml
> the
> >> polling interval of each replica will be half that value. Or else if the
> >> soft commit max time is defined then the replicas will use half the soft
> >> commit max time as the interval. If neither are defined then the poll
> >> interval will be 3 seconds (hard coded). See here:
> >> https://github.com/apache/lucene-
> >> solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> >

Re: Soft commit and new replica types

2018-12-09 Thread vadim . ivanov


 If hard commit max time is 300 sec then commit happens every 300 sec on tlog 
leader. And new segments pop up on the leader every 300 sec, during indexing. 
Polling interval on other replicas 150 sec, but not every poll attempt they 
fetch new segment from the leader, afaiu. Erick, do you mean that on all other  
tlog replicas(not leaders) commit occurs every poll?  воскресенье, 09 декабря 
2018г., 19:21 +03:00 от Erick Erickson  erickerick...@gmail.com :

>Not quite, 60. The polling interval is half the commit interval
>
>This has always bothered me a little bit, I wonder at the utility of a
>config param. We already have old-style replication with a
>configurable polling interval. Under very heavy indexing loads, it
>seems to me that either the tlogs will grow quite large or we'll be
>pulling a lot of unnecessary segments across the wire, segments
>that'll soon be merged away and the merged segment re-pulled.
>
>Apparently, though, nobody's seen this "in the wild", so it's
>theoretical at this point.
>On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
< vadim.iva...@spb.ntk-intourist.ru> wrote:
>
> Thanks, Edward, for clues.
> What bothers me is newSearcher start, warming, cache clear... all that CPU 
> consuming stuff in my heavy-indexing scenario.
> With NRT I had autoSoftCommit:  30 .
> So I had new Searcher no more than  every 5 min on every replica.
> To have more or less  the same effect with TLOG - PULL collection,
> I suppose, I have to have  :  30
> (yes, I understand that newSearchers start asynchronously on leader and 
> replicas)
> Am I right?
> --
> Vadim
>
>
>> -Original Message-
>> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
>> Sent: Sunday, December 09, 2018 12:42 AM
>> To:  solr-user@lucene.apache.org
>> Subject: Re: Soft commit and new replica types
>>
>> Some insights in the new replica types below:
>>
>> On Sat, December 8, 2018 08:42, Vadim Ivanov <
>> vadim.iva...@spb.ntk-intourist.ru wrote:
>>
>>>
>>> From Ref guide we have:
>>> " NRT is the only type of replica that supports soft-commits..."
>>> "If TLOG replica does become a leader, it will behave the same as if it
>>> was a NRT type of replica."
>>> Does it mean, that if we do not have NRT replicas in the cluster then
>>> autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
>>> leader)?
>>>
>>
>> No, not completely. Both TLOG and PULL nodes will periodically poll the
>> leader for changes in index segments' files and download those segments
>> from the leader. If hard commit max time is defined in solrconfig.xml the
>> polling interval of each replica will be half that value. Or else if the
>> soft commit max time is defined then the replicas will use half the soft
>> commit max time as the interval. If neither are defined then the poll
>> interval will be 3 seconds (hard coded). See here:
>> https://github.com/apache/lucene-
>> solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
>> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
>>
>> If the TLOG is the leader it will index locally and append the doc to
>> transaction log as a NRT node would do as well as it will synchronously
>> replicate the data to other TLOG replicas' transaction logs (PULL nodes
>> don't have transaction logs). But TLOG/PULL replicas doesn't support soft
>> commits nor real time gets, afaik.
>>
>>>
>>
>>>
>>> 6
>>>
>>>
>>> Should we say that in autoCommit section openSearcher is always true in
>>> that case?
>>
>>
>>
>> 1
>> 3
>> 512m
>> false
>>
>>
>> Does it mean that new Searcher always starts on all replicas when hard
>> commit happens on leader?
>>
>>
>> Nope. Or at least, the searcher is not synchronously created. Each non
>> leader replica will periodically fetch the index changes from the leader
>> and open a new searcher to reflect those changes as seen here:
>> https://github.com/apache/lucene-
>> solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
>> rg/apache/solr/handler/IndexFetcher.java#L653
>> But it's important to note that the potential delay between the leader's
>> hard commit and the other replicas fetching those changes from the leader
>> and opening a new searcher to reflect latest changes.
>>
>> PS: I am still digging these new replica types so I can have misunderstood
>> or missed some aspect of it.
>>
>> Regards,
>> Edward
>

Re: Soft commit and new replica types

2018-12-09 Thread Erick Erickson

Not quite, 60. The polling interval is half the commit interval

This has always bothered me a little bit, I wonder at the utility of a
config param. We already have old-style replication with a
configurable polling interval. Under very heavy indexing loads, it
seems to me that either the tlogs will grow quite large or we'll be
pulling a lot of unnecessary segments across the wire, segments
that'll soon be merged away and the merged segment re-pulled.

Apparently, though, nobody's seen this "in the wild", so it's
theoretical at this point.
On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
 wrote:
>
> Thanks, Edward, for clues.
> What bothers me is newSearcher start, warming, cache clear... all that CPU 
> consuming stuff in my heavy-indexing scenario.
> With NRT I had autoSoftCommit:   30.
> So I had new Searcher no more than  every 5 min on every replica.
> To have more or less  the same effect with TLOG - PULL collection,
> I suppose, I have to have  :   30
> (yes, I understand that newSearchers start asynchronously on leader and 
> replicas)
> Am I right?
> --
> Vadim
>
>
> > -Original Message-
> > From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> > Sent: Sunday, December 09, 2018 12:42 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Soft commit and new replica types
> >
> > Some insights in the new replica types below:
> >
> > On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > vadim.iva...@spb.ntk-intourist.ru wrote:
> >
> > >
> > > From Ref guide we have:
> > > " NRT is the only type of replica that supports soft-commits..."
> > > "If TLOG replica does become a leader, it will behave the same as if it
> > > was a NRT type of replica."
> > > Does it mean, that if we do not have NRT replicas in the cluster then
> > > autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
> > > leader)?
> > >
> >
> > No, not completely. Both TLOG and PULL nodes will periodically poll the
> > leader for changes in index segments' files and download those segments
> > from the leader. If hard commit max time is defined in solrconfig.xml the
> > polling interval of each replica will be half that value. Or else if the
> > soft commit max time is defined then the replicas will use half the soft
> > commit max time as the interval. If neither are defined then the poll
> > interval will be 3 seconds (hard coded). See here:
> > https://github.com/apache/lucene-
> > solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> > rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> >
> > If the TLOG is the leader it will index locally and append the doc to
> > transaction log as a NRT node would do as well as it will synchronously
> > replicate the data to other TLOG replicas' transaction logs (PULL nodes
> > don't have transaction logs). But TLOG/PULL replicas doesn't support soft
> > commits nor real time gets, afaik.
> >
> > >
> >
> > > 
> > >   6
> > > 
> > >
> > > Should we say that in autoCommit section openSearcher is always true in
> > > that case?
> >
> >
> > 
> >   1
> >   3
> >   512m
> >   false
> > 
> >
> > Does it mean that new Searcher always starts on all replicas when hard
> > commit happens on leader?
> >
> >
> > Nope. Or at least, the searcher is not synchronously created. Each non
> > leader replica will periodically fetch the index changes from the leader
> > and open a new searcher to reflect those changes as seen here:
> > https://github.com/apache/lucene-
> > solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> > rg/apache/solr/handler/IndexFetcher.java#L653
> > But it's important to note that the potential delay between the leader's
> > hard commit and the other replicas fetching those changes from the leader
> > and opening a new searcher to reflect latest changes.
> >
> > PS: I am still digging these new replica types so I can have misunderstood
> > or missed some aspect of it.
> >
> > Regards,
> > Edward
>

RE: Soft commit and new replica types

2018-12-09 Thread Vadim Ivanov

Thanks, Edward, for clues.
What bothers me is newSearcher start, warming, cache clear... all that CPU 
consuming stuff in my heavy-indexing scenario.
With NRT I had autoSoftCommit:   30. 
So I had new Searcher no more than  every 5 min on every replica.
To have more or less  the same effect with TLOG - PULL collection, 
I suppose, I have to have  :   30
(yes, I understand that newSearchers start asynchronously on leader and 
replicas)
Am I right?
-- 
Vadim


> -Original Message-
> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> Sent: Sunday, December 09, 2018 12:42 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Soft commit and new replica types
> 
> Some insights in the new replica types below:
> 
> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> vadim.iva...@spb.ntk-intourist.ru wrote:
> 
> >
> > From Ref guide we have:
> > " NRT is the only type of replica that supports soft-commits..."
> > "If TLOG replica does become a leader, it will behave the same as if it
> > was a NRT type of replica."
> > Does it mean, that if we do not have NRT replicas in the cluster then
> > autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
> > leader)?
> >
> 
> No, not completely. Both TLOG and PULL nodes will periodically poll the
> leader for changes in index segments' files and download those segments
> from the leader. If hard commit max time is defined in solrconfig.xml the
> polling interval of each replica will be half that value. Or else if the
> soft commit max time is defined then the replicas will use half the soft
> commit max time as the interval. If neither are defined then the poll
> interval will be 3 seconds (hard coded). See here:
> https://github.com/apache/lucene-
> solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> 
> If the TLOG is the leader it will index locally and append the doc to
> transaction log as a NRT node would do as well as it will synchronously
> replicate the data to other TLOG replicas' transaction logs (PULL nodes
> don't have transaction logs). But TLOG/PULL replicas doesn't support soft
> commits nor real time gets, afaik.
> 
> >
> 
> > 
> >   6
> > 
> >
> > Should we say that in autoCommit section openSearcher is always true in
> > that case?
> 
> 
> 
>   1
>   3
>   512m
>   false
> 
> 
> Does it mean that new Searcher always starts on all replicas when hard
> commit happens on leader?
> 
> 
> Nope. Or at least, the searcher is not synchronously created. Each non
> leader replica will periodically fetch the index changes from the leader
> and open a new searcher to reflect those changes as seen here:
> https://github.com/apache/lucene-
> solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/o
> rg/apache/solr/handler/IndexFetcher.java#L653
> But it's important to note that the potential delay between the leader's
> hard commit and the other replicas fetching those changes from the leader
> and opening a new searcher to reflect latest changes.
> 
> PS: I am still digging these new replica types so I can have misunderstood
> or missed some aspect of it.
> 
> Regards,
> Edward

Re: Soft commit and new replica types

2018-12-08 Thread Edward Ribeiro

Some insights in the new replica types below:

On Sat, December 8, 2018 08:42, Vadim Ivanov <
vadim.iva...@spb.ntk-intourist.ru wrote:

>
> From Ref guide we have:
> " NRT is the only type of replica that supports soft-commits..."
> "If TLOG replica does become a leader, it will behave the same as if it
> was a NRT type of replica."
> Does it mean, that if we do not have NRT replicas in the cluster then
> autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
> leader)?
>

No, not completely. Both TLOG and PULL nodes will periodically poll the
leader for changes in index segments' files and download those segments
from the leader. If hard commit max time is defined in solrconfig.xml the
polling interval of each replica will be half that value. Or else if the
soft commit max time is defined then the replicas will use half the soft
commit max time as the interval. If neither are defined then the poll
interval will be 3 seconds (hard coded). See here:
https://github.com/apache/lucene-solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/org/apache/solr/cloud/ReplicateFromLeader.java#L68-L77

If the TLOG is the leader it will index locally and append the doc to
transaction log as a NRT node would do as well as it will synchronously
replicate the data to other TLOG replicas' transaction logs (PULL nodes
don't have transaction logs). But TLOG/PULL replicas doesn't support soft
commits nor real time gets, afaik.

>

> 
>   6
> 
>
> Should we say that in autoCommit section openSearcher is always true in
> that case?

  1
  3
  512m
  false

Does it mean that new Searcher always starts on all replicas when hard
commit happens on leader?

Nope. Or at least, the searcher is not synchronously created. Each non
leader replica will periodically fetch the index changes from the leader
and open a new searcher to reflect those changes as seen here:
https://github.com/apache/lucene-solr/blob/75b183196798232aa6f2dcb117f309119053/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L653
But it's important to note that the potential delay between the leader's
hard commit and the other replicas fetching those changes from the leader
and opening a new searcher to reflect latest changes.

PS: I am still digging these new replica types so I can have misunderstood
or missed some aspect of it.

Regards,
Edward

Re: Soft commit impact on replication

2018-06-18 Thread Adarsh_infor

Hi Erick,

Thanks for the response. 

First thing we not indexing on Slave.  And we are not re-indexing/optimizing
entire the core in Master node. 

The only warning which I see in the log is "Unable clean the unused index
directory so starting full copy".  
That one i can understand and I don't have issue with that as its normal
behaviour but most of the time. But most of the time it’s just triggers
full-copy without any details in the log.  

And recently in one of the nodes i enabled soft-commit in master nodes and
monitored the corresponding slave node, what i observed is it didn't even
trigger the full-copy not even once for almost 3 consecutive days. So am
wondering do we need to have soft commit enabled in master for replication
to happen smooth if so what’s the dependency there


Thanks 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Soft commit impact on replication

2018-06-15 Thread Erick Erickson

My first guess is that you're indexing to the slave nodes.

Second guess is that you're re-indexing your entire corpus on the master node.

Third guess is that you're optimizing on the master node (don't do this)

What does the slave's log say is the reason? If all the segments on
the master have changed, they'll all need to be copied during
replication so if that's the case that's entirely normal.

And you shouldn't need to commit on the slaves, that should happen as
part of replication.

Best,
Erick

On Fri, Jun 15, 2018 at 3:25 AM, Adarsh Hd  wrote:
> Hi All,
>
> Current am using SOLR 5.2.1 on Linux machine.  I have cluster of 5 nodes with 
> master and salve configuration, which gives 5 master nodes and 5slave node. 
> We have enabled only hard commit on master nodes and both soft & hard commit 
> on the slave nodes since the search will happen on slave.
> Current we are facing a issue which is causing our slave to replicate full 
> copy from master most of the time , sometimes I get error like unable to 
> clean old index so trigger full copy I could understand that from couple of 
> solr links, but there is lot of time where the full copy is triggered just 
> like with out any warning or any error message could anyone let me know the 
> possibilities that might trigger the full copy replication so often.
>
>
> Regards
> Adarsh

Re: Soft commit uploading datas cant search on website

2017-08-11 Thread Erick Erickson

First, if you specify commit it's doing a hard commit with
openSearcher=true by default so the softCommit isn't necessary here.
I'd do one or the other, as it's possible that Solr is stopping at the
first one.

bq: when i do the hardcommit manually . then its shows the result on website.

I don't know what that means in this context.

The most obvious difference is that you have "fq" clauses in the one
that doesn't work, what happens when you remove them?

Best,
Erick

On Fri, Aug 11, 2017 at 3:00 AM, Abdul Ekfhy  wrote:
> I configure the softcommit in solrconfig.xml, and when i add a new
> entry(example : solrtest) from website and run below commit url
> "http://192.168.2.10:8983/solr/goods/update?softCommit=true=true "
>
> when i run this and check on the query  keyword:solrtest its shows the entry
> on xml format
>
> and also its increase +1  in numdocs for goods.
>
> but it not shows on website search.
>
> when i do the hardcommit manually . then its shows the result on website.
>
> any idea what could be the issue.
>
> my configurations are below
>
> solrconfig.xml
>
>   
> ${solr.autoCommit.maxTime:15000}
> false
>   
>
>
>   
>1
>   
>
>
> solr.log
> ..
> ..
> search on query(its shows hits=1)
> .
> 2017-08-11 09:15:28.060 INFO  (qtp985934102-14) [c:goods s:shard1
> r:core_node3 x:goods_v6.6] o.a.s.c.S.Request [goods_v6.6]  webapp=/solr
> path=/select params={q=keyword:solrtest=on=json&_=1502439335248}
> hits=1 status=0 QTime=1
> .
> search on website(its shows hits=0)
> ..
> 2017-08-11 09:14:37.226 INFO  (qtp985934102-47) [c:goods s:shard1
> r:core_node3 x:goods_v6.6] o.a.s.c.S.Request [goods_v6.6]  webapp=/solr
> path=/select
> params={q=keyword:"*solrtest*"=0=storePrice:[0+TO+9]=goodsStatus:0=goodsClick+desc=12=javabin=2}
> hits=0 status=0 QTime=2
>
>
> any configurations i need to  add more to solrconfig.xml ??
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Soft-commit-uploading-datas-cant-search-on-website-tp4350186.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Soft commit and reading data just after the commit

2016-12-20 Thread Shawn Heisey

On 12/19/2016 7:12 PM, Lasitha Wattaladeniya wrote:
> *Requirement *is, we are showing a list of entries on a page. For each
> user there's a read / unread flag. The data for listing is fetched
> from solr. And you can see the entry was previously read or not. So
> when a user views an entry by clicking. We are updating the database
> flag to READ and use real time indexing to update solr index. So when
> the user close the full view of the entry and go back to entry listing
> page, the data fetched from solr should be updated to READ. Can't we
> achieve a requirement as described above using solr ? (without
> manipulating the previously fetched results list from solr, because at
> some point we'll have to go back to search results from solr and at
> that time it should be updated).

If you want the user to see that state reflected in their own results,
it's probably far easier to have the application track what the user has
opened in the last minute or so, and if those specific IDs are in the
results, mark them as read regardless of what Solr returns.

Shortly after the application sends the update, the results coming back
from Solr should be current, which is why your application won't need to
hold onto that information for very long.

Thanks,
Shawn

Re: Soft commit and reading data just after the commit

2016-12-19 Thread Ere Maijala


Hi,

so, the app already has a database connection because it updates the 
READ flag when the user clicks an entry, right? If you only need the 
flag for display purposes, it sounds like it would make sense to also 
fetch it directly from the database when displaying the listing. Of 
course if you also need to search for READ/UNREAD you need to index the 
change, but perhaps you could get away with it taking longer.


--Ere

20.12.2016, 4.12, Lasitha Wattaladeniya kirjoitti:

Hi Shawn,

Thanks for your well detailed explanation. Now I understand, I won't be
able to achieve the 100ms softcommit timeout with my hardware setup.
However let's say someone has a requirement as below (quoted from my
previous mail)

*Requirement *is,  we are showing a list of entries on a page. For each
user there's a read / unread flag.  The data for listing is fetched from
solr. And you can see the entry was previously read or not. So when a user
views an entry by clicking.  We are updating the database flag to READ and
use real time indexing to update solr index.  So when the user close the
full view of the entry and go back to entry listing page,  the data fetched
from solr should be updated to READ.

Can't we achieve a requirement as described above using solr ? (without
manipulating the previously fetched results list from solr, because at some
point we'll have to go back to search results from solr and at that time it
should be updated).

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893
Blog : techreadme.blogspot.com

On Mon, Dec 19, 2016 at 6:37 PM, Shawn Heisey  wrote:


On 12/18/2016 7:09 PM, Lasitha Wattaladeniya wrote:

@eric : thanks for the lengthy reply. So let's say I increase the
autosoftcommit time out to may be 100 ms. In that case do I have to
wait much that time from client side before calling search ?. What's
the correct way of achieving this?


Some of the following is covered by the links you've already received.
Some of it may be new information.

Before you can see a change you've just made, you will need to wait for
the commit to be fired (in this case, the autoSoftCommit time) plus
however long it actually takes to complete the commit and open a new
searcher.  Opening the searcher is the expensive part.

What I typically recommend that people do is have the autoSoftCommit
time as long as they can stand, with 60-300 seconds as a "typical"
value.  That's a setting of 6 to 30.  What you are trying to
achieve is much faster, and much more difficult.

100 milliseconds will typically be far too small a value unless your
index is extremely small or your hardware is incredibly fast and has a
lot of memory.  With a value of 100, you'll want each of those soft
commits (which do open a new searcher) to take FAR less than 100
milliseconds to complete.  This kind of speed can be difficult to
achieve, especially if the index is large.

To have any hope of fast commit times, you will need to set
autowarmCount on all Solr caches to zero.  If you are indexing
frequently enough, you might even want to completely disable Solr's
internal caches, because they may be providing no benefit.

You will want to have enough extra memory that your operating system can
cache the vast majority (or even maybe all) of your index.

https://wiki.apache.org/solr/SolrPerformanceProblems

Some other info that's helpful for understanding why plenty of *spare*
memory (not allocated by programs) is necessary for good performance:

https://en.wikipedia.org/wiki/Page_cache
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

The reason in a nutshell:  Disks are EXTREMELY slow.  Memory is very fast.

Thanks,
Shawn






--
Ere Maijala
Kansalliskirjasto / The National Library of Finland

Re: Soft commit and reading data just after the commit

2016-12-19 Thread Walter Underwood

You probably need a database instead of a search engine.

What requirement makes you want to do this with a search engine?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Dec 19, 2016, at 6:34 PM, Lasitha Wattaladeniya  wrote:
> 
> Hi Hendrik,
> 
> Thanks for your input. Previously I was using the hard commit
> (SolrClient.commit()) but then I got some error when there are concurrent
> real time index requests from my app. The error was  "Exceeded limit of
> maxWarmingSearchers=2, try again later", then i changed the code to use
> only solrserver.add(docs) method and configured autoSoftCommit timeout and
> autoCommit timeout in solrConfig.
> 
> I think, i'll get the same error when I use the method you described
> (SolrClient.commit(String
> collection, boolean waitFlush, boolean waitSearcher, boolean softCommit)),
> Anyway i'll have a look at that method.
> 
> Best regards,
> Lasitha
> 
> Lasitha Wattaladeniya
> Software Engineer
> 
> Mobile : +6593896893 <+65%209389%206893>
> Blog : techreadme.blogspot.com
> 
> On Tue, Dec 20, 2016 at 3:31 AM, Hendrik Haddorp 
> wrote:
> 
>> Hi,
>> 
>> the SolrJ API has this method: SolrClient.commit(String collection,
>> boolean waitFlush, boolean waitSearcher, boolean softCommit).
>> My assumption so far was that when you set waitSearcher to true that the
>> method call only returns once a search would find the new data, which
>> sounds what you want. I used this already and it seemed to work just fine.
>> 
>> regards,
>> Hendrik
>> 
>> 
>> On 19.12.2016 04:09, Lasitha Wattaladeniya wrote:
>> 
>>> Hi all,
>>> 
>>> Thanks for your replies,
>>> 
>>> @dorian : the requirement is,  we are showing a list of entries on a page.
>>> For each user there's a read / unread flag.  The data for listing is
>>> fetched from solr. And you can see the entry was previously read or not.
>>> So
>>> when a user views an entry by clicking.  We are updating the database flag
>>> to READ and use real time indexing to update solr entry.  So when the user
>>> close the full view of the entry and go back to entry listing page,  the
>>> data fetched from solr should be updated to READ. That's the use case we
>>> are trying to fix.
>>> 
>>> @eric : thanks for the lengthy reply.  So let's say I increase the
>>> autosoftcommit time out to may be 100 ms.  In that case do I have to wait
>>> much that time from client side before calling search ?.  What's the
>>> correct way of achieving this?
>>> 
>>> Regards,
>>> Lasitha
>>> 
>>> On 18 Dec 2016 23:52, "Erick Erickson"  wrote:
>>> 
>>> 1 ms autocommit is far too frequent. And it's not
 helping you anyway.
 
 There is some lag between when a commit happens
 and when the docs are really available. The sequence is:
 1> commit (soft or hard-with-opensearcher=true doesn't matter).
 2> a new searcher is opened and autowarming starts
 3> until the new searcher is opened, queries continue to be served by
 the old searcher
 4> the new searcher is fully opened
 5> _new_ requests are served by the new searcher.
 6> the last request is finished by the old searcher and it's closed.
 
 So what's probably happening is that you send docs and then send a
 query and Solr is still in step <3>. You can look at your admin UI
 pluginst/stats page or your log to see how long it takes for a
 searcher to open and adjust your expectations accordingly.
 
 If you want to fetch only the document (not try to get it by a
 search), Real Time Get is designed to insure that you always get the
 most recent copy whether it's searchable or not.
 
 All that said, Solr wasn't designed for autocommits that are that
 frequent. That's why the documentation talks about _Near_ Real Time.
 You may need to adjust your expectations.
 
 Best,
 Erick
 
 On Sun, Dec 18, 2016 at 6:49 AM, Dorian Hoxha 
 wrote:
 
> There's a very high probability that you're using the wrong tool for the
> job if you need 1ms softCommit time. Especially when you always need it
> 
 (ex
 
> there are apps where you need commit-after-insert very rarely).
> 
> So explain what you're using it for ?
> 
> On Sun, Dec 18, 2016 at 3:38 PM, Lasitha Wattaladeniya <
> 
 watt...@gmail.com>
 
> wrote:
> 
> Hi Furkan,
>> 
>> Thanks for the links. I had read the first one but not the second one.
>> I
>> did read it after you sent. So in my current solrconfig.xml settings
>> 
> below
 
> are the configurations,
>> 
>> 
>>${solr.autoSoftCommit.maxTime:1}
>>  
>> 
>> 
>> 
>>15000
>>false
>>  
>> 
>> The problem i'm facing is, just after adding the documents to solr
>> using
>> solrj, when I retrieve data from solr I

Re: Soft commit and reading data just after the commit

2016-12-19 Thread Lasitha Wattaladeniya

Hi Hendrik,

Thanks for your input. Previously I was using the hard commit
(SolrClient.commit()) but then I got some error when there are concurrent
real time index requests from my app. The error was  "Exceeded limit of
maxWarmingSearchers=2, try again later", then i changed the code to use
only solrserver.add(docs) method and configured autoSoftCommit timeout and
autoCommit timeout in solrConfig.

I think, i'll get the same error when I use the method you described
(SolrClient.commit(String
collection, boolean waitFlush, boolean waitSearcher, boolean softCommit)),
Anyway i'll have a look at that method.

Best regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893 <+65%209389%206893>
Blog : techreadme.blogspot.com

On Tue, Dec 20, 2016 at 3:31 AM, Hendrik Haddorp 
wrote:

> Hi,
>
> the SolrJ API has this method: SolrClient.commit(String collection,
> boolean waitFlush, boolean waitSearcher, boolean softCommit).
> My assumption so far was that when you set waitSearcher to true that the
> method call only returns once a search would find the new data, which
> sounds what you want. I used this already and it seemed to work just fine.
>
> regards,
> Hendrik
>
>
> On 19.12.2016 04:09, Lasitha Wattaladeniya wrote:
>
>> Hi all,
>>
>> Thanks for your replies,
>>
>> @dorian : the requirement is,  we are showing a list of entries on a page.
>> For each user there's a read / unread flag.  The data for listing is
>> fetched from solr. And you can see the entry was previously read or not.
>> So
>> when a user views an entry by clicking.  We are updating the database flag
>> to READ and use real time indexing to update solr entry.  So when the user
>> close the full view of the entry and go back to entry listing page,  the
>> data fetched from solr should be updated to READ. That's the use case we
>> are trying to fix.
>>
>> @eric : thanks for the lengthy reply.  So let's say I increase the
>> autosoftcommit time out to may be 100 ms.  In that case do I have to wait
>> much that time from client side before calling search ?.  What's the
>> correct way of achieving this?
>>
>> Regards,
>> Lasitha
>>
>> On 18 Dec 2016 23:52, "Erick Erickson"  wrote:
>>
>> 1 ms autocommit is far too frequent. And it's not
>>> helping you anyway.
>>>
>>> There is some lag between when a commit happens
>>> and when the docs are really available. The sequence is:
>>> 1> commit (soft or hard-with-opensearcher=true doesn't matter).
>>> 2> a new searcher is opened and autowarming starts
>>> 3> until the new searcher is opened, queries continue to be served by
>>> the old searcher
>>> 4> the new searcher is fully opened
>>> 5> _new_ requests are served by the new searcher.
>>> 6> the last request is finished by the old searcher and it's closed.
>>>
>>> So what's probably happening is that you send docs and then send a
>>> query and Solr is still in step <3>. You can look at your admin UI
>>> pluginst/stats page or your log to see how long it takes for a
>>> searcher to open and adjust your expectations accordingly.
>>>
>>> If you want to fetch only the document (not try to get it by a
>>> search), Real Time Get is designed to insure that you always get the
>>> most recent copy whether it's searchable or not.
>>>
>>> All that said, Solr wasn't designed for autocommits that are that
>>> frequent. That's why the documentation talks about _Near_ Real Time.
>>> You may need to adjust your expectations.
>>>
>>> Best,
>>> Erick
>>>
>>> On Sun, Dec 18, 2016 at 6:49 AM, Dorian Hoxha 
>>> wrote:
>>>
 There's a very high probability that you're using the wrong tool for the
 job if you need 1ms softCommit time. Especially when you always need it

>>> (ex
>>>
 there are apps where you need commit-after-insert very rarely).

 So explain what you're using it for ?

 On Sun, Dec 18, 2016 at 3:38 PM, Lasitha Wattaladeniya <

>>> watt...@gmail.com>
>>>
 wrote:

 Hi Furkan,
>
> Thanks for the links. I had read the first one but not the second one.
> I
> did read it after you sent. So in my current solrconfig.xml settings
>
 below
>>>
 are the configurations,
>
> 
> ${solr.autoSoftCommit.maxTime:1}
>   
>
>
> 
> 15000
> false
>   
>
> The problem i'm facing is, just after adding the documents to solr
> using
> solrj, when I retrieve data from solr I am not getting the updated
>
 results.
>>>
 This happens time to time. Most of the time I get the correct data but
>
 in
>>>
 some occasions I get wrong results. so as you suggest, what the best
> practice to use here ? , should I wait 1 mili second before calling for
> updated results ?
>
> Regards,
> Lasitha
>
> Lasitha Wattaladeniya
> Software Engineer
>
> Mobile : +6593896893
> Blog : techreadme.blogspot.com

Re: Soft commit and reading data just after the commit

2016-12-19 Thread Lasitha Wattaladeniya

Hi Shawn,

Thanks for your well detailed explanation. Now I understand, I won't be
able to achieve the 100ms softcommit timeout with my hardware setup.
However let's say someone has a requirement as below (quoted from my
previous mail)

*Requirement *is,  we are showing a list of entries on a page. For each
user there's a read / unread flag.  The data for listing is fetched from
solr. And you can see the entry was previously read or not. So when a user
views an entry by clicking.  We are updating the database flag to READ and
use real time indexing to update solr index.  So when the user close the
full view of the entry and go back to entry listing page,  the data fetched
from solr should be updated to READ.

Can't we achieve a requirement as described above using solr ? (without
manipulating the previously fetched results list from solr, because at some
point we'll have to go back to search results from solr and at that time it
should be updated).

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893
Blog : techreadme.blogspot.com

On Mon, Dec 19, 2016 at 6:37 PM, Shawn Heisey  wrote:

> On 12/18/2016 7:09 PM, Lasitha Wattaladeniya wrote:
> > @eric : thanks for the lengthy reply. So let's say I increase the
> > autosoftcommit time out to may be 100 ms. In that case do I have to
> > wait much that time from client side before calling search ?. What's
> > the correct way of achieving this?
>
> Some of the following is covered by the links you've already received.
> Some of it may be new information.
>
> Before you can see a change you've just made, you will need to wait for
> the commit to be fired (in this case, the autoSoftCommit time) plus
> however long it actually takes to complete the commit and open a new
> searcher.  Opening the searcher is the expensive part.
>
> What I typically recommend that people do is have the autoSoftCommit
> time as long as they can stand, with 60-300 seconds as a "typical"
> value.  That's a setting of 6 to 30.  What you are trying to
> achieve is much faster, and much more difficult.
>
> 100 milliseconds will typically be far too small a value unless your
> index is extremely small or your hardware is incredibly fast and has a
> lot of memory.  With a value of 100, you'll want each of those soft
> commits (which do open a new searcher) to take FAR less than 100
> milliseconds to complete.  This kind of speed can be difficult to
> achieve, especially if the index is large.
>
> To have any hope of fast commit times, you will need to set
> autowarmCount on all Solr caches to zero.  If you are indexing
> frequently enough, you might even want to completely disable Solr's
> internal caches, because they may be providing no benefit.
>
> You will want to have enough extra memory that your operating system can
> cache the vast majority (or even maybe all) of your index.
>
> https://wiki.apache.org/solr/SolrPerformanceProblems
>
> Some other info that's helpful for understanding why plenty of *spare*
> memory (not allocated by programs) is necessary for good performance:
>
> https://en.wikipedia.org/wiki/Page_cache
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> The reason in a nutshell:  Disks are EXTREMELY slow.  Memory is very fast.
>
> Thanks,
> Shawn
>
>

Re: Soft commit and reading data just after the commit

2016-12-19 Thread Hendrik Haddorp


Hi,

the SolrJ API has this method: SolrClient.commit(String collection, 
boolean waitFlush, boolean waitSearcher, boolean softCommit).
My assumption so far was that when you set waitSearcher to true that the 
method call only returns once a search would find the new data, which 
sounds what you want. I used this already and it seemed to work just fine.


regards,
Hendrik

On 19.12.2016 04:09, Lasitha Wattaladeniya wrote:

Hi all,

Thanks for your replies,

@dorian : the requirement is,  we are showing a list of entries on a page.
For each user there's a read / unread flag.  The data for listing is
fetched from solr. And you can see the entry was previously read or not. So
when a user views an entry by clicking.  We are updating the database flag
to READ and use real time indexing to update solr entry.  So when the user
close the full view of the entry and go back to entry listing page,  the
data fetched from solr should be updated to READ. That's the use case we
are trying to fix.

@eric : thanks for the lengthy reply.  So let's say I increase the
autosoftcommit time out to may be 100 ms.  In that case do I have to wait
much that time from client side before calling search ?.  What's the
correct way of achieving this?

Regards,
Lasitha

On 18 Dec 2016 23:52, "Erick Erickson"  wrote:


1 ms autocommit is far too frequent. And it's not
helping you anyway.

There is some lag between when a commit happens
and when the docs are really available. The sequence is:
1> commit (soft or hard-with-opensearcher=true doesn't matter).
2> a new searcher is opened and autowarming starts
3> until the new searcher is opened, queries continue to be served by
the old searcher
4> the new searcher is fully opened
5> _new_ requests are served by the new searcher.
6> the last request is finished by the old searcher and it's closed.

So what's probably happening is that you send docs and then send a
query and Solr is still in step <3>. You can look at your admin UI
pluginst/stats page or your log to see how long it takes for a
searcher to open and adjust your expectations accordingly.

If you want to fetch only the document (not try to get it by a
search), Real Time Get is designed to insure that you always get the
most recent copy whether it's searchable or not.

All that said, Solr wasn't designed for autocommits that are that
frequent. That's why the documentation talks about _Near_ Real Time.
You may need to adjust your expectations.

Best,
Erick

On Sun, Dec 18, 2016 at 6:49 AM, Dorian Hoxha 
wrote:

There's a very high probability that you're using the wrong tool for the
job if you need 1ms softCommit time. Especially when you always need it

(ex

there are apps where you need commit-after-insert very rarely).

So explain what you're using it for ?

On Sun, Dec 18, 2016 at 3:38 PM, Lasitha Wattaladeniya <

watt...@gmail.com>

wrote:


Hi Furkan,

Thanks for the links. I had read the first one but not the second one. I
did read it after you sent. So in my current solrconfig.xml settings

below

are the configurations,


${solr.autoSoftCommit.maxTime:1}
  



15000
false
  

The problem i'm facing is, just after adding the documents to solr using
solrj, when I retrieve data from solr I am not getting the updated

results.

This happens time to time. Most of the time I get the correct data but

in

some occasions I get wrong results. so as you suggest, what the best
practice to use here ? , should I wait 1 mili second before calling for
updated results ?

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893
Blog : techreadme.blogspot.com

On Sun, Dec 18, 2016 at 8:46 PM, Furkan KAMACI 
wrote:


Hi Lasitha,

First of all, did you check these:

https://cwiki.apache.org/confluence/display/solr/Near+

Real+Time+Searching

https://lucidworks.com/blog/2013/08/23/understanding-
transaction-logs-softcommit-and-commit-in-sorlcloud/

after that, if you cannot adjust your configuration you can give more
information and we can find a solution.

Kind Regards,
Furkan KAMACI

On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya <

watt...@gmail.com>

wrote:


Hi furkan,

Thanks for your reply, it is generally a query heavy system. We are

using

realtime indexing for editing the available data

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893 <+65%209389%206893>
Blog : techreadme.blogspot.com

On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI <

furkankam...@gmail.com>

wrote:


Hi Lasitha,

What is your indexing / querying requirements. Do you have an index
heavy/light  - query heavy/light system?

Kind Regards,
Furkan KAMACI

On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <
watt...@gmail.com>
wrote:


Hello devs,

I'm here with another problem i'm facing. I'm trying to do a

commit

(soft

commit) through solrj and just after the commit, retrieve the data

from

solr (requirement is to get

Re: Soft commit and reading data just after the commit

2016-12-19 Thread Shawn Heisey

On 12/18/2016 7:09 PM, Lasitha Wattaladeniya wrote:
> @eric : thanks for the lengthy reply. So let's say I increase the
> autosoftcommit time out to may be 100 ms. In that case do I have to
> wait much that time from client side before calling search ?. What's
> the correct way of achieving this? 

Some of the following is covered by the links you've already received. 
Some of it may be new information.

Before you can see a change you've just made, you will need to wait for
the commit to be fired (in this case, the autoSoftCommit time) plus
however long it actually takes to complete the commit and open a new
searcher.  Opening the searcher is the expensive part.

What I typically recommend that people do is have the autoSoftCommit
time as long as they can stand, with 60-300 seconds as a "typical"
value.  That's a setting of 6 to 30.  What you are trying to
achieve is much faster, and much more difficult.

100 milliseconds will typically be far too small a value unless your
index is extremely small or your hardware is incredibly fast and has a
lot of memory.  With a value of 100, you'll want each of those soft
commits (which do open a new searcher) to take FAR less than 100
milliseconds to complete.  This kind of speed can be difficult to
achieve, especially if the index is large.

To have any hope of fast commit times, you will need to set
autowarmCount on all Solr caches to zero.  If you are indexing
frequently enough, you might even want to completely disable Solr's
internal caches, because they may be providing no benefit.

You will want to have enough extra memory that your operating system can
cache the vast majority (or even maybe all) of your index.

https://wiki.apache.org/solr/SolrPerformanceProblems

Some other info that's helpful for understanding why plenty of *spare*
memory (not allocated by programs) is necessary for good performance:

https://en.wikipedia.org/wiki/Page_cache
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

The reason in a nutshell:  Disks are EXTREMELY slow.  Memory is very fast.

Thanks,
Shawn

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Lasitha Wattaladeniya

I didn't look much onto  REALTIME GET handler.  Thanks for mentioning
that.  I'm checking it now

On 19 Dec 2016 10:09, "Lasitha Wattaladeniya"  wrote:

> Hi all,
>
> Thanks for your replies,
>
> @dorian : the requirement is,  we are showing a list of entries on a page.
> For each user there's a read / unread flag.  The data for listing is
> fetched from solr. And you can see the entry was previously read or not. So
> when a user views an entry by clicking.  We are updating the database flag
> to READ and use real time indexing to update solr entry.  So when the user
> close the full view of the entry and go back to entry listing page,  the
> data fetched from solr should be updated to READ. That's the use case we
> are trying to fix.
>
> @eric : thanks for the lengthy reply.  So let's say I increase the
> autosoftcommit time out to may be 100 ms.  In that case do I have to wait
> much that time from client side before calling search ?.  What's the
> correct way of achieving this?
>
> Regards,
> Lasitha
>
> On 18 Dec 2016 23:52, "Erick Erickson"  wrote:
>
>> 1 ms autocommit is far too frequent. And it's not
>> helping you anyway.
>>
>> There is some lag between when a commit happens
>> and when the docs are really available. The sequence is:
>> 1> commit (soft or hard-with-opensearcher=true doesn't matter).
>> 2> a new searcher is opened and autowarming starts
>> 3> until the new searcher is opened, queries continue to be served by
>> the old searcher
>> 4> the new searcher is fully opened
>> 5> _new_ requests are served by the new searcher.
>> 6> the last request is finished by the old searcher and it's closed.
>>
>> So what's probably happening is that you send docs and then send a
>> query and Solr is still in step <3>. You can look at your admin UI
>> pluginst/stats page or your log to see how long it takes for a
>> searcher to open and adjust your expectations accordingly.
>>
>> If you want to fetch only the document (not try to get it by a
>> search), Real Time Get is designed to insure that you always get the
>> most recent copy whether it's searchable or not.
>>
>> All that said, Solr wasn't designed for autocommits that are that
>> frequent. That's why the documentation talks about _Near_ Real Time.
>> You may need to adjust your expectations.
>>
>> Best,
>> Erick
>>
>> On Sun, Dec 18, 2016 at 6:49 AM, Dorian Hoxha 
>> wrote:
>> > There's a very high probability that you're using the wrong tool for the
>> > job if you need 1ms softCommit time. Especially when you always need it
>> (ex
>> > there are apps where you need commit-after-insert very rarely).
>> >
>> > So explain what you're using it for ?
>> >
>> > On Sun, Dec 18, 2016 at 3:38 PM, Lasitha Wattaladeniya <
>> watt...@gmail.com>
>> > wrote:
>> >
>> >> Hi Furkan,
>> >>
>> >> Thanks for the links. I had read the first one but not the second one.
>> I
>> >> did read it after you sent. So in my current solrconfig.xml settings
>> below
>> >> are the configurations,
>> >>
>> >> 
>> >>${solr.autoSoftCommit.maxTime:1}
>> >>  
>> >>
>> >>
>> >> 
>> >>15000
>> >>false
>> >>  
>> >>
>> >> The problem i'm facing is, just after adding the documents to solr
>> using
>> >> solrj, when I retrieve data from solr I am not getting the updated
>> results.
>> >> This happens time to time. Most of the time I get the correct data but
>> in
>> >> some occasions I get wrong results. so as you suggest, what the best
>> >> practice to use here ? , should I wait 1 mili second before calling for
>> >> updated results ?
>> >>
>> >> Regards,
>> >> Lasitha
>> >>
>> >> Lasitha Wattaladeniya
>> >> Software Engineer
>> >>
>> >> Mobile : +6593896893
>> >> Blog : techreadme.blogspot.com
>> >>
>> >> On Sun, Dec 18, 2016 at 8:46 PM, Furkan KAMACI > >
>> >> wrote:
>> >>
>> >> > Hi Lasitha,
>> >> >
>> >> > First of all, did you check these:
>> >> >
>> >> > https://cwiki.apache.org/confluence/display/solr/Near+
>> >> Real+Time+Searching
>> >> > https://lucidworks.com/blog/2013/08/23/understanding-
>> >> > transaction-logs-softcommit-and-commit-in-sorlcloud/
>> >> >
>> >> > after that, if you cannot adjust your configuration you can give more
>> >> > information and we can find a solution.
>> >> >
>> >> > Kind Regards,
>> >> > Furkan KAMACI
>> >> >
>> >> > On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya <
>> >> watt...@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Hi furkan,
>> >> >>
>> >> >> Thanks for your reply, it is generally a query heavy system. We are
>> >> using
>> >> >> realtime indexing for editing the available data
>> >> >>
>> >> >> Regards,
>> >> >> Lasitha
>> >> >>
>> >> >> Lasitha Wattaladeniya
>> >> >> Software Engineer
>> >> >>
>> >> >> Mobile : +6593896893 <+65%209389%206893>
>> >> >> Blog : techreadme.blogspot.com
>> >> >>
>> >> >> On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI <
>> furkankam...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >>> Hi Lasitha,

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Lasitha Wattaladeniya

Hi all,

Thanks for your replies,

@dorian : the requirement is,  we are showing a list of entries on a page.
For each user there's a read / unread flag.  The data for listing is
fetched from solr. And you can see the entry was previously read or not. So
when a user views an entry by clicking.  We are updating the database flag
to READ and use real time indexing to update solr entry.  So when the user
close the full view of the entry and go back to entry listing page,  the
data fetched from solr should be updated to READ. That's the use case we
are trying to fix.

@eric : thanks for the lengthy reply.  So let's say I increase the
autosoftcommit time out to may be 100 ms.  In that case do I have to wait
much that time from client side before calling search ?.  What's the
correct way of achieving this?

Regards,
Lasitha

On 18 Dec 2016 23:52, "Erick Erickson"  wrote:

> 1 ms autocommit is far too frequent. And it's not
> helping you anyway.
>
> There is some lag between when a commit happens
> and when the docs are really available. The sequence is:
> 1> commit (soft or hard-with-opensearcher=true doesn't matter).
> 2> a new searcher is opened and autowarming starts
> 3> until the new searcher is opened, queries continue to be served by
> the old searcher
> 4> the new searcher is fully opened
> 5> _new_ requests are served by the new searcher.
> 6> the last request is finished by the old searcher and it's closed.
>
> So what's probably happening is that you send docs and then send a
> query and Solr is still in step <3>. You can look at your admin UI
> pluginst/stats page or your log to see how long it takes for a
> searcher to open and adjust your expectations accordingly.
>
> If you want to fetch only the document (not try to get it by a
> search), Real Time Get is designed to insure that you always get the
> most recent copy whether it's searchable or not.
>
> All that said, Solr wasn't designed for autocommits that are that
> frequent. That's why the documentation talks about _Near_ Real Time.
> You may need to adjust your expectations.
>
> Best,
> Erick
>
> On Sun, Dec 18, 2016 at 6:49 AM, Dorian Hoxha 
> wrote:
> > There's a very high probability that you're using the wrong tool for the
> > job if you need 1ms softCommit time. Especially when you always need it
> (ex
> > there are apps where you need commit-after-insert very rarely).
> >
> > So explain what you're using it for ?
> >
> > On Sun, Dec 18, 2016 at 3:38 PM, Lasitha Wattaladeniya <
> watt...@gmail.com>
> > wrote:
> >
> >> Hi Furkan,
> >>
> >> Thanks for the links. I had read the first one but not the second one. I
> >> did read it after you sent. So in my current solrconfig.xml settings
> below
> >> are the configurations,
> >>
> >> 
> >>${solr.autoSoftCommit.maxTime:1}
> >>  
> >>
> >>
> >> 
> >>15000
> >>false
> >>  
> >>
> >> The problem i'm facing is, just after adding the documents to solr using
> >> solrj, when I retrieve data from solr I am not getting the updated
> results.
> >> This happens time to time. Most of the time I get the correct data but
> in
> >> some occasions I get wrong results. so as you suggest, what the best
> >> practice to use here ? , should I wait 1 mili second before calling for
> >> updated results ?
> >>
> >> Regards,
> >> Lasitha
> >>
> >> Lasitha Wattaladeniya
> >> Software Engineer
> >>
> >> Mobile : +6593896893
> >> Blog : techreadme.blogspot.com
> >>
> >> On Sun, Dec 18, 2016 at 8:46 PM, Furkan KAMACI 
> >> wrote:
> >>
> >> > Hi Lasitha,
> >> >
> >> > First of all, did you check these:
> >> >
> >> > https://cwiki.apache.org/confluence/display/solr/Near+
> >> Real+Time+Searching
> >> > https://lucidworks.com/blog/2013/08/23/understanding-
> >> > transaction-logs-softcommit-and-commit-in-sorlcloud/
> >> >
> >> > after that, if you cannot adjust your configuration you can give more
> >> > information and we can find a solution.
> >> >
> >> > Kind Regards,
> >> > Furkan KAMACI
> >> >
> >> > On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya <
> >> watt...@gmail.com>
> >> > wrote:
> >> >
> >> >> Hi furkan,
> >> >>
> >> >> Thanks for your reply, it is generally a query heavy system. We are
> >> using
> >> >> realtime indexing for editing the available data
> >> >>
> >> >> Regards,
> >> >> Lasitha
> >> >>
> >> >> Lasitha Wattaladeniya
> >> >> Software Engineer
> >> >>
> >> >> Mobile : +6593896893 <+65%209389%206893>
> >> >> Blog : techreadme.blogspot.com
> >> >>
> >> >> On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI <
> furkankam...@gmail.com>
> >> >> wrote:
> >> >>
> >> >>> Hi Lasitha,
> >> >>>
> >> >>> What is your indexing / querying requirements. Do you have an index
> >> >>> heavy/light  - query heavy/light system?
> >> >>>
> >> >>> Kind Regards,
> >> >>> Furkan KAMACI
> >> >>>
> >> >>> On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <
> >> >>> watt...@gmail.com>
> >> >>> wrote:
> >> >>>
> >> >>> > Hello devs,

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Erick Erickson

1 ms autocommit is far too frequent. And it's not
helping you anyway.

There is some lag between when a commit happens
and when the docs are really available. The sequence is:
1> commit (soft or hard-with-opensearcher=true doesn't matter).
2> a new searcher is opened and autowarming starts
3> until the new searcher is opened, queries continue to be served by
the old searcher
4> the new searcher is fully opened
5> _new_ requests are served by the new searcher.
6> the last request is finished by the old searcher and it's closed.

So what's probably happening is that you send docs and then send a
query and Solr is still in step <3>. You can look at your admin UI
pluginst/stats page or your log to see how long it takes for a
searcher to open and adjust your expectations accordingly.

If you want to fetch only the document (not try to get it by a
search), Real Time Get is designed to insure that you always get the
most recent copy whether it's searchable or not.

All that said, Solr wasn't designed for autocommits that are that
frequent. That's why the documentation talks about _Near_ Real Time.
You may need to adjust your expectations.

Best,
Erick

On Sun, Dec 18, 2016 at 6:49 AM, Dorian Hoxha  wrote:
> There's a very high probability that you're using the wrong tool for the
> job if you need 1ms softCommit time. Especially when you always need it (ex
> there are apps where you need commit-after-insert very rarely).
>
> So explain what you're using it for ?
>
> On Sun, Dec 18, 2016 at 3:38 PM, Lasitha Wattaladeniya 
> wrote:
>
>> Hi Furkan,
>>
>> Thanks for the links. I had read the first one but not the second one. I
>> did read it after you sent. So in my current solrconfig.xml settings below
>> are the configurations,
>>
>> 
>>${solr.autoSoftCommit.maxTime:1}
>>  
>>
>>
>> 
>>15000
>>false
>>  
>>
>> The problem i'm facing is, just after adding the documents to solr using
>> solrj, when I retrieve data from solr I am not getting the updated results.
>> This happens time to time. Most of the time I get the correct data but in
>> some occasions I get wrong results. so as you suggest, what the best
>> practice to use here ? , should I wait 1 mili second before calling for
>> updated results ?
>>
>> Regards,
>> Lasitha
>>
>> Lasitha Wattaladeniya
>> Software Engineer
>>
>> Mobile : +6593896893
>> Blog : techreadme.blogspot.com
>>
>> On Sun, Dec 18, 2016 at 8:46 PM, Furkan KAMACI 
>> wrote:
>>
>> > Hi Lasitha,
>> >
>> > First of all, did you check these:
>> >
>> > https://cwiki.apache.org/confluence/display/solr/Near+
>> Real+Time+Searching
>> > https://lucidworks.com/blog/2013/08/23/understanding-
>> > transaction-logs-softcommit-and-commit-in-sorlcloud/
>> >
>> > after that, if you cannot adjust your configuration you can give more
>> > information and we can find a solution.
>> >
>> > Kind Regards,
>> > Furkan KAMACI
>> >
>> > On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya <
>> watt...@gmail.com>
>> > wrote:
>> >
>> >> Hi furkan,
>> >>
>> >> Thanks for your reply, it is generally a query heavy system. We are
>> using
>> >> realtime indexing for editing the available data
>> >>
>> >> Regards,
>> >> Lasitha
>> >>
>> >> Lasitha Wattaladeniya
>> >> Software Engineer
>> >>
>> >> Mobile : +6593896893 <+65%209389%206893>
>> >> Blog : techreadme.blogspot.com
>> >>
>> >> On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI 
>> >> wrote:
>> >>
>> >>> Hi Lasitha,
>> >>>
>> >>> What is your indexing / querying requirements. Do you have an index
>> >>> heavy/light  - query heavy/light system?
>> >>>
>> >>> Kind Regards,
>> >>> Furkan KAMACI
>> >>>
>> >>> On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <
>> >>> watt...@gmail.com>
>> >>> wrote:
>> >>>
>> >>> > Hello devs,
>> >>> >
>> >>> > I'm here with another problem i'm facing. I'm trying to do a commit
>> >>> (soft
>> >>> > commit) through solrj and just after the commit, retrieve the data
>> from
>> >>> > solr (requirement is to get updated data list).
>> >>> >
>> >>> > I'm using soft commit instead of the hard commit, is previously I got
>> >>> an
>> >>> > error "Exceeded limit of maxWarmingSearchers=2, try again later"
>> >>> because of
>> >>> > too many commit requests. Now I have removed the explicit commit and
>> >>> has
>> >>> > let the solr to do the commit using autoSoftCommit *(1 mili second)*
>> >>> and
>> >>> > autoCommit *(30 seconds)* configurations. Now I'm not getting any
>> >>> errors
>> >>> > when i'm committing frequently.
>> >>> >
>> >>> > The problem i'm facing now is, I'm not getting the updated data when
>> I
>> >>> > fetch from solr just after the soft commit. So in this case what are
>> >>> the
>> >>> > best practices to use ? to wait 1 mili second before retrieving data
>> >>> after
>> >>> > soft commit ? I don't feel like waiting from client side is a good
>> >>> option.
>> >>> > Please give me some help from your expert knowledge

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Dorian Hoxha

There's a very high probability that you're using the wrong tool for the
job if you need 1ms softCommit time. Especially when you always need it (ex
there are apps where you need commit-after-insert very rarely).

So explain what you're using it for ?

On Sun, Dec 18, 2016 at 3:38 PM, Lasitha Wattaladeniya 
wrote:

> Hi Furkan,
>
> Thanks for the links. I had read the first one but not the second one. I
> did read it after you sent. So in my current solrconfig.xml settings below
> are the configurations,
>
> 
>${solr.autoSoftCommit.maxTime:1}
>  
>
>
> 
>15000
>false
>  
>
> The problem i'm facing is, just after adding the documents to solr using
> solrj, when I retrieve data from solr I am not getting the updated results.
> This happens time to time. Most of the time I get the correct data but in
> some occasions I get wrong results. so as you suggest, what the best
> practice to use here ? , should I wait 1 mili second before calling for
> updated results ?
>
> Regards,
> Lasitha
>
> Lasitha Wattaladeniya
> Software Engineer
>
> Mobile : +6593896893
> Blog : techreadme.blogspot.com
>
> On Sun, Dec 18, 2016 at 8:46 PM, Furkan KAMACI 
> wrote:
>
> > Hi Lasitha,
> >
> > First of all, did you check these:
> >
> > https://cwiki.apache.org/confluence/display/solr/Near+
> Real+Time+Searching
> > https://lucidworks.com/blog/2013/08/23/understanding-
> > transaction-logs-softcommit-and-commit-in-sorlcloud/
> >
> > after that, if you cannot adjust your configuration you can give more
> > information and we can find a solution.
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya <
> watt...@gmail.com>
> > wrote:
> >
> >> Hi furkan,
> >>
> >> Thanks for your reply, it is generally a query heavy system. We are
> using
> >> realtime indexing for editing the available data
> >>
> >> Regards,
> >> Lasitha
> >>
> >> Lasitha Wattaladeniya
> >> Software Engineer
> >>
> >> Mobile : +6593896893 <+65%209389%206893>
> >> Blog : techreadme.blogspot.com
> >>
> >> On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI 
> >> wrote:
> >>
> >>> Hi Lasitha,
> >>>
> >>> What is your indexing / querying requirements. Do you have an index
> >>> heavy/light  - query heavy/light system?
> >>>
> >>> Kind Regards,
> >>> Furkan KAMACI
> >>>
> >>> On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <
> >>> watt...@gmail.com>
> >>> wrote:
> >>>
> >>> > Hello devs,
> >>> >
> >>> > I'm here with another problem i'm facing. I'm trying to do a commit
> >>> (soft
> >>> > commit) through solrj and just after the commit, retrieve the data
> from
> >>> > solr (requirement is to get updated data list).
> >>> >
> >>> > I'm using soft commit instead of the hard commit, is previously I got
> >>> an
> >>> > error "Exceeded limit of maxWarmingSearchers=2, try again later"
> >>> because of
> >>> > too many commit requests. Now I have removed the explicit commit and
> >>> has
> >>> > let the solr to do the commit using autoSoftCommit *(1 mili second)*
> >>> and
> >>> > autoCommit *(30 seconds)* configurations. Now I'm not getting any
> >>> errors
> >>> > when i'm committing frequently.
> >>> >
> >>> > The problem i'm facing now is, I'm not getting the updated data when
> I
> >>> > fetch from solr just after the soft commit. So in this case what are
> >>> the
> >>> > best practices to use ? to wait 1 mili second before retrieving data
> >>> after
> >>> > soft commit ? I don't feel like waiting from client side is a good
> >>> option.
> >>> > Please give me some help from your expert knowledge
> >>> >
> >>> > Best regards,
> >>> > Lasitha Wattaladeniya
> >>> > Software Engineer
> >>> >
> >>> > Mobile : +6593896893
> >>> > Blog : techreadme.blogspot.com
> >>> >
> >>>
> >>
> >>
> >
>

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Lasitha Wattaladeniya

Hi Furkan,

Thanks for the links. I had read the first one but not the second one. I
did read it after you sent. So in my current solrconfig.xml settings below
are the configurations,


   ${solr.autoSoftCommit.maxTime:1}
 



   15000
   false
 

The problem i'm facing is, just after adding the documents to solr using
solrj, when I retrieve data from solr I am not getting the updated results.
This happens time to time. Most of the time I get the correct data but in
some occasions I get wrong results. so as you suggest, what the best
practice to use here ? , should I wait 1 mili second before calling for
updated results ?

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893
Blog : techreadme.blogspot.com

On Sun, Dec 18, 2016 at 8:46 PM, Furkan KAMACI 
wrote:

> Hi Lasitha,
>
> First of all, did you check these:
>
> https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
> https://lucidworks.com/blog/2013/08/23/understanding-
> transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> after that, if you cannot adjust your configuration you can give more
> information and we can find a solution.
>
> Kind Regards,
> Furkan KAMACI
>
> On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya 
> wrote:
>
>> Hi furkan,
>>
>> Thanks for your reply, it is generally a query heavy system. We are using
>> realtime indexing for editing the available data
>>
>> Regards,
>> Lasitha
>>
>> Lasitha Wattaladeniya
>> Software Engineer
>>
>> Mobile : +6593896893 <+65%209389%206893>
>> Blog : techreadme.blogspot.com
>>
>> On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI 
>> wrote:
>>
>>> Hi Lasitha,
>>>
>>> What is your indexing / querying requirements. Do you have an index
>>> heavy/light  - query heavy/light system?
>>>
>>> Kind Regards,
>>> Furkan KAMACI
>>>
>>> On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <
>>> watt...@gmail.com>
>>> wrote:
>>>
>>> > Hello devs,
>>> >
>>> > I'm here with another problem i'm facing. I'm trying to do a commit
>>> (soft
>>> > commit) through solrj and just after the commit, retrieve the data from
>>> > solr (requirement is to get updated data list).
>>> >
>>> > I'm using soft commit instead of the hard commit, is previously I got
>>> an
>>> > error "Exceeded limit of maxWarmingSearchers=2, try again later"
>>> because of
>>> > too many commit requests. Now I have removed the explicit commit and
>>> has
>>> > let the solr to do the commit using autoSoftCommit *(1 mili second)*
>>> and
>>> > autoCommit *(30 seconds)* configurations. Now I'm not getting any
>>> errors
>>> > when i'm committing frequently.
>>> >
>>> > The problem i'm facing now is, I'm not getting the updated data when I
>>> > fetch from solr just after the soft commit. So in this case what are
>>> the
>>> > best practices to use ? to wait 1 mili second before retrieving data
>>> after
>>> > soft commit ? I don't feel like waiting from client side is a good
>>> option.
>>> > Please give me some help from your expert knowledge
>>> >
>>> > Best regards,
>>> > Lasitha Wattaladeniya
>>> > Software Engineer
>>> >
>>> > Mobile : +6593896893
>>> > Blog : techreadme.blogspot.com
>>> >
>>>
>>
>>
>

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Furkan KAMACI

Hi Lasitha,

First of all, did you check these:

https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

after that, if you cannot adjust your configuration you can give more
information and we can find a solution.

Kind Regards,
Furkan KAMACI

On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya 
wrote:

> Hi furkan,
>
> Thanks for your reply, it is generally a query heavy system. We are using
> realtime indexing for editing the available data
>
> Regards,
> Lasitha
>
> Lasitha Wattaladeniya
> Software Engineer
>
> Mobile : +6593896893 <+65%209389%206893>
> Blog : techreadme.blogspot.com
>
> On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI 
> wrote:
>
>> Hi Lasitha,
>>
>> What is your indexing / querying requirements. Do you have an index
>> heavy/light  - query heavy/light system?
>>
>> Kind Regards,
>> Furkan KAMACI
>>
>> On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <
>> watt...@gmail.com>
>> wrote:
>>
>> > Hello devs,
>> >
>> > I'm here with another problem i'm facing. I'm trying to do a commit
>> (soft
>> > commit) through solrj and just after the commit, retrieve the data from
>> > solr (requirement is to get updated data list).
>> >
>> > I'm using soft commit instead of the hard commit, is previously I got an
>> > error "Exceeded limit of maxWarmingSearchers=2, try again later"
>> because of
>> > too many commit requests. Now I have removed the explicit commit and has
>> > let the solr to do the commit using autoSoftCommit *(1 mili second)* and
>> > autoCommit *(30 seconds)* configurations. Now I'm not getting any errors
>> > when i'm committing frequently.
>> >
>> > The problem i'm facing now is, I'm not getting the updated data when I
>> > fetch from solr just after the soft commit. So in this case what are the
>> > best practices to use ? to wait 1 mili second before retrieving data
>> after
>> > soft commit ? I don't feel like waiting from client side is a good
>> option.
>> > Please give me some help from your expert knowledge
>> >
>> > Best regards,
>> > Lasitha Wattaladeniya
>> > Software Engineer
>> >
>> > Mobile : +6593896893
>> > Blog : techreadme.blogspot.com
>> >
>>
>
>

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Lasitha Wattaladeniya

Hi furkan,

Thanks for your reply, it is generally a query heavy system. We are using
realtime indexing for editing the available data

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893
Blog : techreadme.blogspot.com

On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI 
wrote:

> Hi Lasitha,
>
> What is your indexing / querying requirements. Do you have an index
> heavy/light  - query heavy/light system?
>
> Kind Regards,
> Furkan KAMACI
>
> On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya  >
> wrote:
>
> > Hello devs,
> >
> > I'm here with another problem i'm facing. I'm trying to do a commit (soft
> > commit) through solrj and just after the commit, retrieve the data from
> > solr (requirement is to get updated data list).
> >
> > I'm using soft commit instead of the hard commit, is previously I got an
> > error "Exceeded limit of maxWarmingSearchers=2, try again later" because
> of
> > too many commit requests. Now I have removed the explicit commit and has
> > let the solr to do the commit using autoSoftCommit *(1 mili second)* and
> > autoCommit *(30 seconds)* configurations. Now I'm not getting any errors
> > when i'm committing frequently.
> >
> > The problem i'm facing now is, I'm not getting the updated data when I
> > fetch from solr just after the soft commit. So in this case what are the
> > best practices to use ? to wait 1 mili second before retrieving data
> after
> > soft commit ? I don't feel like waiting from client side is a good
> option.
> > Please give me some help from your expert knowledge
> >
> > Best regards,
> > Lasitha Wattaladeniya
> > Software Engineer
> >
> > Mobile : +6593896893
> > Blog : techreadme.blogspot.com
> >
>

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Furkan KAMACI

Hi Lasitha,

What is your indexing / querying requirements. Do you have an index
heavy/light  - query heavy/light system?

Kind Regards,
Furkan KAMACI

On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya 
wrote:

> Hello devs,
>
> I'm here with another problem i'm facing. I'm trying to do a commit (soft
> commit) through solrj and just after the commit, retrieve the data from
> solr (requirement is to get updated data list).
>
> I'm using soft commit instead of the hard commit, is previously I got an
> error "Exceeded limit of maxWarmingSearchers=2, try again later" because of
> too many commit requests. Now I have removed the explicit commit and has
> let the solr to do the commit using autoSoftCommit *(1 mili second)* and
> autoCommit *(30 seconds)* configurations. Now I'm not getting any errors
> when i'm committing frequently.
>
> The problem i'm facing now is, I'm not getting the updated data when I
> fetch from solr just after the soft commit. So in this case what are the
> best practices to use ? to wait 1 mili second before retrieving data after
> soft commit ? I don't feel like waiting from client side is a good option.
> Please give me some help from your expert knowledge
>
> Best regards,
> Lasitha Wattaladeniya
> Software Engineer
>
> Mobile : +6593896893
> Blog : techreadme.blogspot.com
>

Re: Soft commit from curl

2016-10-22 Thread mimino

Got it. Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Soft-commit-from-curl-tp4302288p4302615.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Soft commit from curl

2016-10-21 Thread Erick Erickson

The best way is to look at your Solr logs. When you see the commit
message, you'll see things like
"start 
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}"

that ought to work, as should something like:
curl blah blah/update?softCommit=true=true

Best,
Erick

On Thu, Oct 20, 2016 at 12:23 PM, Michal Danilák  wrote:
> Does the following command issue soft commit or hard commit?
>
> curl http://localhost:8984/solr/update?softCommit=true -H "Content-Type:
> text/xml" --data-binary ''
>
> How to find out which commit was triggered? Can I get it somewhere in logs?
>
> Thanks.

Re: Soft commit does not affecting query performance

2016-04-13 Thread Bhaumik Joshi

Hi Bill,


Please find below reference.

http://www.cloudera.com/documentation/enterprise/5-4-x/topics/search_tuning_solr.html
* "Enable soft commits and set the value to the largest value that 
meets your requirements. The default value of 1000 (1 second) is too aggressive 
for some environments."


Thanks & Regards,

Bhaumik Joshi



From: billnb...@gmail.com <billnb...@gmail.com>
Sent: Monday, April 11, 2016 7:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Soft commit does not affecting query performance

Why do you think it would ?

Bill Bell
Sent from mobile


> On Apr 11, 2016, at 7:48 AM, Bhaumik Joshi <bjo...@asite.com> wrote:
>
> Hi All,
>
> We are doing query performance test with different soft commit intervals. In 
> the test with 1sec of soft commit interval and 1min of soft commit interval 
> we didn't notice any improvement in query timings.
>
>
>
> We did test with SolrMeter (Standalone java tool for stress tests with Solr) 
> for 1sec soft commit and 1min soft commit.
>
> Index stats of test solr cloud: 0.7 million documents and 1 GB index size.
>
> Solr cloud has 2 shard and each shard has one replica.
>
>
>
> Please find below detailed test readings: (all timings are in milliseconds)
>
>
> Soft commit - 1sec
> Queries per sec Updates per sec   Total Queries   
>   Total Q time   Avg Q Time Total Client time 
>   Avg Client time
> 1  5  
> 100 44340 
>443 48834
> 488
> 5  5  
> 101 128914
>   1276   143239  1418
> 10   5
>   104 295325  
> 2839   330931  3182
> 25   5
>   102 675319  
> 6620   793874  7783
>
> Soft commit - 1min
> Queries per sec Updates per sec   Total Queries   
>   Total Q time   Avg Q Time Total Client time 
>   Avg Client time
> 1  5  
> 100 44292 
>442 48569
> 485
> 5  5  
> 105 131389
>   1251   147174  1401
> 10   5
>   102 299518  
> 2936   337748  3311
> 25   5
>   108 742639  
> 6876   865222  8011
>
> As theory suggests soft commit affects query performance but in my case it 
> doesn't. Can you put some light on this?
> Also suggest if I am missing something here.
>
> Regards,
> Bhaumik Joshi
>
>
>
>
>
>
>
>
>
>
> [Asite]
>
> The Hyperloop Station Design Competition - A 48hr design collaboration, from 
> mid-day, 23rd May 2016.
> REGISTER HERE http://www.buildearthlive.com/hyperloop
[http://www.buildearthlive.com/resources/images/BuildEarthLiveLogo-Hyperloop-2.png]<http://www.buildearthlive.com/hyperloop>

The Hyperloop Station Design Competition - Build Earth 
Live<http://www.buildearthlive.com/hyperloop>
www.buildearthlive.com
The Hyperloop Station Design Competition. A 48hr design collaboration, from 
mid-day,23rd May.



>
> [Build Earth Live Hyperloop]<http://www.buildearthlive.com/hyperloop>
>
> [CC Award Winners 2015]

Re: Soft commit does not affecting query performance

2016-04-11 Thread billnbell

Why do you think it would ?

Bill Bell
Sent from mobile


> On Apr 11, 2016, at 7:48 AM, Bhaumik Joshi  wrote:
> 
> Hi All,
> 
> We are doing query performance test with different soft commit intervals. In 
> the test with 1sec of soft commit interval and 1min of soft commit interval 
> we didn't notice any improvement in query timings.
> 
> 
> 
> We did test with SolrMeter (Standalone java tool for stress tests with Solr) 
> for 1sec soft commit and 1min soft commit.
> 
> Index stats of test solr cloud: 0.7 million documents and 1 GB index size.
> 
> Solr cloud has 2 shard and each shard has one replica.
> 
> 
> 
> Please find below detailed test readings: (all timings are in milliseconds)
> 
> 
> Soft commit - 1sec
> Queries per sec Updates per sec   Total Queries   
>   Total Q time   Avg Q Time Total Client time 
>   Avg Client time
> 1  5  
> 100 44340 
>443 48834
> 488
> 5  5  
> 101 128914
>   1276   143239  1418
> 10   5
>   104 295325  
> 2839   330931  3182
> 25   5
>   102 675319  
> 6620   793874  7783
> 
> Soft commit - 1min
> Queries per sec Updates per sec   Total Queries   
>   Total Q time   Avg Q Time Total Client time 
>   Avg Client time
> 1  5  
> 100 44292 
>442 48569
> 485
> 5  5  
> 105 131389
>   1251   147174  1401
> 10   5
>   102 299518  
> 2936   337748  3311
> 25   5
>   108 742639  
> 6876   865222  8011
> 
> As theory suggests soft commit affects query performance but in my case it 
> doesn't. Can you put some light on this?
> Also suggest if I am missing something here.
> 
> Regards,
> Bhaumik Joshi
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> [Asite]
> 
> The Hyperloop Station Design Competition - A 48hr design collaboration, from 
> mid-day, 23rd May 2016.
> REGISTER HERE http://www.buildearthlive.com/hyperloop
> 
> [Build Earth Live Hyperloop]
> 
> [CC Award Winners 2015]

Re: Soft commit and hard commit

2015-11-30 Thread Alessandro Benedetti

In particular please give us additional details about your search use case .
If the master is not searched, do you mean you have a master/slave
architecture ?
In the case, how replication is managed ?

If you are replicating old style, you are going to be able to see only what
is in the disk at the moment of the replication, which means only the hard
committed segments . Is this your case ?

Do you need soft commit at all ?

>From Erick's guide :

Heavy (bulk) indexing
> The assumption here is that you’re interested in getting lots of data to
> the index as quickly as possible for search sometime in the future. I’m
> thinking original loads of a data source etc.
>
>- Set your soft commit interval quite long. As in 10 minutes or even
>longer (-1 for no soft commits at all). *Soft commit is about
>visibility, *and my assumption here is that bulk indexing isn’t about
>near real time searching so don’t do the extra work of opening any kind of
>searcher.
>
>
>- Set your hard commit intervals to 15 seconds, openSearcher=false.
>Again the assumption is that you’re going to be just blasting data at Solr.
>The worst case here is that you restart your system and have to replay 15
>seconds or so of data from your tlog. If your system is bouncing up and
>down more often than that, fix the reason for that first.
>
>
>- Only after you’ve tried the simple things should you consider
>refinements, they’re usually only required in unusual circumstances. But
>they include:
>
>
>- Turning off the tlog completely for the bulk-load operation
>
>
>- Indexing offline with some kind of map-reduce process
>
>
>- Only having a leader per shard, no replicas for the load, then
>turning on replicas later and letting them do old-style replication to
>catch up. Note that this is automatic, if the node discovers it is “too
>far” out of sync with the leader, it initiates an old-style replication.
>After it has caught up, it’ll get documents as they’re indexed to the
>leader and keep its own tlog.
>
>
>- etc.
>
>
Cheers


On 30 November 2015 at 09:14, Ali Nazemian  wrote:

> Dear Midas,
> Hi,
> AFAIK, currently Solr uses virtual memory for storing memory maps.
> Therefore using 36GB from 48GB of ram for Java heap is not recommended. As
> a rule of thumb do not access more than 25% of your total memory to Solr
> JVM in usual situations.
> About your main question, setting softcommit and hardcommit for Solr is
> highly dependent on your application. A really nice guide for this purpose
> is presented by lucidworks, In order to find the best value for softcommit
> and hardcommit please follow this guide:
>
> http://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Best regards.
>
> On Mon, Nov 30, 2015 at 9:48 AM, Midas A  wrote:
>
> > Machine configuration
> >
> > RAM: 48 GB
> > CPU: 8 core
> > JVM : 36 GB
> >
> > We are updating 70 , 000 docs / hr  . what should be our soft commit and
> > hard commit time  to get best results.
> >
> > Current configuration :
> >  6 false
>  > autoCommit>
> >
> >
> >  60 
> >
> > There are no read on master server.
> >
>
>
>
> --
> A.Nazemian
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: Soft commit and hard commit

2015-11-30 Thread Ali Nazemian

Dear Midas,
Hi,
AFAIK, currently Solr uses virtual memory for storing memory maps.
Therefore using 36GB from 48GB of ram for Java heap is not recommended. As
a rule of thumb do not access more than 25% of your total memory to Solr
JVM in usual situations.
About your main question, setting softcommit and hardcommit for Solr is
highly dependent on your application. A really nice guide for this purpose
is presented by lucidworks, In order to find the best value for softcommit
and hardcommit please follow this guide:
http://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best regards.

On Mon, Nov 30, 2015 at 9:48 AM, Midas A  wrote:

> Machine configuration
>
> RAM: 48 GB
> CPU: 8 core
> JVM : 36 GB
>
> We are updating 70 , 000 docs / hr  . what should be our soft commit and
> hard commit time  to get best results.
>
> Current configuration :
>  6 false  autoCommit>
>
>
>  60 
>
> There are no read on master server.
>

-- 
A.Nazemian

Re: soft commit through leader

2015-05-19 Thread Erick Erickson

In a word yes. The Solr servers are independently keeping their own
timers and one could trip on replica X while an update was in
transmission from the leader say. Or any one of a zillion other timing
conditions. In fact, this is why the indexes will have different
segments on replicas in a slice, the hard commit can be triggered at
different wall-clock times.

But do note that this isn't as much an issue as you might think. The
timer is started when the first update is sent to Solr. So, in the
scenario where you start up all your nodes, the timer starts when you
issue the first commit, i.e. probably within a few milliseconds of
each other. This might still be an issue, but the gap isn't that wide.
Solr promises _eventual_ consistency

If you need to control this, if you issue a soft commit from a client
(URL, SolrJ client, curl, etc) then it _is_ distributed to all
replicas in a collection at that point in time.

Best,
Erick

On Tue, May 19, 2015 at 3:43 AM, Gopal Jee zgo...@gmail.com wrote:
 hi
 wanted to know, when we do soft commit through configuration in
 solrconfig.xml,  will different replicas commit at different point of time
 depending upon when the replica started...or will leader send commit to all
 replicas at same time as per commit interval set in solrconfig.

 thanks
 gopal

Re: soft commit and deletions

2014-11-26 Thread Shawn Heisey

On 11/26/2014 8:18 AM, Andreas Hubold wrote:
 But I'm still not totally sure. Does a soft commit also make deleted
 documents invisible?
 
 In a test with an EmbeddedSolrServer I triggered a soft commit and was
 still able to find a deleted document afterwards. Is this as expected?

All changes to the index, including deletes, are not seen by clients
until a commit with openSearcher=true is done.  A soft commit *should*
cause the deletes to take effect, along with any adds or updates done
since the last searcher was opened.

There's a problem somewhere if that's not happening, either in your
indexing code or Solr.

Thanks,
Shawn

Re: soft commit and deletions

2014-11-26 Thread Erick Erickson

As Shawn says, deletes should be
visible after a soft commit.

Let's see the code though. If you re-use a searcher that
you had open before the commit, it'll still see the old
snapshot of the index including the deleted documents.
Or if you do open a new searcher and any autowarming
hasn't completed you'll still see the snapshot before the commit.

Best,
Erick

On Wed, Nov 26, 2014 at 8:16 AM, Shawn Heisey apa...@elyograg.org wrote:
 On 11/26/2014 8:18 AM, Andreas Hubold wrote:
 But I'm still not totally sure. Does a soft commit also make deleted
 documents invisible?

 In a test with an EmbeddedSolrServer I triggered a soft commit and was
 still able to find a deleted document afterwards. Is this as expected?

 All changes to the index, including deletes, are not seen by clients
 until a commit with openSearcher=true is done.  A soft commit *should*
 cause the deletes to take effect, along with any adds or updates done
 since the last searcher was opened.

 There's a problem somewhere if that's not happening, either in your
 indexing code or Solr.

 Thanks,
 Shawn

Re: soft commit and deletions

2014-11-26 Thread Andreas Hubold


Thank you, Shawn and Erick!

With your hint about the re-used searcher I was able to find my error. I 
must wait for the newly opened searcher when calling the commit method:


solrServer.commit(false, true /*waitSearcher*/, true /*softCommit*/);

instead of

solrServer.commit(false, false, true);

Thanks,
Andreas


Erick Erickson wrote on 11/26/2014 05:35 PM:

As Shawn says, deletes should be
visible after a soft commit.

Let's see the code though. If you re-use a searcher that
you had open before the commit, it'll still see the old
snapshot of the index including the deleted documents.
Or if you do open a new searcher and any autowarming
hasn't completed you'll still see the snapshot before the commit.

Best,
Erick

On Wed, Nov 26, 2014 at 8:16 AM, Shawn Heisey apa...@elyograg.org wrote:

On 11/26/2014 8:18 AM, Andreas Hubold wrote:

But I'm still not totally sure. Does a soft commit also make deleted
documents invisible?

In a test with an EmbeddedSolrServer I triggered a soft commit and was
still able to find a deleted document afterwards. Is this as expected?

All changes to the index, including deletes, are not seen by clients
until a commit with openSearcher=true is done.  A soft commit *should*
cause the deletes to take effect, along with any adds or updates done
since the last searcher was opened.

There's a problem somewhere if that's not happening, either in your
indexing code or Solr.

Thanks,
Shawn


.

Re: soft commit and deletions

2014-11-26 Thread Erick Erickson

Thanks for closing this off. it'd have been a pretty serious
thing if soft commits weren't working.

Erick

On Wed, Nov 26, 2014 at 12:58 PM, Andreas Hubold
andreas.hub...@coremedia.com wrote:
 Thank you, Shawn and Erick!

 With your hint about the re-used searcher I was able to find my error. I
 must wait for the newly opened searcher when calling the commit method:

 solrServer.commit(false, true /*waitSearcher*/, true /*softCommit*/);

 instead of

 solrServer.commit(false, false, true);

 Thanks,
 Andreas


 Erick Erickson wrote on 11/26/2014 05:35 PM:

 As Shawn says, deletes should be
 visible after a soft commit.

 Let's see the code though. If you re-use a searcher that
 you had open before the commit, it'll still see the old
 snapshot of the index including the deleted documents.
 Or if you do open a new searcher and any autowarming
 hasn't completed you'll still see the snapshot before the commit.

 Best,
 Erick

 On Wed, Nov 26, 2014 at 8:16 AM, Shawn Heisey apa...@elyograg.org wrote:

 On 11/26/2014 8:18 AM, Andreas Hubold wrote:

 But I'm still not totally sure. Does a soft commit also make deleted
 documents invisible?

 In a test with an EmbeddedSolrServer I triggered a soft commit and was
 still able to find a deleted document afterwards. Is this as expected?

 All changes to the index, including deletes, are not seen by clients
 until a commit with openSearcher=true is done.  A soft commit *should*
 cause the deletes to take effect, along with any adds or updates done
 since the last searcher was opened.

 There's a problem somewhere if that's not happening, either in your
 indexing code or Solr.

 Thanks,
 Shawn

 .

Re: {soft}Commit and cache flusing

2013-10-10 Thread Dmitry Kan

Tim,

my suggestion was very concise, sorry for that. But not at all rude or
anything. Instead, tried to help you.

Dmitry


On Wed, Oct 9, 2013 at 9:28 PM, Tim Vaillancourt t...@elementspace.comwrote:

 Apologies all. I think the suggestion that I was replying to get noticed
 is what erked me, otherwise I would have moved on. I'll follow this advice.

 Cheers,

 Tim


 On 9 October 2013 05:20, Erick Erickson erickerick...@gmail.com wrote:

  Tim:
 
  I think you're mis-interpreting. By replying to a post with the subject:
 
  {soft}Commit and cache flushing
 
  but going in a different direction, it's easy for people to think I'm
  not interested in that
  thread, I'll ignore it, thereby missing the fact that you're asking a
  somewhat different
  question that they might have information about. It's not about whether
  you're
  doing anything particularly wrong with the question. It's about making
  it easy for
  people to help.
 
  See http://people.apache.org/~hossman/#threadhijack
 
  Best,
  Erick
 
  On Tue, Oct 8, 2013 at 6:23 PM, Tim Vaillancourt t...@elementspace.com
  wrote:
   I have a genuine question with substance here. If anything this
   nonconstructive, rude response was to get noticed. Thanks for
   contributing to the discussion.
  
   Tim
  
  
   On 8 October 2013 05:31, Dmitry Kan solrexp...@gmail.com wrote:
  
   Tim,
   I suggest you open a new thread and not reply to this one to get
  noticed.
   Dmitry
  
  
   On Mon, Oct 7, 2013 at 9:44 PM, Tim Vaillancourt 
 t...@elementspace.com
   wrote:
  
Is there a way to make autoCommit only commit if there are pending
   changes,
ie: if there are 0 adds pending commit, don't autoCommit
  (open-a-searcher
and wipe the caches)?
   
Cheers,
   
Tim
   
   
On 2 October 2013 00:52, Dmitry Kan solrexp...@gmail.com wrote:
   
 right. We've got the autoHard commit configured only atm. The
soft-commits
 are controlled on the client. It was just easier to implement the
  first
 version of our internal commit policy that will commit to all solr
 instances at once. This is where we have noticed the reported
  behavior.


 On Wed, Oct 2, 2013 at 9:32 AM, Bram Van Dam 
 bram.van...@intix.eu
wrote:

  if there are no modifications to an index and a softCommit or
hardCommit
  issued, then solr flushes the cache.
 
 
  Indeed. The easiest way to work around this is by disabling auto
commits
  and only commit when you have to.

Re: {soft}Commit and cache flusing

2013-10-09 Thread Erick Erickson

Tim:

I think you're mis-interpreting. By replying to a post with the subject:

{soft}Commit and cache flushing

but going in a different direction, it's easy for people to think I'm
not interested in that
thread, I'll ignore it, thereby missing the fact that you're asking a
somewhat different
question that they might have information about. It's not about whether you're
doing anything particularly wrong with the question. It's about making
it easy for
people to help.

See http://people.apache.org/~hossman/#threadhijack

Best,
Erick

On Tue, Oct 8, 2013 at 6:23 PM, Tim Vaillancourt t...@elementspace.com wrote:
 I have a genuine question with substance here. If anything this
 nonconstructive, rude response was to get noticed. Thanks for
 contributing to the discussion.

 Tim


 On 8 October 2013 05:31, Dmitry Kan solrexp...@gmail.com wrote:

 Tim,
 I suggest you open a new thread and not reply to this one to get noticed.
 Dmitry


 On Mon, Oct 7, 2013 at 9:44 PM, Tim Vaillancourt t...@elementspace.com
 wrote:

  Is there a way to make autoCommit only commit if there are pending
 changes,
  ie: if there are 0 adds pending commit, don't autoCommit (open-a-searcher
  and wipe the caches)?
 
  Cheers,
 
  Tim
 
 
  On 2 October 2013 00:52, Dmitry Kan solrexp...@gmail.com wrote:
 
   right. We've got the autoHard commit configured only atm. The
  soft-commits
   are controlled on the client. It was just easier to implement the first
   version of our internal commit policy that will commit to all solr
   instances at once. This is where we have noticed the reported behavior.
  
  
   On Wed, Oct 2, 2013 at 9:32 AM, Bram Van Dam bram.van...@intix.eu
  wrote:
  
if there are no modifications to an index and a softCommit or
  hardCommit
issued, then solr flushes the cache.
   
   
Indeed. The easiest way to work around this is by disabling auto
  commits
and only commit when you have to.

Re: {soft}Commit and cache flusing

2013-10-09 Thread Tim Vaillancourt

Apologies all. I think the suggestion that I was replying to get noticed
is what erked me, otherwise I would have moved on. I'll follow this advice.

Cheers,

Tim


On 9 October 2013 05:20, Erick Erickson erickerick...@gmail.com wrote:

 Tim:

 I think you're mis-interpreting. By replying to a post with the subject:

 {soft}Commit and cache flushing

 but going in a different direction, it's easy for people to think I'm
 not interested in that
 thread, I'll ignore it, thereby missing the fact that you're asking a
 somewhat different
 question that they might have information about. It's not about whether
 you're
 doing anything particularly wrong with the question. It's about making
 it easy for
 people to help.

 See http://people.apache.org/~hossman/#threadhijack

 Best,
 Erick

 On Tue, Oct 8, 2013 at 6:23 PM, Tim Vaillancourt t...@elementspace.com
 wrote:
  I have a genuine question with substance here. If anything this
  nonconstructive, rude response was to get noticed. Thanks for
  contributing to the discussion.
 
  Tim
 
 
  On 8 October 2013 05:31, Dmitry Kan solrexp...@gmail.com wrote:
 
  Tim,
  I suggest you open a new thread and not reply to this one to get
 noticed.
  Dmitry
 
 
  On Mon, Oct 7, 2013 at 9:44 PM, Tim Vaillancourt t...@elementspace.com
  wrote:
 
   Is there a way to make autoCommit only commit if there are pending
  changes,
   ie: if there are 0 adds pending commit, don't autoCommit
 (open-a-searcher
   and wipe the caches)?
  
   Cheers,
  
   Tim
  
  
   On 2 October 2013 00:52, Dmitry Kan solrexp...@gmail.com wrote:
  
right. We've got the autoHard commit configured only atm. The
   soft-commits
are controlled on the client. It was just easier to implement the
 first
version of our internal commit policy that will commit to all solr
instances at once. This is where we have noticed the reported
 behavior.
   
   
On Wed, Oct 2, 2013 at 9:32 AM, Bram Van Dam bram.van...@intix.eu
   wrote:
   
 if there are no modifications to an index and a softCommit or
   hardCommit
 issued, then solr flushes the cache.


 Indeed. The easiest way to work around this is by disabling auto
   commits
 and only commit when you have to.

Re: {soft}Commit and cache flusing

2013-10-08 Thread Dmitry Kan

Tim,
I suggest you open a new thread and not reply to this one to get noticed.
Dmitry


On Mon, Oct 7, 2013 at 9:44 PM, Tim Vaillancourt t...@elementspace.comwrote:

 Is there a way to make autoCommit only commit if there are pending changes,
 ie: if there are 0 adds pending commit, don't autoCommit (open-a-searcher
 and wipe the caches)?

 Cheers,

 Tim


 On 2 October 2013 00:52, Dmitry Kan solrexp...@gmail.com wrote:

  right. We've got the autoHard commit configured only atm. The
 soft-commits
  are controlled on the client. It was just easier to implement the first
  version of our internal commit policy that will commit to all solr
  instances at once. This is where we have noticed the reported behavior.
 
 
  On Wed, Oct 2, 2013 at 9:32 AM, Bram Van Dam bram.van...@intix.eu
 wrote:
 
   if there are no modifications to an index and a softCommit or
 hardCommit
   issued, then solr flushes the cache.
  
  
   Indeed. The easiest way to work around this is by disabling auto
 commits
   and only commit when you have to.

Re: {soft}Commit and cache flusing

2013-10-08 Thread Tim Vaillancourt

I have a genuine question with substance here. If anything this
nonconstructive, rude response was to get noticed. Thanks for
contributing to the discussion.

Tim


On 8 October 2013 05:31, Dmitry Kan solrexp...@gmail.com wrote:

 Tim,
 I suggest you open a new thread and not reply to this one to get noticed.
 Dmitry


 On Mon, Oct 7, 2013 at 9:44 PM, Tim Vaillancourt t...@elementspace.com
 wrote:

  Is there a way to make autoCommit only commit if there are pending
 changes,
  ie: if there are 0 adds pending commit, don't autoCommit (open-a-searcher
  and wipe the caches)?
 
  Cheers,
 
  Tim
 
 
  On 2 October 2013 00:52, Dmitry Kan solrexp...@gmail.com wrote:
 
   right. We've got the autoHard commit configured only atm. The
  soft-commits
   are controlled on the client. It was just easier to implement the first
   version of our internal commit policy that will commit to all solr
   instances at once. This is where we have noticed the reported behavior.
  
  
   On Wed, Oct 2, 2013 at 9:32 AM, Bram Van Dam bram.van...@intix.eu
  wrote:
  
if there are no modifications to an index and a softCommit or
  hardCommit
issued, then solr flushes the cache.
   
   
Indeed. The easiest way to work around this is by disabling auto
  commits
and only commit when you have to.

Re: Soft commit and flush

2013-10-07 Thread adfel70

I understand the bottom line that soft commits are about visibility, hard
commits are about durability. I am just trying to gain better understanding
what happens under the hood...
2 more related questions you made me think of:
1. Does the NRTCachingDirectoryFactory relevant for both types of commit, or
just for hard commit?
2. If soft commit does not flush - all data exists in RAM until we call hard
commit? If so, using soft commit without calling hard commit could cause OOE
... ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726p4093834.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Soft commit and flush

2013-10-07 Thread Erick Erickson

bq: Does the NRTCachingDirectoryFactory relevant for both types of commit, or
just for hard commit

Don't know the code deeply, but NRT==Near Real Time == Soft commit I'd guess.

bq: If soft commit does not flush...

soft commit flushes the transaction log. On restart if the content of
the tlog isn't
in the index, then it's replayed to catch up the index. OOE? Out Of Energy? You
can optionally set up soft commits to fsync the tlog if you want to
eliminate the
remote possibility that you have an op-system (not JVM) crash between the time
the JVM passes the write off to the op system and the op system writes the
bits to disk.

Best,
Erick

On Mon, Oct 7, 2013 at 2:57 AM, adfel70 adfe...@gmail.com wrote:
 I understand the bottom line that soft commits are about visibility, hard
 commits are about durability. I am just trying to gain better understanding
 what happens under the hood...
 2 more related questions you made me think of:
 1. Does the NRTCachingDirectoryFactory relevant for both types of commit, or
 just for hard commit?
 2. If soft commit does not flush - all data exists in RAM until we call hard
 commit? If so, using soft commit without calling hard commit could cause OOE
 ... ?



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726p4093834.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Soft commit and flush

2013-10-07 Thread adfel70

Sorry, by OOE I meant Out of memory exception...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726p4093902.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Soft commit and flush

2013-10-07 Thread Guido Medina


Out of Memory Exception is well known as OOM.

Guido.

On 07/10/13 14:11, adfel70 wrote:

Sorry, by OOE I meant Out of memory exception...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726p4093902.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: {soft}Commit and cache flusing

2013-10-07 Thread Tim Vaillancourt

Is there a way to make autoCommit only commit if there are pending changes,
ie: if there are 0 adds pending commit, don't autoCommit (open-a-searcher
and wipe the caches)?

Cheers,

Tim


On 2 October 2013 00:52, Dmitry Kan solrexp...@gmail.com wrote:

 right. We've got the autoHard commit configured only atm. The soft-commits
 are controlled on the client. It was just easier to implement the first
 version of our internal commit policy that will commit to all solr
 instances at once. This is where we have noticed the reported behavior.


 On Wed, Oct 2, 2013 at 9:32 AM, Bram Van Dam bram.van...@intix.eu wrote:

  if there are no modifications to an index and a softCommit or hardCommit
  issued, then solr flushes the cache.
 
 
  Indeed. The easiest way to work around this is by disabling auto commits
  and only commit when you have to.

Re: Soft commit and flush

2013-10-07 Thread Erick Erickson

bq:  If so, using soft commit without calling hard commit could cause OOM

no. Aside from anything you have configured for auto(hard) commit, the
ramBufferSizeMB in solrconfig.xml will flush the in-memory structures out
to the segments when the size reaches this limit. It won't _close_ the
current segment, so it won't be permanent, but it'll limit memory consumption.

Best,
Erick

On Mon, Oct 7, 2013 at 9:40 AM, Guido Medina guido.med...@temetra.com wrote:
 Out of Memory Exception is well known as OOM.

 Guido.


 On 07/10/13 14:11, adfel70 wrote:

 Sorry, by OOE I meant Out of memory exception...



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726p4093902.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: {soft}Commit and cache flusing

2013-10-02 Thread Bram Van Dam


if there are no modifications to an index and a softCommit or hardCommit
issued, then solr flushes the cache.


Indeed. The easiest way to work around this is by disabling auto commits 
and only commit when you have to.

Re: {soft}Commit and cache flusing

2013-10-02 Thread Dmitry Kan

right. We've got the autoHard commit configured only atm. The soft-commits
are controlled on the client. It was just easier to implement the first
version of our internal commit policy that will commit to all solr
instances at once. This is where we have noticed the reported behavior.


On Wed, Oct 2, 2013 at 9:32 AM, Bram Van Dam bram.van...@intix.eu wrote:

 if there are no modifications to an index and a softCommit or hardCommit
 issued, then solr flushes the cache.


 Indeed. The easiest way to work around this is by disabling auto commits
 and only commit when you have to.

Re: {soft}Commit and cache flusing

2013-10-01 Thread Shawn Heisey

On 10/1/2013 2:48 AM, Dmitry Kan wrote:
 This is a minor thing, perhaps, but thought to ask / share:
 
 if there are no modifications to an index and a softCommit or hardCommit
 issued, then solr flushes the cache.

Any time you do a commit that opens a new Searcher object
(openSearcher=true, which is required if you want index changes to be
visible to people making queries), the caches are invalidated.  This is
because the layout of the index (and therefore the Lucene internal IDs)
can completely change with *any* commit/merge, and there is no easy and
reliable way to determine when the those numbers have NOT changed.

If you have warming queries configured, those happen on the new
searcher, populating the new cache.  If you have cache autoWarming
configured, then keys from the old caches are re-queried against the new
index and used to populate the new cache.

I do not understand deep Lucene internals, but what I've seen come
through Jira activity and commits over the last year or two has been a
strong move towards per-segment thinking instead of whole-index
thinking.  If this idea becomes applicable to all aspects of Lucene,
then perhaps Solr caches can also become per-segment, and will not need
to be completely invalidated except in the case of a major merge or
forceMerge.

Thanks,
Shawn

Re: {soft}Commit and cache flusing

2013-10-01 Thread Dmitry Kan

Thanks a lot Shawn for an exhaustive reply!

Regards,
Dmitry


On Tue, Oct 1, 2013 at 5:37 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/1/2013 2:48 AM, Dmitry Kan wrote:
  This is a minor thing, perhaps, but thought to ask / share:
 
  if there are no modifications to an index and a softCommit or hardCommit
  issued, then solr flushes the cache.

 Any time you do a commit that opens a new Searcher object
 (openSearcher=true, which is required if you want index changes to be
 visible to people making queries), the caches are invalidated.  This is
 because the layout of the index (and therefore the Lucene internal IDs)
 can completely change with *any* commit/merge, and there is no easy and
 reliable way to determine when the those numbers have NOT changed.

 If you have warming queries configured, those happen on the new
 searcher, populating the new cache.  If you have cache autoWarming
 configured, then keys from the old caches are re-queried against the new
 index and used to populate the new cache.

 I do not understand deep Lucene internals, but what I've seen come
 through Jira activity and commits over the last year or two has been a
 strong move towards per-segment thinking instead of whole-index
 thinking.  If this idea becomes applicable to all aspects of Lucene,
 then perhaps Solr caches can also become per-segment, and will not need
 to be completely invalidated except in the case of a major merge or
 forceMerge.

 Thanks,
 Shawn

Re: Soft commit and flush

2013-09-25 Thread Erick Erickson

Why do you care? Curiosity or are you trying to find a
behavior you can count on?

because soft commits are about visibility, hard commits are
about durability. Meaning you can't count on a soft commit
writing anything to disk at all. I suspect in your tests the soft
commit had nothing to do with the changes on disk, those were
just a consequence of indexing more data triggering a flush
to disk and would have happened if you hadn't done a soft
commit.

hard commits are what you can control writes to disk with,
not soft commits.

Best,
Erick

On Tue, Sep 24, 2013 at 3:56 PM, Shawn Heisey s...@elyograg.org wrote:
 On 9/24/2013 5:51 AM, adfel70 wrote:

 My conclusion is that soft commit always flushes the data, but because of
 the implementation of NRTCachingDirectoryFactory, the data will be written
 to the disk when its getting too big.


 The NRTCachingDirectoryFactory (which creates NRTCachingDirectory instances)
 used by default in newer Solr versions has default settings for some of its
 parameters that show up in the solr log:

 maxCacheMB=48.0 maxMergeSizeMB=4.0

 The constructor javadocs for NRTCachingDirectory show what circumstances
 will cause the directory to use RAM instead of flushing to disk:

 http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/store/NRTCachingDirectory.html#NRTCachingDirectory%28org.apache.lucene.store.Directory,%20double,%20double%29

 We will cache a newly created output if 1) it's a flush or a merge and the
 estimated size of the merged segment is = maxMergeSizeMB, and 2) the total
 cached bytes is = maxCachedMB

 Thanks,
 Shawn

I believe data is not fsynched to disk until a hard commit (and even
then disks can lie to you and tell you data is safe even though it's
still in disk cache waiting to really be written to the medium) ,
which is why you can lose it between hard commits. Soft commits just
make newly added docs visible in search results.

Otis
--
Solr ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm

On Tue, Sep 24, 2013 at 7:51 AM, adfel70 adfe...@gmail.com wrote:
I am struggling to get a deep understanding of soft commit.
I have read Erick's post
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
which helped me a lot with when and why we should call each type of commit.
But still, I cant understand what exactly happens when we call soft commit:
I mean, does the new data is flushed, fsynched, or hold in the RAM... ?
I tried to test it myself and I got 2 different behaviours:
a. If I just had 1 document that was added to the index, soft commit did not
cause index files to change.
b. If I had a big change (addition of about 100,000 docs, ~5MB tlog file),
calling the soft commit DID change the index files - so I guess that soft
commit caused fsynch.

My conclusion is that soft commit always flushes the data, but because of
the implementation of NRTCachingDirectoryFactory, the data will be written
to the disk when its getting too big.

Can some one please correct me?

--
View this message in context:
http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Soft commit and flush

2013-09-24 Thread Shawn Heisey


On 9/24/2013 5:51 AM, adfel70 wrote:

My conclusion is that soft commit always flushes the data, but because of
the implementation of NRTCachingDirectoryFactory, the data will be written
to the disk when its getting too big.


The NRTCachingDirectoryFactory (which creates NRTCachingDirectory 
instances) used by default in newer Solr versions has default settings 
for some of its parameters that show up in the solr log:


maxCacheMB=48.0 maxMergeSizeMB=4.0

The constructor javadocs for NRTCachingDirectory show what circumstances 
will cause the directory to use RAM instead of flushing to disk:


http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/store/NRTCachingDirectory.html#NRTCachingDirectory%28org.apache.lucene.store.Directory,%20double,%20double%29

We will cache a newly created output if 1) it's a flush or a merge and 
the estimated size of the merged segment is = maxMergeSizeMB, and 2) 
the total cached bytes is = maxCachedMB


Thanks,
Shawn

Re: Soft Commit and Document Cache

2013-04-22 Thread Mark Miller

Yup - all of the top level caches are. It's a trade off - don't NRT more than 
you need to.

- Mark

On Apr 22, 2013, at 6:16 PM, Niran Fajemisin afa...@yahoo.com wrote:

 Hi all,
 
 A quick (and hopefully simply) question: Does the document cache (or any of 
 the other caches for that matter), get invalidated after a soft commit has 
 been performed?
 
 Thanks,
 Niran

Re: Soft Commit and Document Cache

2013-04-22 Thread Shawn Heisey


On 4/22/2013 4:16 PM, Niran Fajemisin wrote:

A quick (and hopefully simply) question: Does the document cache (or any of the 
other caches for that matter), get invalidated after a soft commit has been 
performed?


All Solr caches are invalidated when you issue a commit with 
openSearcher set to true.  There would be no reason to do a soft commit 
with openSearcher set to false.  That setting only makes sense with hard 
commits.


If you have queries defined for the newSearcher event, then they will be 
run, which can pre-populate caches.


The filterCache and queryResultCache can be autowarmed on commit - the 
most relevant autowarmCount queries in the cache from the old searcher 
are re-run against the new searcher.  The queryResultWindowSize 
parameter helps control exactly what gets cached with the queryResultCache.


The documentCache cannot be autowarmed, although I *think* that when 
entries from the queryResultCache are run, it will also populate the 
documentCache, though I could be wrong about that.


I do not know whether autowarming is done before or after newSearcher 
queries.


http://wiki.apache.org/solr/SolrCaching

Thanks,
Shawn

Re: Soft Commit and Document Cache

2013-04-22 Thread Niran Fajemisin

Thanks Shawn and Mark! That was very helpful.

-Niran

 From: Shawn Heisey s...@elyograg.org
To: solr-user@lucene.apache.org 
Sent: Monday, April 22, 2013 5:30 PM
Subject: Re: Soft Commit and Document Cache

On 4/22/2013 4:16 PM, Niran Fajemisin wrote:
 A quick (and hopefully simply) question: Does the document cache (or any of 
 the other caches for that matter), get invalidated after a soft commit has 
 been performed?

All Solr caches are invalidated when you issue a commit with 
openSearcher set to true.  There would be no reason to do a soft commit 
with openSearcher set to false.  That setting only makes sense with hard 
commits.

If you have queries defined for the newSearcher event, then they will be 
run, which can pre-populate caches.

The filterCache and queryResultCache can be autowarmed on commit - the 
most relevant autowarmCount queries in the cache from the old searcher 
are re-run against the new searcher.  The queryResultWindowSize 
parameter helps control exactly what gets cached with the queryResultCache.

The documentCache cannot be autowarmed, although I *think* that when 
entries from the queryResultCache are run, it will also populate the 
documentCache, though I could be wrong about that.

I do not know whether autowarming is done before or after newSearcher 
queries.

http://wiki.apache.org/solr/SolrCaching

Thanks,
Shawn

Re: soft commit 2

2012-01-05 Thread Erick Erickson

What is your evidence that it doesn't work
when you specify it in solrconfig.xml? You
haven't provided enough information about
what you've tried to give us much to go on.

It might help to review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Tue, Jan 3, 2012 at 8:17 AM, ramires uy...@beriltech.com wrote:
 hi

 softcommit work with below command but don`t work in solrconfig.xml. What is
 wrong with below xml part?

 curl http://localhost:8984/solr/update -H Content-Type: text/xml
 --data-binary 'commit softCommit=true waitFlush=false
 waitSearcher=false/'

  updateHandler class=solr.DirectUpdateHandler2
        autoSoftCommit
       maxTime1000/maxTime
     /autoSoftCommit
  /updateHandler


 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/soft-commit-2-tp3628975p3628975.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: soft commit

2012-01-03 Thread Jason Rutherglen

*Laugh*

I stand by what Mark said:

Right - in most NRT cases (very frequent soft commits), the cache should
probably be disabled.

On Mon, Jan 2, 2012 at 7:45 PM, Yonik Seeley yo...@lucidimagination.com wrote:
 On Mon, Jan 2, 2012 at 9:58 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 It still normally makes sense to have the caches enabled (esp filter and 
 document caches).

 In the NRT case that statement is completely incorrect

 *shrug*

 To each their own.  I stand my my statement.

 -Yonik
 http://www.lucidimagination.com

Re: soft commit

2012-01-03 Thread Erik Hatcher

As I understand it, the document and filter caches add value *intra* request 
such that it keeps additional work (like fetching stored fields from disk more 
than once) from occurring.

Erik

On Jan 3, 2012, at 16:26 , Jason Rutherglen wrote:

 *Laugh*
 
 I stand by what Mark said:
 
 Right - in most NRT cases (very frequent soft commits), the cache should
 probably be disabled.
 
 On Mon, Jan 2, 2012 at 7:45 PM, Yonik Seeley yo...@lucidimagination.com 
 wrote:
 On Mon, Jan 2, 2012 at 9:58 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 It still normally makes sense to have the caches enabled (esp filter and 
 document caches).
 
 In the NRT case that statement is completely incorrect
 
 *shrug*
 
 To each their own.  I stand my my statement.
 
 -Yonik
 http://www.lucidimagination.com

Re: soft commit

2012-01-03 Thread Yonik Seeley

On Tue, Jan 3, 2012 at 4:36 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
 As I understand it, the document and filter caches add value *intra* request 
 such that it keeps additional work (like fetching stored fields from disk 
 more than once) from occurring.

Yep.  Highlighting, multi-select faceting, and distributed search are
just some of the scenarios where the caches are utilized in the scope
of a single request.
Please folks, don't disable your caches!

-Yonik
http://www.lucidimagination.com

Re: soft commit

2012-01-03 Thread Jason Rutherglen

 multi-select faceting

Yikes.  I'd love to see a test showing that un-inverted field cache
(which is for ALL segments as a single unit) can be used efficiently
with NRT / soft commit.

On Tue, Jan 3, 2012 at 1:50 PM, Yonik Seeley yo...@lucidimagination.com wrote:
 On Tue, Jan 3, 2012 at 4:36 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
 As I understand it, the document and filter caches add value *intra* request 
 such that it keeps additional work (like fetching stored fields from disk 
 more than once) from occurring.

 Yep.  Highlighting, multi-select faceting, and distributed search are
 just some of the scenarios where the caches are utilized in the scope
 of a single request.
 Please folks, don't disable your caches!

 -Yonik
 http://www.lucidimagination.com

Re: soft commit

2012-01-03 Thread Jason Rutherglen

The main point is, Solr unlike for example Elastic Search and other
Lucene based systems does NOT cache filters or facets per-segment.

This is a fundamental design flaw.

On Tue, Jan 3, 2012 at 1:50 PM, Yonik Seeley yo...@lucidimagination.com wrote:
 On Tue, Jan 3, 2012 at 4:36 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
 As I understand it, the document and filter caches add value *intra* request 
 such that it keeps additional work (like fetching stored fields from disk 
 more than once) from occurring.

 Yep.  Highlighting, multi-select faceting, and distributed search are
 just some of the scenarios where the caches are utilized in the scope
 of a single request.
 Please folks, don't disable your caches!

 -Yonik
 http://www.lucidimagination.com

Re: soft commit

2012-01-03 Thread Yonik Seeley

On Tue, Jan 3, 2012 at 5:03 PM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
 Yikes.  I'd love to see a test showing that un-inverted field cache
 (which is for ALL segments as a single unit) can be used efficiently
 with NRT / soft commit.

Please stop being a troll.
Solr as multiple faceting methods - only one uses un-inverted field cache.

Oh, and for the record, Solr does have a faceting method in trunk that
caches per-segment.
There are always tradeoffs though - string faceting per-segment will
always be slower than string faceting over the complete index (due to
the cost of merging per-segment counts).

Anyway, disabling any of those caches won't make anything any
faster... the data structures will still be built, they just won't be
reused.
Seems like you realized your original statement was erroneous and have
just reverted to troll state, trying to find something to pick at.

-Yonik
http://www.lucidimagination.com

Re: soft commit

2012-01-03 Thread Jason Rutherglen

Address the points I brought up or don't reply with funny name calling.

Below are two key points reiterated and re-articulated is an easy to answer way:

* Multi-select faceting is per-segment (true or false)

* Filters are cached per-segment (true or false)

On Tue, Jan 3, 2012 at 2:16 PM, Yonik Seeley yo...@lucidimagination.com wrote:
 On Tue, Jan 3, 2012 at 5:03 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
 Yikes.  I'd love to see a test showing that un-inverted field cache
 (which is for ALL segments as a single unit) can be used efficiently
 with NRT / soft commit.

 Please stop being a troll.
 Solr as multiple faceting methods - only one uses un-inverted field cache.

 Oh, and for the record, Solr does have a faceting method in trunk that
 caches per-segment.
 There are always tradeoffs though - string faceting per-segment will
 always be slower than string faceting over the complete index (due to
 the cost of merging per-segment counts).

 Anyway, disabling any of those caches won't make anything any
 faster... the data structures will still be built, they just won't be
 reused.
 Seems like you realized your original statement was erroneous and have
 just reverted to troll state, trying to find something to pick at.

 -Yonik
 http://www.lucidimagination.com

Re: soft commit

2012-01-02 Thread Tomás Fernández Löbbe

Yes, soft commit currently clears Solr's caches.

On Mon, Jan 2, 2012 at 12:01 PM, ramires uy...@beriltech.com wrote:

 hi

 After soft-commit with below command all cache are cleared. Is it normal?

 curl http://localhost:8984/solr/update -H Content-Type: text/xml
 --data-binary 'commit softCommit=true waitFlush=false
 waitSearcher=false/'



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/soft-commit-tp3626765p3626765.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: soft commit

2012-01-02 Thread Mark Miller

Right - in most NRT cases (very frequent soft commits), the cache should
probably be disabled.

2012/1/2 Tomás Fernández Löbbe tomasflo...@gmail.com

 Yes, soft commit currently clears Solr's caches.

 On Mon, Jan 2, 2012 at 12:01 PM, ramires uy...@beriltech.com wrote:

  hi
 
  After soft-commit with below command all cache are cleared. Is it normal?
 
  curl http://localhost:8984/solr/update -H Content-Type: text/xml
  --data-binary 'commit softCommit=true waitFlush=false
  waitSearcher=false/'
 
 
 
  --
  View this message in context:
  http://lucene.472066.n3.nabble.com/soft-commit-tp3626765p3626765.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 




-- 
- Mark

http://www.lucidimagination.com

Re: soft commit

2012-01-02 Thread Yonik Seeley

On Mon, Jan 2, 2012 at 1:28 PM, Mark Miller markrmil...@gmail.com wrote:
 Right - in most NRT cases (very frequent soft commits), the cache should
 probably be disabled.

Did you mean autowarm should be disabled (as it already is in the
example config)?
It still normally makes sense to have the caches enabled (esp filter
and document caches).

-Yonik
http://www.lucidimagination.com

Re: soft commit

2012-01-02 Thread Jason Rutherglen

 It still normally makes sense to have the caches enabled (esp filter and 
 document caches).

In the NRT case that statement is completely incorrect

On Mon, Jan 2, 2012 at 5:37 PM, Yonik Seeley yo...@lucidimagination.com wrote:
 On Mon, Jan 2, 2012 at 1:28 PM, Mark Miller markrmil...@gmail.com wrote:
 Right - in most NRT cases (very frequent soft commits), the cache should
 probably be disabled.

 Did you mean autowarm should be disabled (as it already is in the
 example config)?
 It still normally makes sense to have the caches enabled (esp filter
 and document caches).

 -Yonik
 http://www.lucidimagination.com

Re: soft commit

2012-01-02 Thread Yonik Seeley

On Mon, Jan 2, 2012 at 9:58 PM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
 It still normally makes sense to have the caches enabled (esp filter and 
 document caches).

 In the NRT case that statement is completely incorrect

*shrug*

To each their own.  I stand my my statement.

-Yonik
http://www.lucidimagination.com

75 matches

Mail list logo