Re: Solr background merge in case of pull replicas

2021-01-06 Thread kshitij tyagi
Hi,

I am not querying on tlog replicas, solr version is 8.6 and 2 tlogs and 4
pull replica setup.

why should pull replicas be affected during background segment merges?

Regards,
kshitij

On Wed, Jan 6, 2021 at 9:48 PM Ritvik Sharma  wrote:

> Hi
> It may be the cause of rebalancing and querying is not available not on
> tlog at that moment.
> You can check tlog logs and pull log when u are facing this issue.
>
> May i know which version of solr you are using? and what is the ration of
> tlog and pull nodes.
>
> On Wed, 6 Jan 2021 at 2:46 PM, kshitij tyagi 
> wrote:
>
> > Hi,
> >
> > I am having a  tlog + pull replica solr cloud setup.
> >
> > 1. I am observing that whenever background segment merge is triggered
> > automatically, i see high response time on all of my solr nodes.
> >
> > As far as I know merges must be happening on tlog and hence the increase
> > response time, i am not able to understand that why my pull replicas are
> > affected during background index merges.
> >
> > Can someone give some insights on this? What is affecting my pull
> replicas
> > during index merges?
> >
> > Regards,
> > kshitij
> >
>


Solr background merge in case of pull replicas

2021-01-06 Thread kshitij tyagi
Hi,

I am having a  tlog + pull replica solr cloud setup.

1. I am observing that whenever background segment merge is triggered
automatically, i see high response time on all of my solr nodes.

As far as I know merges must be happening on tlog and hence the increase
response time, i am not able to understand that why my pull replicas are
affected during background index merges.

Can someone give some insights on this? What is affecting my pull replicas
during index merges?

Regards,
kshitij


Re: solrCloud client socketTimeout initiates retries

2020-12-18 Thread kshitij tyagi
Hi erick,

Thanks. Yes we will be upgrading soon to 8.8
till we upgrade we are increasing socket timeout and it helps for time
being to some extent.

regards,
kshitij

On Fri, Dec 18, 2020 at 7:48 PM Erick Erickson 
wrote:

> Right, there are several alternatives. Try going here:
> http://jirasearch.mikemccandless.com/search.py?index=jira
>
> and search for “circuit breaker” and you’ll find a bunch
> of JIRAs. Unfortunately, some are in 8.8..
>
> That said, some of the circuit breakers are in much earlier
> releases. Would it suffice until you can upgrade to set
> the circuit breakers?
>
> One problem with your solution is that the query keeps
> on running, admittedly on only one replica of each shard.
> With circuit breakers, the query itself is stoped, thus freeing
> up resources.
>
> Additionally, if you see a pattern (for instance, certain
> wildcard patterns) you could intercept that before sending.
>
> Best,
> Erick
>
> > On Dec 18, 2020, at 8:52 AM, kshitij tyagi 
> wrote:
> >
> > Hi Erick,
> >
> > I agree but in a huge cluster the retries keeps on happening, cant we
> have
> > this feature implemented in client.
> > i was referring to this jira
> > https://issues.apache.org/jira/browse/SOLR-10479
> > We have seen that some malicious queries come to system which takes
> > significant time and these queries propagating to other solr servers
> choke
> > the entire cluster.
> >
> > Regards,
> > kshitij
> >
> >
> >
> >
> >
> > On Fri, Dec 18, 2020 at 7:12 PM Erick Erickson 
> > wrote:
> >
> >> Why do you want to do this? This sounds like an XY problem, you
> >> think you’re going to solve some problem X by doing Y. Y in this case
> >> is setting the numServersToTry, but you haven’t explained what X,
> >> the problem you’re trying to solve is.
> >>
> >> Offhand, this seems like a terrible idea. If you’re requests are timing
> >> out, what purpose is served by _not_ trying the next one on the
> >> list? With, of course, a much longer timeout interval…
> >>
> >> The code is structured that way on the theory that you want the request
> >> to succeed and the system needs to be tolerant of momentary
> >> glitches due to network congestion, reading indexes into memory, etc.
> >> Bypassing that assumption needs some justification….
> >>
> >> Best,
> >> Erick
> >>
> >>> On Dec 18, 2020, at 6:23 AM, kshitij tyagi 
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> We have a Solrcloud setup and are using CloudSolrClient, What we are
> >> seeing
> >>> is if socketTimeoutOccurs then the same request is sent to other solr
> >>> server.
> >>>
> >>> So if I set socketTimeout to a very low value say 100ms and my query
> >> takes
> >>> around 200ms then client tries to query second server, then next and so
> >>> on(basically all available servers with same query).
> >>>
> >>> I see that we have *numServersToTry* in LBSolrClient class but not able
> >> to
> >>> set this using CloudSolrClient. Using this we can restrict the above
> >>> feature.
> >>>
> >>> Should a jira be created to support numServersToTry by CloudSolrClient?
> >> Or
> >>> is there any other way to control the request to other solr servers?.
> >>>
> >>> Regards,
> >>> kshitij
> >>
> >>
>
>


Re: solrCloud client socketTimeout initiates retries

2020-12-18 Thread kshitij tyagi
Hi Erick,

I agree but in a huge cluster the retries keeps on happening, cant we have
this feature implemented in client.
 i was referring to this jira
https://issues.apache.org/jira/browse/SOLR-10479
We have seen that some malicious queries come to system which takes
significant time and these queries propagating to other solr servers choke
the entire cluster.

Regards,
kshitij





On Fri, Dec 18, 2020 at 7:12 PM Erick Erickson 
wrote:

> Why do you want to do this? This sounds like an XY problem, you
> think you’re going to solve some problem X by doing Y. Y in this case
> is setting the numServersToTry, but you haven’t explained what X,
> the problem you’re trying to solve is.
>
> Offhand, this seems like a terrible idea. If you’re requests are timing
> out, what purpose is served by _not_ trying the next one on the
> list? With, of course, a much longer timeout interval…
>
> The code is structured that way on the theory that you want the request
> to succeed and the system needs to be tolerant of momentary
> glitches due to network congestion, reading indexes into memory, etc.
> Bypassing that assumption needs some justification….
>
> Best,
> Erick
>
> > On Dec 18, 2020, at 6:23 AM, kshitij tyagi 
> wrote:
> >
> > Hi,
> >
> > We have a Solrcloud setup and are using CloudSolrClient, What we are
> seeing
> > is if socketTimeoutOccurs then the same request is sent to other solr
> > server.
> >
> > So if I set socketTimeout to a very low value say 100ms and my query
> takes
> > around 200ms then client tries to query second server, then next and so
> > on(basically all available servers with same query).
> >
> > I see that we have *numServersToTry* in LBSolrClient class but not able
> to
> > set this using CloudSolrClient. Using this we can restrict the above
> > feature.
> >
> > Should a jira be created to support numServersToTry by CloudSolrClient?
> Or
> > is there any other way to control the request to other solr servers?.
> >
> > Regards,
> > kshitij
>
>


solrCloud client socketTimeout initiates retries

2020-12-18 Thread kshitij tyagi
Hi,

We have a Solrcloud setup and are using CloudSolrClient, What we are seeing
is if socketTimeoutOccurs then the same request is sent to other solr
server.

So if I set socketTimeout to a very low value say 100ms and my query takes
around 200ms then client tries to query second server, then next and so
on(basically all available servers with same query).

I see that we have *numServersToTry* in LBSolrClient class but not able to
set this using CloudSolrClient. Using this we can restrict the above
feature.

Should a jira be created to support numServersToTry by CloudSolrClient? Or
is there any other way to control the request to other solr servers?.

Regards,
kshitij


Re: Solr Upgrade socketTimeout issue in 8.2

2020-02-19 Thread kshitij tyagi
rds,
kshitij

On Wed, Feb 19, 2020 at 6:18 PM Erick Erickson 
wrote:

> Yogesh:
>
> Please do not hijack threads. The original poster requested information
> about
> socket timeouts. True “upgrade” was mentioned, but it was a completely
> different issue.
>
> Kshitij:
>
> There’s not much information to go on here. It’s possible you were running
> close to the timeout limit before and “something” changed just enough
> to go over that limit.
>
> I’m a bit confused though, you talk about commands like reload while
> indexing.
> What _exactly_ are you trying to do? Details matter.
>
> One thing that did changes was “schemaless” became the default. This
> causes reloads when Solr is indexing docs and comes across fields
> for the first time. I personally don’t recommend “schemaless”, so what
> happens if you turn that off?
>
> If you’re manually sending reloads, you might try doing them async.
>
> That said, you must start from a fresh index with _no_ documents in it
> when you upgrade more than one major version. Did you start over?
>
> Best,
> Erick
>
> > On Feb 19, 2020, at 3:58 AM, kshitij tyagi 
> wrote:
> >
> > Hi,
> >
> > Any information on socket timeout issue when using collection apis? I am
> > observing increased response time when using Collection APis in upgraded
> > version
> >
> > On Wed, Feb 19, 2020 at 2:22 PM Jörn Franke 
> wrote:
> >
> >> Yes you need to reindex.
> >> Update solrconfig, schemas to leverage the later feature of the version
> >> (some datatypes are now more optimal others are deprecated.
> >>
> >> Update Solrconfig.xml and schema to leverage the latest  datatypes ,
> >> features etc..
> >>
> >> Create new collection based on newest config.
> >> Use your regular Index process to move documents to new collection.
> >>
> >> Check if new collection works and has expected performance.
> >>
> >> Delete old collection.
> >>
> >> Test before in a test environment and not in production!
> >>
> >>> Am 19.02.2020 um 09:46 schrieb Yogesh Chaudhari
> >> :
> >>>
> >>> Hi,
> >>>
> >>> Could you please share me the steps to upgrade SOlr?
> >>>
> >>> Now I am using Solr cloud 5.2.1 on production and wanted to upgrade to
> >> SOlr7.7.2. I am doing this in 2 spteps SOlr 5.2.1 to SOlr 6.6.6 then
> SOlr
> >> 7.7.2.
> >>>
> >>> I have upgraded to Solr but getting issue for indexing of old
> >> documents.  I am badly stuck get get old document in migrated solr
> version.
> >>>
> >>> Should I do the re-indexing? If yes can you please share the way to
> >> re-indexing?
> >>>
> >>> Can you please provide your inputs on this?
> >>>
> >>> Thanks,
> >>>
> >>> Yogesh Chaudhari
> >>>
> >>> -Original Message-
> >>> From: kshitij tyagi 
> >>> Sent: Wednesday, February 19, 2020 12:52 PM
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Solr Upgrade socketTimeout issue in 8.2
> >>>
> >>> Hi,
> >>>
> >>> We have upgraded our solrCloud from version 6.6.0 to 8.2.0
> >>>
> >>> At the time of indexing intermittently we are observing socketTimeout
> >> exception when using Collection apis. example when we try reloading one
> of
> >> the collection using CloudSolrClient class.
> >>>
> >>> Is there any performance degradation in Solrcloud collection apis?
> >>>
> >>> logs:
> >>>
> >>> IOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught
> >> end of stream exception
> >>>
> >>> EndOfStreamException: Unable to read additional data from client
> >> sessionid 0x2663e756d775747, likely client has closed socket
> >>>
> >>> at
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> >>>
> >>> at
> >>>
> >>
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
> >>>
> >>> at java.lang.Thread.run(Unknown Source)
> >>>
> >>>
> >>> logs:
> >>>
> >>>
> >>> Exception has occured in job switch: Timeout occurred while waiting
> >> response from server at:
> >>
> http://secure-web.cisco.com/1w_RA10DqbruLQVC6aKUXuMgV4hC3T14viv2m2iUptQ2hyGjYLn0sSSy0Q_XNqcVxHym-e_mOyPc_AYM4zpIWWXdyRCpvXzL3mSeFK-DzhL_CqoNi2FwQUvhk2zb8OQKs1e11yBHqblc3Kyx0XlruLvb24BUj0lBBGmJVf5E9rrTaFQbFmCdNyccx1KCIpzf2MlyeqvvXVWKCW_YbqnLWGjcfqlAylbNqJTGuKf5rbBMdJ8pn14dbFlM0QDZjn6IORWVA8NqmdhC9VwD1rzpU6dVIpsph6qz_OcgoH61wlZALQ1Zj65XRFtXvuhqEWQeaabvKactprjz1o3pflKaxttbgxz1ItRxb4FjZkBgTC24uwalAmi_CyfeP7DECtIYATYf3AJFjCUfLV8_Rj2V5J0JeCTFDi7CWqKFUhiHXtpM8PvZt8kgMIRwfgPUKHPIJ/http%3A%2F%2Fprod-t-8.net%3A8983%2Fsolr
> >>>
> >>>
> >>> Is anyone facing same type of issue? in Solrcloud? Any suggestions to
> >> solve??
> >>>
> >>>
> >>>
> >>> Regards,
> >>>
> >>> kshitij
> >>
>
>


Re: Solr Upgrade socketTimeout issue in 8.2

2020-02-19 Thread kshitij tyagi
Hi,

Any information on socket timeout issue when using collection apis? I am
observing increased response time when using Collection APis in upgraded
version

On Wed, Feb 19, 2020 at 2:22 PM Jörn Franke  wrote:

> Yes you need to reindex.
> Update solrconfig, schemas to leverage the later feature of the version
> (some datatypes are now more optimal others are deprecated.
>
> Update Solrconfig.xml and schema to leverage the latest  datatypes ,
> features etc..
>
> Create new collection based on newest config.
> Use your regular Index process to move documents to new collection.
>
> Check if new collection works and has expected performance.
>
> Delete old collection.
>
> Test before in a test environment and not in production!
>
> > Am 19.02.2020 um 09:46 schrieb Yogesh Chaudhari
> :
> >
> > Hi,
> >
> > Could you please share me the steps to upgrade SOlr?
> >
> > Now I am using Solr cloud 5.2.1 on production and wanted to upgrade to
> SOlr7.7.2. I am doing this in 2 spteps SOlr 5.2.1 to SOlr 6.6.6 then SOlr
> 7.7.2.
> >
> > I have upgraded to Solr but getting issue for indexing of old
> documents.  I am badly stuck get get old document in migrated solr version.
> >
> > Should I do the re-indexing? If yes can you please share the way to
> re-indexing?
> >
> > Can you please provide your inputs on this?
> >
> > Thanks,
> >
> > Yogesh Chaudhari
> >
> > -Original Message-
> > From: kshitij tyagi 
> > Sent: Wednesday, February 19, 2020 12:52 PM
> > To: solr-user@lucene.apache.org
> > Subject: Solr Upgrade socketTimeout issue in 8.2
> >
> > Hi,
> >
> > We have upgraded our solrCloud from version 6.6.0 to 8.2.0
> >
> > At the time of indexing intermittently we are observing socketTimeout
> exception when using Collection apis. example when we try reloading one of
> the collection using CloudSolrClient class.
> >
> > Is there any performance degradation in Solrcloud collection apis?
> >
> > logs:
> >
> > IOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught
> end of stream exception
> >
> > EndOfStreamException: Unable to read additional data from client
> sessionid 0x2663e756d775747, likely client has closed socket
> >
> > at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> >
> > at
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
> >
> > at java.lang.Thread.run(Unknown Source)
> >
> >
> > logs:
> >
> >
> > Exception has occured in job switch: Timeout occurred while waiting
> response from server at:
> http://secure-web.cisco.com/1w_RA10DqbruLQVC6aKUXuMgV4hC3T14viv2m2iUptQ2hyGjYLn0sSSy0Q_XNqcVxHym-e_mOyPc_AYM4zpIWWXdyRCpvXzL3mSeFK-DzhL_CqoNi2FwQUvhk2zb8OQKs1e11yBHqblc3Kyx0XlruLvb24BUj0lBBGmJVf5E9rrTaFQbFmCdNyccx1KCIpzf2MlyeqvvXVWKCW_YbqnLWGjcfqlAylbNqJTGuKf5rbBMdJ8pn14dbFlM0QDZjn6IORWVA8NqmdhC9VwD1rzpU6dVIpsph6qz_OcgoH61wlZALQ1Zj65XRFtXvuhqEWQeaabvKactprjz1o3pflKaxttbgxz1ItRxb4FjZkBgTC24uwalAmi_CyfeP7DECtIYATYf3AJFjCUfLV8_Rj2V5J0JeCTFDi7CWqKFUhiHXtpM8PvZt8kgMIRwfgPUKHPIJ/http%3A%2F%2Fprod-t-8.net%3A8983%2Fsolr
> >
> >
> > Is anyone facing same type of issue? in Solrcloud? Any suggestions to
> solve??
> >
> >
> >
> > Regards,
> >
> > kshitij
>


Solr Upgrade socketTimeout issue in 8.2

2020-02-18 Thread kshitij tyagi
Hi,

We have upgraded our solrCloud from version 6.6.0 to 8.2.0

At the time of indexing intermittently we are observing socketTimeout
exception when using Collection apis. example when we try reloading one of
the collection using CloudSolrClient class.

Is there any performance degradation in Solrcloud collection apis?

logs:

IOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of
stream exception

EndOfStreamException: Unable to read additional data from client sessionid
0x2663e756d775747, likely client has closed socket

at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)

at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)

at java.lang.Thread.run(Unknown Source)


logs:


Exception has occured in job switch: Timeout occurred while waiting
response from server at:http://prod-t-8.net:8983/solr


Is anyone facing same type of issue? in Solrcloud? Any suggestions to solve??



Regards,

kshitij


Re: Upgrading solr to 8.2

2020-01-15 Thread kshitij tyagi
Hi,

Any suggestions from anyone?

Regards,
kshitij

On Tue, Jan 14, 2020 at 4:11 PM Jan Høydahl  wrote:

> Please don’t cross-post, this discussion belongs in solr-user only.
>
> Jan
>
> > 14. jan. 2020 kl. 22:22 skrev kshitij tyagi  >:
> >
> > Also trie fileds have been updated to point fields, will that by any
> chance
> > degrade my response time by 50 percent?
> >
> > On Tue, Jan 14, 2020 at 1:37 PM kshitij tyagi 
> > wrote:
> >
> >> Hi Team,
> >>
> >> I am currently upgrading my system from solr 6.6 to solr 8.2 :
> >>
> >> 1.  I am observing increased search time in my queries i.e. search
> response
> >> time is increasing along with cpu utilisation, although memory looks
> fine,
> >> on analysing heap dumps I figured out that queries are taking most of
> the
> >> time in Docstreamer.java file and method convertLuceneDocToSolrDoc.
> >> I saw a couple of Solr jira regarding the same, example : SOLR-11891,
> >> SOLR-1265.
> >>
> >> Can anyone please help me out by pointing out where I need to look out
> and
> >> what needs to be done in order to bring back my response time which was
> >> earlier?
> >>
> >> Regards,
> >> kshitij
> >>
>
>


Re: Upgrading solr to 8.2

2020-01-14 Thread kshitij tyagi
Also trie fileds have been updated to point fields, will that by any chance
degrade my response time by 50 percent?

On Tue, Jan 14, 2020 at 1:37 PM kshitij tyagi 
wrote:

> Hi Team,
>
> I am currently upgrading my system from solr 6.6 to solr 8.2 :
>
> 1.  I am observing increased search time in my queries i.e. search response
> time is increasing along with cpu utilisation, although memory looks fine,
> on analysing heap dumps I figured out that queries are taking most of the
> time in Docstreamer.java file and method convertLuceneDocToSolrDoc.
> I saw a couple of Solr jira regarding the same, example : SOLR-11891,
> SOLR-1265.
>
> Can anyone please help me out by pointing out where I need to look out and
> what needs to be done in order to bring back my response time which was
> earlier?
>
> Regards,
> kshitij
>


Upgrading solr to 8.2

2020-01-14 Thread kshitij tyagi
Hi Team,

I am currently upgrading my system from solr 6.6 to solr 8.2 :

1.  I am observing increased search time in my queries i.e. search response
time is increasing along with cpu utilisation, although memory looks fine,
on analysing heap dumps I figured out that queries are taking most of the
time in Docstreamer.java file and method convertLuceneDocToSolrDoc.
I saw a couple of Solr jira regarding the same, example : SOLR-11891,
SOLR-1265.

Can anyone please help me out by pointing out where I need to look out and
what needs to be done in order to bring back my response time which was
earlier?

Regards,
kshitij


Re: Query on autoGeneratePhraseQueries

2019-10-15 Thread kshitij tyagi
Hi,

Try debugging your solr query and understand how it gets parsed. Try using
"debug=true" for the same

On Tue, Oct 15, 2019 at 12:58 PM Shubham Goswami 
wrote:

> *Hi all,*
>
> I am a beginner to solr framework and I am trying to implement
> *autoGeneratePhraseQueries* property in a fieldtype of type=text_general, i
> kept the property value as true and restarted the solr server but still it
> is not taking my two words query like(Black company) as a phrase without
> double quotes and returning the results only for Black.
>
>  Can somebody please help me to understand what am i missing ?
> Following is my Schema.xml file code and i am using solr 7.5 version.
>  positionIncrementGap="100" multiValued="true"
> autoGeneratePhraseQueries="true">
> 
>   =
>ignoreCase="true"/>
>   
> 
> 
>   
>ignoreCase="true"/>
>ignoreCase="true" synonyms="synonyms.txt"/>
>   
> 
>   
>
>
> --
> *Thanks & Regards*
> Shubham Goswami
> Enterprise Software Engineer
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-7803886288
> office: 0731-409-3684
> http://www.hotwaxsystems.com
>


Re: Solr Repeaters/Slaves replicating are every commit on Master instead of Optimize

2019-09-01 Thread kshitij tyagi
Try changing commit to
optimize

Also, If it does not work, try removing the polling interval configuration
from the slaves.

What you are seeing is expected behaviour for solr and nothing is unusual.
Try out the changes and I hope it should work fine.

On Sun, Sep 1, 2019 at 7:52 AM Monil Parikh  wrote:

> Hello Solr Users,
>
> I am trying to get Master-Repeater-Slave config to work, I am facing
> replication related issue on luceneMatchVersion 7.7.1.
>
> Posted on stack overflow with all details:
>
> https://stackoverflow.com/questions/57741934/solr-repeaters-slaves-replicating-are-every-commit-on-master-instead-of-optimize
>
> Thanks in advance!
>


Re: Solr edismax parser with multi-word synonyms

2019-07-18 Thread kshitij tyagi
Hi sunil,

1. as you have added "microwave food" in synonym as a multiword synonym to
"frozen dinner", edismax parsers finds your synonym in the file and is
considering your query as a Phrase query.

This is the reason you are seeing parsed query as  +(((+title:microwave
+title:food) (+title:frozen +title:dinner))), frozen dinner is considered
as a phrase here.

If you want partial match on your query then you can add frozen dinner,
microwave food, microwave, food to your synonym file and you will see the
parsed query as:
"+(((+title:microwave +title:food) title:miccrowave title:food
(+title:frozen +title:dinner)))"
 Another option is to write your own custom query parser and use it as a
plugin.

Hope this helps!!

kshitij


On Thu, Jul 18, 2019 at 9:14 AM Sunil Srinivasan  wrote:

>
> I have enabled the SynonymGraphFilter in my field configuration in order
> to support multi-word synonyms (I am using Solr 7.6). Here is my field
> configuration:
> 
> 
>   
> 
>
> 
>   
>synonyms="synonyms.txt"/>
> 
> 
>
> 
>
> And this is my synonyms.txt file:
> frozen dinner,microwave food
>
> Scenario 1: blue shirt (query with no synonyms)
>
> Here is my first Solr query:
>
> http://localhost:8983/solr/base/search?q=blue+shirt=title=edismax=on
>
> And this is the parsed query I see in the debug output:
> +((title:blue) (title:shirt))
>
> Scenario 2: frozen dinner (query with synonyms)
>
> Now, here is my second Solr query:
>
> http://localhost:8983/solr/base/search?q=frozen+dinner=title=edismax=on
>
> And this is the parsed query I see in the debug output:
> +(((+title:microwave +title:food) (+title:frozen +title:dinner)))
>
> I am wondering why the first query looks for documents containing at least
> one of the two query tokens, whereas the second query looks for documents
> with both of the query tokens? I would understand if it looked for both the
> tokens of the synonyms (i.e. both microwave and food) to avoid the
> sausagization problem. But I would like to get partial matches on the
> original query at least (i.e. it should also match documents containing
> just the token 'dinner').
>
> Would any one know why the behavior is different across queries with and
> without synonyms? And how could I work around this if I wanted partial
> matches on queries that also have synonyms?
>
> Ideally, I would like the parsed query in the second case to be:
> +(((+title:microwave +title:food) (title:frozen title:dinner)))
>
> I'd appreciate any help with this. Thanks!
>


Re: Solr Sudden I/O spike

2019-07-11 Thread kshitij tyagi
Hi,

Can you checck and update if there is any indexing going on the core or a
merge or an optimise triggered on the same. There might be an instance of
high IO in case any bacckgroung merging triggers while serving query
requests.

Regards,
kshitij

On Fri, Jun 14, 2019 at 5:23 PM Sripra deep 
wrote:

> Hi,
>   Any help would be appreciated, I am using solr 7.1.0, Suddenly we got a
> high I/O even with a very low request rate and the core went down. Did
> anybody experience the same or root cause of this.
>
> Below are the log error msg that we got from solr.log
>
> 2019-06-06 10:37:14.490 INFO  (qtp761960786-8618) [   ]
> o.a.s.s.HttpSolrCall Unable to write response, client closed connection or
> we are shutting down
> org.eclipse.jetty.io.EofException
> at org.eclipse.jetty.io
> .ChannelEndPoint.flush(ChannelEndPoint.java:199)
> at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:420)
> at
> org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:375)
> at
> org.eclipse.jetty.io
> .SelectChannelEndPoint$3.run(SelectChannelEndPoint.java:107)
> at
> org.eclipse.jetty.io
> .SelectChannelEndPoint.onSelected(SelectChannelEndPoint.java:193)
> at
> org.eclipse.jetty.io
> .ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:283)
> at
> org.eclipse.jetty.io
> .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:181)
> at
>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceExecuteConsume(ExecuteProduceConsume.java:169)
> at
>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:145)
> at
>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
> at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:51)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
> at org.eclipse.jetty.io
> .ChannelEndPoint.flush(ChannelEndPoint.java:177)
> ... 12 more
> Thanks,
> Sripradeep P
>


Re: Sold Claud is missing in sole installation

2018-06-23 Thread kshitij tyagi
Hi pradeep,

You want your . architecture to be on cloud or master slave? Which script
have you run from bin/solr to start the instance or share the command used
for starting the instance.

Regards,
kshitij

On Sat, Jun 23, 2018 at 3:05 PM, Pradeep Sharma 
wrote:

> Hi,
>
> I have installed Solr 7.3.1using BitNami on google cloud, but found sold
> cloud instance in installation.
>
> Regards
> Pradeep Sharma


Re: In-place update vs Atomic updates

2018-01-08 Thread kshitij tyagi
Hi Shawn,

Thanks for the information,

1. Does in place updates opens a new searcher by itself or not?
2. As the entire segment is rewriten, it means that frequent in place
updates are expensive as each in place update will rewrite the entire
segment again? Correct me here if my understanding is not correct.

Thanks,
Kshitij

On Mon, Jan 8, 2018 at 9:19 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 1/8/2018 4:05 AM, kshitij tyagi wrote:
>
>> What are the major differences between atomic and in-place updates, I have
>> gone through the documentation but it does not give detail internal
>> information.
>>
>
> Atomic updates are nearly identical to simple indexing, except that the
> existing document is read from the index to populate a new document along
> with whatever updates were requested, then the new document is indexed and
> the old one is deleted.
>
> 1. Does doing in-place update prevents solr cache burst or not, what are
>> the benefits of using in-place updates?
>>
>
> In-place updates are only possible on a field where only docValues is
> true.  The settings for things like indexed and stored must be false.
>
> An in-place update finds the segment containing the document and writes a
> whole new file containing the value of every document in the segment for
> the updated field.  If the segment contains ten million documents, then
> information for ten million values will be written for a single document
> update.
>
> I want to update one of the fields of the documnet but I do not want to
>> burst my cache.
>>
>
> When the index changes for ANY reason, no matter how the change is
> accomplished, caches must be thrown away when a new searcher is built.
> Lucene and Solr have no way of knowing that a change doesn't affect some
> cache entries, so the only thing it can do is assume that all the
> information in the cache is now invalid.  What you are asking for here is
> not possible at the moment, and chances are that if code was written to do
> it, that it would be far slower than simply invalidating the caches and
> doing autowarming.
>
> Thanks,
> Shawn
>


In-place update vs Atomic updates

2018-01-08 Thread kshitij tyagi
Hi,

What are the major differences between atomic and in-place updates, I have
gone through the documentation but it does not give detail internal
information.

1. Does doing in-place update prevents solr cache burst or not, what are
the benefits of using in-place updates?

I want to update one of the fields of the documnet but I do not want to
burst my cache.

What is the best approach to achieve the same.

Thanks,
Kshitij


Re: max docs, deleted docs optimization

2017-11-01 Thread kshitij tyagi
Thanks eric for your promp response, it was really helpful.

On Tue, Oct 31, 2017 at 8:30 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> 1> 2 lakh at most. If the standard background merging is going on it
> may be less than that.
>
> 2> Some, but whether you notice or not is an open question. In an
> index with only 10 lakh docs, it's unlikely even having 50% deleted
> documents is going to make much of a difference.
>
> 3> Yes, the deleted docs are in segment until it's merged away. Lucene
> is very efficient (according to Mike McCandless) at skipping deleted
> docs.
>
> 4> It rewrites all segments, purging deleted documents. However, it
> has some pitfalls, see:
> https://lucidworks.com/2017/10/13/segment-merging-deleted-
> documents-optimize-may-bad/.
> In general it's simply not recommended to optimize. There is a Solr
> JIRA discussing this in detail, but I can't get to the site to link it
> right now.
>
> In general, as an index is updated segments are merged together and
> during that process any deleted documents are purged.
>
> Two resources:
> https://lucidworks.com/2013/08/23/understanding-
> transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> See the third animation TieredMergePolicy which is the default here:
> http://blog.mikemccandless.com/2011/02/visualizing-
> lucenes-segment-merges.html
>
> Best,
> Erick
>
> On Tue, Oct 31, 2017 at 4:40 AM, kshitij tyagi
> <kshitij.shopcl...@gmail.com> wrote:
> > Hi,
> >
> > I am using atomic update to update one of the fields, I want to know :
> >
> > 1. if total docs in core are 10 lakh and I partially update 2 lakhs docs
> > then what will be the number of deleted docs?
> >
> > 2. Does higher number of deleted docs have affect on query time? means
> does
> > query time increases if deleted docs are more
> >
> > 3. Are deleted docs present in segment? during query execution does
> deleted
> > docs are traversed.
> >
> > 4. What doe optimized button on solr admin does exactly.
> >
> > Help is much appreciated.
> >
> > Regards,
> > Kshitij
>


max docs, deleted docs optimization

2017-10-31 Thread kshitij tyagi
Hi,

I am using atomic update to update one of the fields, I want to know :

1. if total docs in core are 10 lakh and I partially update 2 lakhs docs
then what will be the number of deleted docs?

2. Does higher number of deleted docs have affect on query time? means does
query time increases if deleted docs are more

3. Are deleted docs present in segment? during query execution does deleted
docs are traversed.

4. What doe optimized button on solr admin does exactly.

Help is much appreciated.

Regards,
Kshitij


Same queries taking more time

2017-06-13 Thread kshitij tyagi
Hi,

We are using master slave architecture, here are the observations:

1. Heap size and connections on slave are increasing and leading to more
query time.

2. We are noticing on solr admin UI that segment count is huge and also no
merging is taking place.

3. We have not made any changes and a new searcher was opened around 5 hrs
ago by solr and since then we are seeing such issues.

What are the aspects we should check as of now?

Help appreciated.

Regards,
Kshitij


Help with facet.limit

2017-04-26 Thread kshitij tyagi
Hi Team,

I am using facet on particular field along with facet.limit=500, problem I
am facing is:


1. As there are more than 500 facets and it is giving me 500 results, I
want particular facets to be returned i.e can I specify to solr to return
me 500 facets along with ones I require?

eg facets returned are a,b,c,d when using facet.limit=4  There are other
facets too but are not returned as I am using limit.

Can I specify something in query such as 'e' so that my result for facet is
a,b,c,e

Regards,
Kshitij


Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-10 Thread kshitij tyagi
Hi Himanshu,

maxWarmingSearchers would break nothing on production. Whenever you request
solr to open a new searcher, it autowarms the searcher so that it can
utilize caching. After autowarm is complete a new searcher is opened.

The questions you need to adress here are

1. Are you using soft-commit or hard-commit? If you are using hard commit
and update frequency is high then you need to switch to soft-commit.

2. you are dealing with only 2.1 millions that is a small set but stilll
you are facing issues, why are you indexing all the fields in solr?
you need to make significant changes in schema and index only those fields
upon which you are querying and not index all the fields.

3. Check you segment count configuration in solrconfig.xml, it should not
be too high or too low as it will affect indexing speed, a high number
would give good indexing speed but a low search result.

Hope these things would help you to tune better.

Regards,
Kshitij

On Mon, Apr 10, 2017 at 1:27 PM, Himanshu Sachdeva 
wrote:

> Hi Toke,
>
> Thanks for your time and quick response. As you said, I changed our logging
> level from SEVERE to INFO and indeed found the performance warning
> *Overlapping
> onDeckSearchers=2* in the logs. I am considering limiting the
> *maxWarmingSearchers* count in configuration but want to be sure that
> nothing breaks in production in case simultaneous commits do happen
> afterwards.
>
> What would happen if we set *maxWarmingSearchers* count to 1 and make
> simultaneous commit from different endpoints? I understand that solr will
> prevent opening a new searcher for the second commit but is that all there
> is to it? Does it mean solr will serve stale data( i.e. send stale data to
> the slaves) ignoring the changes from the second commit? Will these changes
> reflect only when a new searcher is initialized and will they be ignored
> till
> then? Do we even need searchers on the master as we will be querying only
> the slaves? What purpose do the searchers serve exactly? Your time and
> guidance will be very much appreciated. Thank you.
>
> On Thu, Apr 6, 2017 at 6:12 PM, Toke Eskildsen  wrote:
>
> > On Thu, 2017-04-06 at 16:30 +0530, Himanshu Sachdeva wrote:
> > > We monitored the index size for a few days and found that it varies
> > > widely from 11GB to 43GB.
> >
> > Lucene/Solr indexes consists of segments, each holding a number of
> > documents. When a document is deleted, its bytes are not removed
> > immediately, only marked. When a document is updated, it is effectively
> > a delete and an add.
> >
> > If you have an index with 3 documents
> >   segment-0 (live docs [0, 1, 2], deleted docs [])
> > and update document 0 and 1, you will have
> >   segment-0 (live docs [2], deleted docs [0, 1])
> >   segment-1 (live docs
> > [0, 1], deleted docs [])
> > if you then update document 1 again, you will
> > have
> >   segment-0 (live docs [2], deleted docs [0, 1])
> >   segment-1 (live
> > docs [0], deleted docs [1])
> >   segment-1 (live docs [1], deleted docs [])
> >
> > for a total of ([2] + [0, 1]) + ([0] + [1]) + ([1] + []) = 6 documents.
> >
> > The space is reclaimed when segments are merged, but depending on your
> > setup and update pattern that may take some time. Furthermore there is a
> > temporary overhead of merging, when the merged segment is being written
> and
> > the old segments are still available. 4x the minimum size is fairly
> large,
> > but not unrealistic, with enough index-updates.
> >
> > > Recently, we started getting a lot of out of memory errors on the
> > > master. Everytime, solr becomes unresponsive and we need to restart
> > > jetty to bring it back up. At the same we observed the variation in
> > > index size. We are suspecting that these two problems may be linked.
> >
> > Quick sanity check: Look for "Overlapping onDeckSearchers" in your
> > solr.log to see if your memory problems are caused by multiple open
> > searchers:
> > https://wiki.apache.org/solr/FAQ#What_does_.22exceeded_limit_of_maxWarm
> > ingSearchers.3DX.22_mean.3F
> > --
> > Toke Eskildsen, Royal Danish Library
> >
>
>
>
> --
> Himanshu Sachdeva
>


Re: Issue with facet count

2017-04-10 Thread kshitij tyagi
thanks Alex for taking out your valuable time and helping us to understand
better.

Cheers!
Kshitij

On Mon, Apr 10, 2017 at 4:00 PM, alessandro.benedetti 
wrote:

> It really depends on the schema change...
> Any addition/deletion usually implies you can avoid re-indexing if you
> don't
> care the old documents will remain outdated.
> But doing a type change, or a change to the data structures involved ( such
> enabling docValues, norms ect ect) without a full re-index is a NO-GO ( you
> can introduce a lot of subtle problems, not immediately visible)
>
> There have been a lot of discussions in the past to allow Solr manage
> schema
> changes on the fly ( with background jobs transparent to the
> user/administrator), but nothing concrete yet that I know.
>
> Cheers
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Issue-with-facet-count-tp4329056p4329157.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Issue with facet count

2017-04-10 Thread kshitij tyagi
Hi Alex,

After full re-indexing things work out fine.

But is there any other way to make schema changes on the go?

Or we have to reindex entire data whenever a schema change is done?

we are having 30-40 million documents and it is a tedious and time taking
task.

What other approaches are there to change schema on the fly?

Regards,
Kshitij

On Sun, Apr 9, 2017 at 12:55 AM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> Did you do a full reindex? Try completely deleting the index and
> redoing it from scratch (at least as a test). If you have left over
> documents and changed type definitions, things may get messy. If
> that's too hard, just index a single record into a separate collection
> with matching-definition and check there to find the difference.
>
> A type change could be especially complicated if one type was defined
> (on field OR on type) with DocValues and another one without.
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 8 April 2017 at 18:42, kshitij tyagi <kshitij.shopcl...@gmail.com>
> wrote:
> > Hi Alex,
> >
> > Thanks for the response.
> >
> > 1. Actually everything was working fine earlier unless I made changes on
> a
> > dynamic field whose facets are being created, I changed the field type to
> > "strings" earlier I was tokenizing my field on based delimeter '_'.
> >
> > 2. When I made the changes and started indexing then facet count started
> > coming zero, though I was able to query properly on facets and results
> were
> > fine but facet count was giving zero.
> >
> > 3. Then I reverted my changes back in schema file but my problem was not
> > solved and it was still giving zero facet count after reindexing.
> >
> > I am unable to understand that when I have reverted my schema changes
> then
> > things should be back to normal but the case is almost opposite.
> >
> > Kindly help!'
> > Let me know if u require any other information.
> >
> >
> > On Sat, Apr 8, 2017 at 9:00 PM, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> What happens when you facet on a 'all document query' (q=*:*)? Are you
> >> sure your facet fields actually have the content? If they are stored,
> >> they should be returned with the query. If they are not stored, you
> >> could see what they contain in the Schema screen of the Admin UI (load
> >> tokens checkbox).
> >>
> >> Hope that helps to narrow down the issue.
> >>
> >> Regards,
> >>Alex.
> >> P.s. I don't doubt you, but for myself, I would also do a sanity check
> >> that I AM actually getting results because I am querying on THOSE
> >> fields and not - say - on some other field and/or copyField target.
> >> Enabling debug would show you exactly what fields are being querying
> >> with what (analyzed) token.
> >> 
> >> http://www.solr-start.com/ - Resources for Solr users, new and
> experienced
> >>
> >>
> >> On 8 April 2017 at 17:53, kshitij tyagi <kshitij.shopcl...@gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > I am getting zero count for all facets created by using facet.field in
> >> solr
> >> > 5.1
> >> >
> >> > The surprise element is that I am able to query correctly on fields,
> but
> >> my
> >> > facet counts are returning zero count.
> >> >
> >> > Can anyone help me out here on what all I should check?
> >> >
> >> > Regards,
> >> > Kshitij
> >>
>


Re: Issue with facet count

2017-04-08 Thread kshitij tyagi
Hi Alex,

Thanks for the response.

1. Actually everything was working fine earlier unless I made changes on a
dynamic field whose facets are being created, I changed the field type to
"strings" earlier I was tokenizing my field on based delimeter '_'.

2. When I made the changes and started indexing then facet count started
coming zero, though I was able to query properly on facets and results were
fine but facet count was giving zero.

3. Then I reverted my changes back in schema file but my problem was not
solved and it was still giving zero facet count after reindexing.

I am unable to understand that when I have reverted my schema changes then
things should be back to normal but the case is almost opposite.

Kindly help!'
Let me know if u require any other information.


On Sat, Apr 8, 2017 at 9:00 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> What happens when you facet on a 'all document query' (q=*:*)? Are you
> sure your facet fields actually have the content? If they are stored,
> they should be returned with the query. If they are not stored, you
> could see what they contain in the Schema screen of the Admin UI (load
> tokens checkbox).
>
> Hope that helps to narrow down the issue.
>
> Regards,
>Alex.
> P.s. I don't doubt you, but for myself, I would also do a sanity check
> that I AM actually getting results because I am querying on THOSE
> fields and not - say - on some other field and/or copyField target.
> Enabling debug would show you exactly what fields are being querying
> with what (analyzed) token.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 8 April 2017 at 17:53, kshitij tyagi <kshitij.shopcl...@gmail.com>
> wrote:
> > Hi,
> >
> > I am getting zero count for all facets created by using facet.field in
> solr
> > 5.1
> >
> > The surprise element is that I am able to query correctly on fields, but
> my
> > facet counts are returning zero count.
> >
> > Can anyone help me out here on what all I should check?
> >
> > Regards,
> > Kshitij
>


Issue with facet count

2017-04-08 Thread kshitij tyagi
Hi,

I am getting zero count for all facets created by using facet.field in solr
5.1

The surprise element is that I am able to query correctly on fields, but my
facet counts are returning zero count.

Can anyone help me out here on what all I should check?

Regards,
Kshitij


Re: Updating 100 documents in one request

2017-03-01 Thread kshitij tyagi
thanks everyone for your inputs, we are using solr 5.1 as of now.

@rick/walter Can you please explain or provide link for entire set of
loaded documents is saved as JSONL in S3 would be helpfull.

Regards,
Kshitij

On Wed, Mar 1, 2017 at 10:06 PM, Walter Underwood <wun...@wunderwood.org>
wrote:

> That is exactly what we do. The entire set of loaded documents is saved as
> JSONL in S3. Very handy for loading up a prod index in test for diagnosis
> or benchmarking.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Mar 1, 2017, at 8:14 AM, Rick Leir <rl...@leirtech.com> wrote:
> >
> > And perhaps put the crawl results in JSONL, so when you get a 404 you
> can use yesterdays document in a pinch. Cheers -- Rick
> >
> > On March 1, 2017 10:20:21 AM EST, Walter Underwood <
> wun...@wunderwood.org> wrote:
> >> Since I always need to know which document was bad, I back off to
> >> batches of one document when there is a failure.
> >>
> >> wunder
> >> Walter Underwood
> >> wun...@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>
> >>> On Mar 1, 2017, at 6:25 AM, Erick Erickson <erickerick...@gmail.com>
> >> wrote:
> >>>
> >>> What version of Solr? This was a pretty long-standing issue that was
> >>> fixed in Solr 6.1,
> >>> see: https://issues.apache.org/jira/browse/SOLR-445 Otherwise you
> >> really have to
> >>> write your code to re-transmit sub-packets, perhaps even one at a
> >> time
> >>> when a packet
> >>> fails.
> >>>
> >>> Best,
> >>> Erick
> >>>
> >>> On Wed, Mar 1, 2017 at 3:46 AM, kshitij tyagi
> >>> <kshitij.shopcl...@gmail.com> wrote:
> >>>> Hi Team,
> >>>>
> >>>> I am facing an issue when I am updating more than 1 document on
> >> solr.
> >>>>
> >>>> 1. If any 1 document gives 400 error them my other documents are
> >> also not
> >>>> updated.
> >>>>
> >>>> How can I approach to solve this? I need my other documents to be
> >> indexed
> >>>> which are not giving 400 error.
> >>>>
> >>>> Help appreciated!
> >>>>
> >>>> Regards,
> >>>> Kshitij
> >
> > --
> > Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
>


Updating 100 documents in one request

2017-03-01 Thread kshitij tyagi
Hi Team,

I am facing an issue when I am updating more than 1 document on solr.

1. If any 1 document gives 400 error them my other documents are also not
updated.

How can I approach to solve this? I need my other documents to be indexed
which are not giving 400 error.

Help appreciated!

Regards,
Kshitij


Solr Partial Update query

2017-01-30 Thread kshitij tyagi
Hi,

I want to update document in solr partial. The issue I am facing is that I
want to update only those documents in solr that are already present.

I dont want to query solr to check if documnet is present or not, i just
want to post updates to existing documents. How can I achieve this.

Help appreciated.

Regards,
Kshitij


Same score listing order

2017-01-10 Thread kshitij tyagi
Hi,

I need to understand what is the order of listing the documents from query
in case there is same score for all documents.

Regards,
Kshitij


Solr json facet api

2017-01-05 Thread kshitij tyagi
Hi,

We were earlier using solr 4.0 and now moved to solr 5.2:

I am debugging queries and seeing that most of the time in queries are
taken by solr facet queries.

I have read about solr json facet api in solr 5 on wards, can anyone help
me out to understand the difference between these both?

Will there be significant gain in query performance and response time if i
manage to use  SOlr json facet api?

Kindly help me out here as I am trying to reduce my query response time.

Regards,
Kshitij


Re: Queries regarding solr cache

2017-01-04 Thread kshitij tyagi
Hi Shawn,

Need your help:

I am using master slave architecture in my system and here is the
solrconfig.xml:
   ${enable.master:false} startup commit 00:00:10 managed-schema
   ${enable.slave:false} http://${MASTER_CORE_URL}/${solr.core.name}
${POLL_TIME}  

Problem:

I am Noticing that my slaves are not able to use proper caching as:

1. I am indexing on my master and committing frequently, what i am noticing
is that my slaves are committing very frequently and cache is not being
build properly and so my hit ratio is almost zero for caching.

2. What changes I need to make so that the cache builds up properly even
after commits and cache could be used properly, this is wasting a lot of my
resources and also slowering up the queries.

On Mon, Dec 5, 2016 at 9:06 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 12/5/2016 6:44 AM, kshitij tyagi wrote:
> >   - lookups:381
> >   - hits:24
> >   - hitratio:0.06
> >   - inserts:363
> >   - evictions:0
> >   - size:345
> >   - warmupTime:2932
> >   - cumulative_lookups:294948
> >   - cumulative_hits:15840
> >   - cumulative_hitratio:0.05
> >   - cumulative_inserts:277963
> >   - cumulative_evictions:70078
> >
> >   How can I increase my hit ratio? I am not able to understand solr
> >   caching mechanism clearly. Please help.
>
> This means that out of the nearly 30 queries executed by that
> handler, only five percent (15000) of them were found in the cache.  The
> rest of them were not found in the cache at the moment they were made.
> Since these numbers come from the queryResultCache, this refers to the
> "q" parameter.  The filterCache handles things in the fq parameter.  The
> documentCache holds actual documents from your index and fills in stored
> data in results so the document doesn't have to be fetched from the index.
>
> Possible reasons:  1) Your users are rarely entering the same query more
> than once.  2) Your client code is adding something unique to every
> query (q parameter) so very few of them are the same.  3) You are
> committing so frequently that the cache never has a chance to get large
> enough to make a difference.
>
> Here are some queryResultCache stats from one of my indexes:
>
> class:org.apache.solr.search.FastLRUCache
> version:1.0
> description:Concurrent LRU Cache(maxSize=512, initialSize=512,
> minSize=460, acceptableSize=486, cleanupThread=true,
> autowarmCount=8,
> regenerator=org.apache.solr.search.SolrIndexSearcher$3@1d172ac0)
> src:$URL:
> https:/​/​svn.apache.org/​repos/​asf/​lucene/​dev/​
> branches/​lucene_solr_4_7/​solr/​core/​src/​java/​org/​
> apache/​solr/​search/​FastLRUCache.java
> lookups:   3496
> hits:  3145
> hitratio:  0.9
> inserts:   335
> evictions: 0
> size:  338
> warmupTime: 2209
> cumulative_lookups:   12394606
> cumulative_hits:  11247114
> cumulative_hitratio:  0.91
> cumulative_inserts:   1110375
> cumulative_evictions: 409887
>
> These numbers indicate that 91 percent of the queries made to this
> handler were served from the cache.
>
> Thanks,
> Shawn
>
>


Re: Queries regarding solr cache

2016-12-05 Thread kshitij tyagi
Hi Shawn,

Thanks for the reply:

here are the details for query result cache(i am not using NOW in my
queries and most of the queries are common):


   - class:org.apache.solr.search.LRUCache
   - version:1.0
   - description:LRU Cache(maxSize=1000, initialSize=1000,
   autowarmCount=10,
   regenerator=org.apache.solr.search.SolrIndexSearcher$3@73380510)
   - src:null
   - stats:
  - lookups:381
  - hits:24
  - hitratio:0.06
  - inserts:363
  - evictions:0
  - size:345
  - warmupTime:2932
  - cumulative_lookups:294948
  - cumulative_hits:15840
  - cumulative_hitratio:0.05
  - cumulative_inserts:277963
  - cumulative_evictions:70078

  How can I increase my hit ratio? I am not able to understand solr
  caching mechanism clearly. Please help.



On Thu, Dec 1, 2016 at 8:19 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 12/1/2016 4:04 AM, kshitij tyagi wrote:
> > I am using Solr and serving huge number of requests in my application.
> >
> > I need to know how can I utilize caching in Solr.
> >
> > As of now in  then clicking Core Selector → [core name] → Plugins /
> Stats.
> >
> > I am seeing my hit ration as 0 for all the caches. What does this mean
> and
> > how this can be optimized.
>
> If your hitratio is zero, then none of the queries related to that cache
> are finding matches.  This means that your client systems are never
> sending the same query twice.
>
> One possible reason for a zero hitratio is using "NOW" in date queries
> -- NOW changes every millisecond, and the actual timestamp value is what
> ends up in the cache.  This means that the same query with NOW executed
> more than once will actually be different from the cache's perspective.
> The solution is date rounding -- using things like NOW/HOUR or NOW/DAY.
> You could use NOW/MINUTE, but the window for caching would be quite small.
>
> 5000 entries for your filterCache is almost certainly too big.  Each
> filterCache entry tends to be quite large.  If the core has ten million
> documents in it, then each filterCache entry would be 1.25 million bytes
> in size -- the entry is a bitset of all documents in the core.  This
> includes deleted docs that have not yet been reclaimed by merging.  If a
> filterCache for an index that size (which is not all that big) were to
> actually fill up with 5000 entries, it would require over six gigabytes
> of memory just for the cache.
>
> The 1000 that you have on queryResultCache is also rather large, but
> probably not a problem.  There's also documentCache, which generally is
> OK to have sized at several thousand -- I have 16384 on mine.  If your
> documents are particularly large, then you probably would want to have a
> smaller number.
>
> It's good that your autowarmCount values are low.  High values here tend
> to make commits take a very long time.
>
> You do not need to send your message more than once.  The first repeat
> was after less than 40 minutes.  The second was after about two hours.
> Waiting a day or two for a response, particularly for a difficult
> problem, is not unusual for a mailing list.  I begain this reply as soon
> as I saw your message -- about 7:30 AM in my timezone.
>
> Thanks,
> Shawn
>
>


Queries regarding cache

2016-12-01 Thread kshitij tyagi
Hi All,

I am using Solr and serving huge number of requests in my application.

I need to know how can I utilize caching in Solr.

I am seeing hit ratio as 0 for all the caches in Plugins/Stats.

My configurations in solrxml are :




Can someone please help me out here to understand and optimise solr caching
in my system.

Thanks in advance.

Regards,
Kshitij


Fwd: Queries regarding solr cache

2016-12-01 Thread kshitij tyagi
-- Forwarded message --
From: kshitij tyagi <kshitij.shopcl...@gmail.com>
Date: Thu, Dec 1, 2016 at 4:34 PM
Subject: Queries regarding solr cache
To: solr-user@lucene.apache.org


Hi All,

I am using Solr and serving huge number of requests in my application.

I need to know how can I utilize caching in Solr.

As of now in  then clicking Core Selector → [core name] → Plugins / Stats.

I am seeing my hit ration as 0 for all the caches. What does this mean and
how this can be optimized.

My current solr configurations are:





Regards,
Kshitij


Queries regarding solr cache

2016-12-01 Thread kshitij tyagi
Hi All,

I am using Solr and serving huge number of requests in my application.

I need to know how can I utilize caching in Solr.

As of now in  then clicking Core Selector → [core name] → Plugins / Stats.

I am seeing my hit ration as 0 for all the caches. What does this mean and
how this can be optimized.

My current solr configurations are:





Regards,
Kshitij


Re: Solr slow response collection-High Load

2016-09-09 Thread kshitij tyagi
Hi Ankush,

As you are updating highly on one of the cores, hard commit will play a
major role.

Reason: During hard commits solr merges your segments and this is a time
taking process.

During merging of segments indexing of documents gets affected i.e. gets
slower.

Try figuring out the right number of segments you need to have and focus on
analysing the merge process of solr when you are updating high amount of
data.

You will need to find the correct time for hard commits and the required
number of segments for the collection.

Hope this helps.



On Fri, Sep 9, 2016 at 2:13 PM, Ankush Khanna  wrote:

> Hello,
>
> We are running some test for improving our solr performance.
>
> We have around 15 collections on our solr cluster.
> But we are particularly interested in one collection holding high amount of
> documents. (
> https://gist.github.com/AnkushKhanna/9a472bccc02d9859fce07cb0204862da)
>
> Issue:
> We see that there are high response time from the collection, for the same
> queries, when user load or update load is increased.
>
> What are we aiming for:
> Low response time (lower than 3 sec) in high update/traffic.
>
> Current collection, production:
> * Solr Cloud, 2 Shards 2 Replicas
> * Indexed: 5.4 million documents
> * 45 indexed fields per document
> * Soft commit: 5 seconds
> * Hard commit: 10 minutes
>
> Test Setup:
> * Indexed: 3 million documents
> * Rest is same as in production
> * Using gatling to mimic behaviour of updates and user traffic
>
> Finding:
> We see the problem occurring more often when:
> * query size is greater than 2000 characters (we can limit the search to
> 2000 characters, but is there a solution to do this without limiting the
> size)
> * there is high updates going on
> * high user traffic
>
> Some settings I explored:
> * 1 Shard and 3 Replicas
> * Hard commit: 5 minutes (Referencing
> https://lucidworks.com/blog/2013/08/23/understanding-
> transaction-logs-softcommit-and-commit-in-sorlcloud/
> )
>
> With both the above solutions we see some improvements, but not drastic.
> (Attach images)
>
> I would like to have more insights into the following questions:
> * Why is there an improvement with lowering the hard commit time, would it
> interesting to explore with lower hard commit time.
>
> Can some one provide some other pointer I could explore.
>
> Regards
> Ankush Khanna
>


solr query time

2016-09-07 Thread kshitij tyagi
Hi,

I am having 120 fields in a single document and i am indexing all of them
i.e. index=true and stored=true in my schema.

I need to understand how that might be affecting my query time overall.

what is the relation between query time with respect to indexing all fields
in schema??

Regards,
Kshitij


Re: How using fl in query affects query time

2016-09-01 Thread kshitij tyagi
thanks alex.

On Thu, Sep 1, 2016 at 6:54 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> I believe enableLazyFieldLoading setting is supposed to help with the
> partial-fields use-case. But not with query time itself, but with
> re-hydrating stored fields to return. Which I guess is part of the
> query time from the user's point of view.
>
> https://cwiki.apache.org/confluence/display/solr/Query+
> Settings+in+SolrConfig#QuerySettingsinSolrConfig-enableLazyFieldLoading
>
> Regards,
>Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 1 September 2016 at 20:04, kshitij tyagi <kshitij.shopcl...@gmail.com>
> wrote:
> > Hi,
> >
> >
> > I am having around 100 fields in single document. I want to know that if
> I
> > use fl and get only single field from query will that reduce query time??
> >
> > or getting all the fields through query and getting one field using fl in
> > query both will have same query time??
>


How using fl in query affects query time

2016-09-01 Thread kshitij tyagi
Hi,


I am having around 100 fields in single document. I want to know that if I
use fl and get only single field from query will that reduce query time??

or getting all the fields through query and getting one field using fl in
query both will have same query time??


Re: Need to understand solr merging and commit relationship

2016-08-20 Thread kshitij tyagi
thanks shawn that was really helpful

On Sat, Aug 20, 2016 at 3:17 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 8/16/2016 11:47 AM, kshitij tyagi wrote:
> > I need to understand clearly that is there any relationship between solr
> > merging and solr commit?
> >
> > If there is then what is it?
> >
> > Also i need to understand how both of these affect indexing speed on the
> > core?
>
> Whenever a new segment is written, the merge policy is checked to see
> whether a merge is needed.  If it is needed, then the merge is scheduled.
>
> A commit operation can (and frequently does) write a new segment, but
> that is not the only thing that can write (flush) new segments.  When
> the indexing RAM buffer fills up, a segment will be flushed, even
> without a commit.
>
> When paired with the default NRT Directory implementation, soft commits
> change the dynamics slightly, but not the way things generally operate.
> Soft commits are capable of flushing the latest segment(s) to memory,
> instead of the disk, but only if they are quite small.
>
> I would not expect commits to *directly* affect indexing speed unless
> you are doing commits extremely frequently.  Commits might indirectly
> affect indexing speed if they trigger a large merge.
>
> Merging can cause issues with indexing speed, even if it's happening in
> a different Solr core on the same machine.  This is because the system
> resources (I/O bandwidth, memory, CPU) required for a merge are also
> required to write a new segment.  Also, because flushing a new segment
> is effectively the same operation as the writing part of a merge, if too
> many merges are scheduled at once on a core, indexing on that core can
> stop entirely until the number of scheduled merges drops.
>
> Merging can also cause issues with query speed, if there is not
> sufficient memory available to the OS for effective disk caching.
>
> Thanks,
> Shawn
>
>


Re: What does refCount denotes in solr admin

2016-08-18 Thread kshitij tyagi
how to shrink solr container thread pool??

On Thu, Aug 18, 2016 at 2:53 PM, kshitij tyagi <kshitij.shopcl...@gmail.com>
wrote:

> refcount 171 is seen when i reindex a number of documents simultaneously?
> what does this means? I am observing that my indexing speeds slows down
> when refcount increses. I am only indexing on this instance and no queries
> are running on it.
>
> Thanks for the information
>
> On Thu, Aug 18, 2016 at 2:20 PM, Mikhail Khludnev <m...@apache.org> wrote:
>
>> When instance is idle you should see refCount=2 in solrAdmin. One count
>> goes from coreContainer holding a core instance until reload and two comes
>> from solrAdmin request which opens a core while it renders response. So,
>> until you don't request this stat the refCount is 1 that somehow remind
>> quantum mechanics.
>> Seen refcount 171 either mean 169 parallel request, which hardly make
>> sense
>> and I suggest to shrink the Solr container thread pool or/and tweak a
>> client app. But If it keeps growing and remains >2 even when instance is
>> idle, it means some plugin leak it - avoid to close/decref solr core.
>> Hope it helps.
>>
>> On Wed, Aug 17, 2016 at 5:27 PM, kshitij tyagi <
>> kshitij.shopcl...@gmail.com>
>> wrote:
>>
>> > any update??
>> >
>> > On Wed, Aug 17, 2016 at 12:47 PM, kshitij tyagi <
>> > kshitij.shopcl...@gmail.com
>> > > wrote:
>> >
>> > > Hi,
>> > >
>> > > I need to understand what is refcount in stats section of solr admin.
>> > >
>> > > I am seeing refcount: 2 on my solr cores and on one of the core i am
>> > > seeing refcount:171.
>> > >
>> > > The core with refcount  with higher number   is having very slow
>> indexing
>> > > speed?
>> > >
>> > >
>> > >
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>
>


Re: What does refCount denotes in solr admin

2016-08-18 Thread kshitij tyagi
refcount 171 is seen when i reindex a number of documents simultaneously?
what does this means? I am observing that my indexing speeds slows down
when refcount increses. I am only indexing on this instance and no queries
are running on it.

Thanks for the information

On Thu, Aug 18, 2016 at 2:20 PM, Mikhail Khludnev <m...@apache.org> wrote:

> When instance is idle you should see refCount=2 in solrAdmin. One count
> goes from coreContainer holding a core instance until reload and two comes
> from solrAdmin request which opens a core while it renders response. So,
> until you don't request this stat the refCount is 1 that somehow remind
> quantum mechanics.
> Seen refcount 171 either mean 169 parallel request, which hardly make sense
> and I suggest to shrink the Solr container thread pool or/and tweak a
> client app. But If it keeps growing and remains >2 even when instance is
> idle, it means some plugin leak it - avoid to close/decref solr core.
> Hope it helps.
>
> On Wed, Aug 17, 2016 at 5:27 PM, kshitij tyagi <
> kshitij.shopcl...@gmail.com>
> wrote:
>
> > any update??
> >
> > On Wed, Aug 17, 2016 at 12:47 PM, kshitij tyagi <
> > kshitij.shopcl...@gmail.com
> > > wrote:
> >
> > > Hi,
> > >
> > > I need to understand what is refcount in stats section of solr admin.
> > >
> > > I am seeing refcount: 2 on my solr cores and on one of the core i am
> > > seeing refcount:171.
> > >
> > > The core with refcount  with higher number   is having very slow
> indexing
> > > speed?
> > >
> > >
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: What does refCount denotes in solr admin

2016-08-17 Thread kshitij tyagi
any update??

On Wed, Aug 17, 2016 at 12:47 PM, kshitij tyagi <kshitij.shopcl...@gmail.com
> wrote:

> Hi,
>
> I need to understand what is refcount in stats section of solr admin.
>
> I am seeing refcount: 2 on my solr cores and on one of the core i am
> seeing refcount:171.
>
> The core with refcount  with higher number   is having very slow indexing
> speed?
>
>
>


index size increses dramatically

2016-08-17 Thread kshitij tyagi
Hi,


Suddenly my index size just doubles and indexing just slows down poorly.

After sometime it reduces back to normal and indexing starts working.

Can someone help me out in finding why index size doubles abnormally??


What does refCount denotes in solr admin

2016-08-17 Thread kshitij tyagi
Hi,

I need to understand what is refcount in stats section of solr admin.

I am seeing refcount: 2 on my solr cores and on one of the core i am seeing
refcount:171.

The core with refcount  with higher number   is having very slow indexing
speed?


Re: Indexing (posting document) taking a lot of time

2016-08-17 Thread kshitij tyagi
I am posting json using curl.

On Wed, Aug 17, 2016 at 4:41 AM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> What format are those documents? Solr XML? Custom JSON?
>
> Or are you sending PDF/binary documents to Solr's extract handler and
> asking it to do the extraction of the useful stuff? If later, you
> could take that step out of Solr with a custom client using Tika (what
> Solr has under the hood) and only send to Solr the processed output.
>
> Regards,
>Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 16 August 2016 at 22:49, kshitij tyagi <kshitij.shopcl...@gmail.com>
> wrote:
> > 400kb is size of single document and i am sending 100 documents per
> request.
> > solr heap size is 16gb and running on multithread.
> >
> > On Tue, Aug 16, 2016 at 5:10 PM, Emir Arnautovic <
> > emir.arnauto...@sematext.com> wrote:
> >
> >> Hi,
> >>
> >> 400KB/doc * 100doc = 40MB. If you are running it single threaded, Solr
> >> will be idle while accepting relatively large request. Or is 400KB 100
> doc
> >> bulk that you are sending?
> >>
> >> What is Solr's heap size? I would try increasing number of threads and
> >> monitor Solr's heap/CPU/IO to see where is the bottleneck.
> >>
> >> How complex is fields' analysis?
> >>
> >> Regards,
> >> Emir
> >>
> >>
> >> On 16.08.2016 13:25, kshitij tyagi wrote:
> >>
> >>> hi,
> >>>
> >>> we are sending about 100 documents per request for indexing? we have
> >>> autocmmit set to false and commit only when 1 documents are
> >>> present.solr and the machine sending request are in same pool.
> >>>
> >>>
> >>>
> >>> On Tue, Aug 16, 2016 at 4:51 PM, Emir Arnautovic <
> >>> emir.arnauto...@sematext.com> wrote:
> >>>
> >>> Hi,
> >>>>
> >>>> Do you send one doc per request? How frequently do you commit? Where
> is
> >>>> Solr running? What is network connection between your machine and
> Solr?
> >>>> What are JVM settings? Is 10-30s for entire indexing or single doc?
> >>>>
> >>>> Regards,
> >>>> Emir
> >>>>
> >>>>
> >>>> On 16.08.2016 11:34, kshitij tyagi wrote:
> >>>>
> >>>> Hi alexandre,
> >>>>>
> >>>>> 1 document of 400kb size is taking approx 10-30 sec and this is
> >>>>> varying. I
> >>>>> am posting document using curl
> >>>>>
> >>>>> On Tue, Aug 16, 2016 at 2:11 PM, Alexandre Rafalovitch <
> >>>>> arafa...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> How many records is that and what is 'slow'? Also is this standalone
> or
> >>>>>
> >>>>>> cluster setup?
> >>>>>>
> >>>>>> On 16 Aug 2016 6:33 PM, "kshitij tyagi" <
> kshitij.shopcl...@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>>> I am indexing a lot of data about 8GB, but it is taking a lot of
> >>>>>>> time. I
> >>>>>>> have read about maxBufferedDocs, ramBufferSizeMB, merge policy
> ,etc in
> >>>>>>> solrconfig file.
> >>>>>>>
> >>>>>>> It would be helpful if someone could help me out tune the segtting
> for
> >>>>>>> faster indexing speeds.
> >>>>>>>
> >>>>>>> *I have read the docs but not able to get what exactly means
> changing
> >>>>>>>
> >>>>>>> these
> >>>>>>
> >>>>>> configs.*
> >>>>>>>
> >>>>>>>
> >>>>>>> *Regards,*
> >>>>>>> *Kshitij*
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> >>>> Solr & Elasticsearch Support * http://sematext.com/
> >>>>
> >>>>
> >>>>
> >> --
> >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> >> Solr & Elasticsearch Support * http://sematext.com/
> >>
> >>
>


Re: Need to understand solr merging and commit relationship

2016-08-16 Thread kshitij tyagi
 i have 2 solr cores on a machine with same configs.

Problem is I am getting faster indexing speed on core1 and slower on core2.

Both cores have same index size and configuration.

On Tue, Aug 16, 2016 at 11:34 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Why? What is the problem you're facing that you hope
> understanding more about these will help?
>
> Here are two places to start:
> http://blog.mikemccandless.com/2011/02/visualizing-
> lucenes-segment-merges.html
> https://lucidworks.com/blog/2013/08/23/understanding-
> transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> In general every time you do a hard commit the Lucene index is checked
> to see if there are segments that should be merged. If so, then a
> background
> thread is kicked off to start merging selected segments. Which segments
> is decided by the MergePolicy in effect (TieredMergePolicy is the default).
>
> Best,
> Erick
>
> On Tue, Aug 16, 2016 at 10:47 AM, kshitij tyagi
> <kshitij.shopcl...@gmail.com> wrote:
> > I need to understand clearly that is there any relationship between solr
> > merging and solr commit?
> >
> > If there is then what is it?
> >
> > Also i need to understand how both of these affect indexing speed on the
> > core?
>


Need to understand solr merging and commit relationship

2016-08-16 Thread kshitij tyagi
I need to understand clearly that is there any relationship between solr
merging and solr commit?

If there is then what is it?

Also i need to understand how both of these affect indexing speed on the
core?


Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
400kb is size of single document and i am sending 100 documents per request.
solr heap size is 16gb and running on multithread.

On Tue, Aug 16, 2016 at 5:10 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi,
>
> 400KB/doc * 100doc = 40MB. If you are running it single threaded, Solr
> will be idle while accepting relatively large request. Or is 400KB 100 doc
> bulk that you are sending?
>
> What is Solr's heap size? I would try increasing number of threads and
> monitor Solr's heap/CPU/IO to see where is the bottleneck.
>
> How complex is fields' analysis?
>
> Regards,
> Emir
>
>
> On 16.08.2016 13:25, kshitij tyagi wrote:
>
>> hi,
>>
>> we are sending about 100 documents per request for indexing? we have
>> autocmmit set to false and commit only when 1 documents are
>> present.solr and the machine sending request are in same pool.
>>
>>
>>
>> On Tue, Aug 16, 2016 at 4:51 PM, Emir Arnautovic <
>> emir.arnauto...@sematext.com> wrote:
>>
>> Hi,
>>>
>>> Do you send one doc per request? How frequently do you commit? Where is
>>> Solr running? What is network connection between your machine and Solr?
>>> What are JVM settings? Is 10-30s for entire indexing or single doc?
>>>
>>> Regards,
>>> Emir
>>>
>>>
>>> On 16.08.2016 11:34, kshitij tyagi wrote:
>>>
>>> Hi alexandre,
>>>>
>>>> 1 document of 400kb size is taking approx 10-30 sec and this is
>>>> varying. I
>>>> am posting document using curl
>>>>
>>>> On Tue, Aug 16, 2016 at 2:11 PM, Alexandre Rafalovitch <
>>>> arafa...@gmail.com>
>>>> wrote:
>>>>
>>>> How many records is that and what is 'slow'? Also is this standalone or
>>>>
>>>>> cluster setup?
>>>>>
>>>>> On 16 Aug 2016 6:33 PM, "kshitij tyagi" <kshitij.shopcl...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>> I am indexing a lot of data about 8GB, but it is taking a lot of
>>>>>> time. I
>>>>>> have read about maxBufferedDocs, ramBufferSizeMB, merge policy ,etc in
>>>>>> solrconfig file.
>>>>>>
>>>>>> It would be helpful if someone could help me out tune the segtting for
>>>>>> faster indexing speeds.
>>>>>>
>>>>>> *I have read the docs but not able to get what exactly means changing
>>>>>>
>>>>>> these
>>>>>
>>>>> configs.*
>>>>>>
>>>>>>
>>>>>> *Regards,*
>>>>>> *Kshitij*
>>>>>>
>>>>>>
>>>>>> --
>>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>>> Solr & Elasticsearch Support * http://sematext.com/
>>>
>>>
>>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
hi,

we are sending about 100 documents per request for indexing? we have
autocmmit set to false and commit only when 1 documents are
present.solr and the machine sending request are in same pool.



On Tue, Aug 16, 2016 at 4:51 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi,
>
> Do you send one doc per request? How frequently do you commit? Where is
> Solr running? What is network connection between your machine and Solr?
> What are JVM settings? Is 10-30s for entire indexing or single doc?
>
> Regards,
> Emir
>
>
> On 16.08.2016 11:34, kshitij tyagi wrote:
>
>> Hi alexandre,
>>
>> 1 document of 400kb size is taking approx 10-30 sec and this is varying. I
>> am posting document using curl
>>
>> On Tue, Aug 16, 2016 at 2:11 PM, Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> wrote:
>>
>> How many records is that and what is 'slow'? Also is this standalone or
>>> cluster setup?
>>>
>>> On 16 Aug 2016 6:33 PM, "kshitij tyagi" <kshitij.shopcl...@gmail.com>
>>> wrote:
>>>
>>> Hi,
>>>>
>>>> I am indexing a lot of data about 8GB, but it is taking a lot of time. I
>>>> have read about maxBufferedDocs, ramBufferSizeMB, merge policy ,etc in
>>>> solrconfig file.
>>>>
>>>> It would be helpful if someone could help me out tune the segtting for
>>>> faster indexing speeds.
>>>>
>>>> *I have read the docs but not able to get what exactly means changing
>>>>
>>> these
>>>
>>>> configs.*
>>>>
>>>>
>>>> *Regards,*
>>>> *Kshitij*
>>>>
>>>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


Re: Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
Hi alexandre,

1 document of 400kb size is taking approx 10-30 sec and this is varying. I
am posting document using curl

On Tue, Aug 16, 2016 at 2:11 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> How many records is that and what is 'slow'? Also is this standalone or
> cluster setup?
>
> On 16 Aug 2016 6:33 PM, "kshitij tyagi" <kshitij.shopcl...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I am indexing a lot of data about 8GB, but it is taking a lot of time. I
> > have read about maxBufferedDocs, ramBufferSizeMB, merge policy ,etc in
> > solrconfig file.
> >
> > It would be helpful if someone could help me out tune the segtting for
> > faster indexing speeds.
> >
> > *I have read the docs but not able to get what exactly means changing
> these
> > configs.*
> >
> >
> > *Regards,*
> > *Kshitij*
> >
>


Indexing (posting document) taking a lot of time

2016-08-16 Thread kshitij tyagi
Hi,

I am indexing a lot of data about 8GB, but it is taking a lot of time. I
have read about maxBufferedDocs, ramBufferSizeMB, merge policy ,etc in
solrconfig file.

It would be helpful if someone could help me out tune the segtting for
faster indexing speeds.

*I have read the docs but not able to get what exactly means changing these
configs.*


*Regards,*
*Kshitij*


Re: Reindexing in SOlr

2016-02-24 Thread kshitij tyagi
hi

I am using following tag


i am able to connect but indexing is not working. My solr have same versions


On Wed, Feb 24, 2016 at 12:48 PM, Neeraj Bhatt <neerajbhatt2...@gmail.com>
wrote:

> Hi
>
> Can you give your data import tag details  tag in
> db-data-config.xml
> Also is your previuos and new solr have different versions ?
>
> Thanks
>
>
>
> On Wed, Feb 24, 2016 at 12:08 PM, kshitij tyagi
> <kshitij.shopcl...@gmail.com> wrote:
> > Hi,
> >
> > I am following the following article
> > https://wiki.apache.org/solr/HowToReindex
> > to reindex the data using Solr itself as a datasource.
> >
> > Means one solr instance has all fields with stored true and
> indexed=false.
> > When I am using this instance as a datasource and indexing it on other
> > instance data is not indexing.
> >
> > Giving error of version conflict. How can i resolve it.
>


Reindexing in SOlr

2016-02-23 Thread kshitij tyagi
Hi,

I am following the following article
https://wiki.apache.org/solr/HowToReindex
to reindex the data using Solr itself as a datasource.

Means one solr instance has all fields with stored true and indexed=false.
When I am using this instance as a datasource and indexing it on other
instance data is not indexing.

Giving error of version conflict. How can i resolve it.


Size of logs are high

2016-02-10 Thread kshitij tyagi
Hi,
I have migrated to solr 5.2 and the size of logs are high.

Can anyone help me out here how to control this?


Re: Need to move on SOlr cloud (help required)

2016-02-10 Thread kshitij tyagi
@Jack

Currently we have around 55,00,000 docs

Its not about load on one node we have load on different nodes at different
times as our traffic is huge around 60k users at a given point of time

We want the hits on solr servers to be distributed so we are planning to
move on solr cloud as it would be fault tolerant.



On Thu, Feb 11, 2016 at 11:10 AM, Midas A <test.mi...@gmail.com> wrote:

> hi,
> what if master node fail what should be our fail over strategy  ?
>
> On Wed, Feb 10, 2016 at 9:12 PM, Jack Krupansky <jack.krupan...@gmail.com>
> wrote:
>
> > What exactly is your motivation? I mean, the primary benefit of SolrCloud
> > is better support for sharding, and you have only a single shard. If you
> > have no need for sharding and your master-slave replicated Solr has been
> > working fine, then stick with it. If only one machine is having a load
> > problem, then that one node should be replaced. There are indeed plenty
> of
> > good reasons to prefer SolrCloud over traditional master-slave
> replication,
> > but so far you haven't touched on any of them.
> >
> > How much data (number of documents) do you have?
> >
> > What is your typical query latency?
> >
> >
> > -- Jack Krupansky
> >
> > On Wed, Feb 10, 2016 at 2:15 AM, kshitij tyagi <
> > kshitij.shopcl...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > We are currently using solr 5.2 and I need to move on solr cloud
> > > architecture.
> > >
> > > As of now we are using 5 machines :
> > >
> > > 1. I am using 1 master where we are indexing ourdata.
> > > 2. I replicate my data on other machines
> > >
> > > One or the other machine keeps on showing high load so I am planning to
> > > move on solr cloud.
> > >
> > > Need help on following :
> > >
> > > 1. What should be my architecture in case of 5 machines to keep
> > (zookeeper,
> > > shards, core).
> > >
> > > 2. How to add a node.
> > >
> > > 3. what are the exact steps/process I need to follow in order to change
> > to
> > > solr cloud.
> > >
> > > 4. How indexing will work in solr cloud as of now I am using mysql
> query
> > to
> > > get the data on master and then index the same (how I need to change
> this
> > > in case of solr cloud).
> > >
> > > Regards,
> > > Kshitij
> > >
> >
>


Need to move on SOlr cloud (help required)

2016-02-10 Thread kshitij tyagi
Hi,

We are currently using solr 5.2 and I need to move on solr cloud
architecture.

As of now we are using 5 machines :

1. I am using 1 master where we are indexing ourdata.
2. I replicate my data on other machines

One or the other machine keeps on showing high load so I am planning to
move on solr cloud.

Need help on following :

1. What should be my architecture in case of 5 machines to keep (zookeeper,
shards, core).

2. How to add a node.

3. what are the exact steps/process I need to follow in order to change to
solr cloud.

4. How indexing will work in solr cloud as of now I am using mysql query to
get the data on master and then index the same (how I need to change this
in case of solr cloud).

Regards,
Kshitij