Re: Force open a searcher in solr.
So to make things clear, belows what I am expecting I have a document with a unique id field lets say "uniqueID". This document has both stored/indexed and not stored/ not indexed fields Currently I have my pop values in external files but I will instead define a new field in schema (popVal) which will not be stored or indexed and have docValues=true. I am also moving _version_ field to indexed=false and stored=false, since I don't have any case where I retrieve it and use it for searching. Just hoping doing this doesn't cause any issues with updates in general (I read that keeping this as not stored and not indexed is recommended since solr 7) Regards, Akshay On Thu, Aug 13, 2020 at 4:53 PM Erick Erickson wrote: > Let us know how it works. I want to be sure I’m not confusing you > though. There isn’t a “doc ID field”. The structure of an eff file is > docid:value > > where docid is your . What updating numerics does is allow > you to update a field in a doc that’s identified by . That > field is any name you want as long as it’s defined respecting > the limitations in that link. > > Best, > Erick > > > On Aug 13, 2020, at 6:30 AM, Akshay Murarka wrote: > > > > Hey Erick, > > > > Thanks for the information about the doc ID field. > > So our external file values are single float value fields and we do use > > them in functional queries in boost parameter, so based on the definition > > the above should work. > > So currently we use solr 5.4.0 but are in the process of upgrading our > > systems so will try out this change. > > > > Regards, > > Akshay > > > > On Mon, Aug 10, 2020 at 10:19 PM Erick Erickson > > > wrote: > > > >> Right, but you can use those with function queries. Assuming your eff > >> entry is a doc ID plus single numeric, I was wondering if you can > >> accomplish what you need to with function queries... > >> > >>> On Aug 10, 2020, at 11:30 AM, raj.yadav > wrote: > >>> > >>> Erick Erickson wrote > >>>> Ah, ok. That makes sense. I wonder if your use-case would be better > >>>> served, though, by “in place updates”, see: > >>>> > >> > https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html > >>>> This has been around in since Solr 6.5… > >>> > >>> As per documentation `in place update` is only available for numeric > >>> docValues (along with few more conditions). And here its external field > >>> type. > >>> > >>> Regards, > >>> Raj > >>> > >>> > >>> > >>> -- > >>> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html > >> > >> > >
Re: Reaching max Filter cache limit increases the request latencies.
Hey Erick, So I am investigating the point where we can limit the values that are cached using {!cache=false} (we already use it in some of our cases) So in general there is 0 evictions on filter cache side but whenever we hit this max limit there is a spike in evictions as well (which is expected) As far as I remember not forcing enum on our side, but will definitely verify that. My filter cache hit ratio remains constant at around 97.5 % and even during this eviction the hit ratio doesn't go down Regarding other operation there are a few cases where indexing (80 to 150 docs) also happened during that time but there are also cases where index happened 5- 10 min after that and the latencies remained high. Regards, Akshay On Thu, Aug 13, 2020 at 5:08 PM Erick Erickson wrote: > Well, when you hit the max capacity, cache entries get aged out and are > eligible for GC, so GC > activity increases. But for aging out filterCache entries to be > noticeable, you have to be > flushing a _lot_ of them out. Which, offhand, makes me wonder if you’re > using the filterCache > appropriately. > > Here’s what I’d investigate first: What kinds of fq clauses are you using > and are they making > best use of the filterCache? Consider an fq clause like > > fq=date_field:[* to NOW] > > That will consume > an entry in the filterCache and never be re-used because NOW is the epoch > time and will change a millisecond later. > > Similarly for fq clauses that contain a lot of values that may vary, for > instance > > fq=id:(1 2 4 86 93 …) > > where the list of IDs is not likely to be repeated. Or even repeated in a > different order. > > If you do identify patterns that you _know_ will not be repeated, just add > fq={!cache=false}your_unrepeated_pattern > > What I’m guessing here is that if you’ve correctly identified that the > filterCache filling up > is increasing GC activity that much, you must be evicting a _lot_ of fq > entries very rapidly > which indicates you’re not repeating fq’s very often. > > I should add that the filterCache is also used for some other operations, > particularly some > kinds of faceting if you specify the enum method. Are you forcing that? > > All that said, I’m also wondering if this is coincidence and your slowdown > is something > else. Because given all the work a query does, the additional bookkeeping > due to > filterCache churn doesn’t really sound like the culprit. Prior to the > filterCache filling up, > what’s your hit ratio? The scenario I can see where the filterCache churn > could cause > your response times to go up is if, up until that point, you’re getting a > high hit ratio that > goes down after the cache starts aging out entries. I find this rather > unlikely, but possible. > > Best, > Erick > > > On Aug 13, 2020, at 3:19 AM, Akshay Murarka wrote: > > > > Hey guys, > > > > So for quite some time we have been facing an issue where whenever the > Used Filter Cache value reaches the maximum configured value we start > seeing an increase in the query latencies on solr side. > > During this time we also see an increase in our garbage collection and > CPU as well. > > When a commit happens with openSearcher=true then only the latencies > value come back to normal. > > > > Is there any setting that can help us with this or will increasing the > max configured value for filter cache help, because right now we can’t > increase the commit frequency > > > > Thanks for the help. > > > > Regards, > > Akshay > > > > > > Below is the graph for request latency > > > > > > > > > > > > > > Below is the graph for the Filter cache values > > > >
Re: Force open a searcher in solr.
Hey Erick, Thanks for the information about the doc ID field. So our external file values are single float value fields and we do use them in functional queries in boost parameter, so based on the definition the above should work. So currently we use solr 5.4.0 but are in the process of upgrading our systems so will try out this change. Regards, Akshay On Mon, Aug 10, 2020 at 10:19 PM Erick Erickson wrote: > Right, but you can use those with function queries. Assuming your eff > entry is a doc ID plus single numeric, I was wondering if you can > accomplish what you need to with function queries... > > > On Aug 10, 2020, at 11:30 AM, raj.yadav wrote: > > > > Erick Erickson wrote > >> Ah, ok. That makes sense. I wonder if your use-case would be better > >> served, though, by “in place updates”, see: > >> > https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html > >> This has been around in since Solr 6.5… > > > > As per documentation `in place update` is only available for numeric > > docValues (along with few more conditions). And here its external field > > type. > > > > Regards, > > Raj > > > > > > > > -- > > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html > >
Re: Force open a searcher in solr.
Hey, So I have external file fields that have some data that get updated regularly. Whenever those get updated we need the open searcher operation to happen. The value in this external files are used in boosting and other function/range queries. On Mon, Aug 10, 2020 at 5:08 PM Erick Erickson wrote: > In a word, “no”. There is explicit code to _not_ open a new searcher if > the index hasn’t changed because it’s an expensive operation. > > Could you explain _why_ you want to open a new searcher even though the > index is unchanged? The reason for the check in the first place is that > nothing has changed about the index so the assumption is that there’s no > reason to open a new searcher. > > You could add at least one bogus doc on each shard, then delete them all > then issue a commit as a rather crude way to do this. Insuring that you > changed at least one doc on each shard is “an exercise for the reader”… > > Again, though, perhaps if you explained why you think this is necessary we > could suggest another approach. At first glance, this looks like an XY > problem though. > > Best, > Erick > > > On Aug 10, 2020, at 5:49 AM, Akshay Murarka wrote: > > > > Hey, > > > > I have a use case where none of the document in my solr index is > changing but I still want to open a new searcher through the curl api. > > On executing the below curl command > > curl > “XXX.XX.XX.XXX:9744/solr/mycollection/update?openSearcher=true=true” > > it doesn’t open a new searcher. Below is what I get in logs > > > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6824) [c:mycollection > s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] > o.a.s.u.DirectUpdateHandler2 start > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6819) [c:mycollection > s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] > o.a.s.u.DirectUpdateHandler2 start > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6829) [c:mycollection > s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1] > o.a.s.u.DirectUpdateHandler2 start > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6824) [c:mycollection > s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6819) [c:mycollection > s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6766) [c:mycollection > s:shard1_1_1 r:core_node7 x:mycollection_shard1_1_1_replica1] > o.a.s.u.DirectUpdateHandler2 start > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6829) [c:mycollection > s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1] > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. > > 2020-08-10 09:32:22.696 INFO (qtp297786644-6766) [c:mycollection > s:shard1_1_1 r:core_node7 x:mycollection_shard1_1_1_replica1] > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. > > 2020-08-10 09:32:22.697 INFO (qtp297786644-6824) [c:mycollection > s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] > o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: > org.apache.solr.search.SolrIndexSearcher > > 2020-08-10 09:32:22.697 INFO (qtp297786644-6819) [c:mycollection > s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] > o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: > org.apache.solr.search.SolrIndexSearcher > > 2020-08-10 09:32:22.697 INFO (qtp297786644-6829) [c:mycollection > s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1] > o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: > org.apache.solr.search.SolrIndexSearcher > > 2020-08-10 09:32:22.697 INFO (qtp297786644-6824) [c:mycollection > s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] > o.a.s.u.DirectUpdateHandler2 end_commit_flush > > 2020-08-10 09:32:22.697 INFO (qtp297786644-6819) [c:mycollection > s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] > o.a.s.u.DirectUpdateHandler2 end_commit_flush > > 2020-08-10 09:32:22.697 INFO (qtp297786644-6829) [c:mycollection &
Force open a searcher in solr.
Hey, I have a use case where none of the document in my solr index is changing but I still want to open a new searcher through the curl api. On executing the below curl command curl “XXX.XX.XX.XXX:9744/solr/mycollection/update?openSearcher=true=true” it doesn’t open a new searcher. Below is what I get in logs 2020-08-10 09:32:22.696 INFO (qtp297786644-6824) [c:mycollection s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 2020-08-10 09:32:22.696 INFO (qtp297786644-6819) [c:mycollection s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 2020-08-10 09:32:22.696 INFO (qtp297786644-6829) [c:mycollection s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 2020-08-10 09:32:22.696 INFO (qtp297786644-6824) [c:mycollection s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. 2020-08-10 09:32:22.696 INFO (qtp297786644-6819) [c:mycollection s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. 2020-08-10 09:32:22.696 INFO (qtp297786644-6766) [c:mycollection s:shard1_1_1 r:core_node7 x:mycollection_shard1_1_1_replica1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 2020-08-10 09:32:22.696 INFO (qtp297786644-6829) [c:mycollection s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. 2020-08-10 09:32:22.696 INFO (qtp297786644-6766) [c:mycollection s:shard1_1_1 r:core_node7 x:mycollection_shard1_1_1_replica1] o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. 2020-08-10 09:32:22.697 INFO (qtp297786644-6824) [c:mycollection s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher 2020-08-10 09:32:22.697 INFO (qtp297786644-6819) [c:mycollection s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher 2020-08-10 09:32:22.697 INFO (qtp297786644-6829) [c:mycollection s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher 2020-08-10 09:32:22.697 INFO (qtp297786644-6824) [c:mycollection s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.u.DirectUpdateHandler2 end_commit_flush 2020-08-10 09:32:22.697 INFO (qtp297786644-6819) [c:mycollection s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.u.DirectUpdateHandler2 end_commit_flush 2020-08-10 09:32:22.697 INFO (qtp297786644-6829) [c:mycollection s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.u.DirectUpdateHandler2 end_commit_flush I don’t want to do a complete reload of my collection. Is there any parameter that can be used to forcefully open a new searcher every time I do a commit with openSearcher=true Thanks in advance for the help
Re: Timeout waiting for connection from pool
Is there any way through which I can create an external plugin and update this values? -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Timeout waiting for connection from pool
I don't have an issue with increasing the request rate. But facing this issue when the system is going under recovery. Its not able to recover properly and throwing this connection error -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Timeout waiting for connection from pool
Hey, I am currently running solr 5.4.0 in solr cloud mode. Everything has been working fine till now but when I start increasing the request rate I am starting to get connection timeout errors. Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool On reading more about this I found that solr 5.4.0 has a major bug fixed in verion 5.5 related to low values for maxUpdateConnectionsPerHost. But I can't update my system to 5.5 as of now. I am not able to find where/how do I add/edit the above mentioned parameter to increase its value. Any help would be highly appreciated. Regards, Akshay -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Learning to rank
Hi I am student. for my master thesis I am working on the Learning To rank. As I did research on it. I found solution provided by the Bloomberg. But I would like to ask. With the example that you have provided It always shows the error of Bad Request. Do you have running example of it. So i can adapt it to my application. I am trying to use the example that you have provided in github. core :- techproducts traning_and_uploading_demo.py It generates the training data. But I am getting the problem in uploading the model. It shows error of bad request (empty request body). please help me out with this problem. So I will be able to adapt it to my application. Best Regards ! Any help would be appreciated
Solr reload process flow
Hey, I am using solr-5.4.0 in my production environment and am trying to automate the reload/restart process of the solr collections based on certain specific conditions. I noticed that on solr reload the thread count increases a lot there by resulting in increased latencies. So I read about reload process and came to know that while reloading 1) Solr creates a new core internally and then assigns this core same name as the old core. Is this correct? 2) If above is true then does solr actually create a new index internally on reload? 3) If so then restart sounds much better than reload, or is there any better way to upload new configs on solr? 4) Can you point me to any docs that can give me more details about this? Any help would be appreciated, Thank you. Regards, Akshay
Re: Example Solr Config on EC2
Yes you can promote a slave to be master refer http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node In AWS one can use an elastic IP(http://aws.amazon.com/articles/1346) to refer to the master and this can be assigned to slaves as they assume the role of master(in case of failure). All slaves will then refer to this new master and there will be no need to regenerate data. Automation of this maybe possible through CloudWatch alarm-actions. I don't know of any available example automation scripts. Cheers Akshay. On Wed, Aug 10, 2011 at 9:08 PM, Matt Shields m...@mattshields.org wrote: If I were to build a master with multiple slaves, is it possible to promote a slave to be the new master if the original master fails? Will all the slaves pickup right where they left off, or any time the master fails will we need to completely regenerate all the data? If this is possible, are there any examples of this being automated? Especially on Win2k3. Matthew Shields Owner BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation, Managed Services www.beantownhost.com www.sysadminvalley.com www.jeeprally.com On Mon, Aug 8, 2011 at 5:34 PM, mboh...@yahoo.com wrote: Matthew, Here's another resource: http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/ Michael Bohlig Lucid Imagination - Original Message From: Matt Shields m...@mattshields.org To: solr-user@lucene.apache.org Sent: Mon, August 8, 2011 2:03:20 PM Subject: Example Solr Config on EC2 I'm looking for some examples of how to setup Solr on EC2. The configuration I'm looking for would have multiple nodes for redundancy. I've tested in-house with a single master and slave with replication running in Tomcat on Windows Server 2003, but even if I have multiple slaves the single master is a single point of failure. Any suggestions or example configurations? The project I'm working on is a .NET setup, so ideally I'd like to keep this search cluster on Windows Server, even though I prefer Linux. Matthew Shields Owner BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation, Managed Services www.beantownhost.com www.sysadminvalley.com www.jeeprally.com
Auto-scaling solr setup
So i am trying to setup an auto-scaling search system of ec2 solr-slaves which scale up as number of requests increase and vice versa Here is what I have 1. A solr master and underlying slaves(scalable). And an elastic load balancer to distribute the load. 2. The ec2-auto-scaling setup fires nodes when traffic increases. However the replication times(replication speed) for the index from the master varies for these newly fired nodes. 3. I want to avoid addition of these nodes to the load balancer till it has completed initial replication and has a warmed up cache. For this I need to know a way I can check if the initial replication has completed. and also a way of warming up the cache post this. I can think of doing this via .. a shellscript/awk(checking times replicated/index size) ... is there a cleaner way ? Also on the side note .. any suggestions or pointers to how one set up their scalable solr setup on cloud(AWS mainly) would be helpful. Regards, Akshay
Re: Auto-scaling solr setup
Yes sadly .. I too have not much clue about AWS. The SolrReplication API doesnt give me what i want exactly.. For the time being i have hacked my way into the amazon image bootstrapping the replication check in a shell script ((curl awk) very dirty way) . Once the check suceeds I enable the server using the Solr healthcheck for load-balancers. I was wondering if anyone has moved to the cloud..specially Amazon auto-scaling where they dont have control over when a new node is fired.. All scenarios i encountered were people creating a node .. warming up the cache and then adding it under the HAProxy LB. I guess warmup is not that big an issue as compared to an empty response. Thanks for your response :) Regards, Akshay On Mon, Jun 6, 2011 at 6:33 PM, Erick Erickson erickerick...@gmail.comwrote: The HTTP interface (http://wiki.apache.org/solr/SolrReplication#HTTP_API) can be used to control lots of parts of replication. As to warmups, I don't know of a good way to test that. I don't know whether getting the current status on the slave includes whether warmup is completed or not. At worst, after replication is complete you could wait an interval (see the warmup times on your running servers) before routing requests to the slave. I haven't any clue at all about AWS... Best Erick On Mon, Jun 6, 2011 at 9:18 AM, Akshay akm...@gmail.com wrote: So i am trying to setup an auto-scaling search system of ec2 solr-slaves which scale up as number of requests increase and vice versa Here is what I have 1. A solr master and underlying slaves(scalable). And an elastic load balancer to distribute the load. 2. The ec2-auto-scaling setup fires nodes when traffic increases. However the replication times(replication speed) for the index from the master varies for these newly fired nodes. 3. I want to avoid addition of these nodes to the load balancer till it has completed initial replication and has a warmed up cache. For this I need to know a way I can check if the initial replication has completed. and also a way of warming up the cache post this. I can think of doing this via .. a shellscript/awk(checking times replicated/index size) ... is there a cleaner way ? Also on the side note .. any suggestions or pointers to how one set up their scalable solr setup on cloud(AWS mainly) would be helpful. Regards, Akshay
Re: replicated index files have incorrect timestamp
You need to specify the index version number for which list of files is to be shown. The URL should be like this:http://masterhost:port/solr/replication?command=filelistindexversion=index version number You can get the index version number from the URL: http://masterhost:port/solr/replication?command=indexversion On Fri, Apr 24, 2009 at 1:10 AM, Jeff Newburn jnewb...@zappos.com wrote: We see the exact same thing. Additionally, that url returns 404 on a multicore and gives an error when I add the core. − response − lst name=responseHeader int name=status0/int int name=QTime0/int /lst str name=statusno indexversion specified/str /response -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Jian Han Guo jian...@gmail.com Reply-To: solr-user@lucene.apache.org Date: Wed, 22 Apr 2009 23:43:02 -0700 To: solr-user@lucene.apache.org Subject: Re: replicated index files have incorrect timestamp I am using Mac OS 10.5. I can't access the box right now and this week. I'll do it next week and post the result then. Thanks, Jianhan 2009/4/22 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com which OS are you using? it does not look at the timestamps to decide if the index is in sync . It looks at the index version only. BTW can you just hit the master withe url and paste the response here http://masterhost:port/solr/replication?command=filelist On Thu, Apr 23, 2009 at 11:53 AM, Jian Han Guo jian...@gmail.com wrote: That's right. The timestamp of files on the slave side are all Dec 31 1969, so it looks the timestamp was not set (and therefore it is zero). The ones on the master side are all correct. Nevertheless, solr seems being able to recognize that master and slave are in sync after replication. Don't know how it does that. I haven't check if the two machines are in sync, but even if they are not, the timestamp should not be Dec 31, 1969, I think. Thanks, Jianhan 2009/4/22 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com Let me assume that you are using the in-inbuilt replication. The replication ties to set the timestamp of all the files same as that of the files in the master. just cross check. On Thu, Apr 23, 2009 at 6:57 AM, Jian Han Guo jian...@gmail.com wrote: Hi, I am using nightly build on 4/22/2009. Replication works fine, but the files inside index directory on slave side all have old timestamp: Dec 31 1969. Is this a known issue? Thanks, Jianhan -- --Noble Paul -- --Noble Paul -- Regards, Akshay K. Ukey.
Re: multicore
On Thu, Apr 2, 2009 at 5:58 PM, Neha Bhardwaj neha_bhard...@persistent.co.in wrote: Hi, I want to index through commond line How to do that? You can use curl, http://wiki.apache.org/solr/UpdateXmlMessages?highlight=%28curl%29#head-c614ba822059ae20dde5698290caeb851534c9de -Original Message- From: Erik Hatcher [mailto:e...@ehatchersolutions.com] Sent: Thursday, April 02, 2009 5:42 PM To: solr-user@lucene.apache.org Subject: Re: multicore On Apr 2, 2009, at 6:32 AM, Neha Bhardwaj wrote: Also how to index data in particular core. Say.. we have core0 and core1 in multicore. How can I specify that on which core iwant to index data. You index into http://loalhost:8983/solr/core0/update or http://loalhost:8983/solr/core1/update Erik DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Regards, Akshay K. Ukey.
Re: Times Replicated Since Startup: 109 since yesterday afternoon?
Can you post your replicationhandler configuration? On Mon, Mar 30, 2009 at 8:17 PM, sunnyfr johanna...@gmail.com wrote: Hi, Can you explain me more about this replication script in solr 1.4. It does work but it always replicate everything from the master so it lost every cache everything to replicate it. I don't get really how it works ? Thanks a lot, -- View this message in context: http://www.nabble.com/Times-Replicated-Since-Startup%3A-109--since-yesterday-afternoon--tp22784943p22784943.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay K. Ukey.
Re: Trunk Replication Page Issue
Hi Jeff, The line number from your stacktrace doesn't seem to be valid in the trunk code (of jsp). did you do an ant clean dist ? if yes, can you send me the generated servlet for the jsp? On Fri, Feb 27, 2009 at 10:17 PM, Jeff Newburn jnewb...@zappos.com wrote: In trying trunk to fix the Lucene Sync issue we have now encountered a severed java exception making the replication page non functional. Am I missing something or doing something wrong? Info: Slave server on the replication page. Just a code dump as follows. Feb 27, 2009 8:44:37 AM org.apache.solr.common.SolrException log SEVERE: org.apache.jasper.JasperException: java.lang.NullPointerException at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:4 18) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:337) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application FilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh ain.java:206) at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher. java:630) at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDis patcher.java:436) at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatch er.java:374) at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher .java:302) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 273) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application FilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh ain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja va:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja va:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128 ) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102 ) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java :109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java: 879) at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(H ttp11NioProtocol.java:719) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java: 2080) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja va:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9 07) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.jsp.admin.replication.index_jsp._jspService(index_jsp.java:294) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:3 74) ... 24 more -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 -- Regards, Akshay K. Ukey.
Re: CoreAdmin for replication STATUS
On Fri, Jan 16, 2009 at 4:57 AM, Jacob Singh jacobsi...@gmail.com wrote: Hi, How do I find out the status of a slave's index? I have the following scenario: 1. Boot up the slave. I give it a core name of boot-$CoreName. 2. I call boot-$CoreName/replication?command=snappull 3. I check back every minute using cron and I want to see if the slave has actually gotten the data. 4. When it gets the data I call solr/admin/cores?action=RENAMEcore=boot-$CoreNameother=$CoreName. I do this because the balancer will start hitting the slave before it has the full index otherwise. Step 3 is the problem. I don't have a reliable way to know it has finished replication AFAIK. I see in ?action=STATUS for the CoreAdmin there is a field called current. Is this useful for this? If not, what is recommended. I could hit the admin/replication/index.jsp url and screenscrape the HTML, but I imagine there is a better way. From the slave you can issue a http command, boot-$CoreName/replication?command=details this returns XML which contains a node isReplicating having boolean value. This will tell you whether replication is in progress or completed. Thanks, Jacob -- +1 510 277-0891 (o) +91 33 7458 (m) web: http://pajamadesign.com Skype: pajamadesign Yahoo: jacobsingh AIM: jacobsingh gTalk: jacobsi...@gmail.com -- Regards, Akshay Ukey.
Re: To get all indexed records.
Use *:* as a query to get all records. Refer to http://wiki.apache.org/solr/SolrQuerySyntax for more info. On Mon, Jan 12, 2009 at 5:30 PM, Tushar_Gandhi tushar_gan...@neovasolutions.com wrote: Hi, I am using solr 1.3. I want to retrieve all records from index file. How should I write solr query so that I will get all records? Thanks, Tushar. -- View this message in context: http://www.nabble.com/To-get-all-indexed-records.-tp21413170p21413170.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey.
Re: Solr query for date
On Thu, Jan 8, 2009 at 3:38 PM, prerna07 pkhandelw...@sapient.com wrote: My requirement is to fetch records whthin range of 45 days. 1) ?q=date_field:[NOW TO NOW-45DAYS] is not returning any results this works for me if you interchange the range limits viz. [NOW-45DAYS TO NOW] 2) ?q=date_field:[NOW TO NOW+45DAYS] is throwing exception this too works for me. Have you defined the date_field field as solr.DateField type in the schema? date_field should be of type solr.DateField to use range query. however I get correct results when i run following query : ?q=date_field:[* TO NOW] Please suggest the correct query for range with days. Thanks, Prerna Akshay-8 wrote: You can use DateMath as: date_field:[NOW TO NOW+45DAYS] On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com wrote: Hi, what will be the syntax of this sql query SELECT * FROM table WHERE date SYSDATE and date SYSDATE+45 in solr format ? I need to fetch records where date is between current date and 45 days from today. Thanks, Prerna -- View this message in context: http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey. Enjoy your job, make lots of money, work within the law. Choose any two. -Author Unknown. -- View this message in context: http://www.nabble.com/Solr-query-for-date-tp21327696p21349038.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey.
Re: Solr query for date
On Thu, Jan 8, 2009 at 4:46 PM, prerna07 pkhandelw...@sapient.com wrote: 1) [NOW-45 TO NOW] works for me now. 2) [NOW TO NOW+45DAYS] is still throwing following exception : -- message org.apache.lucene.queryParser.ParseException: Cannot parse 'dateToTest_product_s:[NOW TO NOW 45DAYS]': Encountered 45DAYS at line 1, notice in the error message that + sign in your query string is getting converted to space. you need to escape the + sign and properly url encode the query before querying solr. See here: http://wiki.apache.org/solr/SolrQuerySyntax#urlescaping column 33. Was expecting: ] ... description The request sent by the client was syntactically incorrect (org.apache.lucene.queryParser.ParseException: Cannot parse 'dateToTest_product_s:[NOW TO NOW 45DAYS]': Encountered 45DAYS at line 1, column 33. Was expecting: ] ... ). - I am running the query on slr admin on internet explorer. i have defined date field as date in schema.xml Akshay-8 wrote: On Thu, Jan 8, 2009 at 3:38 PM, prerna07 pkhandelw...@sapient.com wrote: My requirement is to fetch records whthin range of 45 days. 1) ?q=date_field:[NOW TO NOW-45DAYS] is not returning any results this works for me if you interchange the range limits viz. [NOW-45DAYS TO NOW] 2) ?q=date_field:[NOW TO NOW+45DAYS] is throwing exception this too works for me. Have you defined the date_field field as solr.DateField type in the schema? date_field should be of type solr.DateField to use range query. however I get correct results when i run following query : ?q=date_field:[* TO NOW] Please suggest the correct query for range with days. Thanks, Prerna Akshay-8 wrote: You can use DateMath as: date_field:[NOW TO NOW+45DAYS] On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com wrote: Hi, what will be the syntax of this sql query SELECT * FROM table WHERE date SYSDATE and date SYSDATE+45 in solr format ? I need to fetch records where date is between current date and 45 days from today. Thanks, Prerna -- View this message in context: http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey. Enjoy your job, make lots of money, work within the law. Choose any two. -Author Unknown. -- View this message in context: http://www.nabble.com/Solr-query-for-date-tp21327696p21349038.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey. -- View this message in context: http://www.nabble.com/Solr-query-for-date-tp21327696p21349994.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey.
Re: Querying Solr Index for date fields
You will have to URL encode the string correctly and supply date in format Solr expects. Please check this: http://wiki.apache.org/solr/SolrQuerySyntax On Fri, Jan 9, 2009 at 12:21 PM, Rayudu avsrit2...@yahoo.co.in wrote: Hi All, I have a field with is solr.DateField in my schema file. If I want to get the docs. for a given date for eg: get all the docs. whose date value is 2009-01-09 then how can I query my index. As solr's date format is -mm-ddThh:mm:ss, if I give the date as 2009-01-09T00:00:00Z it is thorwing an exception solr.SolrException: HTTP code=400, reason=Invalid Date String:'2009-01-09T00' . if I give the date as 2009-01-09 it is thorwing an exception , solr.SolrException: HTTP code=400, reason=Invalid Date String:'2009-01-09' Thanks, Rayudu. -- View this message in context: http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21367097.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey.
Re: Solr query for date
You can use DateMath as: date_field:[NOW TO NOW+45DAYS] On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com wrote: Hi, what will be the syntax of this sql query SELECT * FROM table WHERE date SYSDATE and date SYSDATE+45 in solr format ? I need to fetch records where date is between current date and 45 days from today. Thanks, Prerna -- View this message in context: http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey. Enjoy your job, make lots of money, work within the law. Choose any two. -Author Unknown.
Re: Dynamic Boosting at query time with boost value as another fieldvalue
The colon is used to specify value for a field. E.g. in the query box of solr admin you would type something like fieldName:string to search (title:Java). You can use hypen '-' or some other character in the field name instead of colon. On Mon, Dec 15, 2008 at 12:11 PM, Pooja Verlani pooja.verl...@gmail.comwrote: hi, Is it possible to have a fieldname with colon for example source:site? I want to apply query time boost as per recency to this field with the recency function. Recip function with rord isn't taking my source:site fieldname, its throwing an exception. I have tried with escape characters too. Please suggest something. Thank you, Regards Pooja On Thu, Dec 11, 2008 at 7:20 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Take a look at FunctionQuery support in Solr: http://wiki.apache.org/solr/FunctionQuery http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani pooja.verl...@gmail.com wrote: Hi all, I have a specific requirement for query time boosting. I have to boost a field on the basis of the value returned from one of the fields of the document. Basically, I have the creationDate for a document and in order to introduce recency factor in the search, i need to give a boost to the creation field, where the boost value is something like a log(1/x) function and x is the (presentDate - creationDate). Till now what I have seen is we can give only a static boost to the documents. In case you can provide a solution to my problem.. please do reply :) Thanks a lot, Regards. Pooja -- Regards, Shalin Shekhar Mangar. -- Regards, Akshay Ukey.
Re: Dynamic Boosting at query time with boost value as another fieldvalue
On Mon, Dec 15, 2008 at 12:36 PM, Pooja Verlani pooja.verl...@gmail.comwrote: ohk.. that means I can't use colon in the fieldname ever in such a scenario ? probably we can use colon in fieldname. are you using the special keyword _val_ for recip function query? http://wiki.apache.org/solr/FunctionQuery#head-df0601b9306c8f2906ce91d3904bcd9621e02c99 http://wiki.apache.org/solr/SolrQuerySyntax On Mon, Dec 15, 2008 at 12:24 PM, Akshay akshay.u...@gmail.com wrote: The colon is used to specify value for a field. E.g. in the query box of solr admin you would type something like fieldName:string to search (title:Java). You can use hypen '-' or some other character in the field name instead of colon. On Mon, Dec 15, 2008 at 12:11 PM, Pooja Verlani pooja.verl...@gmail.com wrote: hi, Is it possible to have a fieldname with colon for example source:site? I want to apply query time boost as per recency to this field with the recency function. Recip function with rord isn't taking my source:site fieldname, its throwing an exception. I have tried with escape characters too. Please suggest something. Thank you, Regards Pooja On Thu, Dec 11, 2008 at 7:20 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Take a look at FunctionQuery support in Solr: http://wiki.apache.org/solr/FunctionQuery http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani pooja.verl...@gmail.com wrote: Hi all, I have a specific requirement for query time boosting. I have to boost a field on the basis of the value returned from one of the fields of the document. Basically, I have the creationDate for a document and in order to introduce recency factor in the search, i need to give a boost to the creation field, where the boost value is something like a log(1/x) function and x is the (presentDate - creationDate). Till now what I have seen is we can give only a static boost to the documents. In case you can provide a solution to my problem.. please do reply :) Thanks a lot, Regards. Pooja -- Regards, Shalin Shekhar Mangar. -- Regards, Akshay Ukey. -- Regards, Akshay Ukey.
Re: jboss and solr
On Thu, Dec 11, 2008 at 11:21 AM, Neha Bhardwaj [EMAIL PROTECTED] wrote: I am trying to configure jboss wih solr As stated in wiki docs I copied the solr.war but there is no web-apps folder currently present in jboss. So should I create web-apps manually and paste the war file there. For JBoss, war files are deployed to this location: $JBOSS_HOME/server/default/deploy Please look up resources on the net for more information on running applications in JBoss. I tried configuring solr with tomcat as well. I paste the war file in tomcat's web-apps folder. Now when I set system property solr.solr.home It raises an class not found exception. Probably something is missing in the environment settings. One way to get solr running in Tomcat is to start the Tomcat server from the directory where solr home is present. E.g. solr home is at location /home/users/test-solr/solr then start tomcat server from /home/users/test-solr directory. This assumes that you have $TOMCAT_HOME/bin in your PATH env variable. Can any one help me with that. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Regards, Akshay Ukey.
Re: Multiple indexing
Please take a look at this: http://wiki.apache.org/solr/MultipleIndexes On Mon, Dec 8, 2008 at 10:25 AM, Neha Bhardwaj [EMAIL PROTECTED] wrote: Is multiple indexing possible in solr? If yes, how? DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Regards, Akshay K. Ukey.
Re: delta-import for XML files, Solr statistics
On Fri, Oct 24, 2008 at 6:07 PM, [EMAIL PROTECTED] wrote: Thanks for your very fast response :-) 2.) The documentation from DataImportHandler describes the index update process for SQL databases only... My scenario: - My application creates, deletes and modifies files from /tmp/files every night. - delta-import / DataImportHandler should mirror _all_ this changes to my lucene index (= create, delete, update documents). The only Entityprocessor which supports delta is SqlEntityProcessor. The XPathEntityProcessor has not implemented it , because we do not know of a consistent way of finding deltas for XML. So , unfortunately,no delta support for XML. But that said you can implement those methods in XPathEntityProcessor . The methods are explained in EntityProcessor.java. if you have questions specific to this I can help.Probably we can contribute it back === Is this possible with delta-import / DataImportHandler? === If not: Do you have any suggestions on how to do this? Ok so, at the moment I have to do a full-import to update my index. What happens with (user) queries while full-import is running? Does Solr block this queries the import is finished? Which configuration options control this behavior? No queries to SOLR are not blocked during full import. My scenario: - /tmp/files contains 682 'myDoc_.*\.xml' XML files. - Each XML file contains 12 XML elements (e.g. titlefoo/title). - DataImportHandler transfer only 5 from this 12 elements to the lucene index. I don't understand the output from 'solr/dataimport' (= status): ### response ... lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched1363/str str name=Total Documents Skipped0/str str name=Full Dump Started2008-10-24 13:19:03/str str name= Indexing completed. Added/Updated: 681 documents. Deleted 0 documents. /str str name=Committed2008-10-24 13:19:05/str str name=Optimized2008-10-24 13:19:05/str str name=Time taken 0:0:2.648/str /lst ... /response === Why shows the Added/Updated counter 681 and not 682? Added updated is the no:of docs . How do you know the number is not accurate? /tmp/files$ ls myDoc_*.xml | wc -l 682 But Added/Updated shows 681. Does this mean that one file has an XML error? But the statistic says Total Documents Skipped = 0?! It might be the case that somewhere there is a extra line in one of the XML files, a line like ?xml version=1.0 encoding=utf-8? or something. 4.) And my last questions about Solr statistics/informations... === Is it possible to get informations (number of indexed documents, stored values from documents etc.) from the current lucene index? === The admin webinterface shows 'numDocs' and 'maxDoc' in 'statistics/core'. Is 'numDocs' = number of indexed documents? What means 'maxDocs'? Do you have answers for this questions too? Bye, Simon -- Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer -- Regards, Akshay Ukey.
Re: Special character matching 'x' ?
You need to configure Tomcat appropriately for recognizing international characters in the URI. Take a look at this to see if it helps, http://wiki.apache.org/solr/SolrTomcat#head-20147ee4d9dd5ca83ed264898280ab60457847c4 On Thu, Sep 18, 2008 at 10:53 AM, Sanjay Suri [EMAIL PROTECTED] wrote: Hi, Can someone shed some light on this? One of my field values has the name Räikkönen which contains a special characters. Strangely, as I see it anyway, it matches on the search query 'x' ? Can someone explain or point me to the solution/documentation? Any help appreciated, -Sanjay -- Sanjay Suri Videocrux Inc. http://videocrux.com +91 99102 66626 -- Regards, Akshay Ukey.
Re: Where DATA are stored
You will find the indexed data inside the data/index directory of your solr home. The documents are stored in Lucene Index File Format which is not human readable. To find a document or documents you have to search it through solr admin web page with appropriate Lucene query syntax ( http://lucene.apache.org/java/docs/queryparsersyntax.html). Duplicates will not be there if you have a uniqueKey defined in your schema and make sure you don't have allowDups attribute set to true in the add command for adding documents. On Mon, Jul 21, 2008 at 12:42 PM, sanraj25 [EMAIL PROTECTED] wrote: Hi When indexing data with solr where document stored and how to find document in solr installation.And how to avoid data duplication in solr database. thanks in advance regards, Santhanaraj R -- View this message in context: http://www.nabble.com/Where-DATA-are-stored-tp18563280p18563280.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Akshay Ukey.