from:"akshay"

Re: Force open a searcher in solr.

2020-08-13 Thread Akshay Murarka

So to make things clear, belows what I am expecting

I have a document with a unique id field lets say "uniqueID".
This document has both stored/indexed and not stored/ not indexed fields
Currently I have my pop values in external files but I will instead define
a new field in schema (popVal) which will not be stored or indexed and have
docValues=true.
I am also moving _version_ field to indexed=false and stored=false, since I
don't have any case where I retrieve it and use it for searching. Just
hoping doing this doesn't cause any issues with updates in general (I read
that keeping this as not stored and not indexed is recommended since solr 7)

Regards,
Akshay

On Thu, Aug 13, 2020 at 4:53 PM Erick Erickson 
wrote:

> Let us know how it works. I want to be sure I’m not confusing you
> though. There isn’t a “doc ID field”. The structure of an eff file is
> docid:value
>
> where docid is your . What updating numerics does is allow
> you to update a field in a doc that’s identified by . That
> field is any name you want as long as it’s defined respecting
> the limitations in that link.
>
> Best,
> Erick
>
> > On Aug 13, 2020, at 6:30 AM, Akshay Murarka  wrote:
> >
> > Hey Erick,
> >
> > Thanks for the information about the doc ID field.
> > So our external file values are single float value fields and we do use
> > them in functional queries in boost parameter, so based on the definition
> > the above should work.
> > So currently we use solr 5.4.0 but are in the process of upgrading our
> > systems so will try out this change.
> >
> > Regards,
> > Akshay
> >
> > On Mon, Aug 10, 2020 at 10:19 PM Erick Erickson  >
> > wrote:
> >
> >> Right, but you can use those with function queries. Assuming your eff
> >> entry is a doc ID plus single numeric, I was wondering if you can
> >> accomplish what you need to with function queries...
> >>
> >>> On Aug 10, 2020, at 11:30 AM, raj.yadav 
> wrote:
> >>>
> >>> Erick Erickson wrote
> >>>> Ah, ok. That makes sense. I wonder if your use-case would be better
> >>>> served, though, by “in place updates”, see:
> >>>>
> >>
> https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html
> >>>> This has been around in since Solr 6.5…
> >>>
> >>> As per documentation `in place update` is only available for numeric
> >>> docValues (along with few more conditions). And here its external field
> >>> type.
> >>>
> >>> Regards,
> >>> Raj
> >>>
> >>>
> >>>
> >>> --
> >>> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >>
> >>
>
>

Re: Reaching max Filter cache limit increases the request latencies.

2020-08-13 Thread Akshay Murarka

Hey Erick,

So I am investigating the point where we can limit the values that are
cached using {!cache=false} (we already use it in some of our cases)
So in general there is 0 evictions on filter cache side but whenever we hit
this max limit there is a spike in evictions as well (which is expected)
As far as I remember not forcing enum on our side, but will definitely
verify that.
My filter cache hit ratio remains constant at around 97.5 %  and even
during this eviction the hit ratio doesn't go down
Regarding other operation there are  a few cases where indexing (80 to 150
docs) also happened during that time but there are also cases where index
happened 5- 10 min after that and the latencies remained high.

Regards,
Akshay


On Thu, Aug 13, 2020 at 5:08 PM Erick Erickson 
wrote:

> Well, when you hit the max capacity, cache entries get aged out and are
> eligible for GC, so GC
> activity increases. But for aging out filterCache entries to be
> noticeable, you have to be
> flushing a _lot_ of them out. Which, offhand, makes me wonder if you’re
> using the filterCache
> appropriately.
>
> Here’s what I’d investigate first: What kinds of fq clauses are you using
> and are they making
> best use of the filterCache? Consider an fq clause like
>
> fq=date_field:[* to NOW]
>
> That will consume
> an entry in the filterCache and never be re-used because NOW is the epoch
> time and will change a millisecond later.
>
> Similarly for fq clauses that contain a lot of values that may vary, for
> instance
>
> fq=id:(1 2 4 86 93 …)
>
> where the list of IDs is not likely to be repeated. Or even repeated in a
> different order.
>
> If you do identify patterns that you _know_ will not be repeated, just add
> fq={!cache=false}your_unrepeated_pattern
>
> What I’m guessing here is that if you’ve correctly identified that the
> filterCache filling up
> is increasing GC activity that much, you must be evicting a _lot_ of fq
> entries very rapidly
> which indicates you’re not repeating fq’s very often.
>
> I should add that the filterCache is also used for some other operations,
> particularly some
> kinds of faceting if you specify the enum method. Are you forcing that?
>
> All that said, I’m also wondering if this is coincidence and your slowdown
> is something
> else. Because given all the work a query does, the additional bookkeeping
> due to
> filterCache churn doesn’t really sound like the culprit. Prior to the
> filterCache filling up,
> what’s your hit ratio? The scenario I can see where the filterCache churn
> could cause
> your response times to go up is if, up until that point, you’re getting a
> high hit ratio that
> goes down after the cache starts aging out entries. I find this rather
> unlikely, but possible.
>
> Best,
> Erick
>
> > On Aug 13, 2020, at 3:19 AM, Akshay Murarka  wrote:
> >
> > Hey guys,
> >
> > So for quite some time we have been facing an issue where whenever the
> Used Filter Cache value reaches the maximum configured value we start
> seeing an increase in the query latencies on solr side.
> > During this time we also see an increase in our garbage collection and
> CPU as well.
> > When a commit happens with openSearcher=true then only the latencies
> value come back to normal.
> >
> > Is there any setting that can help us with this or will increasing the
> max configured value for filter cache help, because right now we can’t
> increase the commit frequency
> >
> > Thanks for the help.
> >
> > Regards,
> > Akshay
> >
> >
> > Below is the graph for request latency
> > 
> >
> >
> >
> >
> >
> > Below is the graph for the Filter cache values
> > 
>
>

Re: Force open a searcher in solr.

2020-08-13 Thread Akshay Murarka

Hey Erick,

Thanks for the information about the doc ID field.
So our external file values are single float value fields and we do use
them in functional queries in boost parameter, so based on the definition
the above should work.
So currently we use solr 5.4.0 but are in the process of upgrading our
systems so will try out this change.

Regards,
Akshay

On Mon, Aug 10, 2020 at 10:19 PM Erick Erickson 
wrote:

> Right, but you can use those with function queries. Assuming your eff
> entry is a doc ID plus single numeric, I was wondering if you can
> accomplish what you need to with function queries...
>
> > On Aug 10, 2020, at 11:30 AM, raj.yadav  wrote:
> >
> > Erick Erickson wrote
> >> Ah, ok. That makes sense. I wonder if your use-case would be better
> >> served, though, by “in place updates”, see:
> >>
> https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html
> >> This has been around in since Solr 6.5…
> >
> > As per documentation `in place update` is only available for numeric
> > docValues (along with few more conditions). And here its external field
> > type.
> >
> > Regards,
> > Raj
> >
> >
> >
> > --
> > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
>

Re: Force open a searcher in solr.

2020-08-10 Thread Akshay Murarka

Hey,

So I have external file fields that have some data that get updated
regularly. Whenever those get updated we need the open searcher operation
to happen. The value in this external files are used in boosting and other
function/range queries.

On Mon, Aug 10, 2020 at 5:08 PM Erick Erickson 
wrote:

> In a word, “no”. There is explicit code to _not_ open a new searcher if
> the index hasn’t changed because it’s an expensive operation.
>
> Could you explain _why_ you want to open a new searcher even though the
> index is unchanged? The reason for the check in the first place is that
> nothing has changed about the index so the assumption is that there’s no
> reason to open a new searcher.
>
> You could add at least one bogus doc on each shard, then delete them all
> then issue a commit as a rather crude way to do this. Insuring that you
> changed at least one doc on each shard is “an exercise for the reader”…
>
> Again, though, perhaps if you explained why you think this is necessary we
> could suggest another approach. At first glance, this looks like an XY
> problem though.
>
> Best,
> Erick
>
> > On Aug 10, 2020, at 5:49 AM, Akshay Murarka  wrote:
> >
> > Hey,
> >
> > I have a use case where none of the document in my solr index is
> changing but I still want to open a new searcher through the curl api.
> > On executing the below curl command
> > curl
> “XXX.XX.XX.XXX:9744/solr/mycollection/update?openSearcher=true=true”
> > it doesn’t open a new searcher. Below is what I get in logs
> >
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6824) [c:mycollection
> s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6819) [c:mycollection
> s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6829) [c:mycollection
> s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6824) [c:mycollection
> s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1]
> o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6819) [c:mycollection
> s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1]
> o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6766) [c:mycollection
> s:shard1_1_1 r:core_node7 x:mycollection_shard1_1_1_replica1]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6829) [c:mycollection
> s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1]
> o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
> > 2020-08-10 09:32:22.696 INFO  (qtp297786644-6766) [c:mycollection
> s:shard1_1_1 r:core_node7 x:mycollection_shard1_1_1_replica1]
> o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
> > 2020-08-10 09:32:22.697 INFO  (qtp297786644-6824) [c:mycollection
> s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1]
> o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening:
> org.apache.solr.search.SolrIndexSearcher
> > 2020-08-10 09:32:22.697 INFO  (qtp297786644-6819) [c:mycollection
> s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1]
> o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening:
> org.apache.solr.search.SolrIndexSearcher
> > 2020-08-10 09:32:22.697 INFO  (qtp297786644-6829) [c:mycollection
> s:shard1_0_0 r:core_node4 x:mycollection_shard1_0_0_replica1]
> o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening:
> org.apache.solr.search.SolrIndexSearcher
> > 2020-08-10 09:32:22.697 INFO  (qtp297786644-6824) [c:mycollection
> s:shard1_1_0 r:core_node6 x:mycollection_shard1_1_0_replica1]
> o.a.s.u.DirectUpdateHandler2 end_commit_flush
> > 2020-08-10 09:32:22.697 INFO  (qtp297786644-6819) [c:mycollection
> s:shard1_0_1 r:core_node5 x:mycollection_shard1_0_1_replica1]
> o.a.s.u.DirectUpdateHandler2 end_commit_flush
> > 2020-08-10 09:32:22.697 INFO  (qtp297786644-6829) [c:mycollection
&

Force open a searcher in solr.

2020-08-10 Thread Akshay Murarka

Hey,

I have a use case where none of the document in my solr index is changing but I 
still want to open a new searcher through the curl api. 
On executing the below curl command 
curl “XXX.XX.XX.XXX:9744/solr/mycollection/update?openSearcher=true=true”
it doesn’t open a new searcher. Below is what I get in logs

2020-08-10 09:32:22.696 INFO  (qtp297786644-6824) [c:mycollection s:shard1_1_0 
r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.u.DirectUpdateHandler2 
start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2020-08-10 09:32:22.696 INFO  (qtp297786644-6819) [c:mycollection s:shard1_0_1 
r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.u.DirectUpdateHandler2 
start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2020-08-10 09:32:22.696 INFO  (qtp297786644-6829) [c:mycollection s:shard1_0_0 
r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.u.DirectUpdateHandler2 
start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2020-08-10 09:32:22.696 INFO  (qtp297786644-6824) [c:mycollection s:shard1_1_0 
r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.u.DirectUpdateHandler2 
No uncommitted changes. Skipping IW.commit.
2020-08-10 09:32:22.696 INFO  (qtp297786644-6819) [c:mycollection s:shard1_0_1 
r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.u.DirectUpdateHandler2 
No uncommitted changes. Skipping IW.commit.
2020-08-10 09:32:22.696 INFO  (qtp297786644-6766) [c:mycollection s:shard1_1_1 
r:core_node7 x:mycollection_shard1_1_1_replica1] o.a.s.u.DirectUpdateHandler2 
start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2020-08-10 09:32:22.696 INFO  (qtp297786644-6829) [c:mycollection s:shard1_0_0 
r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.u.DirectUpdateHandler2 
No uncommitted changes. Skipping IW.commit.
2020-08-10 09:32:22.696 INFO  (qtp297786644-6766) [c:mycollection s:shard1_1_1 
r:core_node7 x:mycollection_shard1_1_1_replica1] o.a.s.u.DirectUpdateHandler2 
No uncommitted changes. Skipping IW.commit.
2020-08-10 09:32:22.697 INFO  (qtp297786644-6824) [c:mycollection s:shard1_1_0 
r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.c.SolrCore 
SolrIndexSearcher has not changed - not re-opening: 
org.apache.solr.search.SolrIndexSearcher
2020-08-10 09:32:22.697 INFO  (qtp297786644-6819) [c:mycollection s:shard1_0_1 
r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.c.SolrCore 
SolrIndexSearcher has not changed - not re-opening: 
org.apache.solr.search.SolrIndexSearcher
2020-08-10 09:32:22.697 INFO  (qtp297786644-6829) [c:mycollection s:shard1_0_0 
r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.c.SolrCore 
SolrIndexSearcher has not changed - not re-opening: 
org.apache.solr.search.SolrIndexSearcher
2020-08-10 09:32:22.697 INFO  (qtp297786644-6824) [c:mycollection s:shard1_1_0 
r:core_node6 x:mycollection_shard1_1_0_replica1] o.a.s.u.DirectUpdateHandler2 
end_commit_flush
2020-08-10 09:32:22.697 INFO  (qtp297786644-6819) [c:mycollection s:shard1_0_1 
r:core_node5 x:mycollection_shard1_0_1_replica1] o.a.s.u.DirectUpdateHandler2 
end_commit_flush
2020-08-10 09:32:22.697 INFO  (qtp297786644-6829) [c:mycollection s:shard1_0_0 
r:core_node4 x:mycollection_shard1_0_0_replica1] o.a.s.u.DirectUpdateHandler2 
end_commit_flush


I don’t want to do a complete reload of my collection.
Is there any parameter that can be used to forcefully open a new searcher every 
time I do a commit with openSearcher=true

Thanks in advance for the help

Re: Timeout waiting for connection from pool

2018-07-18 Thread akshay

Is there any way through which I can create an external plugin and update
this values?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Timeout waiting for connection from pool

2018-07-18 Thread akshay

I don't have an issue with increasing the request rate. But facing this issue
when the system is going under recovery. Its not able to recover properly
and throwing this connection error



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Timeout waiting for connection from pool

2018-07-18 Thread akshay

Hey,

I am currently running solr 5.4.0 in solr cloud mode. Everything has been
working fine till now but when I start increasing the request rate I am
starting to get connection timeout errors.

Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout
waiting for connection from pool

On reading more about this I found that solr 5.4.0 has a major bug fixed in
verion 5.5 related to low values for maxUpdateConnectionsPerHost. But I
can't update my system to 5.5 as of now.
I am not able to find where/how do I add/edit the above mentioned parameter
to increase its value.

Any help would be  highly appreciated.

Regards,
Akshay



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Learning to rank

2018-07-16 Thread Akshay Patil

 Hi

I am student. for my master thesis I am working on the Learning To rank. As
I did research on it. I found solution provided by the Bloomberg. But I
would like to ask. With the example that you have provided It always shows
the error of Bad Request.

Do you have running example of it. So i can adapt it to my application.

I am trying to use the example that you have provided in github.

core :- techproducts
traning_and_uploading_demo.py

It generates the training data. But I am getting the problem in uploading
the model. It shows error of bad request (empty request body). please help
me out with this problem. So I will be able to adapt it to my application.

Best Regards !

Any help would be appreciated

Solr reload process flow

2018-03-14 Thread Akshay Murarka

Hey,

I am using solr-5.4.0 in my production environment and am trying to automate 
the reload/restart process of the solr collections based on certain specific 
conditions.

I noticed that on solr reload the thread count increases a lot there by 
resulting in increased latencies. So I read about reload process and came to 
know that while reloading
1) Solr creates a new core internally and then assigns this core same 
name as the old core. Is this correct?
2) If above is true then does solr actually create a new index 
internally on reload?
3) If so then restart sounds much better than reload, or is there any 
better way to upload new configs on solr?
4) Can you point me to any docs that can give me more details about 
this?

Any help would be appreciated, Thank you.

Regards,
Akshay

Re: Example Solr Config on EC2

2011-08-10 Thread Akshay

Yes you can promote a slave to be master refer
http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node

In AWS one can use an elastic IP(http://aws.amazon.com/articles/1346) to
refer to the master and this can be assigned to slaves as they assume the
role of master(in case of failure). All slaves will then refer to this new
master and there will be no need to regenerate data.

Automation of this maybe possible through CloudWatch alarm-actions. I don't
know of any available example automation scripts.

Cheers
Akshay.

On Wed, Aug 10, 2011 at 9:08 PM, Matt Shields m...@mattshields.org wrote:

If I were to build a master with multiple slaves, is it possible to promote
a slave to be the new master if the original master fails? Will all the
slaves pickup right where they left off, or any time the master fails will
we need to completely regenerate all the data?

If this is possible, are there any examples of this being automated?
Especially on Win2k3.

Matthew Shields
Owner
BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
Managed Services
www.beantownhost.com
www.sysadminvalley.com
www.jeeprally.com

On Mon, Aug 8, 2011 at 5:34 PM, mboh...@yahoo.com wrote:

Matthew,

Here's another resource:

http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/

Michael Bohlig
Lucid Imagination

- Original Message
From: Matt Shields m...@mattshields.org
To: solr-user@lucene.apache.org
Sent: Mon, August 8, 2011 2:03:20 PM
Subject: Example Solr Config on EC2

I'm looking for some examples of how to setup Solr on EC2. The
configuration I'm looking for would have multiple nodes for redundancy.
I've tested in-house with a single master and slave with replication
running in Tomcat on Windows Server 2003, but even if I have multiple
slaves
the single master is a single point of failure. Any suggestions or
example
configurations? The project I'm working on is a .NET setup, so ideally
I'd
like to keep this search cluster on Windows Server, even though I prefer
Linux.

Matthew Shields
Owner
BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
Managed Services
www.beantownhost.com
www.sysadminvalley.com
www.jeeprally.com

Auto-scaling solr setup

2011-06-06 Thread Akshay

So i am trying to setup an auto-scaling search system of ec2 solr-slaves
which scale up as number of requests increase and vice versa
Here is what I have
1. A solr master and underlying slaves(scalable). And an elastic load
balancer to distribute the load.
2. The ec2-auto-scaling setup fires nodes when traffic increases. However
the replication times(replication speed) for the index from the master
varies for these newly fired nodes.
3. I want to avoid addition of these nodes to the load balancer till it has
completed initial replication and has a warmed up cache.
For this I need to know a way I can check if the initial replication has
completed. and also a way of warming up the cache post this.

I can think of doing this via .. a shellscript/awk(checking times
replicated/index size) ... is there a cleaner way ?

Also on the side note .. any suggestions or pointers to how one set up their
scalable solr setup on cloud(AWS mainly) would be helpful.

Regards,
Akshay

Re: Auto-scaling solr setup

2011-06-06 Thread Akshay

Yes sadly ..  I too have not much clue about AWS.

The SolrReplication API doesnt give me what i want exactly.. For the time
being i have hacked my way into the amazon image bootstrapping the
replication check in a shell script ((curl  awk) very dirty way) . Once the
check suceeds I enable the server using the Solr healthcheck for
load-balancers. I was wondering if anyone has moved to the cloud..specially
Amazon auto-scaling where they dont have control over when a new node is
fired.. All scenarios i encountered were people creating a node .. warming
up the cache and then adding it under the HAProxy LB.

I guess warmup is not that big an issue as compared to an empty response.
Thanks for your response :)

Regards,
Akshay

On Mon, Jun 6, 2011 at 6:33 PM, Erick Erickson erickerick...@gmail.comwrote:

 The HTTP interface (http://wiki.apache.org/solr/SolrReplication#HTTP_API)
 can be used to control lots of parts of replication.

 As to warmups, I don't know of a good way to test that. I don't know
 whether
 getting the current status on the slave includes whether warmup is
 completed
 or not. At worst, after replication is complete you could wait an interval
 (see
 the warmup times on your running servers) before routing requests to the
 slave.

 I haven't any clue at all about AWS...

 Best
 Erick

 On Mon, Jun 6, 2011 at 9:18 AM, Akshay akm...@gmail.com wrote:
  So i am trying to setup an auto-scaling search system of ec2 solr-slaves
  which scale up as number of requests increase and vice versa
  Here is what I have
  1. A solr master and underlying slaves(scalable). And an elastic load
  balancer to distribute the load.
  2. The ec2-auto-scaling setup fires nodes when traffic increases. However
  the replication times(replication speed) for the index from the master
  varies for these newly fired nodes.
  3. I want to avoid addition of these nodes to the load balancer till it
 has
  completed initial replication and has a warmed up cache.
 For this I need to know a way I can check if the initial replication
 has
  completed. and also a way of warming up the cache post this.
 
  I can think of doing this via .. a shellscript/awk(checking times
  replicated/index size) ... is there a cleaner way ?
 
  Also on the side note .. any suggestions or pointers to how one set up
 their
  scalable solr setup on cloud(AWS mainly) would be helpful.
 
  Regards,
  Akshay

Re: replicated index files have incorrect timestamp

2009-04-23 Thread Akshay

You need to specify the index version number for which list of files is to
be shown. The URL should be like
this:http://masterhost:port/solr/replication?command=filelistindexversion=index
version number

You can get the index version number from the URL:
http://masterhost:port/solr/replication?command=indexversion

On Fri, Apr 24, 2009 at 1:10 AM, Jeff Newburn jnewb...@zappos.com wrote:

 We see the exact same thing.  Additionally, that url returns 404 on a
 multicore and gives an error when I add the core.

 −
 response
 −
 lst name=responseHeader
 int name=status0/int
 int name=QTime0/int
 /lst
 str name=statusno indexversion specified/str
 /response

 --
 Jeff Newburn
 Software Engineer, Zappos.com
 jnewb...@zappos.com - 702-943-7562


  From: Jian Han Guo jian...@gmail.com
  Reply-To: solr-user@lucene.apache.org
  Date: Wed, 22 Apr 2009 23:43:02 -0700
  To: solr-user@lucene.apache.org
  Subject: Re: replicated index files have incorrect timestamp
 
  I am using Mac OS 10.5.
 
  I can't access the box right now and this week. I'll do it next week and
  post the result then.
 
  Thanks,
 
  Jianhan
 
  2009/4/22 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com
 
  which OS are you using?
 
  it does not look at the timestamps to decide if the index is in sync .
  It looks at the index version only.
 
  BTW can you just hit the master withe url and paste the response here
 
  http://masterhost:port/solr/replication?command=filelist
 
  On Thu, Apr 23, 2009 at 11:53 AM, Jian Han Guo jian...@gmail.com
 wrote:
  That's right. The timestamp of files on the slave side are all Dec 31
   1969,
  so it looks the timestamp was not set (and therefore it is zero). The
  ones
  on the master side are all correct. Nevertheless, solr seems being able
  to
  recognize that master and slave are in sync after replication. Don't
 know
  how it does that.
 
  I haven't check if the two machines are in sync, but even if they are
  not,
  the timestamp should not be Dec 31, 1969, I think.
 
  Thanks,
 
  Jianhan
 
 
 
  2009/4/22 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com
 
  Let me assume that you are using the in-inbuilt replication.
 
  The replication ties to set the timestamp of all the files same as
  that of the files in the master. just cross check.
 
  On Thu, Apr 23, 2009 at 6:57 AM, Jian Han Guo jian...@gmail.com
  wrote:
  Hi,
 
  I am using nightly build on 4/22/2009. Replication works fine, but
 the
  files
  inside index directory on slave side all have old timestamp: Dec 31
   1969.
  Is this a known issue?
 
  Thanks,
 
  Jianhan
 
 
 
 
  --
  --Noble Paul
 
 
 
 
 
  --
  --Noble Paul
 




-- 
Regards,
Akshay K. Ukey.

Re: multicore

2009-04-02 Thread Akshay

On Thu, Apr 2, 2009 at 5:58 PM, Neha Bhardwaj 
neha_bhard...@persistent.co.in wrote:

 Hi,
 I want to index through commond line
 How to do that?


You can use curl,
http://wiki.apache.org/solr/UpdateXmlMessages?highlight=%28curl%29#head-c614ba822059ae20dde5698290caeb851534c9de





 -Original Message-
 From: Erik Hatcher [mailto:e...@ehatchersolutions.com]
 Sent: Thursday, April 02, 2009 5:42 PM
 To: solr-user@lucene.apache.org
 Subject: Re: multicore


 On Apr 2, 2009, at 6:32 AM, Neha Bhardwaj wrote:
  Also how to index data in particular core.
  Say.. we have core0 and core1 in multicore.
  How can I specify that on which core iwant to index data.

 You index into http://loalhost:8983/solr/core0/update or
 http://loalhost:8983/solr/core1/update

Erik


 DISCLAIMER
 ==
 This e-mail may contain privileged and confidential information which is
 the property of Persistent Systems Ltd. It is intended only for the use of
 the individual or entity to which it is addressed. If you are not the
 intended recipient, you are not authorized to read, retain, copy, print,
 distribute or use this message. If you have received this communication in
 error, please notify the sender and delete all copies of this message.
 Persistent Systems Ltd. does not accept any liability for virus infected
 mails.




-- 
Regards,
Akshay K. Ukey.

Re: Times Replicated Since Startup: 109 since yesterday afternoon?

2009-03-30 Thread Akshay

Can you post your replicationhandler configuration?

On Mon, Mar 30, 2009 at 8:17 PM, sunnyfr johanna...@gmail.com wrote:


 Hi,

 Can you explain me more about this replication script in solr 1.4.
 It does work but it always replicate everything from the master so it lost
 every cache everything to replicate it.
 I don't get really how it works ?

 Thanks a lot,
 --
 View this message in context:
 http://www.nabble.com/Times-Replicated-Since-Startup%3A-109--since-yesterday-afternoon--tp22784943p22784943.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay K. Ukey.

Re: Trunk Replication Page Issue

2009-03-01 Thread Akshay

Hi Jeff,
The line number from your stacktrace doesn't seem to be valid in the trunk
code (of jsp).

did you do an ant clean dist ?
if yes, can you send me the generated servlet for the jsp?

On Fri, Feb 27, 2009 at 10:17 PM, Jeff Newburn jnewb...@zappos.com wrote:

 In trying trunk to fix the Lucene Sync issue we have now encountered a
 severed java exception making the replication page non functional.  Am I
 missing something or doing something wrong?

 Info:
 Slave server on the replication page.  Just a code dump as follows.

 Feb 27, 2009 8:44:37 AM org.apache.solr.common.SolrException log
 SEVERE: org.apache.jasper.JasperException: java.lang.NullPointerException
at

 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:4
 18)
at
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:337)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
 FilterChain.java:290)
at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
 ain.java:206)
at

 org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.
 java:630)
at

 org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDis
 patcher.java:436)
at

 org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatch
 er.java:374)
at

 org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher
 .java:302)
at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
 273)
at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
 FilterChain.java:235)
at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
 ain.java:206)
at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
 va:233)
at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
 va:175)
at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128
 )
at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102
 )
at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
 :109)
at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at

 org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:
 879)
at

 org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(H
 ttp11NioProtocol.java:719)
at

 org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:
 2080)
at

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
 va:885)
at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
 07)
at java.lang.Thread.run(Thread.java:619)
 Caused by: java.lang.NullPointerException
at
 org.apache.jsp.admin.replication.index_jsp._jspService(index_jsp.java:294)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
at

 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:3
 74)
... 24 more


 --
 Jeff Newburn
 Software Engineer, Zappos.com
 jnewb...@zappos.com - 702-943-7562




-- 
Regards,
Akshay K. Ukey.

Re: CoreAdmin for replication STATUS

2009-01-15 Thread Akshay

On Fri, Jan 16, 2009 at 4:57 AM, Jacob Singh jacobsi...@gmail.com wrote:

 Hi,

 How do I find out the status of a slave's index?  I have the following
 scenario:

 1. Boot up the slave.  I give it a core name of boot-$CoreName.

 2. I call boot-$CoreName/replication?command=snappull

 3. I check back every minute using cron and I want to see if the slave
 has actually gotten the data.

 4. When it gets the data I call
 solr/admin/cores?action=RENAMEcore=boot-$CoreNameother=$CoreName.

 I do this because the balancer will start hitting the slave before it
 has the full index otherwise.  Step 3 is the problem.  I don't have a
 reliable way to know it has finished replication AFAIK.  I see in
 ?action=STATUS for the CoreAdmin there is a field called current.
 Is this useful for this?  If not, what is recommended.  I could hit
 the admin/replication/index.jsp url and screenscrape the HTML, but I
 imagine there is a better way.


From the slave you can issue a http command,
boot-$CoreName/replication?command=details
this returns XML which contains a node isReplicating having boolean value.
This will tell you whether replication is in progress or completed.

Thanks,
 Jacob

 --

 +1 510 277-0891 (o)
 +91  33 7458 (m)

 web: http://pajamadesign.com

 Skype: pajamadesign
 Yahoo: jacobsingh
 AIM: jacobsingh
 gTalk: jacobsi...@gmail.com




-- 
Regards,
Akshay Ukey.

Re: To get all indexed records.

2009-01-12 Thread Akshay

Use *:* as a query to get all records. Refer to
http://wiki.apache.org/solr/SolrQuerySyntax for more info.

On Mon, Jan 12, 2009 at 5:30 PM, Tushar_Gandhi 
tushar_gan...@neovasolutions.com wrote:


 Hi,
   I am using solr 1.3. I want to retrieve all records from index file.
 How should I write solr query so that I will get all records?

 Thanks,
 Tushar.
 --
 View this message in context:
 http://www.nabble.com/To-get-all-indexed-records.-tp21413170p21413170.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay Ukey.

Re: Solr query for date

2009-01-08 Thread Akshay

On Thu, Jan 8, 2009 at 3:38 PM, prerna07 pkhandelw...@sapient.com wrote:




 My requirement is to fetch records whthin range of 45 days.

 1) ?q=date_field:[NOW TO NOW-45DAYS] is not returning any results

this works for me if you interchange the range limits viz. [NOW-45DAYS TO
NOW]


 2)  ?q=date_field:[NOW TO NOW+45DAYS] is throwing exception

this too works for me.

Have you defined the date_field field as solr.DateField type in the schema?
date_field should be of type solr.DateField to use range query.



 however I get correct results when i run following  query :
  ?q=date_field:[* TO NOW]

 Please suggest the correct query for range with days.

 Thanks,
 Prerna





 Akshay-8 wrote:
 
  You can use DateMath as:
 
  date_field:[NOW TO NOW+45DAYS]
 
  On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com
 wrote:
 
 
  Hi,
 
   what will be the syntax of this sql query
   SELECT * FROM table WHERE date  SYSDATE and  date SYSDATE+45
   in solr format ?
 
   I need to fetch records where date is between current date and 45 days
  from
  today.
 
  Thanks,
  Prerna
  --
  View this message in context:
  http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  Regards,
  Akshay Ukey.
 
  Enjoy your job, make lots of money, work within the law. Choose any two.
  -Author Unknown.
 
 

 --
 View this message in context:
 http://www.nabble.com/Solr-query-for-date-tp21327696p21349038.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay Ukey.

Re: Solr query for date

2009-01-08 Thread Akshay

On Thu, Jan 8, 2009 at 4:46 PM, prerna07 pkhandelw...@sapient.com wrote:



 1) [NOW-45 TO NOW] works for me now.
 2) [NOW TO NOW+45DAYS] is still throwing following exception :

 --
 message org.apache.lucene.queryParser.ParseException: Cannot parse
 'dateToTest_product_s:[NOW TO NOW 45DAYS]': Encountered 45DAYS at line 1,

notice in the error message that + sign in your query string is getting
converted to space. you need to escape the + sign and properly url encode
the query before querying solr. See here:
http://wiki.apache.org/solr/SolrQuerySyntax#urlescaping


 column 33. Was expecting: ] ...

 description The request sent by the client was syntactically incorrect
 (org.apache.lucene.queryParser.ParseException: Cannot parse
 'dateToTest_product_s:[NOW TO NOW 45DAYS]': Encountered 45DAYS at line 1,
 column 33. Was expecting: ] ... ).
 -
 I am running the query on slr admin on internet explorer.

 i have defined date field as date in schema.xml


 Akshay-8 wrote:
 
  On Thu, Jan 8, 2009 at 3:38 PM, prerna07 pkhandelw...@sapient.com
 wrote:
 
 
 
 
  My requirement is to fetch records whthin range of 45 days.
 
  1) ?q=date_field:[NOW TO NOW-45DAYS] is not returning any results
 
  this works for me if you interchange the range limits viz. [NOW-45DAYS TO
  NOW]
 
 
  2)  ?q=date_field:[NOW TO NOW+45DAYS] is throwing exception
 
  this too works for me.
 
  Have you defined the date_field field as solr.DateField type in the
  schema?
  date_field should be of type solr.DateField to use range query.
 
 
 
  however I get correct results when i run following  query :
   ?q=date_field:[* TO NOW]
 
  Please suggest the correct query for range with days.
 
  Thanks,
  Prerna
 
 
 
 
 
  Akshay-8 wrote:
  
   You can use DateMath as:
  
   date_field:[NOW TO NOW+45DAYS]
  
   On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com
  wrote:
  
  
   Hi,
  
what will be the syntax of this sql query
SELECT * FROM table WHERE date  SYSDATE and  date SYSDATE+45
in solr format ?
  
I need to fetch records where date is between current date and 45
  days
   from
   today.
  
   Thanks,
   Prerna
   --
   View this message in context:
   http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
  
   --
   Regards,
   Akshay Ukey.
  
   Enjoy your job, make lots of money, work within the law. Choose any
  two.
   -Author Unknown.
  
  
 
  --
  View this message in context:
  http://www.nabble.com/Solr-query-for-date-tp21327696p21349038.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  Regards,
  Akshay Ukey.
 
 

 --
 View this message in context:
 http://www.nabble.com/Solr-query-for-date-tp21327696p21349994.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay Ukey.

Re: Querying Solr Index for date fields

2009-01-08 Thread Akshay

You will have to URL encode the string correctly and supply date in format
Solr expects. Please check this: http://wiki.apache.org/solr/SolrQuerySyntax

On Fri, Jan 9, 2009 at 12:21 PM, Rayudu avsrit2...@yahoo.co.in wrote:


 Hi All,
  I have a field with is solr.DateField in my schema file. If I want to
 get the docs. for a given date for eg: get all the docs. whose date value
 is
 2009-01-09 then how can I query my index. As solr's date format is
 -mm-ddThh:mm:ss,

 if I give the date as 2009-01-09T00:00:00Z it is thorwing an
 exception solr.SolrException: HTTP code=400, reason=Invalid Date
 String:'2009-01-09T00'  .
 if I give the date as 2009-01-09 it is thorwing an
 exception , solr.SolrException: HTTP code=400, reason=Invalid Date
 String:'2009-01-09'

 Thanks,
 Rayudu.
 --
 View this message in context:
 http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21367097.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay Ukey.

Re: Solr query for date

2009-01-07 Thread Akshay

You can use DateMath as:

date_field:[NOW TO NOW+45DAYS]

On Wed, Jan 7, 2009 at 3:00 PM, prerna07 pkhandelw...@sapient.com wrote:


 Hi,

  what will be the syntax of this sql query
  SELECT * FROM table WHERE date  SYSDATE and  date SYSDATE+45
  in solr format ?

  I need to fetch records where date is between current date and 45 days
 from
 today.

 Thanks,
 Prerna
 --
 View this message in context:
 http://www.nabble.com/Solr-query-for-date-tp21327696p21327696.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay Ukey.

Enjoy your job, make lots of money, work within the law. Choose any two.
-Author Unknown.

Re: Dynamic Boosting at query time with boost value as another fieldvalue

2008-12-14 Thread Akshay

The colon is used to specify value for a field. E.g. in the query box of
solr admin you would type something like fieldName:string to search
(title:Java). You can use hypen '-' or some other character in the field
name instead of colon.

On Mon, Dec 15, 2008 at 12:11 PM, Pooja Verlani pooja.verl...@gmail.comwrote:

 hi,
 Is it possible to have a fieldname with colon for example source:site? I
 want to apply query time boost as per recency to this field with the
 recency
 function.
 Recip function with rord isn't taking my source:site fieldname, its
 throwing
 an exception. I have tried with escape characters too.
 Please suggest something.

 Thank you,
 Regards
 Pooja

 On Thu, Dec 11, 2008 at 7:20 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

  Take a look at FunctionQuery support in Solr:
 
  http://wiki.apache.org/solr/FunctionQuery
 
 
 http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd
 
  On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani pooja.verl...@gmail.com
  wrote:
 
   Hi all,
  
   I have a specific requirement for query time boosting.
   I have to boost a field on the basis of the value returned from one of
  the
   fields of the document.
  
   Basically, I have the creationDate for a document and in order to
  introduce
   recency factor in the search, i need to give a boost to the creation
  field,
   where the boost value is something like a log(1/x) function and x is
 the
   (presentDate - creationDate).
   Till now what I have seen is we can give only a static boost to the
   documents.
  
   In case you can provide a solution to my problem.. please do reply :)
  
   Thanks a lot,
   Regards.
   Pooja
  
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 




-- 
Regards,
Akshay Ukey.

Re: Dynamic Boosting at query time with boost value as another fieldvalue

2008-12-14 Thread Akshay

On Mon, Dec 15, 2008 at 12:36 PM, Pooja Verlani pooja.verl...@gmail.comwrote:

 ohk.. that means I can't use colon in the fieldname ever in such a scenario
 ?

probably we can use colon in fieldname.
are you using the special keyword _val_ for recip function query?
http://wiki.apache.org/solr/FunctionQuery#head-df0601b9306c8f2906ce91d3904bcd9621e02c99
http://wiki.apache.org/solr/SolrQuerySyntax



 On Mon, Dec 15, 2008 at 12:24 PM, Akshay akshay.u...@gmail.com wrote:

  The colon is used to specify value for a field. E.g. in the query box of
  solr admin you would type something like fieldName:string to search
  (title:Java). You can use hypen '-' or some other character in the field
  name instead of colon.
 
  On Mon, Dec 15, 2008 at 12:11 PM, Pooja Verlani pooja.verl...@gmail.com
  wrote:
 
   hi,
   Is it possible to have a fieldname with colon for example
 source:site?
  I
   want to apply query time boost as per recency to this field with the
   recency
   function.
   Recip function with rord isn't taking my source:site fieldname, its
   throwing
   an exception. I have tried with escape characters too.
   Please suggest something.
  
   Thank you,
   Regards
   Pooja
  
   On Thu, Dec 11, 2008 at 7:20 PM, Shalin Shekhar Mangar 
   shalinman...@gmail.com wrote:
  
Take a look at FunctionQuery support in Solr:
   
http://wiki.apache.org/solr/FunctionQuery
   
   
  
 
 http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd
   
On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani 
  pooja.verl...@gmail.com
wrote:
   
 Hi all,

 I have a specific requirement for query time boosting.
 I have to boost a field on the basis of the value returned from one
  of
the
 fields of the document.

 Basically, I have the creationDate for a document and in order to
introduce
 recency factor in the search, i need to give a boost to the
 creation
field,
 where the boost value is something like a log(1/x) function and x
 is
   the
 (presentDate - creationDate).
 Till now what I have seen is we can give only a static boost to the
 documents.

 In case you can provide a solution to my problem.. please do reply
 :)

 Thanks a lot,
 Regards.
 Pooja

   
   
   
--
Regards,
Shalin Shekhar Mangar.
   
  
 
 
 
  --
  Regards,
  Akshay Ukey.
 




-- 
Regards,
Akshay Ukey.

Re: jboss and solr

2008-12-10 Thread Akshay

On Thu, Dec 11, 2008 at 11:21 AM, Neha Bhardwaj 
[EMAIL PROTECTED] wrote:

 I am trying to configure jboss wih solr



 As stated in wiki docs I copied the  solr.war  but there is no web-apps
 folder currently present in jboss.

 So should I create web-apps manually and paste the war file there.

For JBoss, war files are deployed to this location:
$JBOSS_HOME/server/default/deploy
Please look up resources on the net for more information on running
applications in JBoss.





 I tried configuring solr with tomcat as well. I paste the war file in
 tomcat's web-apps folder. Now when I set system property solr.solr.home

 It raises an class not found exception.

Probably something is missing in the environment settings.
One way to get solr running in Tomcat is to start the Tomcat server from the
directory where solr home is present. E.g. solr home is at location
/home/users/test-solr/solr then start tomcat server from
/home/users/test-solr directory. This assumes that you have $TOMCAT_HOME/bin
in your PATH env variable.





 Can any one help me with that.


 DISCLAIMER
 ==
 This e-mail may contain privileged and confidential information which is
 the property of Persistent Systems Ltd. It is intended only for the use of
 the individual or entity to which it is addressed. If you are not the
 intended recipient, you are not authorized to read, retain, copy, print,
 distribute or use this message. If you have received this communication in
 error, please notify the sender and delete all copies of this message.
 Persistent Systems Ltd. does not accept any liability for virus infected
 mails.




-- 
Regards,
Akshay Ukey.

Re: Multiple indexing

2008-12-07 Thread Akshay

Please take a look at this:

http://wiki.apache.org/solr/MultipleIndexes

On Mon, Dec 8, 2008 at 10:25 AM, Neha Bhardwaj 
[EMAIL PROTECTED] wrote:

 Is multiple indexing possible in solr?

 If yes, how?


 DISCLAIMER
 ==
 This e-mail may contain privileged and confidential information which is
 the property of Persistent Systems Ltd. It is intended only for the use of
 the individual or entity to which it is addressed. If you are not the
 intended recipient, you are not authorized to read, retain, copy, print,
 distribute or use this message. If you have received this communication in
 error, please notify the sender and delete all copies of this message.
 Persistent Systems Ltd. does not accept any liability for virus infected
 mails.




-- 
Regards,
Akshay K. Ukey.

Re: delta-import for XML files, Solr statistics

2008-10-24 Thread Akshay

On Fri, Oct 24, 2008 at 6:07 PM, [EMAIL PROTECTED] wrote:

 Thanks for your very fast response :-)


   2.)
   The documentation from DataImportHandler describes the index update
  process for SQL databases only...
  
   My scenario:
   - My application creates, deletes and modifies files from /tmp/files
  every night.
   - delta-import / DataImportHandler should mirror _all_ this changes
 to
  my lucene index (= create, delete, update documents).
  The only Entityprocessor which supports delta is SqlEntityProcessor.
  The XPathEntityProcessor has not implemented it , because we do not
  know of a consistent way of finding deltas for XML. So ,
  unfortunately,no delta support for XML. But that said you can
  implement those methods in XPathEntityProcessor . The methods are
  explained in EntityProcessor.java. if you have questions specific to
  this I can help.Probably we can contribute it back
  
   === Is this possible with delta-import / DataImportHandler?
   === If not: Do you have any suggestions on how to do this?

 Ok so, at the moment I have to do a full-import to update my index. What
 happens with (user) queries while full-import is running? Does Solr block
 this queries the import is finished? Which configuration options control
 this behavior?


No queries to SOLR  are not blocked during full import.





   My scenario:
   - /tmp/files contains 682 'myDoc_.*\.xml' XML files.
   - Each XML file contains 12 XML elements (e.g. titlefoo/title).
   - DataImportHandler transfer only 5 from this 12 elements to the lucene
  index.
  
  
   I don't understand the output from 'solr/dataimport' (= status):
  
   ###
   response
...
lst name=statusMessages
str name=Total Requests made to DataSource0/str
str name=Total Rows Fetched1363/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2008-10-24 13:19:03/str
str name=
  Indexing completed. Added/Updated: 681 documents. Deleted 0
  documents.
/str
str name=Committed2008-10-24 13:19:05/str
str name=Optimized2008-10-24 13:19:05/str
str name=Time taken 0:0:2.648/str
/lst
   ...
   /response
  
   === Why shows the Added/Updated counter 681 and not 682?
 
  Added updated is the no:of docs . How do you know the number is not
  accurate?


 /tmp/files$ ls myDoc_*.xml | wc -l
 682

 But Added/Updated shows 681. Does this mean that one file has an XML
 error? But the statistic says Total Documents Skipped = 0?!


It might be the case that somewhere there is a extra line in one of the XML
files, a line like ?xml version=1.0 encoding=utf-8? or something.






   4.)
   And my last questions about Solr statistics/informations...
  
   === Is it possible to get informations (number of indexed documents,
  stored values from documents etc.) from the current lucene index?
   === The admin webinterface shows 'numDocs' and 'maxDoc' in
  'statistics/core'. Is 'numDocs' = number of indexed documents? What means
 'maxDocs'?

 Do you have answers for this questions too?

 Bye,
 Simon
 --
 Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
 Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer




-- 
Regards,
Akshay Ukey.

Re: Special character matching 'x' ?

2008-09-17 Thread Akshay

You need to configure Tomcat appropriately for recognizing international
characters in the URI. Take a look at this to see if it helps,
http://wiki.apache.org/solr/SolrTomcat#head-20147ee4d9dd5ca83ed264898280ab60457847c4

On Thu, Sep 18, 2008 at 10:53 AM, Sanjay Suri [EMAIL PROTECTED] wrote:

 Hi,
 Can someone shed some light on this?

 One of my field values has  the name Räikkönen  which contains a special
 characters.

 Strangely, as I see it anyway, it matches on the search query 'x' ?

 Can someone explain or point me to the solution/documentation?

 Any help appreciated,
 -Sanjay

 --
 Sanjay Suri

 Videocrux Inc.
 http://videocrux.com
 +91 99102 66626




-- 
Regards,
Akshay Ukey.

Re: Where DATA are stored

2008-07-21 Thread Akshay

You will find the indexed data inside the data/index directory of your solr
home. The documents are stored in Lucene Index File Format which is not
human readable.

To find a document or documents you have to search it through solr admin web
page with appropriate Lucene query syntax (
http://lucene.apache.org/java/docs/queryparsersyntax.html).

Duplicates will not be there if you have a uniqueKey defined in your schema
and make sure you don't have allowDups attribute set to true in the add
command for adding documents.

On Mon, Jul 21, 2008 at 12:42 PM, sanraj25 [EMAIL PROTECTED] wrote:


 Hi
When indexing data with solr where document stored and how to find
 document in solr installation.And
 how to avoid data duplication in solr database.

 thanks in advance


 regards,
 Santhanaraj R
 --
 View this message in context:
 http://www.nabble.com/Where-DATA-are-stored-tp18563280p18563280.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Akshay Ukey.

Re: Force open a searcher in solr.

Re: Reaching max Filter cache limit increases the request latencies.

Re: Force open a searcher in solr.

Re: Force open a searcher in solr.

Force open a searcher in solr.

Re: Timeout waiting for connection from pool

Re: Timeout waiting for connection from pool

Timeout waiting for connection from pool

Learning to rank

Solr reload process flow

Re: Example Solr Config on EC2

Auto-scaling solr setup

Re: Auto-scaling solr setup

Re: replicated index files have incorrect timestamp

Re: multicore

Re: Times Replicated Since Startup: 109 since yesterday afternoon?

Re: Trunk Replication Page Issue

Re: CoreAdmin for replication STATUS

Re: To get all indexed records.

Re: Solr query for date

Re: Solr query for date

Re: Querying Solr Index for date fields

Re: Solr query for date

Re: Dynamic Boosting at query time with boost value as another fieldvalue

Re: Dynamic Boosting at query time with boost value as another fieldvalue

Re: jboss and solr

Re: Multiple indexing

Re: delta-import for XML files, Solr statistics

Re: Special character matching 'x' ?

Re: Where DATA are stored

30 matches

Site Navigation

Mail list logo

Footer information