Re: Problem adding new requesthandler to solr branch_3x

2011-03-09 Thread Paul Rogers
Hoss many thanks for the reply Paul On 8 March 2011 19:45, Chris Hostetter hossman_luc...@fucit.org wrote: : 1.  Why the problem occurs (has something changed between 1.4.1 and 3x)? Various pieces of code dealing with config parsing have changed since 1.4.1 to be better about verifying

LucidGaze Monitoring tool

2011-03-09 Thread Isan Fulia
Hi all, Does anyone know what does m on the y -axis stands for in req/sec graph for update handler . -- Thanks Regards, Isan Fulia.

Re: NRT in Solr

2011-03-09 Thread stockii
i am using solr for NRT with this version of solr ... Solr Specification Version: 4.0.0.2010.10.26.08.43.14 Solr Implementation Version: 4.0-2010-10-26_08-05-39 1027394 - hudson - 2010-10-26 08:43:14 Lucene Specification Version: 4.0-2010-10-26_08-05-39 Lucene Implementation Version:

Solr UIMA Wiki page

2011-03-09 Thread Tommaso Teofili
Hi all, I just improved the Solr UIMA integration wiki page [1] so if anyone is using it and/or has any feedback it'd be more than welcome. Regards, Tommaso [1] : http://wiki.apache.org/solr/SolrUIMA

Re: NRT in Solr

2011-03-09 Thread stockii
question: http://wiki.apache.org/solr/NearRealtimeSearchTuning 'PERFORMANCE WARNING: Overlapping onDeckSearchers=x i got this message. in my solrconfig.xml: maxWarmingSearchers=4, if i set this to 1 or 2 i got exception. with 4 i got nothing, but the Performance Warning. the wiki-articel

Re: getting much double-Values from solr -- timeout

2011-03-09 Thread stockii
Are you using shards or have everything in same index? - shards == distributed Search over several cores ? = yes, but not always. but in generally not. What problem did you experience with the StatsCompnent? - if i use stats on my 34Million Index, no matter how many docs founded, the sum takes

Re: getting much double-Values from solr -- timeout

2011-03-09 Thread stockii
i am using NRT, and the caches are not always warmed, i think this is almost a problem !? - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for

Re: Solr UIMA Wiki page

2011-03-09 Thread Markus Jelsma
Great work! On Wednesday 09 March 2011 11:20:41 Tommaso Teofili wrote: Hi all, I just improved the Solr UIMA integration wiki page [1] so if anyone is using it and/or has any feedback it'd be more than welcome. Regards, Tommaso [1] : http://wiki.apache.org/solr/SolrUIMA -- Markus Jelsma

NRT and warmupTime of filterCache

2011-03-09 Thread stockii
I tried to create an NRT like in the wiki but i got some problems with autowarming and ondeckSearchers. ervery minute i start a delta of one core and the other core start every minute a commit of the index to search for it. wiki says ... = 1 Searcher and fitlerCache warmupCount=3600. with this

Re: NRT in Solr

2011-03-09 Thread Markus Jelsma
maxWarmingSearcher=1 is good for current stable Solr versions where memory is important. Overlapping warming searchers can be extremely memory consuming. I don't know how cache warming behaves with NRT. On Wednesday 09 March 2011 11:27:39 stockii wrote: question:

Re: True master-master fail-over without data gaps

2011-03-09 Thread Michael Sokolov
Yes, I think this should be pushed upstream - insert a tee in the document stream so that all documents go to both masters. Then use a load balancer to make requests of the masters. The tee itself then becomes a possible single point of failure, but you didn't say anything about the

Re: Help -DIH (mail)

2011-03-09 Thread Matias Alonso
Hi Peter, When I execute the commands you mentioned, nothing happend. Below I show you the comands executed and the answered of they. Sorry, but I don´t know how to enable the log; my jre is by default. Rememeber I´m running the example-DIH (trunk\solr\example\example-DIH\solr); java

Re: NRT and warmupTime of filterCache

2011-03-09 Thread stockii
make it sense to update solr for getting SOLR-571 ??? - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute

Re: getting much double-Values from solr -- timeout

2011-03-09 Thread Jan Høydahl
You have a large index with tough performance requirements on one server. I would analyze your system to see if it's got any bottlenecks. Watch out for auto-warming taking too long so it does not finish before next commit() Watch out for too frequent commits Monitor mem usage (JConsole or

Re: Help -DIH (mail)

2011-03-09 Thread Peter Sturge
Hi, You've included some output in your message, so I presume something *did* happen when you ran the 'status' command (but it might not be what you wanted to happen :-) If you run: http://localhost:8983/solr/mail/dataimport?command=status and you get something like this back: str

Re: NRT in Solr

2011-03-09 Thread Jason Rutherglen
Jae, NRT hasn't been implemented NRT as of yet in Solr, I think partially because major features such as replication, caching, and uninverted faceting suddenly are no longer viable, eg, it's another round of testing etc. It's doable, however I think the best approach is a separate request call

Re: NRT and warmupTime of filterCache

2011-03-09 Thread Jason Rutherglen
I think it's best to turn the warmupCount to zero because usually there isn't time in between the creation of a new searcher to run the warmup queries, eg, it'll negatively impact the desired goal of low latency new index readers? On Wed, Mar 9, 2011 at 3:41 AM, stockii stock.jo...@googlemail.com

Re: True master-master fail-over without data gaps

2011-03-09 Thread Jason Rutherglen
If you're using the delta import handler the problem would seem to go away because you can have two separate masters running at all times, and if one fails, you can then point the slaves to the secondary master, that is guaranteed to be in sync because it's been importing from the same database?

Re: dataimport

2011-03-09 Thread Brian Lamb
This has since been fixed. The problem was that there was not enough memory on the machine. It works just fine now. On Tue, Mar 8, 2011 at 6:22 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : INFO: Creating a connection for entity id with URL: :

Re: Help -DIH (mail)

2011-03-09 Thread Matias Alonso
Peter, You´re right; may be I expose wrong because of my english. I done everything you told me. I think that no find the folder when index. What you thinking about? Below I show to you part of the log. 09/03/2011 11:52:01 org.apache.solr.core.SolrCore execute INFO: [mail] webapp=/solr

Re: Help -DIH (mail)

2011-03-09 Thread Peter Sturge
Hi, When you ran the status command, what was the output? On Wed, Mar 9, 2011 at 2:55 PM, Matias Alonso matiasgalo...@gmail.com wrote: Peter, You´re right; may be I expose wrong because of my english. I done everything you told me. I think that no find the folder when index. What you

Re: Help -DIH (mail)

2011-03-09 Thread Matias Alonso
Log: 09/03/2011 11:54:58 org.apache.solr.core.SolrCore execute INFO: [mail] webapp=/solr path=/dataimport params={command=status} status=0 QTime=0 XML response - lst name=responseHeader int name=status0/int int name=QTime0/int /lst - lst name=initArgs - lst name=defaults str

SolrJ and digest authentication

2011-03-09 Thread Erlend Garåsen
I'm trying to do a search with SolrJ using digest authentication, but I'm getting the following error: org.apache.solr.common.SolrException: Unauthorized I'm setting up SolrJ this way: HttpClient client = new HttpClient(); ListString authPrefs = new ArrayListString();

RE: True master-master fail-over without data gaps

2011-03-09 Thread Robert Petersen
If you have a wrapper, like an indexer app which prepares solr docs and sends them into solr, then it is simple. The wrapper is your 'tee' and it can send docs to both (or N) masters. -Original Message- From: Michael Sokolov [mailto:soko...@ifactory.com] Sent: Wednesday, March 09, 2011

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
Hi, - Original Message If you're using the delta import handler the problem would seem to go away because you can have two separate masters running at all times, and if one fails, you can then point the slaves to the secondary master, that is guaranteed to be in sync because it's

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
Hi, - Original Message From: Robert Petersen rober...@buy.com To: solr-user@lucene.apache.org Sent: Wed, March 9, 2011 11:40:56 AM Subject: RE: True master-master fail-over without data gaps If you have a wrapper, like an indexer app which prepares solr docs and sends them into

Re: True master-master fail-over without data gaps

2011-03-09 Thread Jason Rutherglen
Oh, there is no DB involved.  Think of a document stream continuously coming in, a component listening to that stream, grabbing docs, and pushing it to master(s). I don't think Solr is designed for this use case, eg, I wouldn't expect deterministic results with the current architecture as

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
Hi, - Original Message Yes, I think this should be pushed upstream - insert a tee in the document stream so that all documents go to both masters. Then use a load balancer to make requests of the masters. Hm, but this makes the tee app aware of this. What if I want to hide

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
Hi, - Original Message Oh, there is no DB involved. Think of a document stream continuously coming in, a component listening to that stream, grabbing docs, and pushing it to master(s). I don't think Solr is designed for this use case, eg, I wouldn't expect

RE: True master-master fail-over without data gaps

2011-03-09 Thread Robert Petersen
Currently I use an application connected to a queue containing incoming data which my indexer app turns into solr docs. I log everything to a log table and have never had an issue with losing anything. I can trace incoming docs exactly, and keep timing data in there also. If I added a second

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
Hi, - Original Message I'd honestly think about buffer the incoming documents in some store that's actually made for fail-over persistence reliability, maybe CouchDB or something. And then that's taking care of not losing anything, and the problem becomes how we make sure

Re: True master-master fail-over without data gaps

2011-03-09 Thread Walter Underwood
On Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote: You mean it's not possible to have 2 masters that are in nearly real-time sync? How about with DRBD? I know people use DRBD to keep 2 Hadoop NNs (their edit logs) in sync to avoid the current NN SPOF, for example, so I'm thinking this

RE: True master-master fail-over without data gaps

2011-03-09 Thread Robert Petersen
...but the index resides on disk doesn't it??? lol -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, March 09, 2011 9:06 AM To: solr-user@lucene.apache.org Subject: Re: True master-master fail-over without data gaps Hi, - Original

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
Hi, - Original Message Currently I use an application connected to a queue containing incoming data which my indexer app turns into solr docs. I log everything to a log table and have never had an issue with losing anything. Yeah, if everything goes through some storage that

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
On disk, yes, but only indexed, and thus far enough from the original content to make storing terms in Lucene's inverted index. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Robert

Re: True master-master fail-over without data gaps

2011-03-09 Thread Markus Jelsma
RAMdisk ...but the index resides on disk doesn't it??? lol -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, March 09, 2011 9:06 AM To: solr-user@lucene.apache.org Subject: Re: True master-master fail-over without data gaps Hi,

Re: True master-master fail-over without data gaps

2011-03-09 Thread Jason Rutherglen
This is why there's block cipher cryptography. On Wed, Mar 9, 2011 at 9:11 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: On disk, yes, but only indexed, and thus far enough from the original content to make storing terms in Lucene's inverted index. Otis Sematext ::

Re: dataimport

2011-03-09 Thread Adam Estrada
Brian, I had the same problem a while back and set the JAVA_OPTS env variable to something my machine could handle. That may also be an option for you going forward. Adam On Wed, Mar 9, 2011 at 9:33 AM, Brian Lamb brian.l...@journalexperts.com wrote: This has since been fixed. The problem was

RE: True master-master fail-over without data gaps

2011-03-09 Thread Robert Petersen
I guess you could put a LB between slaves and masters, never thought of that! :) -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, March 09, 2011 9:10 AM To: solr-user@lucene.apache.org Subject: Re: True master-master fail-over without data

Re: True master-master fail-over without data gaps

2011-03-09 Thread Otis Gospodnetic
Right. LB VIP on both sides of master(s). Black box. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Robert Petersen rober...@buy.com To: solr-user@lucene.apache.org Sent: Wed, March

Newb query question

2011-03-09 Thread Daniel Baughman
Is there a way to perform string logic on the key field using a subquery or some other method. IE. If the left 4 characters of the key are ABCD, then include or exclude those from the search. Here is the laymans pseudo code for what I'm wanting to do: *:* AND LEFT(KEY, 4) 'abcd'

Re: Newb query question

2011-03-09 Thread Erick Erickson
How about something like: for exclusion +*:* -KEY:abcd* for inclusion +*:* +KEY:abcd* Best Erick On Wed, Mar 9, 2011 at 12:34 PM, Daniel Baughman da...@hostworks.com wrote: Is there a way to perform string logic on the key field using a subquery or some other method. IE. If the left 4

Re: True master-master fail-over without data gaps

2011-03-09 Thread Jonathan Rochkind
On 3/9/2011 12:05 PM, Otis Gospodnetic wrote: But check this! In some cases one is not allowed to save content to disk (think copyrights). I'm not making this up - we actually have a customer with this cannot save to disk (but can index) requirement. Do they realize that a Solr index is on

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Otis Gospodnetic
Hi, - Original Message From: Walter Underwood wun...@wunderwood.org On Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote: You mean it's not possible to have 2 masters that are in nearly real-time sync? How about with DRBD? I know people use DRBD to keep 2 Hadoop NNs (their

Re: Newb query question

2011-03-09 Thread Otis Gospodnetic
Hi, It sounds like if you put those 4 chars in a separate field at index time you could apply your logic on that at search time. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Daniel

RE: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Robert Petersen
Can't you skip the SAN and keep the indexes locally? Then you would have two redundant copies of the index and no lock issues. Also, Can't master02 just be a slave to master01 (in the master farm and separate from the slave farm) until such time as master01 fails? Then master02 would start

Re: Solr Hanging all of sudden with update/csv

2011-03-09 Thread danomano
After About 4-5 hours the merge completed (ran out of heap)..as you suggested..it was having memory issues.. Read queries during the merge were working just fine (they were taking longer then normal ~30-60seconds). I think I need to do more reading on understanding the merge/optimization

how would you design schema?

2011-03-09 Thread dan whelan
Hi, I'm investigating how to set up a schema like this: I want to index accounts and the products purchased (multiValued) by that account but I also need the ability to search by the date the product was purchased. It would be easy if the purchase date wasn't part of the requirements. How

Re: how would you design schema?

2011-03-09 Thread Geert-Jan Brits
Would having a solr-document represent a 'product purchase per account' solve your problem? You could then easily link the date of purchase to the document as well as the account-number. e.g: fields: orderid (key), productid, product-characteristics, order-characteristics (including date of

Sorting

2011-03-09 Thread Brian Lamb
Hi all, I know that I can add sort=score desc to the url to sort in descending order. However, I would like to sort a MoreLikeThis response which returns records like this: lst name=moreLikeThis result name=3 numFound=113611 start=0 maxScore=0.4392774 result name=2 numFound= start=0

Re: docBoost

2011-03-09 Thread Brian Lamb
Anyone have any clue on this on? On Tue, Mar 8, 2011 at 2:11 PM, Brian Lamb brian.l...@journalexperts.comwrote: Hi all, I am using dataimport to create my index and I want to use docBoost to assign some higher weights to certain docs. I understand the concept behind docBoost but I haven't

Re: docBoost

2011-03-09 Thread Jayendra Patil
you can use the ScriptTransformer to perform the boost calcualtion and addition. http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer dataConfig script![CDATA[ function f1(row) { // Add boost row.put('$docBoost',1.5);

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Otis Gospodnetic
Hi, Original Message From: Robert Petersen rober...@buy.com Can't you skip the SAN and keep the indexes locally? Then you would have two redundant copies of the index and no lock issues. I could, but then I'd have the issue of keeping them in sync, which seems more fragile.

Re: Solr Hanging all of sudden with update/csv

2011-03-09 Thread Otis Gospodnetic
Hi, You'll benefit from watching this segment merging video: http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html And you'll appreciate the graph at the bottom: http://code.google.com/p/zoie/wiki/ZoieMergePolicy Otis Sematext :: http://sematext.com/ :: Solr -

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jake Luciani
Hi Otis, Have you considered using Solandra with Quorum writes to achieve master/master with CA semantics? -Jake On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi, Original Message From: Robert Petersen rober...@buy.com Can't you skip

Re: Solr Hanging all of sudden with update/csv

2011-03-09 Thread Jason Rutherglen
You will need to cap the maximum segment size using LogByteSizeMergePolicy.setMaxMergeMB. As then you will only have segments that are of an optimal size, and Lucene will not try to create gigantic segments. I think though on the query side you will run out of heap space due to the terms index

Re: docBoost

2011-03-09 Thread Brian Lamb
That makes sense. As a follow up, is there a way to only conditionally use the boost score? For example, in some cases I want to use the boost score and in other cases I want all documents to be treated equally. On Wed, Mar 9, 2011 at 2:42 PM, Jayendra Patil jayendra.patil@gmail.com wrote:

Excluding results from more like this

2011-03-09 Thread Brian Lamb
Hi all, I'm using MoreLikeThis to find similar results but I'd like to exclude records by the id number. For example, I use the following URL: http://localhost:8983/solr/search/?q=id:(2 3 5)mlt=truemlt.fl=description,idfl=*,score How would I exclude record 4 form the MoreLikeThis results? I

Fwd: some relational-type groupig with search

2011-03-09 Thread l . blevins
- Forwarded Message - From: l blevins l.blev...@comcast.net To: solr user mail solr-user-h...@lucene.apache.org Sent: Wednesday, March 9, 2011 4:03:06 PM Subject: some relational-type groupig with search I have a large database for which we have some good search capabilties now,

Re: Excluding results from more like this

2011-03-09 Thread Otis Gospodnetic
Brian, ...?q=id:(2 3 5) -4 Otis --- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Brian Lamb brian.l...@journalexperts.com To: solr-user@lucene.apache.org Sent: Wed, March 9, 2011 4:05:10

Same index is ranking differently on 2 machines

2011-03-09 Thread Allistair Crossley
Hi, I am seeing an issue I do not understand and hope that someone can shed some light on this. The issue is that for a particular search we are seeing a particular result rank in position 3 on one machine and position 8 on the production machine. The position 3 is our desired and roughly

FunctionQueries and FieldCache and OOM

2011-03-09 Thread Markus Jelsma
Hi, In one of the environments i'm working on (4 Solr 1.4.1. nodes with replication, 3+ million docs, ~5.5GB index size, high commit rate (~1-2min), high query rate (~50q/s), high number of updates (~1000docs/commit)) the nodes continuously run out of memory. During development we frequently

Re: Excluding results from more like this

2011-03-09 Thread Brian Lamb
That doesn't seem to do it. Record 4 is still showing up in the MoreLikeThis results. On Wed, Mar 9, 2011 at 4:12 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Brian, ...?q=id:(2 3 5) -4 Otis --- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Jayendra Patil
queryNorm is just a normalizing factor and is the same value across all the results for a query, to just make the scores comparable. So even if it varies in different environment, you should not worried about.

Re: Excluding results from more like this

2011-03-09 Thread Jonathan Rochkind
Yeah, that just restricts what items are in your main result set (and adding -4 has no real effect). The more like this set is constructed based on your main result set, for each document in it. As far as I can see from here: http://wiki.apache.org/solr/MoreLikeThis ..there seems to be no

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Jonathan Rochkind
Yes, but the identical index with the identical solrconfig.xml and the identical query and the identical version of Solr on two different machines should preduce identical results. So it's a legitimate question why it's not. But perhaps queryNorm isn't enough to answer that. Sorry, it's out

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Allistair Crossley
Thanks. Good to know, but even so my problem remains - the end score should not be different and is causing a dramatically different ranking of a document (3 versus 7 is dramatic for my client). This must be down to the scoring debug differences - it's the only difference I can find :( On Mar

Indexing a text string for faceting

2011-03-09 Thread Greg Georges
Hello all, I have a small problem with my faceting fields. In all I create a new faceting field which is indexed and not stored, and use copyField. The problem is I facet on category names which have examples like this Policies Documentation

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Allistair Crossley
That's what I think, glad I am not going mad. I've spent 1/2 a day comparing the config files, checking out from SVN again and ensuring the databases are identical. I cannot see what else I can do to make them equivalent. Both servers checkout directly from SVN, I am convinced the files are

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Jayendra Patil
Are you sure you have the same config ... The boost seems different for the field text - text:dubai^0.1 text:dubai -2.286596 = (MATCH) sum of: - 1.6891675 = (MATCH) sum of: -1.3198489 = (MATCH) max plus 0.01 times others of: - 0.023022119 = (MATCH) weight(text:dubai^0.1 in 1551),

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Yonik Seeley
On Wed, Mar 9, 2011 at 4:49 PM, Jayendra Patil jayendra.patil@gmail.com wrote: Are you sure you have the same config ... The boost seems different for the field text - text:dubai^0.1 text:dubai Yep... Try adding echoParams=all and see all the parameters solr is acting on.

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Allistair Crossley
Oh wow, how did I miss that? My apologies to anyone who read this post. I should have diffed my custom dismax handler. Looks like my SVN merge didn't work properly. Embarassing. Thanks everyone ;) On Mar 9, 2011, at 4:51 PM, Yonik Seeley wrote: On Wed, Mar 9, 2011 at 4:49 PM, Jayendra Patil

Math-generated fields during query

2011-03-09 Thread Peter Sturge
Hi, I was wondering if it is possible during a query to create a returned field 'on the fly' (like function query, but for concrete values, not score). For example, if I input this query: q=_val_:product(15,3)fl=*,score For every returned document, I get score = 45. If I change it slightly

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Smiley, David W.
I was just about to jump in this conversation to mention Solandra and go fig, Solandra's committer comes in. :-) It was nice to meet you at Strata, Jake. I haven't dug into the code yet but Solandra strikes me as a killer way to scale Solr. I'm looking forward to playing with it; particularly

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jason Rutherglen
Doesn't Solandra partition by term instead of document? On Wed, Mar 9, 2011 at 2:13 PM, Smiley, David W. dsmi...@mitre.org wrote: I was just about to jump in this conversation to mention Solandra and go fig, Solandra's committer comes in. :-)   It was nice to meet you at Strata, Jake. I

Re: Same index is ranking differently on 2 machines

2011-03-09 Thread Jonathan Rochkind
Wait, if you don't have identical indexes, then why would you expect identical results? If your indexes are different, one would expect the results for the same query to be different -- there are different documents in the index! The iDF portion of the TF/iDF type algorithm at the base of

Re: NRT in Solr

2011-03-09 Thread Smiley, David W.
Zoie adds NRT to Solr: http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin I haven't tried it yet but looks cool. ~ David Smiley Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/ On Mar 9, 2011, at 9:01 AM, Jason Rutherglen wrote: Jae, NRT hasn't been

Re: NRT in Solr

2011-03-09 Thread Jonathan Rochkind
Interesting, does anyone have a summary of what techniques zoie uses to do this? I don't see any docs on the technical details. On 3/9/2011 5:29 PM, Smiley, David W. wrote: Zoie adds NRT to Solr: http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin I haven't tried it yet but looks

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jake Luciani
Jason, It's predecessor did, Lucandra. But Solandra is a new approach that manages shards of documents across the cluster for you and uses solrs distributed search to query indexes. Jake On Mar 9, 2011, at 5:15 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Doesn't Solandra

Re: NRT in Solr

2011-03-09 Thread Otis Gospodnetic
Jonathan, they have a Wiki up these somewhere, including pretty diagrams. If you have Lucene in Action, Zoie is one of the case studies and is described in a lot of detail. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Otis Gospodnetic
Jake, Maybe it's time to come up with the Solandra/Solr matrix so we can see Solandra's strengths (e.g. RT, no replication) and weaknesses (e.g. I think I saw a mention of some big indices?) or missing feature (e.g. no delete by query), etc. Thanks! Otis Sematext :: http://sematext.com/

Re: Fwd: some relational-type groupig with search

2011-03-09 Thread Michael Sokolov
Probably you can just sort by date (one way and then the other) and limit your result set to a single document. That should free up enough budget for the bonuses of the highly-placed people, I think :) On 3/9/2011 4:05 PM, l.blev...@comcast.net wrote: - Forwarded Message - From: l

Re: some relational-type groupig with search

2011-03-09 Thread l . blevins
It is not just one document that would be returned, it one document per person.  That is a little trickier. - Original Message - From: Michael Sokolov soko...@ifactory.com To: solr-user@lucene.apache.org Cc: l blevins l.blev...@comcast.net Sent: Wednesday, March 9, 2011 7:46:10

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jake Luciani
Yeah sure. Let me update this on the Solandra wiki. I'll send across the link I think you hit the main two shortcomings atm. -Jake On Wed, Mar 9, 2011 at 6:17 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Jake, Maybe it's time to come up with the Solandra/Solr matrix so we can

java.lang.ClassCastException being thrown seemingly at random

2011-03-09 Thread harish.agarwal
Hello, I'm using a recent build of the trunk (from 3/1). I've noticed that after the index is up and running for some time I start to get intermittent errors that look like this: Mar 2, 2011 9:26:01 AM org.apache.solr.common.SolrException log SEVERE: java.lang.ClassCastException The

Caching filter question / code review

2011-03-09 Thread Mark
I created the following SearchComponent that wraps a deduplicate filter around the current query and added it to last-components. It appears to be working, but is there any way I can improve the performance? Would this be considered and added to the filtercache? Am I even caching correctly?

Re: java.lang.ClassCastException being thrown seemingly at random

2011-03-09 Thread Yonik Seeley
On Wed, Mar 9, 2011 at 8:34 PM, harish.agarwal harish.agar...@gmail.com wrote: I'm using a recent build of the trunk (from 3/1).  I've noticed that after the index is up and running for some time I start to get intermittent errors that look like this: Mar 2, 2011 9:26:01 AM

Re: NRT in Solr

2011-03-09 Thread Bill Bell
So it looks like can handle adding new documents, and expiring old documents. Updating a document is not part of the game. This would work well for message boards or tweet type solutions. Solr can do this as well directly. Why wouldn't you just improve the document and facet caching so that when

Re: docBoost

2011-03-09 Thread Bill Bell
Yes just add if statement based on a field type and do a row.put() only if that other value is a certain value. On 3/9/11 1:39 PM, Brian Lamb brian.l...@journalexperts.com wrote: That makes sense. As a follow up, is there a way to only conditionally use the boost score? For example, in some

Solr Cell: Content extraction problem with ContentStreamUpdateRequest and multiple files

2011-03-09 Thread Karthik Shiraly
Hi, I'm using Solr 1.4.1. The scenario involves user uploading multiple files. These have content extracted using SolrCell, then indexed by Solr along with other information about the user. ContentStreamUpdateRequest seemed like the right choice for this - use addFile() to send file data, and

Re: NRT in Solr

2011-03-09 Thread Lance Norskog
Please start new threads for new conversations. On Wed, Mar 9, 2011 at 2:27 AM, stockii stock.jo...@googlemail.com wrote: question: http://wiki.apache.org/solr/NearRealtimeSearchTuning 'PERFORMANCE WARNING: Overlapping onDeckSearchers=x i got this message. in my solrconfig.xml:

Re: Solr Cell: Content extraction problem with ContentStreamUpdateRequest and multiple files

2011-03-09 Thread Karthik Shiraly
In case the exact problem was not clear to somebody: The problem with FileUpload interpreting file data as regular form fields is that, Solr thinks there are no content streams in the request and throws a missing_content_stream exception. On Thu, Mar 10, 2011 at 10:59 AM, Karthik Shiraly