URLDecoder error message

2013-02-12 Thread o.mares
Hey, yesterday we updated from solr 4.0 to solr 4.1 and since then from time to time following error pops up: {msg=URLDecoder: Invalid character encoding detected after position 160 of query string / form data (while parsing as UTF-8),code=400}: {msg=URLDecoder: Invalid character encoding

compare two shards.

2013-02-12 Thread stockii
hello. i want to compare two shards each other, because these shards should have the same index. but this isnt so =( so i want to find these documents, there are missing in one shard of my both shards. my ideas - distrubuted shard request on my nodes and fire a facet search on my unique-field.

Indexed And Stored

2013-02-12 Thread anurag.jain
hello, in my schema field name=city_name type=text_general indexed=false stored=true/ and i updated 18 data. now i need indexed=true for all old data. i need solution please someone help me out. please reply urgent!! thanks -- View this message in context:

Re: Indexed And Stored

2013-02-12 Thread Gora Mohanty
On 12 February 2013 15:49, anurag.jain anurag.k...@gmail.com wrote: hello, in my schema field name=city_name type=text_general indexed=false stored=true/ and i updated 18 data. now i need indexed=true for all old data. i need solution [...] You have no choice but to change the

Re: replication problems with solr4.1

2013-02-12 Thread Bernd Fehling
Now this is strange, the index generation and index version is changing with replication. e.g. master has index generation 118 index version 136059533234 and slave has index generation 118 index version 136059533234 are both same. Now add one doc to master with commit. master has index

Re: Indexed And Stored

2013-02-12 Thread Rafał Kuć
Hello! The simplest way will be updating your schema.xml file, do the change that needs to be done and fully re-index your data. Solr wont be able to automatically change not indexed field to indexed one. You could also use the partial document update API of Solr if you don't have your original

Re: Maximum Number of Records In Index

2013-02-12 Thread Macroman
Our document ID's are most definately distinct and there are partial updates to existing records, I have run SQL queries outside of SOLR to validate records going in and only about 1% are updates to existing records. There are no deletes underway every day new records are added or updated. Example

Tag facet.query excludes are broken when group.facet=true - SOLR 4.1 Bug?

2013-02-12 Thread Mark Beeby
I'm trying to use facets alongside grouping, however when I ask SOLR to compute grouped facet counts (group.facet=true, see http://wiki.apache.org/solr/FieldCollapsing) it no longer honours facet.query excludes, however without this (group.facet=false) the exclude works again without any

Re: Maximum Number of Records In Index

2013-02-12 Thread Joel Bernstein
A couple of things to check. 1) Have you retained your solr logs. If so, take a look in them for indexing errors. 2) What is the difference between maxdocs and numdocs. This will give an indication if a large number of records are being deleted or updated. 3) Can you explain your partial updates?

LoadBalancing while adding documents

2013-02-12 Thread J Mohamed Zahoor
Hi I have multi shard replicated index spread across two machines. Once a week, i delete the entire index and create it from scratch. Today i am using ConcurrentUpdateSolrServer in solrj to add documents to the index. I want to add documents through both the servers.. to utilise the

Re: Indexed And Stored

2013-02-12 Thread anurag.jain
Actually problem is i updated data first. and then i have to add new fields so i made another json file [ { id:2131, newfield:{add:2121} }, { id:21, newfield:{add:21} } ] now i have two different files. so if i try to update previous file for indexed = true. it erase new field -- View

solrcloud-zookeeper

2013-02-12 Thread adm1n
Hi all, the first question: is there a way to reduce timeout when sold shard comes up? it looks in log file as follows: Feb 12, 2013 1:19:08 PM org.apache.solr.cloud.ShardLeaderElectionContext waitForReplicasToComeUp INFO: Waiting until we see more replicas up: total=2 found=1 timeoutin=178992

Re: memory leak - multiple cores

2013-02-12 Thread Michael Della Bitta
Marcos, You could consider using the CoreAdminHandler instead: http://wiki.apache.org/solr/CoreAdmin#CoreAdminHandler It works extremely well. Otherwise, you should periodically restart Tomcat. I'm not sure how much memory would be leaked, but it's likely not going to have much of an impact

Re: memory leak - multiple cores

2013-02-12 Thread Michael Della Bitta
I should also say that there can easily be memory leaked from permgen space when reloading webapps in Tomcat regardless of what resources the app creates because class references from the context classloader to the parent classloader can't be collected appropriately, so restarting Tomcat

Re: memory leak - multiple cores

2013-02-12 Thread Marcos Mendez
Many thanks! I will try to use the CoreAdminHandler and see if that solves the issue! On Feb 12, 2013, at 9:05 AM, Michael Della Bitta wrote: I should also say that there can easily be memory leaked from permgen space when reloading webapps in Tomcat regardless of what resources the app

Re: solrcloud-zookeeper

2013-02-12 Thread Mark Miller
By default, on cluster startup, we wait until we see all the replicas for a shard come up. This is for safety. You may have introduced an old shard with old data or a new shard with no data, and you don't want something like that becoming the leader. If you don't want to do this wait, it's

Re: Possible issue in edismax?

2013-02-12 Thread Sandeep Mestry
Hi Felipe, Just a short note to say thanks for your valuable suggestion. I had implemented that and could see expected results. The length norm still spoils it for few fields but I balanced it with the boost factors accordingly. Once again, Many Thanks! Sandeep On 1 February 2013 22:53, Sandeep

Re: More Like This, finding original record

2013-02-12 Thread Daniel Rijkhof
Well, i have found the following line in MoreLikeThisHandler$MoreListThisHelper.getMoreLikeThis(..) // exclude current document from results realMLTQuery.add( new TermQuery

DisMax Query Field-Filters (ASCIIFolding)

2013-02-12 Thread Ralf Heyde
Hello, I have an interesting behaviour. I have a FieldType Text_PL. This type is configured as: fieldType name=text_pl class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory

Re: SolrCloud and hardcoded 'id' field

2013-02-12 Thread Michael Della Bitta
Apparently this was a side effect of the custom sharding feature. There is a fix planned, but I don't know more about it than that. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where

Benefits of Solr over Lucene?

2013-02-12 Thread JohnRodey
I know that Solr web-enables a Lucene index, but I'm trying to figure out what other things Solr offers over Lucene. On the Solr features list it says Solr uses the Lucene search library and extends it!, but what exactly are the extensions from the list and what did Lucene give you? Also if I

Re: DisMax Query Field-Filters (ASCIIFolding)

2013-02-12 Thread Ahmet Arslan
Hi Ralf, Dismax querparser does not allow fielded queries. e.g. field:something Consider using edismax query parser instead. Also debugQuery=on will display informative output how query parsed analyzed etc. ahmet --- On Tue, 2/12/13, Ralf Heyde ralf.he...@gmx.de wrote: From: Ralf Heyde

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Travis Low
http://lucene.apache.org/solr/ On Tue, Feb 12, 2013 at 10:40 AM, JohnRodey timothydd...@yahoo.com wrote: I know that Solr web-enables a Lucene index, but I'm trying to figure out what other things Solr offers over Lucene. On the Solr features list it says Solr uses the Lucene search library

Solr 3.3.0 - Random CPU problem

2013-02-12 Thread federico.wachs
Hi all, I'm using Solr 3.3.0 with one master server and two slaves. And the problem I'm having is that both slaves get degraded randomly but at the same time. I am completely lost at to what the cause could be, but I see that the tomcat that runs Solr webapp executes a PERL script that consumes

Re: DisMax Query Field-Filters (ASCIIFolding)

2013-02-12 Thread Ralf Heyde
Hi, thanks for your first Answer. I don't want to have a fielded-query in my DisMax Query. My DismaxQuery looks like this: qt=dismaxq=czółenka... -- works qt=dismaxq=czolenka... -- does not work The accessed Fields contain the ASCIIFoldingFilter for Query Index. So, what I need is, that

Re: DisMax Query Field-Filters (ASCIIFolding)

2013-02-12 Thread Jack Krupansky
1. Show us the full query request and request handler. In particular, the qf parameter. 2. Try the Solr Admin Analysis UI to check for sure how the analysis is being performed. 3. Add debugQuery=true to your query to see how it is actually parsed. 4. If there is any chance that you have

Re: DisMax Query Field-Filters (ASCIIFolding)

2013-02-12 Thread Ralf Heyde
I'll try to reindex - i modified the schema, but NOT re-indexed the Index. Damn ! Original-Nachricht Datum: Tue, 12 Feb 2013 11:14:04 -0500 Von: Jack Krupansky j...@basetechnology.com An: solr-user@lucene.apache.org Betreff: Re: DisMax Query Field-Filters (ASCIIFolding)

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Walter Underwood
This is apples and pomegranates. Lucene is a library, Solr is a server. In features, they are more alike than different. wunder On Feb 12, 2013, at 7:40 AM, JohnRodey wrote: I know that Solr web-enables a Lucene index, but I'm trying to figure out what other things Solr offers over Lucene.

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Jack Krupansky
Here's yet another short list of benefits of Solr over Lucene (not that any of them take away from Lucene since Solr is based on Lucene): - Multiple core index - go beyond the limits of a single lucene index - Support for multi-core or named collections - richer query parsers (e.g.,

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Amit Jha
Add to Jack reply, Solr can also be embed into the application and can run on same process. Solr, the server-I zation of lucene. The line is very blurred and solr is not a very thin wrapper around lucene library. Most solr features are distinct from lucene like - detailed breakdown of

Re: Any inputs regarding running solr cluster on virtual machines?

2013-02-12 Thread Shawn Heisey
On 2/12/2013 12:25 AM, adfel70 wrote: I'm currently running a solr cluster on 10 physical machines. I'm considering moving to virtual machines. Any insights on this issue? Have anyone tried this? any best practices? You'll definitely see some performance degradation. How much is very hard to

Re: URLDecoder error message

2013-02-12 Thread Shawn Heisey
On 2/12/2013 1:42 AM, o.mares wrote: yesterday we updated from solr 4.0 to solr 4.1 and since then from time to time following error pops up: {msg=URLDecoder: Invalid character encoding detected after position 160 of query string / form data (while parsing as UTF-8),code=400}: {msg=URLDecoder:

Re: Benefits of Solr over Lucene?

2013-02-12 Thread JohnRodey
So I have had a fair amount of experience using Solr. However on a separate project we are considering just using Lucene directly, which I have never done. I am trying to avoid finding out late that Lucene doesn't offer what we need and being like aw snap, it doesn't support geospatial (or

Re: Solr 3.3.0 - Random CPU problem

2013-02-12 Thread Chris Hostetter
: I'm using Solr 3.3.0 with one master server and two slaves. And the problem : I'm having is that both slaves get degraded randomly but at the same time. : I am completely lost at to what the cause could be, but I see that the : tomcat that runs Solr webapp executes a PERL script that consumes

Re: Solr 3.3.0 - Random CPU problem

2013-02-12 Thread federico.wachs
I don't know how the perl script looks like. I can tell it's being ran by tomcat because when I do : top the owner of the process says tomcat and the CPU is at 100%. I haven't done anything weird to my Solr installation, actually is pretty simple and is the one it used to be on the solr website a

RE: which analyzer is used for facet.query?

2013-02-12 Thread Chris Hostetter
: So it seems that facet.query is using the analyzer of type index. : Is it a bug or is there another analyzer type for the facet query? That doesn't really make any sense ... i don't know much about setting up UIMA (or what/when it logs things) but facet.query uses the regular query parser

Re: Solr 3.3.0 - Random CPU problem

2013-02-12 Thread Chris Hostetter
: I don't know how the perl script looks like. I can tell it's being ran by : tomcat because when I do : top the owner of the process says tomcat and : the CPU is at 100%. ... : Do you have any idea of how to see which PERL script is being executed or : what it's content is? look at the

Re: Need to create SolrServer objects without checking server availability

2013-02-12 Thread Chris Hostetter
: The problem is at program startup -- when 'new HttpSolrServer(url)' is called, : it goes and makes sure that the server is up and responsive. If any of those : 56 object creation calls fails, then my app won't even start. What exactly is the exception are you getting? i don't think antying

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Shawn Heisey
On 2/12/2013 11:19 AM, JohnRodey wrote: So I have had a fair amount of experience using Solr. However on a separate project we are considering just using Lucene directly, which I have never done. I am trying to avoid finding out late that Lucene doesn't offer what we need and being like aw

Re: Do I have to reindex when upgrading from solr 4.0 to 4.1?

2013-02-12 Thread Joel Bernstein
Michael is correct, that was what was said at the bootcamp (by me). I believe this may not be correct though. Further code review shows that Solr 4.0 was already distributing documents using the hash range technique used in 4.1. The big change in 4.1 was that a composite hash key could be used to

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Glen Newton
Is there a page on the wiki that points out the use cases (or the features) that are best suited for Lucene adoption, and those best suited for SOLR adoption? -Glen On Tue, Feb 12, 2013 at 3:11 PM, Shawn Heisey s...@elyograg.org wrote: On 2/12/2013 11:19 AM, JohnRodey wrote: So I have had a

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Walter Underwood
It is like deciding between a disk drive and a file server. Solr and Lucene are different kinds of things. wunder On Feb 12, 2013, at 12:26 PM, Glen Newton wrote: Is there a page on the wiki that points out the use cases (or the features) that are best suited for Lucene adoption, and those

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Glen Newton
And helping people - who don't know much about them - how to decide which to use is not useful? -Glen On Tue, Feb 12, 2013 at 3:34 PM, Walter Underwood wun...@wunderwood.org wrote: It is like deciding between a disk drive and a file server. Solr and Lucene are different kinds of things.

Re: Need to create SolrServer objects without checking server availability

2013-02-12 Thread Shawn Heisey
On 2/12/2013 12:27 PM, Chris Hostetter wrote: : The problem is at program startup -- when 'new HttpSolrServer(url)' is called, : it goes and makes sure that the server is up and responsive. If any of those : 56 object creation calls fails, then my app won't even start. What exactly is the

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Upayavira
Do you want to embed an index into your application, e.g. as a desktop app? Use Lucene. Is search basically the whole of your app? Perhaps use Lucene. Do you want you offer search as a service? Do you want to be able to arbitrarily scale your index (beyond the number of documents a single index

Re: Edismax and mm per field

2013-02-12 Thread Chris Hostetter
: Currently, edismax applies mm to the combination of all fields listed in qf. : : I would like to have mm applied individually to those fields instead. That doesn't really make sense if you think about how the qf is used to build the final query structure -- it is essentially producing a

DIH Delete with Full Import

2013-02-12 Thread Kiran J
Hi, I'm using this configuration: http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport The wiki says: In this case it means obviously that in case you also want to use deletedPkQuery then when running the delta-import command is still necessary. In this link:

Re: Eastings and northings support in Solr Spatial

2013-02-12 Thread Smiley, David W.
Yeah, solr.PointType. Or use solr.SpatialRecursivePrefixTree with geo=false http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 On 2/8/13 10:38 AM, Kissue Kissue kissue...@gmail.com wrote: I can see Solr has the field type solr.LatLonType which supports spatial based on longitudes and

RE: what do you use for testing relevance?

2013-02-12 Thread Markus Jelsma
Roman, Logging clicks and their position in the result list is one useful method to measure the relevance. Using the position you can calculate the mean reciprocal rank, a value near 1.0 is very good so over time you can clearly see whether changes actually improve user

Re: what do you use for testing relevance?

2013-02-12 Thread Sebastian Saip
What do you want to achieve with these tests? Is it meant as a regression, to make sure that only the queries/boosts you changed are affected? Then you will have to implement tests that cover your specific schema/boosts. I'm not aware of any frameworks that do this - we're using Java based tests

Re: solr4.0 problem zkHost with multiple hosts throws out of range exception

2013-02-12 Thread mbennett
The suggested syntax didn't work with embedded ZooKeeper: Syntax: -DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983/solrroot -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=MyConfig Error: SEVERE: Could not start Solr. Check solr/home property and the

Re: solr4.0 problem zkHost with multiple hosts throws out of range exception

2013-02-12 Thread Upayavira
This config isn't intended for embedded zookeeper, it is for a separate zookeeper ensemble that is shared with other services. Upayavira On Tue, Feb 12, 2013, at 10:19 PM, mbennett wrote: The suggested syntax didn't work with embedded ZooKeeper: Syntax: -DzkRun

Re: How to limit queries to specific IDs

2013-02-12 Thread Erick Erickson
First, it may not be a problem assuming your other filter queries are more frequent. Second, the easiest way to keep these out of the filter cache would be just to include them as a MUST clause, like +(original query) +id:(1 2 3 4). Third possibility, see

Re: Reverse range query

2013-02-12 Thread Erick Erickson
Well, what does adding debug=query show you for the parsed query? What documents show up? My first guess is that since you're using exclusive rather than inclusive end points you're expectations aren't what you think. Best Erick On Mon, Feb 11, 2013 at 10:57 PM, ballusethuraman

Re: LoadBalancing while adding documents

2013-02-12 Thread Erick Erickson
Hold on here. LBHttpSolrServer should not be used for indexing in a Master/Slave setup, but in SolrCloud you may use it. Indeed, CloudSolrServer uses LBHttpSolrServer under the covers. Now, why would you want to send requests to both servers? If you're in master/slave mode (i.e. not running

Re: SolrCloud and hardcoded 'id' field

2013-02-12 Thread Shawn Heisey
On 2/11/2013 7:47 PM, Mark Miller wrote: Doesn't sound right to me. I'd guess you heard wrong. I did a search for id with the quotes throughout the branch_4x source code. After excluding test code, test files, and other things that looked like they have good reason to be hardcoded, I was

Re: SolrCloud and hardcoded 'id' field

2013-02-12 Thread Shawn Heisey
On 2/12/2013 7:54 PM, Shawn Heisey wrote: On 2/11/2013 7:47 PM, Mark Miller wrote: Doesn't sound right to me. I'd guess you heard wrong. I did a search for id with the quotes throughout the branch_4x source code. After excluding test code, test files, and other things that looked like they

Re: Benefits of Solr over Lucene?

2013-02-12 Thread Lance Norskog
Lucene and Solr have an aggressive upgrade schedule.From 3 to 4 got a major rewiring, and parts are orders of magnitude faster and smaller. If you code using Lucene, you will never upgrade to newer versions. (I supported SolrLucene customers for 3 years, and nobody ever did.) Cheers, Lance I

Re: More Like This, finding original record

2013-02-12 Thread Otis Gospodnetic
Hello, Daniel, are you looking for the original doc you used for MLT in the response? You could always and easily do this on the client side by looking at IDs of returned docs. Otis Solr ElasticSearch Support http://sematext.com/ On Feb 12, 2013 9:26 AM, Daniel Rijkhof

Re: what do you use for testing relevance?

2013-02-12 Thread Otis Gospodnetic
Hi Roman, We use our own Search Analytics service. It's free and open to anyone - see http://sematext.com/search-analytics/index.html And this post talks exactly about the topic you are asking about:

Re: How to limit queries to specific IDs

2013-02-12 Thread Isaac Hebsh
Thank you, Erick! Three great answers! On Wed, Feb 13, 2013 at 4:20 AM, Erick Erickson erickerick...@gmail.comwrote: First, it may not be a problem assuming your other filter queries are more frequent. Second, the easiest way to keep these out of the filter cache would be just to include

Re: LoadBalancing while adding documents

2013-02-12 Thread J Mohamed Zahoor
On 13-Feb-2013, at 8:11 AM, Erick Erickson erickerick...@gmail.com wrote: Hold on here. LBHttpSolrServer should not be used for indexing in a Master/Slave setup, but in SolrCloud you may use it. Indeed, CloudSolrServer uses LBHttpSolrServer under the covers. In SolrCloud mode,

Re: what do you use for testing relevance?

2013-02-12 Thread Steffen Elberg Godskesen
Hi Roman, If you're looking for regression testing then https://github.com/sul-dlss/rspec-solr might be worth looking at. If you're not a ruby shop, doing something similar in another language shouldn't be to hard. The basic idea is that you setup a set of tests like If the query is X,

Re: LoadBalancing while adding documents

2013-02-12 Thread J Mohamed Zahoor
Ooh.. I dint know that there is CloudSolrServer. Thanks for the pointer. Will explore that. ./zahoor On 13-Feb-2013, at 11:49 AM, J Mohamed Zahoor zah...@indix.com wrote: On 13-Feb-2013, at 8:11 AM, Erick Erickson erickerick...@gmail.com wrote: Hold on here. LBHttpSolrServer should not