Performance degradation with distributed search

2012-02-03 Thread XJ
Hello, I am experimenting with solr distributed search/random sharding (currently use geo sharding), hope to gain some performance and also scalability in the future. (index size keep growing and geo shard is hard to scale) However I'm seeing worse performance with distributed search, on a testin

Re: multiple index analyzer chains on a field

2012-02-03 Thread Jamie Johnson
Looking closer I think I asked the wrong question, please disregard and I will start a new chain with that question On Friday, February 3, 2012, Jamie Johnson wrote: > Is it possible to have multiple index analysis chains on a single field?

Re: frange with multi-valued fields

2012-02-03 Thread Chris Hostetter
: Has anyone had experience using frange with multi-valued fields? In : solr 3.5 doing so results in the error: "can not use FieldCache on : multivalued field" correct. : Here's the use case. We have multiple years attached to each document : and want to be able to refine by a year range.

ReversedWildcardFilterFactory and PorterStemFilterFactory

2012-02-03 Thread Jamie Johnson
I'd like to use both the ReversedWildcardFilterFactory and PorterStemFilterFactory on a text field that I have, I'd like to avoid stemming the reversed fields and would also like to avoid reversing the stemmed fields. My original thought was to have the ReversedWildcardFilterFactory higher in the

Re: error in indexing

2012-02-03 Thread Chris Hostetter
: Subject: Re: error in indexing FWIW: it's really crucial to state which version of Solr you are using when you have bugs with error stack traces like this -- going back through the versions i'm *guessing* that you are using Solr 1.4.1 (or possibly older), correct? Based on that assumption (

Re: SolrCloud war?

2012-02-03 Thread Mark Miller
On Feb 3, 2012, at 1:04 PM, Darren Govoni wrote: > I deployed each war app into the "/solr" context. I presume its needed by > remote URL addressing. Yup - but you can override this by setting the hostContext in solr.xml. It defaults to "solr" as that fits the example jetty distribution. - Ma

Re: Zero Matches Weirdness

2012-02-03 Thread Dmitry Kan
Ok, thanks, Erick, good to know. Sorry for the confusion. On Fri, Feb 3, 2012 at 9:42 PM, Erik Hatcher wrote: > No, don't do that. That's definitely not good advice. If the analysis > chain is the same for both index and query, just use . > > As for Marian's issue... was there literally a + in

Re: Another zero match issue

2012-02-03 Thread Chris Hostetter
: ?q=Create+a+self+contained+Part+Module&defType=edismax&qf=location^0.9+text^0.8+fileName^8.0+title^4.0 : : I get ZERO results. : : If I remove the fileName qf parameter (an indexed but not stored field), I get 5 hits. lemme guess: fileName doesn't use stopwords but other fields do, correct?

Re: Zero Matches Weirdness

2012-02-03 Thread Marian Steinbach
It just got rid of the one field "aktenzeichen" never matching in the qf string. Now it works fine. Solved for now. Thanks!

Re: Zero Matches Weirdness

2012-02-03 Thread Marian Steinbach
2012/2/3 Erik Hatcher : > As for Marian's issue... was there literally a + in the query or was that > urlencoded?   Try debugQuery=true for both queries and see what you get for > the query parsing output. > I tested both + and %20 with and without quotes, it doesn't make a difference whether I

Setting solrj server connection timeout

2012-02-03 Thread Shawn Heisey
Is the following a reasonable approach to setting a connection timeout with SolrJ? queryCore.getHttpClient().getHttpConnectionManager().getParams() .setConnectionTimeout(15000); Right now I have all my solr server objects sharing a single HttpClient that gets created us

Another zero match issue

2012-02-03 Thread Van Tassell, Kristian
Hi everyone! I'm also having some zero match weirdness. When I execute this search: ?q=Create+a+self+contained+Part+Module&defType=edismax&qf=location^0.9+text^0.8+fileName^8.0+title^4.0 I get ZERO results. If I remove the fileName qf parameter (an indexed but not stored field), I get 5 hits.

Re: Zero Matches Weirdness

2012-02-03 Thread Erik Hatcher
No, don't do that. That's definitely not good advice. If the analysis chain is the same for both index and query, just use . As for Marian's issue... was there literally a + in the query or was that urlencoded? Try debugQuery=true for both queries and see what you get for the query parsing

Re: Zero Matches Weirdness

2012-02-03 Thread Dmitry Kan
Actually, I wouldn't count on it and just specify index and query sides explicitly. Just to play it safe. On Fri, Feb 3, 2012 at 8:34 PM, Marian Steinbach wrote: > 2012/2/3 Dmitry Kan : > > What about side of the field? > > > > It's identical. At least that's what I think, since I din't specify

Re: Zero Matches Weirdness

2012-02-03 Thread Marian Steinbach
2012/2/3 Dmitry Kan : > What about side of the field? > It's identical. At least that's what I think, since I din't specify the type="query" or type="index" attribute for the analyzer part. Marian

Re: SolrCloud - issues running with embedded zookeeper ensemble

2012-02-03 Thread Dipti Srivastava
Hi Mark, Thanks for looking into the issue. As for specifying the bootstrap dir for each instance with ZK, it was just a typo on my side. I went back and looked at my script on the second and 3rd nodes and it did not have the bootstrp dir, so I had specified it for only the very FIRST node that re

Re: Zero Matches Weirdness

2012-02-03 Thread Dmitry Kan
What about side of the field? On Fri, Feb 3, 2012 at 6:11 PM, Marian Steinbach wrote: > Hi! > > I am having a weird issue with a search string not producing a match > where it should. I can reproduce it with both 3.4 and 3.5. > > "Where it should" means that I am getting a hit in the "Analyse"

Re: SolrCloud war?

2012-02-03 Thread Darren Govoni
UPDATE: I set my app server[1] system property jetty.port to be equal to the app servers open port and was able to get two Solr shards to talk. The overall properties I set are: App server domain 1: bootstrap_confdir collection.configName jetty.port solr.solr.home zkRun App server domain 2:

Zero Matches Weirdness

2012-02-03 Thread Marian Steinbach
Hi! I am having a weird issue with a search string not producing a match where it should. I can reproduce it with both 3.4 and 3.5. "Where it should" means that I am getting a hit in the "Analyse" tool in the admin panel, but not in a query via /select. Now when I try select?q=Am+Heidstamm&.

Re: Solr index update approach

2012-02-03 Thread Ahmet Arslan
> external fields was one of the options, but I was not 100% > sure if it would > fit. > I will study more about this option. I was wondering if Lucene's ToChildBlockJoinQuery and/or ToParentBlockJoinQuery can be replacement for ExternalFileField. http://www.searchworkings.org/blog/-/blogs/toch

Re: Shard timeouts on large (1B docs) Solr cluster

2012-02-03 Thread Marc Sturlese
timeAllowed can be used outside distributed search. It is used by the TimeL¡mitingCollector. When the search time is equal to timeAllowed it will stop searching and will return the results that could find till then. This can be a problem when using incremental indexing. Lucene starts searching from

Re: Solr index update approach

2012-02-03 Thread Listas Discussões
hi Mikhail external fields was one of the options, but I was not 100% sure if it would fit. I will study more about this option. thank you so much for your reply Arian 2012/2/3 Mikhail Khludnev > Hello Arian, > > Pls look into > > http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-usi

Re: Parallel indexing in Solr

2012-02-03 Thread Erick Erickson
Unfortunately, the answer is "it depends(tm)". First question: How are you indexing things? SolrJ? post.jar? But some observations: 1> sure, using multiple cores will have some parallelism. So will using a single core but using something like SolrJ and StreamingUpdateSolrServer. Especial

Re: Which patch 236 to choose for collapse - Solr 3.5

2012-02-03 Thread preetesh dubey
Erick, yes, u r correct. But with that example I only wanted to explain Tamanjit that "matches" in solr response contains all docs which matched with group query. Tamanjit if u want the counts of docs of only first page according to "rows" parameter then that is the only way which u mentioned...it

Re: error in indexing

2012-02-03 Thread Erick Erickson
Perhaps you could review: http://wiki.apache.org/solr/UsingMailingLists You really haven't shown us what it is that you're doing that generates this error, about all you've said is "it doesn't work". I'd start with trying to index a document with only the required fields for your particular schem

Re: Which patch 236 to choose for collapse - Solr 3.5

2012-02-03 Thread Erick Erickson
Prateesh: I'm not understanding here. I believe Tamanjit is correct. Your example works if and only if *all* the groups are returned, which happens in the example case but not in the general case. Try your experiment with &rows=3 and you'll find that (trunk, example) This search: http://localhos

Re: Solr index update approach

2012-02-03 Thread Mikhail Khludnev
Hello Arian, Pls look into http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-using-external.htmlit can be useful for your purpose. If you need to count facets against an external field you need to develop your own component - shouldn't be a big deal. Solr's bolts are http://lucidworks.lu

Re: Debugging on Tika

2012-02-03 Thread Ahmet Arslan
> I'm using Tika 0.10 for indexing my documents but I am not > getting the expected results when doing a search. Even after > I delete the index and started over again. > Some of the words in for example a PDF document can be found > but most of them not. It could be the maxFieldLength setting in

Solr index update approach

2012-02-03 Thread Listas Discussões
hi, I have an opinion mining application running solr that serves to retrieve documents and perform some analytics using facet queries. It works great. But I have a big issue. The document has an attribute for opinion that is automatically detected, but users can change it if it´s not correct. A d

Re: Debugging on Tika

2012-02-03 Thread Oleg Tikhonov
Hi Arkadi, You can try to extract text from your documents using Tika's CLI (more details http://tika.apache.org/0.7/gettingstarted.html). If you were succeeded that means that something goes wrong during the indexing. Tika only extracts text and metadata from the documents and sends this text to

Re: error in indexing

2012-02-03 Thread leonardo2
Someone can help me? Leonardo -- View this message in context: http://lucene.472066.n3.nabble.com/error-in-indexing-tp3709686p3712495.html Sent from the Solr - User mailing list archive at Nabble.com.

Parallel indexing in Solr

2012-02-03 Thread Per Steffensen
Hi This topic has probably been covered before, but I havnt had the luck to find the answer. We are running solr instances with several cores inside. Solr running out-of-the-box on top of jetty. I believe jetty is receiving all the http-requests about indexing ned documents, and forwards it

Debugging on Tika

2012-02-03 Thread Arkadi Colson
Hi I'm using Tika 0.10 for indexing my documents but I am not getting the expected results when doing a search. Even after I delete the index and started over again. Some of the words in for example a PDF document can be found but most of them not. Is it related to some language setting perhap