Re: logical relation among filter queries

2011-03-08 Thread cyang2010
Right, i can combine that into one fq query. The only thing is that i want to reduce the cache size. I remember this is what i read from wiki. fq=rating:R (filter query cache A) fq=rating:PG-13 (filter query cache B) fq=rating:(R O PG-13) -- (It won't be able to leverage the

Re: StreamingUpdateSolrServer

2011-03-08 Thread Lance Norskog
Yes. Each thread uses its own connection, and each becomes a new thread in the servlet container. On Mon, Mar 7, 2011 at 2:54 AM, Isan Fulia isan.fu...@germinait.com wrote: Hi all, I am using StreamingUpdateSolrServer with queuesize = 5 and threadcount=4 The no. of connections created are same

Synonyms question

2011-03-08 Thread Darx Oman
Hi guys How to put this in synonyms.txt US USA United States of America

Re: Synonyms question

2011-03-08 Thread Jan Høydahl
http://lmgtfy.com/?q=solr+synonym (First hit gives many examples) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 8. mars 2011, at 10.06, Darx Oman wrote: Hi guys How to put this in synonyms.txt US USA United States of America

How to Index and query URLs as fields

2011-03-08 Thread Robert Krüger
Hi, I've run into problems trying to achieve a seemingly simple thing. I'm indexing a bunch of files (local ones and potentially some accessible via other protocols like http or ftp) and have an index field with the url to the file, e.g. file:/home/foo/bar.pdf. Now I want to perform two simple

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Hi, from my experience when you have to scale in the number of documents it's good idea to use shards (so one schema and N shards containing (1/N)*total#docs) while if the requirement is granting high query volume response you could get a significant boost from replicating the same index on 2 or

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Jan Høydahl
Having 2Gb physical memory on the box I would allocate -Xmx1024m to Java as a starting point. The other thing you could do is try to trim your config to use less memory. Are you using many facets? String sorts? Wildcards? Fuzzy? Storing or returning more fields than needed?

Difference between Faceting Fieldcollapsing

2011-03-08 Thread Isha Garg
Hi, Can anyone explain in which scenario faceting field collapsing is used .What is the difference between these two. Best Regards! Isha

Re: How to Index and query URLs as fields

2011-03-08 Thread Robert Krüger
My mistake. The error turned out to be somewhere else and the described approach seems to work. Sorry for the wasted bandwidth. On Mar 8, 2011, at 11:06 AM, Robert Krüger wrote: Hi, I've run into problems trying to achieve a seemingly simple thing. I'm indexing a bunch of files (local

Re: Difference between Faceting Fieldcollapsing

2011-03-08 Thread Jan Høydahl
Faceting is returned independently of your result set, telling you how many documents contain each facet. Field collapsing / grouping modifies your result set to roll up multiple hits sharing the same collapse key, much like Google does to hide more results from same site. You may use a field

Re: logical relation among filter queries

2011-03-08 Thread Erick Erickson
The filter queries are interpreted to be intersection. That is, each fq clause is intersected with the result set. There's no way I know of to combine separate filter queries with an OR operator. Best Erick On Tue, Mar 8, 2011 at 2:59 AM, cyang2010 ysxsu...@hotmail.com wrote: Right, i can

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread rajini maski
I have considered the RAM usage points of solr_wiki and yes,I have many facet queries fired every time and might be this is one of the reason .. I did give the Xmx-1024m and the error occurred but it was 2-3 times after many search queries fired.. But then the system slows down . So I needed any

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Erick Erickson
Have you looked at your cache usage statistics from the admin page? That should give you some sense of whether your caches are experiencing evictions, which would also lead to excessive garbage collections. That should give you some additional information to work with. Also, what version of Solr

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Hi Rajani, i 2011/3/8 rajini maski rajinima...@gmail.com Tommaso, Please can you share any link that explains me about how to enable and do load balancing on the machines that you did mention above..? if you're querying Solr via SolrJ [1] you could use the LBHttpSolrServer [2]

getting much double-Values from solr -- timeout

2011-03-08 Thread stockii
Hello. i have 34.000.000 documents in my index and each doc have a field with a double-value. i want the sum of these fields. i testet it with the statscomponent but this is not usable. !! so i get all my values directly from solr, from the index and with php-sum() i get my sum. that works fine

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Just one more hint, I didn't mention it in the previous email since I imagine the scenario you explained doesn't allow it but anyways you could also check Solr Cloud and its distributed requests [1]. Cheers, Tommaso [1] : http://wiki.apache.org/solr/SolrCloud#Distributed_Requests 2011/3/8

Re: logical relation among filter queries

2011-03-08 Thread cyang2010
Erick, Thanks for reply. Is there anyway that i can instruct to combine seperate filter queries with UNION result, without creating the 3rd filter query cache as I described above? If not, shall I give up using filter query for such scenario (where i query the same field with multiple value

RE: How to handle searches across traditional and simplifies Chinese?

2011-03-08 Thread Burton-West, Tom
This page discusses the reasons why it's not a simple one to one mapping http://www.kanji.org/cjk/c2c/c2cbasis.htm Tom -Original Message- I have documents that contain both simplified and traditional Chinese characters. Is there any way to search across them? For example, if someone

docBoost

2011-03-08 Thread Brian Lamb
Hi all, I am using dataimport to create my index and I want to use docBoost to assign some higher weights to certain docs. I understand the concept behind docBoost but I haven't been able to find an example anywhere that shows how to implement it. Assuming the following config file: document

Re: Problem adding new requesthandler to solr branch_3x

2011-03-08 Thread Chris Hostetter
: 1.  Why the problem occurs (has something changed between 1.4.1 and 3x)? Various pieces of code dealing with config parsing have changed since 1.4.1 to be better about verifying that configs are meaningful ,ad reporting errors when unexpected things are encountered. i'm not sure of the

Smart Pagination queries

2011-03-08 Thread javaxmlsoapdev
e.g. There are 4,000 solr documents that were found for a particular word search. My app has entitlement rules applied to those 4,000 documents and it's quite possible that user is only eligible to view 3,000 results out of 4K. This is achieved through post filtering application logic. My

Re: -ignore words not working?

2011-03-08 Thread Chris Hostetter
: AND ((-title:men) AND (-keywords:men) AND (-description:men)) ... : As soon as I put in -field:value it yeilds no results... even though there : are a ton of results that match the criteria :/ you didn't add -field:value ... you added (-field:value) the parens are significant. the

Re: Error during auto-warming of key

2011-03-08 Thread Markus Jelsma
Anyone here with some thoughts on this issue? Hi, Yesterday's error log contains something peculiar: ERROR [solr.search.SolrCache] - [pool-29-thread-1] - : Error during auto- warming of key:+*:* (1.0/(7.71E-8*float(ms(const(1298682616680),date(sort_date)))+1.0))^20.0:ja

Re: getting much double-Values from solr -- timeout

2011-03-08 Thread Jan Høydahl
Are you using shards or have everything in same index? What problem did you experience with the StatsCompnent? How did you use it? I think the right approach will be to optimize StatsComponent to do quick sum() -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 8.

Re: logical relation among filter queries

2011-03-08 Thread Erick Erickson
Can't really answer that question in the abstract. About all you can really do is monitor your caches (the admin stats page helps) and note if/when you start getting cache evictions and adjust then. I really wouldn't worry about this unless and until you start getting query slowdowns, just go

NRT in Solr

2011-03-08 Thread Jae Joo
Hi, Is NRT in Solr 4.0 from trunk? I have checkouted from Trunk, but could not find the configuration for NRT. Regards Jae

two QueryHandler components in one schema?

2011-03-08 Thread Paul Libbrecht
hello list, in my schema I have searchComponent name=query class=org.curriki.solr.handlers.CurrikiSolrQueryComponent / which, as I understand it, allows all requestHandlers to use my query-component. That is useful but I wonder if there's a way for me to have one

Re: two QueryHandler components in one schema?

2011-03-08 Thread Chris Hostetter
: in my schema I have First off, a bit of terminoligy clarification: Search COmponents are declarred in the solrconfig.xml file. schema.xml is where you define what, inherently, the data in your index *is*. solrocnfig.xml is where you define how you want people to be able to interact with

Re: two QueryHandler components in one schema?

2011-03-08 Thread Paul Libbrecht
Le 8 mars 2011 à 23:03, Chris Hostetter a écrit : : in my schema I have First off, a bit of terminoligy clarification: Search COmponents are declarred in the solrconfig.xml file. schema.xml is where you define what, inherently, the data in your index *is*. solrocnfig.xml is where

Re: two QueryHandler components in one schema?

2011-03-08 Thread Markus Jelsma
A request handler can have first-components and last-components and also just plain components. List all your stuff in components and voila. Don't forget to also add debug, facet and other default components if you need them. Le 8 mars 2011 à 23:03, Chris Hostetter a écrit : : in my schema I

Solr Hanging all of sudden with update/csv

2011-03-08 Thread danomano
Hi folks, I've been using solr for about 3 months. Our Solr install is a single node, and we have been injecting logging data into the solr server every couple of minutes, which each updating taking few minutes. Everything working fine until this morning, at which point it appeared that all

Re: two QueryHandler components in one schema?

2011-03-08 Thread Chris Hostetter
: So how do I define, for a given request-handler, a special query component? : I did not find in this in the schema. you mean solrocnfig.xml, again. Taken directly from the SearchHandler URL i sent you... If you want to have a custom list of components (either omitting defaults or adding

Re: two QueryHandler components in one schema?

2011-03-08 Thread Paul Libbrecht
Erm, did you, Hoss, not say that components are referred to by name? How could the search result be read from the query mySpecialQueryComponent if it cannot be named? Simply through the pool of SolrParams? If yes, that's the great magic of solr. paul Le 8 mars 2011 à 23:19, Chris Hostetter a

How to intercept the http request made by solrj

2011-03-08 Thread cyang2010
Hi, Anyone knows how to intercept the http request made by solrj? I only see the url being printed out when the request is invalid. But still as part of development/debugging process, i want to verify what http request it sent out to solr server. Thanks. CY -- View this message in

Re: error in log INFO org.apache.solr.core.SolrCore - webapp=/solr path=/admin/ping params={} status=0 QTime=1

2011-03-08 Thread Chris Hostetter
: I am using solr under jboss, so this might be more of a jboss config : issue, not really sure. But my logs keep getting spammed, because : solr sends it as ERROR [STDERR] INFO org.apache.solr.core.SolrCore - : webapp=/solr path=/admin/ping params={} status=0 QTime=1 : : Has anyone seen this

Re: two QueryHandler components in one schema?

2011-03-08 Thread Chris Hostetter
: did you, Hoss, not say that components are referred to by name? How : could the search result be read from the query mySpecialQueryComponent : if it cannot be named? Simply through the pool of SolrParams? in the example i gave, mySpecialQueryComponent *is* the name of some component you

Re: dataimport

2011-03-08 Thread Chris Hostetter
: INFO: Creating a connection for entity id with URL: : jdbc:mysql://localhost/researchsquare_beta_library?characterEncoding=UTF8zeroDateTimeBehavior=convertToNull : Feb 24, 2011 8:58:25 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 : call : INFO: Time taken for getConnection(): 137 :

Re: Help with explain query syntax

2011-03-08 Thread Chris Hostetter
: str name=parsedquery : +DisjunctionMaxQuery((company_name:(linguajob.pl linguajob) pl)~0.01) () : /str you can see the crux of your problem in this query string it seems you have a query time synonym in place to *expand* linguajob.pl into [linguajob.pl] and [linguajob] [pl] but query time

Re: Solr Hanging all of sudden with update/csv

2011-03-08 Thread Jonathan Rochkind
My guess is that you're running out of RAM. Actual Java profiling is beyond me, but I have seen issues on updating that were solved by more RAM. If you are updating every few minutes, and your new index takes more than a few minutes to warm, you could be running into overlapping warming

Re: Solr Hanging all of sudden with update/csv

2011-03-08 Thread danomano
Actually this is definitely not a ram issue. I have visualVM connected and MAX Ram available for the JavaVM is ~7GB, but the system is only using ~5.5GB, with a MAX so far of 6.5GB consumed. I think..well I'm guessing the system hit a merge threshold, but I can't tell for sure..I have seen the

Re: Solr Hanging all of sudden with update/csv

2011-03-08 Thread Jason Rutherglen
The index size itself is about 270Gb, (we are hopping to support upto 500-1TB), and have supplied the system with ~3TB diskspace. That's simply massive for a single node. When the system tries to merge the segments the queries are probably not working? And the merges will take quite a while.

Re: Help with explain query syntax

2011-03-08 Thread Yonik Seeley
It's probably the WordDelimiterFilter: org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal: 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 } Get rid of the preserveOriginal=1 in the query analyzer.

Custom search filters

2011-03-08 Thread Mark
Hi all, I am trying to use a custom search filter (org.apache.lucene.search.Filter) but I am unsure of where I should configure this. Would I have to create my own SearchHandler that would wrap this logic in? Any example/suggestions out there? Thanks

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread rajini maski
Thank you all . Tommaso , Thanks. I will follow the links you suggested. Erick, It is Solr 1.4.1 .. Regards, Rajani Maski On Tue, Mar 8, 2011 at 10:16 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Just one more hint, I didn't mention it in the previous email since I imagine the

True master-master fail-over without data gaps

2011-03-08 Thread Otis Gospodnetic
Hello, What are some common or good ways to handle indexing (master) fail-over? Imagine you have a continuous stream of incoming documents that you have to index without losing any of them (or with losing as few of them as possible). How do you set up you masters? In other words, you can't

Re: NRT in Solr

2011-03-08 Thread Otis Gospodnetic
I think once this starts yielding matches: Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Jae Joo jaejo...@gmail.com To: solr-user@lucene.apache.org Sent: Tue, March 8, 2011 4:27:41 PM

Re: NRT in Solr

2011-03-08 Thread Otis Gospodnetic
I think once this starts yielding matches: trunk/solr$ find . -name \*java | xargs grep IndexReader | grep IndexWriter ...we'll know NRT has landed. Until then: http://wiki.apache.org/solr/NearRealtimeSearch Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem

RE: True master-master fail-over without data gaps

2011-03-08 Thread Jonathan Rochkind
I'd honestly think about buffer the incoming documents in some store that's actually made for fail-over persistence reliability, maybe CouchDB or something. And then that's taking care of not losing anything, and the problem becomes how we make sure that our solr master indexes are kept in sync