Re: OS Cache - Solr

2011-10-20 Thread GR
i wonder how do you manage 200 instances On 20 Oct 2011, at 09:21, Sujatha Arun suja.a...@gmail.com wrote: Yes 200 Individual Solr Instances not solr cores. We get an avg response time of below 1 sec. The number of documents is not many most of the isntances ,some of the instnaces have

Doing a search inside an UpdateRequestProcessor

2011-10-20 Thread Poulton, Gareth | Gareth | DU
Hi, I'm still fairly new to solr, so please bear with me. I'm having an amount of difficulty doing a search inside an update request processor (as in, solr.update.processor.UpdateRequestProcessorChain ) that I'm trying to write. As far as I can see there's two main options, each of which is

Re: how was developed solr admin page and the UI part?

2011-10-20 Thread Erik Hatcher
It's certainly possible to develop a Solr admin (or search) UI using Spring MVC. But we won't be adding any Spring MVC to Solr itself; I'm not sure if you are proposing that Solr refactor the UI with that technology or not. Erik On Oct 20, 2011, at 06:26 , nagarjuna wrote: Thank u

Re: Dismax and phrases

2011-10-20 Thread Hyttinen Lauri
Thank you Otis for the answer. I've played around with the solr admin query interface and I've managed to confuse myself even more. If I query without the quotes solr seems to form two parsedqueries +((DisjunctionMaxQuery(( -first word stuff- )) DisjunctionMaxQuery(( -second word stuff- ))

Does anybody has experience in Chinese soundex(sounds like) of SOLR?

2011-10-20 Thread Floyd Wu
Hi there, There are many English soundex implementation can be referenced, but I wonder how to do Chinese soundex(sounds like) filter (maybe). any idea? Floyd

How to return exact set of multivalue field

2011-10-20 Thread Ellery Leung
Hi all I am using Solr 3.4 on Windows 7. Here is the example of a multivalue field: doc arr name=field_name str387/str str386/str /arr /doc doc arr name= field_name str387/str str386/str /arr /doc doc arr name= field_name str387/str str386/str str385/str

Using spellcheck component with query ( q and spellcheck.q )

2011-10-20 Thread Mark Swinson
I am using solr 3.4.0 with spellcheck component. When I try and search for pre built spelling dictionary for a given mispelt word , it only works if I specify BOTH the q and spellcheck.q parameters. If I miss out the q parameter I get a NullPointerException error. I believe I have seen reference

Re: How to return exact set of multivalue field

2011-10-20 Thread dan sutton
-field_name:[ * TO 384] +field_name:[385 TO 386] -field_name:[387 TO *] On Thu, Oct 20, 2011 at 10:51 AM, Ellery Leung elleryle...@be-o.com wrote: Hi all I am using Solr 3.4 on Windows 7. Here is the example of a multivalue field: doc arr name=field_name str387/str str386/str

Re: Does anybody has experience in Chinese soundex(sounds like) of SOLR?

2011-10-20 Thread Otis Gospodnetic
Hi, Wow, interesting question.  Can soundex even be applied to a language like Chinese, which is tonal and doesn't have individual letters, but whole characters?  I'm no expert, but intuitively speaking it sounds hard or maybe even impossible...   Otis Sematext :: http://sematext.com/

Re: Error Instantiating QParserPlugin

2011-10-20 Thread Marco Martinez
its seem that the problem is QParserPlugin2 class Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/10/20 karan.jindal1...@rediffmail.com hi, while to create customized query parser plugin

RE: How to return exact set of multivalue field

2011-10-20 Thread Ellery Leung
Thank you very much for your help! Follow up question: what if it is a string instead of number? While you can use [387 TO *] to find out all number that is bigger than 387, how do you find specific set of string? Thank you again for any help here. -Original Message- From: dan sutton

Re: Does anybody has experience in Chinese soundex(sounds like) of SOLR?

2011-10-20 Thread Ken Krugler
Wow, interesting question. Can soundex even be applied to a language like Chinese, which is tonal and doesn't have individual letters, but whole characters? I'm no expert, but intuitively speaking it sounds hard or maybe even impossible... The only two cases I can think of are: -

Re: Does anybody has experience in Chinese soundex(sounds like) of SOLR?

2011-10-20 Thread Paul Libbrecht
Wouldn't the conversion to a western writing followed by Soundex or Metaphone be the right thing to try? I thought such conversions were mainstream. paul Le 20 oct. 2011 à 12:16, Otis Gospodnetic a écrit : Hi, Wow, interesting question. Can soundex even be applied to a language like

Re: Does anybody has experience in Chinese soundex(sounds like) of SOLR?

2011-10-20 Thread Floyd Wu
Hi Ken, Indeed, I want to support function like phonetic (pinyin or zhuyin) search, not soundex (sorry and thanks correct me). any further idea? Floyd 2011/10/20 Ken Krugler kkrugler_li...@transpac.com: Wow, interesting question.  Can soundex even be applied to a language like Chinese,

RE: how was developed solr admin page and the UI part?

2011-10-20 Thread Jaeger, Jay - DOT
It certainly is possible to develop search pages, update pages, etc. in any architecture you like: I think I'd suggest looking at SolrJ if you want to do that.http://wiki.apache.org/solr/Solrj PLEASE: Go read through the documentation and tutorial and browse thru the Wiki and FAQ. It's

RE: Optimization /Commit memory

2011-10-20 Thread Jaeger, Jay - DOT
Well, since the OS RAM includes the JVM RAM, that is part of your requirement, yes? Aside from the JVM and normal OS requirements, all you need OS RAM for is file caching. Thus, for updates, the OS RAM is not a major factor. For searches, you want sufficient OS RAM to cache enough of the

RE: OS Cache - Solr

2011-10-20 Thread Jaeger, Jay - DOT
I wonder. What if, instead of 200 instances, you had one instance, but built a uniqueKey up out of whatever you have now plus whatever information currently segregates the instances. Then this would be much more manageable. In other words, what is different about each of the 200 instances?

Re: Doing a search inside an UpdateRequestProcessor

2011-10-20 Thread Ahmet Arslan
I'm still fairly new to solr, so please bear with me. I'm having an amount of difficulty doing a search inside an update request processor (as in, solr.update.processor.UpdateRequestProcessorChain ) that I'm trying to write. As far as I can see there's two main options, each of which is

Re: Question about near query order

2011-10-20 Thread Jason, Kim
Which one is better performance of setting inOrder=false in solrconfig.xml and quering with A B~1 AND B A~1 if performance differences? -- View this message in context: http://lucene.472066.n3.nabble.com/Question-about-near-query-order-tp3427312p3437701.html Sent from the Solr - User mailing

LUCENE-2208 (SOLR-1883) Bug with HTMLStripCharFilter, given patch in next nightly build?

2011-10-20 Thread Vadim Kisselmann
Hello folks, i have big problems with InvalidTokenOffsetExceptions with highlighting. Looks like a bug in HTMLStripCharFilter. H.Wang added a patch in LUCENE-2208, but nobody have time to look at this. Could someone of the committers please take a look at this patch and commit it or is this

Re: IndexBasedSpellChecker on multiple fields

2011-10-20 Thread Simone Tripodi
Hi James, sorry for the noise but I am not able to using the approach described, I'm sure I'm misconfiguring something. Basically, I have 2 fields, `abstract` and `subject`, and a field `master-dictionary` where the first to have ben copied. Then, in solrconfig.xml I configured the

how to handle large relational data in Solr

2011-10-20 Thread Jonathan Carothers
All, We are attempting to convert a fairly large relational database into Solr index(es). There are ~100,000 products with ~1,000,000 accessories that can be related to any number of the products. So if I include the search terms and the relationships in the same index, we're looking at a

RE: IndexBasedSpellChecker on multiple fields

2011-10-20 Thread Dyer, James
Here's approximately how I've got it set up to do essentially the same thing, in one of our production indexes: --- schema.xml has: fieldType name=text_spelling class=solr.TextField positionIncrementGap=100 { whitespaceanalyzer, stopwordfilter, wordfelimiterfilter, lowercasefilter ...

RE: how to handle large relational data in Solr

2011-10-20 Thread Brandon Ramirez
I would not recommend removing your relational database altogether. You should treat that as your system of record. By replacing it, you are forcing Solr to store the unmodified value for everything even when not needed. You also lose normalization. And if you ever need to add some data to

BaseTokenFilterFactory not found in plugin

2011-10-20 Thread Michael Craig
I'm trying to write and use a custom filter for Solr. I've got a filter factory in myorg/solr/analysis/TestThingFilterFactory.java: package myorg.solr.analysis; import org.apache.lucene.analysis.TokenStream; import org.apache.solr.analysis.BaseTokenFilterFactory; import

Re: Merging Remote Solr Indexes?

2011-10-20 Thread Yury Kats
On 10/19/2011 5:15 PM, Darren Govoni wrote: Hi Otis, Yeah, I saw page, but it says for merging cores, which I presume must reside locally to the solr instance doing the merging? What I'm interested in doing is merging across solr instances running on different machines into a single

RE: how to handle large relational data in Solr

2011-10-20 Thread Jonathan Carothers
Agreed, this will just be a read only view of the existing database for search purposes. Sorry for the confusion. From: Brandon Ramirez [brandon_rami...@elementk.com] Sent: Thursday, October 20, 2011 10:50 AM To: solr-user@lucene.apache.org Subject: RE:

DocumentAnlysisRequestHandler expects a single content stream ...

2011-10-20 Thread dan whelan
Hi, I am trying to use the analysis request handler in php / curl and was wondering if anyone could help point me in the right direction. I would like to mimic this command-line example from the wiki but need to do it with a variable instead of with a file on the filesystem. curl

Issue with Shard configuration in solrconfig.xml (Solr 3.1)

2011-10-20 Thread Rahul Warawdekar
Hi, I am trying to evaluate distributed search for my project by splitting up our single index on 2 shards with Solr 3.1 When I query the first solr server by passing the shards parameter, I get correct search results from both shards. (

Re: DocumentAnlysisRequestHandler expects a single content stream ...

2011-10-20 Thread dan whelan
I figured it out by passing this as the curl post fields $postContent=array('wt'='json', 'indent'='true', 'stream.body'= $xmldoc); curl_setopt($ch, CURLOPT_POSTFIELDS, $postContent); On 10/20/11 8:15 AM, dan whelan wrote: Hi, I am trying to use the analysis request handler in php / curl

Re: Issue with Shard configuration in solrconfig.xml (Solr 3.1)

2011-10-20 Thread Shawn Heisey
On 10/20/2011 9:33 AM, Rahul Warawdekar wrote: Hi, I am trying to evaluate distributed search for my project by splitting up our single index on 2 shards with Solr 3.1 When I query the first solr server by passing the shards parameter, I get correct search results from both shards. (

Re: Issue with Shard configuration in solrconfig.xml (Solr 3.1)

2011-10-20 Thread Yury Kats
On 10/20/2011 11:33 AM, Rahul Warawdekar wrote: Hi, I am trying to evaluate distributed search for my project by splitting up our single index on 2 shards with Solr 3.1 When I query the first solr server by passing the shards parameter, I get correct search results from both shards. (

Re: how to handle large relational data in Solr

2011-10-20 Thread Robert Stewart
If your documents are products, then 100,000 documents is a pretty small index for solr. Do you know approximately how many accessories are related to each product on average? If # if relatively small (around 100 or less), then it should be ok to create product documents with all the related

Re: Merging Remote Solr Indexes?

2011-10-20 Thread Darren Govoni
Interesting Yury. Thanks. On 10/20/2011 11:00 AM, Yury Kats wrote: On 10/19/2011 5:15 PM, Darren Govoni wrote: Hi Otis, Yeah, I saw page, but it says for merging cores, which I presume must reside locally to the solr instance doing the merging? What I'm interested in doing is merging

Query/Delete performance difference between straight HTTP and SolrJ

2011-10-20 Thread Shawn Heisey
I've got two build systems for my Solr index that I wrote. The first one is in Perl and uses GET/POST requests via HTTP, the second is in Java using SolrJ. I've noticed a performance discrepancy when processing every one of my delete records, currently about 25000 of them. It takes about 5

RE: how to handle large relational data in Solr

2011-10-20 Thread Jonathan Carothers
Actually, that's the root of my concern. It looks like it product will average ~20,000 associated accessories, still workable, but starting to look painful. Coming back the other way, I would guess each accessory would be associated with 100 products on average. Given that there would be

RE: Using spellcheck component with query ( q and spellcheck.q )

2011-10-20 Thread Dyer, James
Mark, The bug you describe looks the same as SOLR-2726 (https://issues.apache.org/jira/browse/SOLR-2726) which doesn't seem to be part of 3.4. You might want to try applying the patch, or better yet, just use a fresh check-out on the 3.x branch as the current state (for un-released 3.5)

Relevance for MoreLikeThis

2011-10-20 Thread entdeveloper
I'm using the http://wiki.apache.org/solr/MoreLikeThisHandler MoreLikeThisHandler to find similar documents. It doesn't immediately appear that there is any way to tweak the relevance for the similar results. By default, it sorts those by how *similar* they are to the original document. However,

inconsistent results when faceting on multivalued field

2011-10-20 Thread Alain Rogister
I am surprised by the results I am getting from a search in a Solr 3.4 index. My schema has a multivalued field of type 'string' : field name=qua_code type=string multiValued=true indexed=true stored=true/ The field values are 7-digit or 9-digit integer numbers; this corresponds to a hierarchy.

org.apache.pdfbox.pdmodel.PDPage Error

2011-10-20 Thread MBD
Hi, I'm new to Solr and trying to get it to index PDFs. Having trouble getting started. Following examples in ExtractingRequestHandler wiki http://wiki.apache.org/solr/ExtractingRequestHandler. Got Solr running and it indexes html, xml txt files just fine...but when I try to feed it a .pdf

Stop fuzzy search

2011-10-20 Thread Andrew Clark
Hi, A Solr search for request gives me hits on documents containing requests, requesting, and requester. How can I turn this feature off so Solr will return only those documents containing request? Thanks, Andrew

Re: Stop fuzzy search

2011-10-20 Thread Otis Gospodnetic
Andrew, What you see is the result of stemming.  In Solr, certain types of fields get stemmed (e.g. text), while some do not (e.g. string, which doesn't even get analyzed). To turn off stemming, create a new field type in schema.xml and make sure not to specify any sort of stemming factory in

Re: how to handle large relational data in Solr

2011-10-20 Thread Otis Gospodnetic
Hi Jonathan, Not sure which version of Solr you are using, but look into Join functionality - hit #1: http://search-lucene.com/?q=joinfc_project=Solr Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/

Can Solr handle large text files?

2011-10-20 Thread Peter Spam
I have about 20k text files, some very small, but some up to 300MB, and would like to do text searching with highlighting. Imagine the text is the contents of your syslog. I would like to type in some terms, such as error and mail, and have Solr return the syslog lines with those terms PLUS

Highlighting misses some characters

2011-10-20 Thread docmattman
I have highlighting on in query. If I do a search for Apple, it will highlight Appl. If I do a search for deleted it will highlight delet, agreed will highlight agre. How can I get it to highlight the full term that I'm searching for and not leave off certain letters? I'm pretty new to Solr,

Re: Optimization /Commit memory

2011-10-20 Thread Sujatha Arun
Thanks that helps. Regards Sujatha On Thu, Oct 20, 2011 at 6:23 PM, Jaeger, Jay - DOT jay.jae...@dot.wi.govwrote: Well, since the OS RAM includes the JVM RAM, that is part of your requirement, yes? Aside from the JVM and normal OS requirements, all you need OS RAM for is file caching.

Re: OS Cache - Solr

2011-10-20 Thread Sujatha Arun
Yes its same ,we have a base static schema and wherever required we use dynamic. Regards, Sujatha On Thu, Oct 20, 2011 at 6:26 PM, Jaeger, Jay - DOT jay.jae...@dot.wi.govwrote: I wonder. What if, instead of 200 instances, you had one instance, but built a uniqueKey up out of whatever you

Re: Optimization /Commit memory

2011-10-20 Thread Sujatha Arun
Just one more thing ,when we are talking about Optimization , we are referring to HD free space for replicating the index (2 or 3 times the index size ) .what is role of RAM (OS) here? Regards Suajtha On Fri, Oct 21, 2011 at 10:12 AM, Sujatha Arun suja.a...@gmail.com wrote: Thanks that

hierarchical synonym

2011-10-20 Thread cmd
if solr support hierarchical synonym for example: animal- dog cat bird use animal as query term and the result set should be contains animal,dog,cat,bird use dog as query term and the result set should be only contains dog rather than other words thanks. -- View this message in context:

Want to support did you mean xxx but is Chinese

2011-10-20 Thread Floyd Wu
Does anybody know how to implement this idea in SOLR. Please kindly point me a direction. For example, when user enter a keyword in Chinese 貝多芬 (this is Beethoven in Chinese) but key in a wrong combination of characters 背多分 (this is pronouncation the same with previous keyword 貝多芬). There in