Multi-words synonyms matching

2012-04-10 Thread elisabeth benoit
Hello, I've read several post on this issue, but can't find a real solution to my multi-words synonyms matching problem. I have in my synonyms.txt an entry like mairie, hotel de ville and my index time analyzer is configured as followed for synonyms. The problem I have is that now "mairie" m

Re: SOLR hangs - update timeout - please help

2012-04-10 Thread rafal.gwizd...@gmail.com
Working for a week now, no signs of fatigue. Many thanks for all the hints R -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-hangs-update-timeout-please-help-tp3863851p3899004.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Why this document does not match?

2012-04-10 Thread Alexander Ramos Jardim
Ok. But I am not querying for fifa 12. I am querying fifa12. There's no white spaces :( 2012/4/9 Chris Hostetter > > : > itemNameSearch:fifa defaultSearchField:12 > > : That's exactly what's happening! Why does this happen? > > whyspace is meaningful to the query parser: it tells the query parse

Re: SolrCloud versus a SearchComponent that rescores

2012-04-10 Thread Benson Margulies
On Mon, Apr 9, 2012 at 9:36 PM, Mark Miller wrote: > Yeah, that's how it works - it ends up hitting the select request handler > (this might be overridable with shards.qt) All the params are passed along, > so in general, it will act the same as the top level req handler - but it can > the remo

Re: SolrCloud versus a SearchComponent that rescores

2012-04-10 Thread Benson Margulies
Another thought: currently I'm using qt=ME to indicate this process. I could, in theory, use some ME=true and make my components check for it to avoid this process, but it seems kind of peculiar from an end-user standpoint.

Re: How to facet data from a multivalued field?

2012-04-10 Thread Erik Hatcher
Thiago - You'll want your series field to be of type "string". If you also need that field searchable by the words within them, you can copyField to a separate "text" (or other analyzed) field type where you search on the tokenized field but facet on the "string" one. Erik On Apr 9,

RE: Choosing tokenizer based on language of document

2012-04-10 Thread Prakashganesh, Prabhu
Hi Dominique, Eric, Thanks for replying. At a high level, what I am trying to work out is pros and cons of different approaches to handling multi lingual content. >From what I have read on the web, the most common/recommended way seems to be to split/shard by language, then each shard/ind

SOLR issue - too many search queries

2012-04-10 Thread arunssasidhar
We have a PHP web application which is using SOLR for searching. The APP is using CURL to connect to the SOLR server and which run in a loop with thousands of predefined keywords. That will create thousands of different search quires to SOLR at a given time. My issue is that, when a single user lo

RE: SOLR issue - too many search queries

2012-04-10 Thread Darren Govoni
My first reaction to your question is why are you running thousands of queries in a loop? Immediately, I think this will not scale well and the design probably needs to be re-visited. Second, if you need that many requests, then you need to seriously consider an architecture that supports it.

Re: Question on using dynamic fields

2012-04-10 Thread Erick Erickson
Not really, XPath isn't my strong suit, I'm afraid I'll have to defer to others. Best Erick On Mon, Apr 9, 2012 at 7:30 PM, Rakesh Varna wrote: > Hi Erick, >   The schema browser says that no dynamic fields were indexed. Any idea > how do I specify dynamic fields through XPath when I only know t

Re: SOLR issue - too many search queries

2012-04-10 Thread Yonik Seeley
On Tue, Apr 10, 2012 at 8:51 AM, arunssasidhar wrote: > We have a PHP web application which is using SOLR for searching. The APP is > using CURL to connect to the SOLR server and which run in a loop with > thousands of predefined keywords. That will create thousands of different > search quires to

Re: Multi-words synonyms matching

2012-04-10 Thread Erick Erickson
Have you tried the "=>' mapping instead? Something like hotel de ville => mairie might work for you. Best Erick On Tue, Apr 10, 2012 at 1:41 AM, elisabeth benoit wrote: > Hello, > > I've read several post on this issue, but can't find a real solution to my > multi-words synonyms matching problem

Facets involving multiple fields

2012-04-10 Thread Marc SCHNEIDER
Hi, I'd like to make a faceted search using two fields. I want to have a single result and not a result by field (like when using facet.field=f1,facet.field=f2). I don't want to use a copy field either because I want it to be dynamic at search time. As far as I know this is not possible for Solr 3

Re: Multi-words synonyms matching

2012-04-10 Thread Markus Jelsma
To map `mairie` to `hotel de ville` as single token you must escape your white space. mairie, hotel\ de\ ville This results in a problem if your tokenizer splits on white space at query time. On Tuesday 10 April 2012 16:39:21 Erick Erickson wrote: > Have you tried the "=>' mapping instead? So

Securing Solr under Tomcat - IP best way?

2012-04-10 Thread Spadez
Hi, I’m in the process of working how to configure and secure my server running Nginx, and Nutch and Solr under Tomcat. Is the best security practice for securing Solr under Tomcat simply to only allow requests only from 127.0.0.1. This way Solr isn’t exposed to the outside world and is only compr

Re: Why this document does not match?

2012-04-10 Thread Erick Erickson
Can't say why this is happening, you haven't included your fieldType definition which would help. You might want to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Tue, Apr 10, 2012 at 3:41 AM, Alexander Ramos Jardim wrote: > Ok. But I am not querying for fifa 12. I am query

Re: Securing Solr under Tomcat - IP best way?

2012-04-10 Thread Markus Jelsma
Hi, I'd certainly add firewall rules. In some cases also HTTP Auth. Nutch can authenticate to Solr so that's no problem. Cheers On Tuesday 10 April 2012 17:10:42 Spadez wrote: > Hi, > > I’m in the process of working how to configure and secure my server running > Nginx, and Nutch and Solr unde

Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Eli Finkelshteyn
Hi Folks, I've been tasked with moving a Solr project I know little about from Ant to Maven. I've found all the dependencies I need and I'm not seeing any errors in my IDE. Everything compiles and installs just fine. Problem is, when I try to start things up in Jetty, I get errors. The first ma

custom query string parsing?

2012-04-10 Thread sam ”
I would like to transform the following: /myhandler/?colors=red&colors=blue&materials=leather to a Query that is similar to: /select/?fq:colors:(red OR blue)&fq:materials:leather&facet=on&facet.field= and various default query params. I tried to do this by providing QParserPlugin:

Preserving punctuation tokens with ICUTokenizerFactory

2012-04-10 Thread Demian Katz
It has been brought to my attention that ICUTokenizerFactory drops tokens like the ++ in "The C++ Programming Language." Is there any way to persuade it to preserve these types of tokens? thanks, Demian

Re: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Eli Finkelshteyn
Update: was able to get rid of the lack of SolrUpdateServlet by moving back to version 3.5 from 4.0-SNAPSHOT (weird-- dunno why this is missing in 4.0), but the build dir thing is still a problem. I'm really not even sure what I should set that to. Eli On 4/10/12 11:30 AM, Eli Finkelshteyn wr

Re: Securing Solr under Tomcat - IP best way?

2012-04-10 Thread Spadez
Thank you for the reply. I hate to take more of peoples time but can anyone elaborate more on the kind of firewall rules I should be looking at? -- View this message in context: http://lucene.472066.n3.nabble.com/Securing-Solr-under-Tomcat-IP-best-way-tp3899929p3900040.html Sent from the Solr -

RE: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Steven A Rowe
Hi Eli, The author of the blog post you mentioned appears to be unaware of the Maven POMs that are already included in Subversion for both Lucene and Solr. See . Because of the complex nature of the Ant build, which the Maven POMs cannot enti

Re: Securing Solr under Tomcat - IP best way?

2012-04-10 Thread Markus Jelsma
Accept only what you need (ports incoming/outgoing) for specific trusted clients. Decide for protocols such as ICMP, DNS, NTP, SSH and of course HTTP and drop all other coming in and reject going out. Beyond this you can also configure some protection for bad packets. There are plenty of guide

Re: Preserving punctuation tokens with ICUTokenizerFactory

2012-04-10 Thread Robert Muir
you can actually plug in customized grammars and stuff like that, but the simplest approach is to configure mappingcharfilter before your tokenizer, with mappings like: "c++" => "cplusplus" On Tue, Apr 10, 2012 at 11:50 AM, Demian Katz wrote: > It has been brought to my attention that ICUTokenize

RE: how to correctly facet clothing multiple sizes and colors?

2012-04-10 Thread Robert Petersen
Well yes but in my experience people generally search for something particular... then select colors and sizes thereafter. -Original Message- From: danjfoley [mailto:d...@micamedia.com] Sent: Monday, April 09, 2012 4:18 PM To: solr-user@lucene.apache.org Subject: Re: how to correctly face

Re: SolrCloud versus a SearchComponent that rescores

2012-04-10 Thread Benson Margulies
I've updated the doc with my findings. Thanks for the pointer.

URP's versus Cloud

2012-04-10 Thread Benson Margulies
How are URP's managed with respect to cloud deployment? Given some solrconfig.xml like the below, do I expect it to be in the chain on the leader, the shards, or both? RNI

Re: URP's versus Cloud

2012-04-10 Thread Markus Jelsma
In this case on each node, order matters. If you, for example, define a standard SignatureUpdateProcessorFactory before the DistributedUpdateProcessorFactory you will end up with multiple values for the signature field. On Tue, 10 Apr 2012 12:43:36 -0400, Benson Margulies wrote: How are URP

Re: URP's versus Cloud

2012-04-10 Thread Benson Margulies
On Tue, Apr 10, 2012 at 1:08 PM, Markus Jelsma wrote: > In this case on each node, order matters. If you, for example, define a > standard SignatureUpdateProcessorFactory before the > DistributedUpdateProcessorFactory you will end up with multiple values for > the signature field. That seems to i

copyField after analyzer

2012-04-10 Thread srinir
Hi, I want to copy/append different fields to one field, while applying a different analyzer for each field. Lets assume i have specified different analyzers/filters for each of the above source fields cat, name, manu, features and includes. I read online that copyField copies before the a

Re: copyField after analyzer

2012-04-10 Thread Rafał Kuć
Hello! It's not possible with copy fields right now. As you wrote - copy fields are copied before analysis is done. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Hi, > I want to copy/append different fields to one field, while applying a > different analy

RE: To truncate or not to truncate (group.truncate vs. facet)

2012-04-10 Thread Young, Cody
You may be able to fake your price requirements by rounding at index time. For instance, if you wanted 10-19$, 20-29$, 30+ then you create a second price field specifically for faceting, round down to 10, 20, 30 at index time and then facet on that field. Cody -Original Message- From:

Re: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Eli Finkelshteyn
Hey Steve, Thanks for the help! Ok, so per those instructions, I'm using a pom to pull dependencies from http://repository.apache.org/snapshots. Nonetheless, that weird solr.build.dir error still appears. Is there some place I need to specify this that I don't know about? Should a build dir be

RE: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Steven A Rowe
Eli, Could you please more fully describe what you're doing? Are you modifying Solr sources, and then compiling & installing the resulting modifications to your local Maven repository? Or do you have a project that doesn't include any Solr sources at all, but only depends on Solr artifacts pul

Re: org.apache.solr.common.SolrException: Internal Server Error

2012-04-10 Thread Chris Hostetter
: When I execute the code, it always meet the error: : Index starting.. : org.apache.solr.common.SolrException: Internal Server Error did you look at the Solr logs? did they give any indication what the error was? -Hoss

Re: org.apache.solr.common.SolrException: parsing error

2012-04-10 Thread Chris Hostetter
We need a lot more info ... starting with what your client code looks like. : I post a *.doc file to the solr server, but I always get the error: : org.apache.solr.common.SolrException: parsing error : at : org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryRespo

Re: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Eli Finkelshteyn
Hey Steven, I'm not modifying Solr sources at all. I just have a project that's built on top of Solr using ant. I'd like to move it to use maven instead of ant. The way I was going about this was just adding in all parts of Solr that it's using as dependencies in Maven. I wasn't using a local

Default qt on SolrCloud

2012-04-10 Thread Benson Margulies
After I load documents into my cloud instance, a URL like: http://localhost:PORT/solr/query?q=*:* finds nothing. http://localhost:PORT/solr/query?q=*:*&qt=standard finds everything. My custom request handlers have 'default="false"'. What have I done?

Re: To truncate or not to truncate (group.truncate vs. facet)

2012-04-10 Thread danjfoley
Good idea. In fact you could fake anything this way. Pre-render the facet values on input. On Tue, Apr 10, 2012 at 1:58 PM, Young, Cody [via Lucene] < ml-node+s472066n3900432...@n3.nabble.com> wrote: > You may be able to fake your price requirements by rounding at index time. > > For instance, if

Re: custom query string parsing?

2012-04-10 Thread sam ”
Essentially, this is what I want to do (I'm extending SearchComponent): @Override public void process(ResponseBuilder rb) throws IOException { final SolrQueryRequest req = rb.req; final MultiMapSolrParams requestParams = SolrRequestParsers.parseQueryString(req.getParamStri

Re: Using DateMath in Facet Label

2012-04-10 Thread Chris Hostetter
: a) Last Week : b) Last Month : c) Last Year : d) 2012 : e) 2011 or earlier ... : Of course, as 2013 rolls in, then the labels for the last two buckets : should change to “2013” and “2012 or earlier”. Is there any way to have : Solr return the correct year base

term frequency outweighs exact phrase match

2012-04-10 Thread alxsss
Hello, I use solr 3.5 with edismax. I have the following issue with phrase search. For example if I have three documents with content like 1.apache apache 2. solr solr 3.apache solr then search for apache solr displays documents in the order 1,.2,3 instead of 3, 2, 1 because term frequency in

Re: Range Queries -sfloat

2012-04-10 Thread Chris Hostetter
: query is price: [ 1 TO 20 ] is returning values out of this range ,like : 23.00 AND 55.00 .The field type of the price field is sfloat . can you provide more details about the documents matching out of the range? are you sure this isn't a multivalued field? : When I check this form admin Deb

RE: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Steven A Rowe
You didn't answer my question about where you are running "mvn jetty:run-exploded" - is it in your own project, or from the Solr sources? Exactly which Solr Maven artifacts are you including as dependencies are in your project's POM? (Can you copy/paste the section?) > Basically, I was just d

Re: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Eli Finkelshteyn
I'm running mvn jetty:run-exploded on my own project. My dependencies are: org.apache.solr solr 4.0-SNAPSHOT war org.apache.solr solr-core 4.0-SNAPSHOT org.apache.solr solr-analysis-extras 4.0-SNAPSHOT org.apache.solr solr-commons-csv 4.0-SNAPSHOT org.apache.lucene lucene-core 4.0-SNAPS

RE: Moving to Maven from Ant solr.build.dir Not Found

2012-04-10 Thread Steven A Rowe
Eli, Sorry, I don't have any experience using Solr in this way. Has anybody else here successfully run Solr when it's included as a war dependency in an external Maven-based war project, by running "mvn jetty:run exploded" from the external project? FYI, The nightly download page I pointed yo

Securing Solr with Tomcat

2012-04-10 Thread solruser
Hi All, Our web application is allowing users to query directly from browser using Solr as Tomcat application by utilizing AJAX Solr library (using jsonp). I'm looking for ways to block internet users directly either updating the index or hitting the admin pages. I’d appreciate your input on this.

Re: Suggester not working for digit starting terms

2012-04-10 Thread jmlucjav
I have double checked and still get the same behaviour. My field is: Analisys

I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-10 Thread Benson Margulies
In my cloud configuration, if I push *:* followed by: I get no errors, the log looks happy enough, but the documents remain in the index, visible to /query. Here's what seems my relevant bit of solrconfig.xml. My URP only implements processAdd.

Re: Securing Solr with Tomcat

2012-04-10 Thread sam ”
http://wiki.apache.org/solr/SolrSecurity Make sure you block query params such as qt= https://issues.apache.org/jira/browse/SOLR-3161 is still open. This could be useful, too: http://www.nodex.co.uk/blog/12-03-12/installing-solr-debian-squeeze On Tue, Apr 10, 2012 at 4:25 PM, solruser wrote:

Re: solr analysis-extras configuration

2012-04-10 Thread N. Tucker
I'm still not exactly clear on why this is the case, but the problem turned out to be that the extra libs needed to be in my tomcat app's WEB-INF/lib directory, rather than ${solrhome]/lib. I don't really understand the distinction between the two, especially since Solr was reporting that it was l

Re: custom query string parsing?

2012-04-10 Thread Chris Hostetter
: Essentially, this is what I want to do (I'm extending SearchComponent): the level of request manipulation you seem to be interested strikes me as something that you should do as a custom RequestHandler -- not a SearchComponent or a QParserPlugin. You can always subclass SearchHandler, and o

using solr to do a 'match'

2012-04-10 Thread Chris Book
Hello, I have a solr index running that is working very well as a search. But I want to add the ability (if possible) to use it to do matching. The problem is that by default it is only looking for all the input terms to be present, and it doesn't give me any indication as to how many terms in th

fq doesn't return any results

2012-04-10 Thread ZHANG Liang F
Hi, I had field defined to store the location of a file: And the return value is something like: E:\my_project\ecmkit\test ... But when I try to filter the result by using fq, None of the following return any results: 1. &fq=path%3AE%3A%5Cmy_project%5Cecmkit%5Cinfotouch (org

Re: Large Index and OutOfMemoryError: Map failed

2012-04-10 Thread Gopal Patwa
Michael, Thanks for response it was 65K as you mention the default value for "cat /proc/sys/vm/max_map_count" . How we determine what value this should be? is it number of document during hard commit in my case it is 15 minutes? or it is number of index file or number of documents we have in all

Re: fq doesn't return any results

2012-04-10 Thread Chris Hostetter
: 1. &fq=path%3AE%3A%5Cmy_project%5Cecmkit%5Cinfotouch : (org.apache.lucene.queryParser.ParseException: Cannot parse : 'path:E:\my_project\ecmkit\infotouch': Encountered " ":" ": "" ) : : 2. &fq=path:"E:\my_project\ecmkit\test" (return 0 result) the problem in the first example is that evne t

RE: fq doesn't return any results

2012-04-10 Thread ZHANG Liang F
Thanks a lot! You did save me a lot of time! All the solutions you provided are working perfectly fine! -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: 2012年4月11日 11:41 To: solr-user@lucene.apache.org Subject: Re: fq doesn't return any results : 1. &fq

Re: Using DateMath in Facet Label

2012-04-10 Thread Charlie Maroto
Hi Chris, c) would cover the last year to the current date, therefore, as I write this it would be the period between Apr11, 2011 and Apr 10, 2012. Therefore, the period begin and end dates would increase by one day tomorrow. d) represents the current calendar year, thus covering Jan 1, 2012 - Ap

multicore solr with dublin core and SolrNet

2012-04-10 Thread pkoueik
Hello there, for a while now i managed to understand Solr in terms of configuration, schemas and indexing from database and from pdfs using tika.. My goal is to provide an eLibrary system in .Net platform using solr to fetch the searched words from database and from flat files and get the result in

Re: using solr to do a 'match'

2012-04-10 Thread Li Li
it's not possible now because lucene don't support this. when doing disjunction query, it only record how many terms match this document. I think this is a common requirement for many users. I suggest lucene should divide scorer to a matcher and a scorer. the matcher just return which doc is matche

[Solr 4.0] soft commit with API of Solr 4.0

2012-04-10 Thread Lyuba Romanchuk
Hi All, Is there way to perform soft commit from code in Solr 4.0 ? Is it possible only from solrconfig.xml through enabling autoSoftCommit with maxDocs and/or maxTime attributes? Thank you in advance. Best regards, Lyuba