Re: How to disable wildcard search

2010-01-28 Thread Erik Hatcher
There's no option to do this with the standard query parser, though you can easily subclass it and override getPrefixQuery (term* queries) and getWildcardQuery (for any other terms that have * or % in them) to throw an exception or convert it to a different type of query. However, maybe

Re: Solr + MySQL newbie question

2010-01-28 Thread Erik Hatcher
Solr has the DataImportHandler framework that allows a straightforward configuration to control indexing from any relational database (with JDBC support). See http://wiki.apache.org/solr/DataImportHandler for details. If you did go the Java route (though not recommended at this point),

Re: transformer or filter...which is better

2010-01-28 Thread Erik Hatcher
We need more details to evaluate - what exactly are you trying to accomplish? I'd tend towards the token filter approach since it would be more general purpose for other indexing routes. Erik On Jan 28, 2010, at 2:10 AM, Abin Mathew wrote: Hi When the same thing can be done

Which schema changes are incompatible?

2010-01-28 Thread Anders Melchiorsen
Hello. I read the FAQ entry about rebuilding the index, http://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F but it is not clear about the times when this is needed. So I wonder, do I need to do it after adding a field, removing a field, changing

solr - katta integration

2010-01-28 Thread V SudershanReddy
Hi, Can we Integrate solr with katta? In order to overcome the limitations of Solr in distributed search, I need to integrate katta with solr, without loosing any features of Solr. Any Suggestions? Any help appreciated. Thanks, Sudharshan

RE: solr with tomcat in cluster mode

2010-01-28 Thread ZAROGKIKAS,GIORGOS
Windows I mention below that I user windows server 2008 -Original Message- From: Lance Norskog [mailto:goks...@gmail.com] Sent: Thursday, January 28, 2010 3:46 AM To: solr-user@lucene.apache.org Subject: Re: solr with tomcat in cluster mode Linux includes a load-balancer program

Re: Help using CachedSqlEntityProcessor

2010-01-28 Thread KirstyS
Okay, I changed my entity to look like this (have included my main entity as well): document name=ArticleDocument entity name=article pk=CmsArticleId query=Select * from vArticleSummaryDetail_SolrSearch (nolock) WHERE ArticleStatusId = 1 entity

index of facet fields are not same as original string

2010-01-28 Thread Solr user
Hi, I am new to Solr. I found facets fields does not reflect the original string in the record. For example, the returned xml is, - doc str name=g_numberG-EUPE/str /doc - lst name=facet_counts lst name=facet_queries / - lst name=facet_fields - lst name=g_number int

Re: solr - katta integration

2010-01-28 Thread Marc Sturlese
have a look: http://issues.apache.org/jira/browse/SOLR-1395 V SudershanReddy wrote: Hi, Can we Integrate solr with katta? In order to overcome the limitations of Solr in distributed search, I need to integrate katta with solr, without loosing any features of Solr.

Re: index of facet fields are not same as original string

2010-01-28 Thread Sergey Pavlikovskiy
Hi, probably, it's because of stemming if you need unstemmed text you can use 'textgen' data type for the field Sergey On Thu, Jan 28, 2010 at 12:25 PM, Solr user uma.ravind...@yahoo.co.inwrote: Hi, I am new to Solr. I found facets fields does not reflect the original string in the

boost on certain keywords

2010-01-28 Thread murali k
Say I have a clothes store, i have ladies clothes, mens clothes when someone searches for clothes, i want to prioritize mens clothing results, how can I achieve this ? this logic should only apply for this keyword, other keywords should work as-is should I be trying with something on synonyms

Re: boost on certain keywords

2010-01-28 Thread Shashi Kant
Look at Payload. On Thu, Jan 28, 2010 at 6:48 AM, murali k ilar...@gmail.com wrote: Say I have a clothes store,  i have ladies clothes, mens clothes when someone searches for clothes, i want to prioritize mens clothing results, how can I achieve this ? this logic should only apply for this

Re: boost on certain keywords

2010-01-28 Thread Shashi Kant
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/ On Thu, Jan 28, 2010 at 6:54 AM, Shashi Kant sk...@sloan.mit.edu wrote: Look at Payload. On Thu, Jan 28, 2010 at 6:48 AM, murali k ilar...@gmail.com wrote: Say I have a clothes store,  i have ladies clothes, mens

Re: Help using CachedSqlEntityProcessor

2010-01-28 Thread Rolf Johansson
It's always a good thing if you can check the debug log (fx catalina.out) or run with debug/verbose to check how Solr runs trough the dataconfig. You've also made a typo in the pk and query, LinkedCatAricleId is missing a t. /Rolf Den 2010-01-28 11.20, skrev KirstyS kirst...@gmail.com:

Re: Help using CachedSqlEntityProcessor

2010-01-28 Thread KirstyS
Thanks, I saw that mistake and I have it working now!!! thank you for all your help. Out of interest, is the cacheKey and cacheLookup documented anywhere? Rolf Johansson-2 wrote: It's always a good thing if you can check the debug log (fx catalina.out) or run with debug/verbose to check

Re: Improvising solr queries

2010-01-28 Thread dipti khullar
Hi I am back again with further queries. Just to check whether caching helps in rectifying our problem, we did a simple test: Restarted solr slave and executed one of the heavy queries immediately to test the query response time. It was again high, somewhat about 700 ms, which means now no

Mavenizing solr webapp

2010-01-28 Thread Licinio Fernández Maurelo
Hi everybody. I'm trying to build apache-solr *webapp* (not the whole project) using maven . Also want to reuse the build.xml ant file. The directory structure is: +build +client +contrib . +src +webapp/src --webapp code +dist --generated artifacts by the ant script --must be copied

RE: How to Implement SpanQuery in Solr . . ?

2010-01-28 Thread Christopher Ball
Please forgive my ignorance, but I am still quite the newbie to both Lucene and Solr. I was hoping to start by getting a simple example working in SOLR and then iterate towards the more complex, given this is my first attempt at extending Solr. For my first iteration of SpanQuery in Solr I am

RE: Can Solr be forced to return all field tags for a document even if the field is empty?l

2010-01-28 Thread Turner, Robbin J
We've been building a UI which displays the results in a tabular formate, so each of the fields that are available for search are presented to the user. We still discussing where it's best to sort this out in our overall system design, such as the UI or application layer. Given our current

Re: Can Solr be forced to return all field tags for a document even if the field is empty?l

2010-01-28 Thread Erick Erickson
OK, that makes sense. I'd deal with this on the UI side personally. My reasoning is that it's poor design to make you search server jump through hoops because you want to write client-side code that can ignore edge cases. The UI changes and now you have to go back to the search code to accommodate

Re: Multiple Cores Vs. Single Core for the following use case

2010-01-28 Thread Matthieu Labour
Thanks a lot everybody for the responses ... I am going to do some practical/empirical testing and will report matt --- On Wed, 1/27/10, Tom Hill solr-l...@worldware.com wrote: From: Tom Hill solr-l...@worldware.com Subject: Re: Multiple Cores Vs. Single Core for the following use case To:

Re: Can Solr be forced to return all field tags for a document even if the field is empty?l

2010-01-28 Thread Erik Hatcher
You can use the Luke request handler to request all the fields in the index, in case that helps with your UI. Solr's Schema Browser uses that request handler to introspect. And Solr Flare also uses it to figure out which fields to show as facets. Erik On Jan 28, 2010, at 10:08

Re: index of facet fields are not same as original string

2010-01-28 Thread Joe Calderon
facets are based off the indexed version of your string nor the stored version, you probably have an analyzer thats removing punctuation, most people index the same field multiple ways for different purposes, matching. storting, faceting etc... index a copy of your field as string type and facet

Knowledge about contents of a page

2010-01-28 Thread ram_sj
Hi, My question is about crawling, I know this is not relevant here, but I asked nutch people, didn't get any response, I just thought of posing here, I'm trying to crawl reviews for business, a. is there any way to tell the content in a web pages are reviews or not? Is it possible to do it

Re: Knowledge about contents of a page

2010-01-28 Thread Avlesh Singh
Classification? - http://en.wikipedia.org/wiki/Document_classification Cheers Avlesh On Fri, Jan 29, 2010 at 1:18 AM, ram_sj rpachaiyap...@gmail.com wrote: Hi, My question is about crawling, I know this is not relevant here, but I asked nutch people, didn't get any response, I just

Solr 1.4 Replication index directories

2010-01-28 Thread mark angelillo
Hi, We're using the new replication and it's working pretty well. There's one detail I'd like to get some more information about. As the replication works, it creates versions of the index in the data directory. Originally we had index/, but now there are dated versions such as

Re: solr - katta integration

2010-01-28 Thread Jason Rutherglen
Hi Reddy, What's the limitation you're running into? Jason On Thu, Jan 28, 2010 at 2:15 AM, V SudershanReddy vsre...@huawei.com wrote: Hi,  Can we Integrate solr with katta?  In order to overcome the limitations of Solr in distributed search, I need to integrate katta with solr, without

weird text stripping issue

2010-01-28 Thread javaxmlsoapdev
I am observing very weird text stripping issue. when I search for word Search I get following doc str name=descriptionIssue 18 Search String/str int name=key4688/int str name=titleIssue 18 Search String2/str /doc And highliting node lst name=4688 arr name=title strIssue 18

implementing profanity detector

2010-01-28 Thread Mike Perham
We'd like to implement a profanity detector for documents during indexing. That is, given a file of profane words, we'd like to be able to mark a document as safe or not safe if it contains any of those words so that we can have something similar to google's safe search. I'm trying to figure out

Re: Indexing TrieDateField Using Lucene

2010-01-28 Thread brad anderson
Thanks Yonik, I think commiting code to solr would not fix the problem. I don't want to have to go through the HTTP stack to create an index. I need to create various indices with different params for testing purposes. In this case, its easiest to just use lucene. I used the

Re: Indexing TrieDateField Using Lucene

2010-01-28 Thread Yonik Seeley
On Thu, Jan 28, 2010 at 4:58 PM, brad anderson solrinter...@gmail.com wrote: I think commiting code to solr would not fix the problem. I don't want to have to go through the HTTP stack to create an index. Why not? If you use something like SolrJ, it's an implementation detail if there is a

Re: implementing profanity detector

2010-01-28 Thread Otis Gospodnetic
How about this crazy idea - a custom TokenFilter that stores the safe flag in ThreadLocal? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message From: Mike Perham mper...@onespot.com To:

Re: Indexing TrieDateField Using Lucene

2010-01-28 Thread brad anderson
I like the embedded solr client suggestion. I'll try that one out. I don't think sending a CSV file will work for my case, I will have to go generate a CSV then index the CSV, as opposed to indexing while generating my content. Thanks 2010/1/28 Yonik Seeley yo...@lucidimagination.com On Thu,

Re: How can I boost bq in FieldQParserPlugin?

2010-01-28 Thread Wangsheng Mei
Hi, Chris, thanks for suggestions. q=ipodbq={!dismax qf=userId^0.5 v=$qq}qq=12345qt=dismax I've tried your suggested query above, unfortunately, it does not work out. I glanced a bit on the error message, the Infinite Recursion error seems that dismax query parser are adding bq multiple times.

Re: How can I boost bq in FieldQParserPlugin?

2010-01-28 Thread Wangsheng Mei
Is it problematic when I use dismax as additional boost query with bq params of upper level dismax query? My purpose is very simple, I wanna use dismax query to search title and content text fields while I still wanna boost it with another field value which is the original intetion why the bq

Re: How can I boost bq in FieldQParserPlugin?

2010-01-28 Thread Yonik Seeley
2010/1/28 Wangsheng Mei hairr...@gmail.com: q=ipodbq={!dismax qf=userId^0.5 v=$qq}qq=12345qt=dismax I've tried your suggested query above, unfortunately, it does not work out. I glanced a bit on the error message, the Infinite Recursion error seems that dismax query parser are adding bq

Re: How can I boost bq in FieldQParserPlugin?

2010-01-28 Thread Wangsheng Mei
nice trick, Yonik.[?] 2010/1/29 Yonik Seeley yo...@lucidimagination.com 2010/1/28 Wangsheng Mei hairr...@gmail.com: q=ipodbq={!dismax qf=userId^0.5 v=$qq}qq=12345qt=dismax I've tried your suggested query above, unfortunately, it does not work out. I glanced a bit on the error message,

Re: Dismax Infinite recursion Error

2010-01-28 Thread Wangsheng Mei
Resolved by yonik's suggestion by adding an empty bq inside the embeded dismax: http://myhost/solr/select?q=ipodbq={!dismax%20qf=userIdhttp://myhost/solr/select?q=ipodbq=%7B%21dismax%20qf=userId^0.5%20v=$qq bq=}qq=123qt=dismax 2010/1/29 Wangsheng Mei hairr...@gmail.com Hi, All. I have

RE: weird text stripping issue

2010-01-28 Thread Ankit Bhatnagar
Check you analyzers Ankit -Original Message- From: javaxmlsoapdev [mailto:vika...@yahoo.com] Sent: Thursday, January 28, 2010 4:46 PM To: solr-user@lucene.apache.org Subject: weird text stripping issue I am observing very weird text stripping issue. when I search for word Search I

Re: index of facet fields are not same as original string

2010-01-28 Thread Solr user
Hi Sergey, In schema.xml, i have got by default !-- A general unstemmed text field - good if one does not know the language of the field --^M fieldType name=textgen class=solr.TextField positionIncrementGap=100^M analyzer type=index^M tokenizer

Re: Solr 1.4 Replication index directories

2010-01-28 Thread mark angelillo
Thanks, Otis. Responses inline. Hi, We're using the new replication and it's working pretty well. There's one detail I'd like to get some more information about. As the replication works, it creates versions of the index in the data directory. Originally we had index/, but now there are

RE: weird text stripping issue

2010-01-28 Thread javaxmlsoapdev
Analyzers are default. anything in particular to look for? ANKITBHATNAGAR wrote: Check you analyzers Ankit -Original Message- From: javaxmlsoapdev [mailto:vika...@yahoo.com] Sent: Thursday, January 28, 2010 4:46 PM To: solr-user@lucene.apache.org Subject: weird text

Re: How can I boost bq in FieldQParserPlugin?

2010-01-28 Thread Wangsheng Mei
Although infinite recursion is disappeared, there is another problem. q=ipodbq={!dismax qf=userId^0.5 v=$qq bq=}qq=12345qt=dismaxdebugQuery=on I try to debug the above query, it turned out to be as: +DisjunctionMaxQuery((content:ipod | title:ipod^4.0)~0.01) ()

Re: weird text stripping issue

2010-01-28 Thread Erick Erickson
Default doesn't tell us much, especially since you haven't told us what version you're using. Please post the relevant parts of your schema. That said, WordDelimiterFilterFactory is a popular one to mis-interpret, see: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters HTH Erick On Thu,

boosting unexpired documents

2010-01-28 Thread Andy
My documents have a field expiration that is the expiration date of that doc. I want to give a boost to all documents that haven't expired. I still want to have expired documents returned, but unexpired documents should be given priority. Ideally the boost amount for all unexpired documents

Querying for multi-term phrases only . . .

2010-01-28 Thread Christopher Ball
I am curious how I can query for multi-term phrases using the TermsComponent? The field I am searching has been shingled so it contains 2 and 3 word phrases. For example in the sample results below I want to only get back multi-word phrases such as table of contents and under the but not

Re: solr with tomcat in cluster mode

2010-01-28 Thread Lance Norskog
Ah, ok. There may be a load-balancer program for Windows. Also, the SolrJ client library includes a feature to load-balance its requests. If you write your app using this library, you're set. On Thu, Jan 28, 2010 at 2:17 AM, ZAROGKIKAS,GIORGOS g.zarogki...@multirama.gr wrote: Windows   I

Re: Improvising solr queries

2010-01-28 Thread Lance Norskog
The listener firstSearcher and nextSearcher events describe queries to run when Solr starts and when receives an updated index. When you start Solr, the firstSearch queries are immediately run. You should put queries there that you want to warm up for your searches. For example, the date range

Re: implementing profanity detector

2010-01-28 Thread Lance Norskog
You could have a synonym file that, for each dirty word, changes the word into an impossible word: for example, xyzzy. Then, a search for clean contents is: (user search) AND NOT xyzzy A synonym filter that included payloads would be cool. On Thu, Jan 28, 2010 at 2:31 PM, Otis Gospodnetic

Re: Help using CachedSqlEntityProcessor

2010-01-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
Thanks for pointing this out. The wiki had a problem fro a while and we could not update the documentation. It is updated here http://wiki.apache.org/solr/DataImportHandler#cached On Thu, Jan 28, 2010 at 6:31 PM, KirstyS kirst...@gmail.com wrote: Thanks, I saw that mistake and I have it

Re: index of facet fields are not same as original string

2010-01-28 Thread Lance Norskog
After you change the schema.xml file, you have to rebuild the index completely. At that point, g_number fields should not be stemmed. You can examine what these text field types do.

Re: Solr 1.4 Replication index directories

2010-01-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
the index.20100127044500/ is a temp directory should have got cleaned up if there was no problem in replication (see the logs if there was a problem) . if there is a problem the temp directory will be used as the new index directory and the old one will no more be used.at any given point only one

Newbie Question on Custom Query Generation

2010-01-28 Thread Abin Mathew
Hi I want to generate my own customized query from the input string entered by the user. It should look something like this *Search field : Microsoft* * Generated Query* : description:microsoft +((tags:microsoft^1.5 title:microsoft^3.0 role:microsoft requi rement:microsoft company:microsoft

Re: boosting unexpired documents

2010-01-28 Thread Lance Norskog
You add a range query on the date, and boost documents within that date range. Check out the 'boost query' feature of dismax. http://www.lucidimagination.com/search/document/CDRG_ch07_7.4.2.9 It's also possible with the standard query parser but a pain in the neck: (value)^2 OR (NOT value)

Re: boosting unexpired documents

2010-01-28 Thread Andy
Ah, thank you! --- On Fri, 1/29/10, Lance Norskog goks...@gmail.com wrote: From: Lance Norskog goks...@gmail.com Subject: Re: boosting unexpired documents To: solr-user@lucene.apache.org Date: Friday, January 29, 2010, 12:32 AM You add a range query on the date, and boost documents