Re: Converting German special characters / umlaute

2007-09-28 Thread Thorsten Scherler
On Thu, 2007-09-27 at 13:26 -0400, J.J. Larrea wrote: At 12:13 PM -0400 9/27/07, Steven Rowe wrote: Chris Hostetter wrote: ... As for implementation, the first part could easily and flexibly accomplished with the current PatternReplaceFilter, and I'm thinking the second could be done with

Color search

2007-09-28 Thread Guangwei Yuan
Hi, We're running an e-commerce site that provides product search. We've been able to extract colors from product images, and we think it'd be cool and useful to search products by color. A product image can have up to 5 colors (from a color space of about 100 colors), so we can implement it

Indexing without application server

2007-09-28 Thread Jae Joo
Hi, I have a multi millions document to be indexed and looking for the way to index it without j2ee application server. It is not incremental indexing, this is a kind of Index once, use forever - all batch mode. I can guess if there is a way to index it without J2EE, it may be much faster...

Re: Color search

2007-09-28 Thread Steven Rowe
Hi Guangwei, When you index your products, you could have a single color field, and include duplicates of each color component proportional to its weight. For example, if you decide to use 10% increments, for your black dress with 70% of black, 20% of gray, 10% of brown, you would index the

Re: Color search

2007-09-28 Thread Grant Ingersoll
Another option would be to extend Solr (and donate back) to incorporate Lucene's payload functionality, in which case you could associate the percentile of the color as a payload and use the BoostingTermQuery... :-) If you're interested in this, a discussion on solr-dev is probably

locallucene former custom-sort thread

2007-09-28 Thread patrick o'leary
Changing thread name; Are you using local lucene or local solr, and which version? P [EMAIL PROTECTED] wrote: i have been testing locallucene with our data for the last couple of days. one issue i faced with it is during when using geo sorting is that it seems to eat up all the

Re: Color search

2007-09-28 Thread Yonik Seeley
If it were just a couple of colors, you could have a separate field for each color and then index the percent in that field. black:70 grey:20 and then you could use a function query to influence the score (or you could sort by the color percent). However, this doesn't scale well to a large

RE: custom sorting

2007-09-28 Thread Sandeep Shetty
i have been testing locallucene with our data for the last couple of days. one issue i faced with it is during when using geo sorting is that it seems to eat up all the memory, however big and become progressively slower, finally after several requests (10 or so in my case) it throws up a

RE: locallucene former custom-sort thread

2007-09-28 Thread Sandeep Shetty
Hi, i'm using local lucene, downloaded the latest zip file solr-example_s1.3_ls0.2.tgz is there a newer version available? Thanks! Sandeep -Original Message- From: patrick o'leary [mailto:[EMAIL PROTECTED] Sent: 28 September 2007 16:08 To: solr-user@lucene.apache.org Subject:

RE: Color search

2007-09-28 Thread Renaud Waldura
Here's another idea: encode color mixes as one RGB value (32 bits) and sort according to those values. To find the closest color is like finding the closest points in the color space. It would be like a distance search. 70% black #00 = 0 20% gray #f0f0f0 = #303030 10% brown #8b4513 = #0e0702

Re: locallucene former custom-sort thread

2007-09-28 Thread patrick o'leary
That's the latest. I was experimenting with caching, which might be the problem. I'll have a look, could you give me an idea of how large the radius was and how many results were coming back. Thanks P Sandeep Shetty wrote: Hi, i'm using local lucene, downloaded the latest zip file

Re: custom sorting

2007-09-28 Thread Narayanan Palasseri
Hi all, Regarding this issue, we tried using a custom request handler which inturn uses the CustomCompartor. But this has a memory leak and we are almost got stuck up at that point. As somebody mentioned, we are thinking of moving towards function query to achieve the same. Please let me know

Re: Indexing without application server

2007-09-28 Thread Walter Underwood
I do not think it will be much faster. The data transfer time is small compared to the indexing time. The indexing will probably take less than a day, so if you spend more than 30 minutes coding a faster method, the project will take longer. wunder On 9/28/07 6:06 AM, Jae Joo [EMAIL PROTECTED]

one query or multiple queries

2007-09-28 Thread Xuesong Luo
Hi, there, I have a user index(each user has a unique index record). If I want to search 10 users, should I run 10 queries or 1 query with multiple user ids? Is there any performance difference? Thanks Xuesong

RE: locallucene former custom-sort thread

2007-09-28 Thread Sandeep Shetty
also probably a point to consider, the index has about 2.9 million records in total -Original Message- From: Sandeep Shetty Sent: 28 September 2007 17:15 To: 'solr-user@lucene.apache.org' Subject: RE: locallucene former custom-sort thread yes i was thinking abt the same. i was

Re: custom sorting

2007-09-28 Thread Jon Pierce
Is the machinery in place to do this now (hook up a function query to be used in sorting)? I'm trying to figure out what's the best way to do a distance sort: custom comparator or function query. Using a custom comparator seems straightforward and reusable across both the standard and dismax

RE: locallucene former custom-sort thread

2007-09-28 Thread Sandeep Shetty
yes i was thinking abt the same. i was searching for a radius of 25 miles. we get about 2500 results back for the search. it seems like its storing all those geo results in cache and it keeps on adding to it each time a geo request is made... thanks for looking into it! Sandeep -Original

Dismax and Grouping query

2007-09-28 Thread Ty Hahn
Hi, I've tried to use grouping query on DisMaxRequestHandler without success. When I sent grouping query in Solr Admin, I could see parens of query escaped in 'querystring' line with debugQuery On. Is this the cause of the failure? e.g. When I send query like +(lucene solr), I can see following

Re: searching remote indexes

2007-09-28 Thread Venkatraman S
resending due to lack of response : [We are using embedded solr 1.2 ] I need a mechanism by which i can search over 3 remote indexes? Can i use the Lucene remote apis to access the index created via Embedded solr? -Venkat On 9/4/07, Venkatraman S [EMAIL PROTECTED] wrote: Hi, [I am new to

Re: Color search

2007-09-28 Thread Steven Rowe
Hi Renaud, I think your method will produce strange results, probably in most cases, e.g. 33% red #FF = #55 33% green #00FF00 = #005500 33% blue #FF = #55 = #55 Thus, red, green and blue dress would score well against a search for medium gray. Not good. Steve Renaud

Re: one query or multiple queries

2007-09-28 Thread Ian Lea
I'd guess the latter would be faster, but who knows? Try it both ways. -- Ian. On 9/28/07, Xuesong Luo [EMAIL PROTECTED] wrote: Hi, there, I have a user index(each user has a unique index record). If I want to search 10 users, should I run 10 queries or 1 query with multiple user ids? Is

Re: Color search

2007-09-28 Thread Matthew Runo
This discussion is incredibly interesting to me! We solved this simply by indexing the color names, and faceting on that. Not a very elegant solution, to be sure - but it works. If people search for a green running shoe they get -green- running shoes. I would be very very interested in

Index multiple languages with multiple analyzers with the same field

2007-09-28 Thread Wu, Daniel
Hi, I know this probably has been asked before, but I was not able to find it in the mailing list. So forgive me if I repeated the same question. We are trying to build a search application to support multiple languages. Users can potentially query with any language. First thought come to

Re: Index multiple languages with multiple analyzers with the same field

2007-09-28 Thread Mike Klaas
On 28-Sep-07, at 11:13 AM, Wu, Daniel wrote: Hi, I know this probably has been asked before, but I was not able to find it in the mailing list. So forgive me if I repeated the same question. This thread hashes out the issues in quite a lot of detail:

Re: searching remote indexes

2007-09-28 Thread Mike Klaas
Solr's main interface is http, so you can connect to that remotely. Query each machine and combine the results using you own business logic. Alternatively, you can try out the query distribution code being developed in http://issues.apache.org/jira/browse/SOLR-303 -Mike On 28-Sep-07, at

Re: Index multiple languages with multiple analyzers with the same field

2007-09-28 Thread Thom Nelson
I had the same problem, but never found a good solution. The best solution is to have a more dynamic way of determining which analyzer to return, such as having some kind of conditional expression evalution in the fieldType/analyzer element, where either the document or the query request

Re: Color search

2007-09-28 Thread Chris Hostetter
: useful to search products by color. A product image can have up to 5 colors : (from a color space of about 100 colors), so we can implement it easily with : Solr's facet search (thanks all who've developed Solr). : : The problem arises when we try to sort the results by the color relevancy. :

Re: Request for graphics

2007-09-28 Thread Chris Hostetter
: I am trying to make a presentation on SOLR and have been unable to find the : SOLR graphic in high quality. Could someone point me in the right direction : or provide the graphics? you're right -- i can't find the orriginal source files for it in subversion. I think i know who made it (here

Schema version question

2007-09-28 Thread Robert Purdy
I was wondering if anyone could help me, I just completed a full index of my data (about 4 million documents) and noticed that when I was first setting up the schema I set the version number to 1.2 thinking that solr 1.2 uses schema version 1.2... ugh... so I am wondering if I can just set the

Re: Color search

2007-09-28 Thread Mike Klaas
On 28-Sep-07, at 6:31 AM, Grant Ingersoll wrote: Another option would be to extend Solr (and donate back) to incorporate Lucene's payload functionality, in which case you could associate the percentile of the color as a payload and use the BoostingTermQuery... :-) If you're interested in

Re: small rsync index question

2007-09-28 Thread Yonik Seeley
On 9/28/07, Brian Whitman [EMAIL PROTECTED] wrote: For some reason sending a commit/ is not refreshing the index It should... are there any errors in the logs? do you see the commit in the logs? Check the stats page to see info about when the current searcher was last opened too. -Yonik

Re: Schema version question

2007-09-28 Thread Yonik Seeley
On 9/28/07, Robert Purdy [EMAIL PROTECTED] wrote: I was wondering if anyone could help me, I just completed a full index of my data (about 4 million documents) and noticed that when I was first setting up the schema I set the version number to 1.2 thinking that solr 1.2 uses schema version

Re: Request for graphics

2007-09-28 Thread Yonik Seeley
On 9/28/07, Clay Webster [EMAIL PROTECTED] wrote: i'm late for dinner out, so i'm just attaching it here. Most attachments are stripped :-) -Yonik

RE: Index multiple languages with multiple analyzers with the same field

2007-09-28 Thread Lance Norskog
Other people custom-create a separate dynamic field for each language they want to support. The spellchecker in Solr 1.2 wants just one field to use as its word source, so this fits. We have a more complex version of this problem: we have content with both English and other languages. Searching

Re: custom sorting

2007-09-28 Thread Chris Hostetter
: Using something like this, how would the custom SortComparatorSource : get a parameter from the request to use in sorting calculations? in general: you wouldn't you would have to specify all options as init params for the FieldType -- which makes it pretty horrible for distance

Re: Color search

2007-09-28 Thread Guangwei Yuan
Thanks for all the replies. I think creating 10 fields and feeding each field with a color's value for 10% from that color is a reasonable approach, and easy to implement too. One problem though, is that not all products have a total of 100% colors (due to various reasons including our color

Re: Dismax and Grouping query

2007-09-28 Thread Chris Hostetter
: I've tried to use grouping query on DisMaxRequestHandler without success. : e.g. : When I send query like +(lucene solr), : I can see following line in the result page. : str name=querystring+\(lucene solr\)/str the dismax handler does not consider parens to be special characters. if it

Re: custom sorting

2007-09-28 Thread Chris Hostetter
: leaks, etc.). (Speaking of which, could anyone with more Lucene/Solr : experience than I comment on the performance characteristics of the : locallucene implementation mentioned on the list recently? I've taken : a first look and it seems reasonable to me.) i cna't speak for anyone else, but

Re: small rsync index question

2007-09-28 Thread Chris Hostetter
: To completely remove the window of inconsistency, comment out the : post-commit hook in solrconfig.xml that takes a snapshot, then send a : commit to get a new snapshot and rsync from that. i think yonik ment UN-comment the postCommit hook in the example solrconfig.xml. -Hoss