Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread britske
Not sure if this has ever come up (or perhaps even implemented without me knowing) , but I'm interested in doing Fuzzy search over multiple fields using Solr. What I mean is the ability to returns documents based on some 'distance calculation' without documents having to match 100% to the query.

modeling prices based on daterange using multipoints

2012-12-11 Thread britske
HI all, Based on some good discussion in Modeling openinghours using multipoints http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-tp4025336p4025683.html I was triggered to have a review of an old painpoint of mine: modeling pricing availability of hotels which

Re: modeling prices based on daterange using multipoints

2012-12-11 Thread britske
) Geert-Jan 2012/12/11 David Smiley (@MITRE.org) [via Lucene] ml-node+s472066n4026151...@n3.nabble.com Hi Britske, This is a very interesting question! britske wrote ... I realize the new spatial-stuff in Solr 4 is no magic bullet, but I'm wondering if I could model multiple prices per

Modeling openinghours using multipoints

2012-12-08 Thread britske
Hi all, Over a year ago I posted a usecase to, the in this context familiar, issue SOLR-2155 of modelling openinghours using multivalued points.

Re: Modeling openinghours using multipoints

2012-12-08 Thread britske
, David Smiley (@MITRE.org) [via Lucene] ml-node+s472066n4025434...@n3.nabble.com wrote: britske wrote That's seriously awesome! Some change in the query though: You described: To query for a business that is open during at least some part of a given time duration I want To query

multiple dateranges/timeslots per doc: modeling openinghours.

2011-09-26 Thread britske
Sorry for the somewhat length post, I would like to make clear that I covered my basis here, and looking for an alternative solution, because the more trivial solutions don't seem to work for my use-case. Consider Bars, musea, etc. These places have multiple openinghours that can depend on:

Universal DataImport(AndExport)Handler

2010-06-08 Thread britske
Recently I looked a bit at DataImportHandler and I'm really impressed with the flexibility of transform / import options. Especially with integrations with Solr Cell / Tika this has become a great Data importer. Besides some use-cases that import to Solr (which I plan to migrate to DIH asap),

manually creating indices to speed up indexing with app-knowledge

2009-11-02 Thread Britske
This may seem like a strange question, but here it goes anyway. Im considering the possibility of low-level constructing indices for about 20.000 indexed fields (type sInt) if at all possible . (With indices in this context I mean the inverted indices from term to Documentid just to be 100%

If field A is empty take field B. Functionality available?

2009-08-28 Thread Britske
I have 2 fields: realprice avgprice I'd like to be able to take the contents of avgprice if realprice is not available. due to design the average price cannot be encoded in the 'realprice'-field. Since I need to be able to filter, sort and facet on these fields, it would be really nice to

Re: If field A is empty take field B. Functionality available?

2009-08-28 Thread Britske
in the same field. On Aug 28, 2009, at 1:16 PM, Britske wrote: I have 2 fields: realprice avgprice I'd like to be able to take the contents of avgprice if realprice is not available. due to design the average price cannot be encoded in the 'realprice'- field. Since I need

Re: solr 1.4: extending StatsComponent to recognize localparm {!ex}

2009-08-26 Thread Britske
Thanks for that. it works now ;-) Erik Hatcher-4 wrote: On Aug 25, 2009, at 6:35 PM, Britske wrote: Moreover, I can't seem to find the actual code in FacetComponent or anywhere else for that matter where the {!ex}-param case is treated. I assume it's

solr 1.4: extending StatsComponent to recognize localparm {!ex}

2009-08-25 Thread Britske
hi, I'm looking for a way to extend StatsComponent te recognize localparams especially the {!ex}-param. To my knowledge this isn't implemented in the current trunk. One of my use-cases for this is to be able to have a javascript price-slider, where the user can operate the slider and thus set

highlighting on edgeGramTokenized field -- hightlighting incorrect bc. position not incremented..

2009-06-12 Thread Britske
, Britske -- View this message in context: http://www.nabble.com/highlighting-on-edgeGramTokenized-field---%3E-hightlighting-incorrect-bc.-position-not-incremented..-tp23996196p23996196.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: highlighting on edgeGramTokenized field -- hightlighting incorrect bc. position not incremented..

2009-06-12 Thread Britske
Thanks, I'll check it out. Otis Gospodnetic wrote: Britske, I'd have to dig, but there are a couple of JIRA issues in Lucene's JIRA (the actual ngram code is part of Lucene) that have to do with ngram positions. I have a feeling that may be the problem. Otis -- Sematext

correct? impossible to filter / facet on ExternalFileField

2009-06-11 Thread Britske
. if not possible, is it on the roadmap? Thanks, Britske -- View this message in context: http://www.nabble.com/correct--impossible-to-filter---facet-on-ExternalFileField-tp23985106p23985106.html Sent from the Solr - User mailing list archive at Nabble.com.

how to get to highlitghting results using solrJ

2009-06-11 Thread Britske
._highlightingInfo with contents: {1-4167147={prefix1=[emOrl/emando Verenigde Staten]},} which is exactly what I need. However there is no (public) method: QueryRepsonse.getHighlightingInfo() ! what am I missing? thanks, Britske -- View this message in context: http://www.nabble.com/how-to-get

Re: how to get to highlitghting results using solrJ

2009-06-11 Thread Britske
only need to refer to the annotated field name... Britske wrote: first time I'm using highlighting and results work ok. Im using it for an auto-suggest function. For reference I used the following query: http://localhost:8983/solr/autocompleteCore/select?fl=name_display,importance,score

speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
it to here, and hoping to receive some valuable info, Cheers, Britske -- View this message in context: http://www.nabble.com/speeding-up-indexing-with-a-LOT-of-indexed-fields-tp22702364p22702364.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
all the time, before this will make any difference I guess. Thanks and please let the suggestions coming. Britske. Otis Gospodnetic wrote: Britske, Here are a few quick ones: - Does that machine really have 10 CPU cores? If it has significantly less, you may be beyond the indexing

solr 1.4: multi-select for statscomponent

2009-02-25 Thread Britske
. Is there any (undocumented) feature that makes this possible? If not, would it be easy to add? Thanks, Britske -- View this message in context: http://www.nabble.com/solr-1.4%3A-multi-select-for-statscomponent-tp22202971p22202971.html Sent from the Solr - User mailing list archive at Nabble.com.

solr on raid 0 -- no performance gain while indexing?

2008-10-15 Thread Britske
such a load between physical disks that the normal write scenario (of software raid 0) of writing sequential chunks in round-robin fashion to all the disks in the array no longer holds? Does this seem logical or does someone know another reason? Thanks, Britske -- View this message in context

Re: DataImportHandler: way to merge multiple db-rows to 1 doc using transformer?

2008-09-29 Thread Britske
or less working home-grown solution, but I would like to be able to set it up with DataImportHandler. thanks for your help, Britske Noble Paul നോബിള്‍ नोब्ळ् wrote: What is the basis on which you merge rows ? Then I may be able to suggest an easy way of doing that On Sun, Sep 28, 2008 at 3

DataImportHandler: way to merge multiple db-rows to 1 doc using transformer?

2008-09-27 Thread Britske
multiple db-rows and merge it to a single solr-row/document. If so, how? Thanks, Britske -- View this message in context: http://www.nabble.com/DataImportHandler%3A-way-to-merge-multiple-db-rows-to-1-doc-using-transformer--tp19706722p19706722.html Sent from the Solr - User mailing list archive

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Britske
these criteria pinpoint a specific field / column to use and the difference should be clear. regards, Britske Funtick wrote: Yes, it should be extremely simple! I simply can't understand how you describe it: Britske wrote: Rows in solr represent productcategories. I will have up to 100k

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Currently, I can't say what the data actualle represents but the analogy of t Mike Klaas wrote: On 28-Jul-08, at 11:16 PM, Britske wrote: That sounds interesting. Let me explain my situation, which may be a variant of what you are proposing. My documents contain more than 10.000

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Hi Fuad, Funtick wrote: Britske wrote: When performing these queries I notice a big difference between qTime (which is mostly in the 15-30 ms range due to caching) and total time taken to return the response (measured through SolrJ's elapsedTime), which takes between 500-1600 ms

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Funtick wrote: Britske wrote: - Rows in solr represent productcategories. I will have up to 100k of them. - Each product category can have 10k products each. These are encoded as the 10k columns / fields (all 10k fields are int values) You are using multivalued fields, you

big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
, Britske -- View this message in context: http://www.nabble.com/big-discrepancy-between-elapsedtime-and-qtime-although-enableLazyFieldLoading%3D-true-tp18698590p18698590.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
15-30 ms. Doesn't this seem strange, since to me it would seem logical that the discrepancy would be at least 1/10th of fetching 100 documents. hmm, hope you can shine some light on this, Thanks a lot, Britske Yonik Seeley wrote: That's a bit too tight to have *all* of the index cached

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
Thanks for clearing that up for me. I'm going to investigate some more... Yonik Seeley wrote: On Mon, Jul 28, 2008 at 4:53 PM, Britske [EMAIL PROTECTED] wrote: Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in this situation? It's the disk seek

reusing docset to limit new query

2008-04-16 Thread Britske
SolrIndexSearcher.cacheDocSet(..) but am not entirely sure what it does (sideeffects? ) Can someone please elaborate on this? Britske -- View this message in context: http://www.nabble.com/reusing-docset-to-limit-new-query-tp16721670p16721670.html Sent from the Solr - User mailing list archive at Nabble.com.

indexing slow, IO-bound?

2008-04-05 Thread Britske
Hi, I have a schema with a lot of (about 1) non-stored indexed fields, which I use for sorting. (no really, that is needed). Moreover I have about 30 stored fields. Indexing of these documents takes a long time. Because of the size of the documents (because of the indexed fields) I am

batch indexing takes more time than shown on SOLR output -- something to do with IO?

2008-01-14 Thread Britske
I have a batch program which inserts items in a solr/lucene index. all is going fine and I get update messages in the console like: 14-jan-2008 16:40:52 org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[10485, 10488, 10489, 10490, 10491, 10495, 10497, 10498, ...(42 more)

how to intersect a doclist with a docset and get a doclist back?

2007-12-14 Thread Britske
Is there a way to get a doclist based on intersecting an existing doclist with a docset? However doing doclist.intersection(docset) returns docset. Is there something I'm missing here? I figured this must be possible since the order of the returned doclist is the same as the order of the

how do do most efficient: collapsing facets into top-N results

2007-12-13 Thread Britske
I've subclassed StandardRequestHandler to be able to show top-N results for some of the facet-values that I'm interested in. The functionality resembles the solr-236 field collapsing a bit, with the difference that I can arbitrarily specify which facet-query to collapse and to what extend.

Re: possible to set mincount on facetquery?

2007-12-05 Thread Britske
It seemed handy in the mentioned case where its not certain if there are products in each of the budgetcategories so you simply ask them all, and only get back the categories which contain at least 1 product. From a functional perspective to me that's kind of on par with doing facet.mincount=1

possible to set mincount on facetquery?

2007-12-05 Thread Britske
is it possible to set a mincount on a facetquery as well as on a facetfield? I have a situation in which I want to group facetqueries (price-ranges) but I obviously dont want to show ranges with 0 results. I tried things like: f.price:[0 TO 50].facet.mincount=1 and f.price:[0 TO

how to load custom valuesource as plugin

2007-11-14 Thread Britske
I've created a simple valueSource which is supposed to calculate a weighted sum over a list of supplied valuesources. How can I let Solr recognise this valuesource? I tried to simply upload it as a plugin, and reference is by its name (wsum) in a functionquery, but got a Unknown function wsum

Re: how to load custom valuesource as plugin

2007-11-14 Thread Britske
Yonik Seeley wrote: Unfortunately, the function query parser isn't currently pluggable. -Yonik On Nov 14, 2007 2:02 PM, Britske [EMAIL PROTECTED] wrote: I've created a simple valueSource which is supposed to calculate a weighted sum over a list of supplied valuesources. How can I let

Re: where to hook in to SOLR to read field-label from functionquery

2007-11-10 Thread Britske
hossman wrote: : Say I have a custom functionquery MinFloatFunction which takes as its : arguments an array of valuesources. : : MinFloatFunction(ValueSource[] sources) : : In my case all these valuesources are the values of a collection of fields. a ValueSource isn't required

where to hook in to SOLR to read field-label from functionquery

2007-11-05 Thread Britske
My question sounds strange I know, but I'll try to explain: Say I have a custom functionquery MinFloatFunction which takes as its arguments an array of valuesources. MinFloatFunction(ValueSource[] sources) In my case all these valuesources are the values of a collection of fields. What I need

SOLR 1.3: defaultOperator always defaults to OR although AND is specifed.

2007-11-01 Thread Britske
experimenting with SOLR 1.3 and discovered that although I specified solrQueryParser defaultOperator=AND/ in schema.xml q=a+b behaves as q=a OR B instead of q=a AND b Obviously this is not correct. I used the nightly of 29 oct. Cheers, Geert-Jan -- View this message in context:

Solr-J: automatic url-escaping gives invalid uri exception. How to workaround?

2007-11-01 Thread Britske
I have a custom requesthandler which does some very basic dynamic parameter substitution. dynamic params are params which are enclosed in braces ({}). So this means i can do something like this: q={order}... where {order} is substituted by the name of an existing order-column. Now this all

Re: Solr-J: automatic url-escaping gives invalid uri exception. How to workaround?

2007-11-01 Thread Britske
I replaced { and } by (( resp. )). Not ideal (I like braces...) but it suffices for now. Still, if someone knows a general solution to the UrlEscaping-issue with Solr-J i'd love to hear it. Cheers, Geert-Jan Britske wrote: I have a custom requesthandler which does some very basic dynamic

solr-139: support for adding fields which are not known at design-time?

2007-10-26 Thread Britske
is it / will it be possible to add priorly non-existing fields to a document with the upcoming solr-139? for instance, would something like this work? add mode=scorex=OVERWRITE doc field name=id* type=1318127/field field name=scorex12/field /doc /add with schema.xml: ... fields field

Re: quickie: do facetfields use same cached items in field cache as FQ-param?

2007-10-12 Thread Britske
Yeah i meant filter-cache, thanks. It seemed that the particular field (cityname) was using a keywordtokenizer (which doens't show at the front) which is why i missed it i guess :-S. This means the term field is tokenized so termEnums-apporach is used. This results in about 10.000 inserts on

Re: quickie: do facetfields use same cached items in field cache as FQ-param?

2007-10-12 Thread Britske
as a related question: is here a way to inspect the queries currently in the filtercache? Britske wrote: Yeah i meant filter-cache, thanks. It seemed that the particular field (cityname) was using a keywordtokenizer (which doens't show at the front) which is why i missed it i guess :-S

implemented StandardReqeustHandler to show top-results per facet-value. Is this the fastest way?

2007-10-11 Thread Britske
Since the title of my original post may not have been so clear, here a repost. //Geert-Jan Britske wrote: First of all, I just wanted to say that I just started working with Solr and really like the results I'm getting from Solr (in terms of performance, flexibility) as well as the good

Re: showing results per facet-value efficiently

2007-10-11 Thread Britske
yup that clarifies things a lot, thanks. Mike Klaas wrote: On 10-Oct-07, at 4:16 AM, Britske wrote: However, I realized that for calculating the count for each of the facetvalues, the original standardrequesthandler already loops the doclist to check for matches. Therefore my

quickie: do facetfields use same cached items in field cache as FQ-param?

2007-10-11 Thread Britske
say I have the following (partial) querystring:...facet=truefacet.field=country field 'country' is not tokenized, not multi-valued, and not boolean, so the field-cache approach is used. Morover, the following (partial) querystring is used as well: ..fq=country:france do these queries share

Re: extending StandardRequestHandler gives ClassCastException

2007-10-10 Thread Britske
Thanks that was the problem! I mistakingly thought the lib-folder containing the jetty.jar etc. was the folder to put the plugins into. After adding a lib-folder to solr-home everything is resolved. Geert-Jan hossman wrote: : SEVERE: java.lang.ClassCastException: :

Re: how to make sure a particular query is ALWAYS cached

2007-10-09 Thread Britske
seperating requests over 2 ports is a nice solution when having multiple user-types. I like that althuigh I don't think i need it for this case. I'm just going to go the 'normal' caching-route and see where that takes me, instead of thinking it can't be done upfront :-) Thanks! hossman

Re: extending StandardRequestHandler gives ClassCastException

2007-10-09 Thread Britske
On Oct 9, 2007, at 9:04 AM, Britske wrote: I'm trying to add a new requestHandler-plugin to Solr by extending StandardRequestHandler. However, when starting solr-server after configuration i get a ClassCastException: SEVERE: java.lang.ClassCastException

Re: extending StandardRequestHandler gives ClassCastException

2007-10-09 Thread Britske
Thanks, but I'm using the updated o.a.s.handler.StandardRequestHandler. I'm going to try on 1.2 instead to see if it changes things. Geert-Jan ryantxu wrote: It still seems odd that I have to include the jar, since the StandardRequestHandler should be picked up in the war right? Is

how to make sure a particular query is ALWAYS cached

2007-10-04 Thread Britske
I want a couple of costly queries to be cached at all times in the queryResultCache. (unless I have a new searcher of course) As for as I know the only parameters to be supplied to the LRU-implementation of the queryResultCache are size-related, which doens't give me this guarentee. what

Re: how to make sure a particular query stays cached (and is not overwritten)

2007-10-04 Thread Britske
the title of my original post was misguided. // Geert-Jan Britske wrote: I want a couple of costly queries to be cached at all times in the queryResultCache. (unless I have a new searcher of course) As for as I know the only parameters to be supplied to the LRU-implementation

Re: how to make sure a particular query is ALWAYS cached

2007-10-04 Thread Britske
hossman wrote: : I want a couple of costly queries to be cached at all times in the : queryResultCache. (unless I have a new searcher of course) first off: you can ensure that certain queries are in the cache, even if there is a newSearcher, just configure a newSearcher Event

RE: how to make sure a particular query is ALWAYS cached

2007-10-04 Thread Britske
the filter query. field:[* TO *] will do nicely. Cheers, Lance Norskog -Original Message- From: Britske [mailto:[EMAIL PROTECTED] Sent: Thursday, October 04, 2007 1:38 PM To: solr-user@lucene.apache.org Subject: Re: how to make sure a particular query is ALWAYS cached