Re: Exception using distributed field-collapsing

2012-06-20 Thread Martijn v Groningen
Hi Bryan, What is the fieldtype of the groupField? You can only group by field that is of type string as is described in the wiki: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters When you group by another field type a http 400 should be returned instead if this error. At least

Re: Issue with field collapsing in solr 4 while performing distributed search

2012-06-11 Thread Martijn v Groningen
The ngroups returns the number of groups that have matched with the query. However if you want ngroups to be correct in a distributed environment you need to put document belonging to the same group into the same shard. Groups can't cross shard boundaries. I guess you need to do some manual

Re: Grouping ngroups count

2012-05-03 Thread Martijn v Groningen
Hi Francois, The issue you describe looks like a similar issue we have fixed before with matches count. Open an issue and we can look into it. Martijn On 1 May 2012 20:14, Francois Perron francois.per...@wantedanalytics.com wrote: Thanks for your response Cody,  First, I used distributed

Re: Solr Cloud vs sharding vs grouping

2012-04-20 Thread Martijn v Groningen
Hi Jean-Sebastien, For some grouping features (like total group count and grouped faceting), the distributed grouping requires you to partition your documents into the right shard. Basically groups can't cross shards. Otherwise the group counts or grouped facet counts may not be correct. If you

Re: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread Martijn v Groningen
The group.facet option only works for field facets (facet.field). Others facets types (query, range and pivot) aren't supported yet. The group.facet works for both single and multivalued fields specified in the facet.field parameter. Martijn On 9 April 2012 20:58, danjfoley d...@micamedia.com

Re: Distributed grouping issue

2012-04-02 Thread Martijn v Groningen
The matches element in the response should return the number of documents that matched with the query and not the number of groups. Did you encountered this issue also with other Solr versions (3.5 or another nightly build)? Martijn On 2 April 2012 09:41, fbrisbart fbrisb...@bestofmedia.com

Re: Distributed grouping issue

2012-04-02 Thread Martijn v Groningen
All documents of a group exist on a single shard, there are no cross-shard groups. You only have to partition documents by group when the groupCount and some other features need to be accurate. For the matches this is not necessary. The matches are summed up during merging the shared

Re: Distributed grouping issue

2012-04-02 Thread Martijn v Groningen
know if you have any luck reproducing. Thanks, Cody -Original Message- From: martijn.is.h...@gmail.com [mailto:martijn.is.h...@gmail.com] On Behalf Of Martijn v Groningen Sent: Monday, April 02, 2012 1:48 PM To: solr-user@lucene.apache.org Subject: Re: Distributed grouping issue

Re: Grouping queries

2012-03-23 Thread Martijn v Groningen
On 22 March 2012 03:10, Jamie Johnson jej2...@gmail.com wrote: I need to apologize I believe that in my example I have too grossly over simplified the problem and it's not clear what I am trying to do, so I'll try again. I have a situation where I have a set of access controls say user,

Re: Grouping queries

2012-03-23 Thread Martijn v Groningen
Where is Join documented? I looked at http://wiki.apache.org/solr/Join and see no reference to fromIndex. Also does this work in a distributed environment? The fromIndex isn't documented in the wiki It is mentioned in the issue and you can find in the Solr code:

Re: Grouping queries

2012-03-21 Thread Martijn v Groningen
I'm not sure if grouping is the right feature to use for your requirements... Grouping does have an impact on performance which you need to take into account. Depending on what grouping features you're going to use (grouped facets, ngroups), grouping performs well on large indices if you use

Re: Solr group witch minimum count in each group

2012-03-21 Thread Martijn v Groningen
Filtering results based on group count isn't supported yet. There is already an issue created for this feature: https://issues.apache.org/jira/browse/SOLR-3152 Martijn On 21 March 2012 11:52, ViruS svi...@gmail.com wrote: Hi, I try to get all duplicated documents in my index. I have

Re: To truncate or not to truncate (group.truncate vs. facet)

2012-03-19 Thread Martijn v Groningen
Hi Rasmus, You might want to use the group.facet parameter: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters I think that will give you the right facet counts with faceting. The parameter is not available in Solr 3.x, so you'll need to use a 4.0 nightly build. Martijn On 19 March

Re: JoinQuery and document score problem

2012-03-06 Thread Martijn v Groningen
Hi Stefan, The score isn't moved from the from side to the to side and as far as I know there isn't a way to configure the scoring of the joined documents. The Solr join query isn't a real join (like in sql) and should be used as filtering mechanism. The best way is to achieve that is to put the

Re: Multiple Sort for Group/Folding

2012-01-11 Thread Martijn v Groningen
Hi Mauro, During the first pass search the sort param is used to determine the top N groups. Then during the second pass search the documents inside the top N groups are sorted using the group.sort parameter. The group.sort doesn't change how the groups them self are sorted. Martijn On 11

Re: Setting group.ngroups=true considerable slows down queries

2011-12-12 Thread Martijn v Groningen
quite some heap space (also without group.ngroups=true). Martijn On 9 December 2011 23:08, Michael Jakl jakl.mich...@gmail.com wrote: Hi! On Fri, Dec 9, 2011 at 17:41, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: On what field type are you grouping and what version of Solr

Re: Setting group.ngroups=true considerable slows down queries

2011-12-12 Thread Martijn v Groningen
of unique groups. Martijn On 12 December 2011 14:32, Michael Jakl jakl.mich...@gmail.com wrote: Hi! On Mon, Dec 12, 2011 at 13:57, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: As as I know currently there isn't another way. Unfortunately the performance degrades badly when having

Re: Field collapsing results caching

2011-12-09 Thread Martijn v Groningen
There is no cross query cache for result grouping. The only caching option out there is the group.cache.percent option: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters Martijn On 8 December 2011 14:29, Kissue Kissue kissue...@gmail.com wrote: Hi, I was just testing field

Re: Setting group.ngroups=true considerable slows down queries

2011-12-09 Thread Martijn v Groningen
Hi Micheal, On what field type are you grouping and what version of Solr are you using? Grouping by string field is faster. Martijn On 9 December 2011 12:46, Michael Jakl jakl.mich...@gmail.com wrote: Hi, I'm using the grouping feature of Solr to return a list of unique documents together

Re: Group.ngroup parameter memory consumption

2011-11-12 Thread Martijn v Groningen
BTW this applies for 4.0-dev. In 3x the String instance from a StringIndex is directly used, this is then put into a list. So there is no extra object instance created per group matching the query. Martijn On 12 November 2011 08:49, Rafał Kuć r@solr.pl wrote: Hello! Thanks, that's what I

Re: Group.ngroup parameter memory consumption

2011-11-11 Thread Martijn v Groningen
The ngroup option collects per search the number of unique groups matching the query. Based on the collected groups it returns the count. So it depends of the number of groups matching the query. To get more in detail: per unique group a ByteRef instance is created to represent a group and this

Re: Selective Result Grouping

2011-11-03 Thread Martijn v Groningen
open an issue for this? Martijn On 1 November 2011 19:58, entdeveloper cameron.develo...@gmail.com wrote: Martijn v Groningen-2 wrote: When using the group.field option values must be the same otherwise they don't get grouped together. Maybe fuzzy grouping would be nice. Grouping videos

Re: facet with group by (or field collapsing)

2011-11-03 Thread Martijn v Groningen
collapse.facet=after doesn't exists in Solr 3.3. This parameter exists in the SOLR-236 patches and is implemented differently in the released versions of Solr. From Solr 3.4 you can use group.truncate. The facet counts are then computed based on the most relevant documents per group. Martijn On

Re: UnInvertedField vs FieldCache for facets for single-token text fields

2011-11-03 Thread Martijn v Groningen
Hi Micheal, The FieldCache is an easier data structure and easier to create, so I also expect it to be faster. Unfortunately for TextField UnInvertedField is always used even if you have one token per document. I think overriding the multiValuedFieldCache method and return false would work. If

Re: joins and filter queries effecting scoring

2011-10-28 Thread Martijn v Groningen
Have your tried using the join in the fq instead of the q? Like this (assuming user_id_i is a field in the post document type and self_id_i a field in the user document type): q=posts_text:hellofq={!join from=self_id_i to=user_id_i}is_active_boolean:true In this example the fq produces a docset

Re: Solr 3.4 group.truncate does not work with facet queries

2011-10-28 Thread Martijn v Groningen
Hi Ian, I think this is a bug. After looking into the code the facet.query feature doesn't take into account the group.truncate option. This needs to be fixed. You can open a new issue in Jira if you want to. Martijn On 28 October 2011 12:09, Ian Grainger i...@isfluent.com wrote: Hi, I'm using

Re: About the indexing process

2011-10-25 Thread Martijn v Groningen
Hi Amos, How are you currently indexing files? Are you indexing Solr input documents or just regular files? You can use Solr cell to index binary files: http://wiki.apache.org/solr/ExtractingRequestHandler Martijn On 25 October 2011 10:21, 刘浪 liu.l...@eisoo.com wrote: Hi,     I appreciate

Re: Selective Result Grouping

2011-10-23 Thread Martijn v Groningen
The current grouping functionality using group.field is basically all-or-nothing: all documents will be grouped by the field value or none will. So there would be no way to, for example, collapse just the videos or images like they do in google. When using the group.field option values must be

Re: Solr 3.4 Grouping group.main=true results in java.lang.NoClassDefFound

2011-09-28 Thread Martijn v Groningen
Hi Frank, How is Solr deployed? And how did you upgrade? The commons-lang library (containing ArrayUtils) is included in the Solr war file. Martijn On 28 September 2011 09:16, Frank Romweber fr...@romweber.de wrote: I use drupal for accessing the solr search engine. After updating an creating

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Martijn v Groningen
Also the error you described when wt=xml and using SolrJ is also fixed in 3.4 (and in trunk / branch3x). You can wait for the 3.4 release of use a night 3x build. Martijn On 12 September 2011 12:41, Sanal K Stephen sanalkstep...@gmail.com wrote: Kirill,         Parsing the grouped result

Re: Nested documents

2011-09-12 Thread Martijn v Groningen
To support this, we also need to implement indexing block of documents in Solr. Basically the UpdateHandler should also use this method: IndexWriter#addDocuments(Collection documents) On 12 September 2011 01:01, Michael McCandless luc...@mikemccandless.com wrote: Even if it applies, this is for

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Martijn v Groningen
, 2011 at 5:45 PM, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: Also the error you described when wt=xml and using SolrJ is also fixed in 3.4 (and in trunk / branch3x). You can wait for the 3.4 release of use a night 3x build. Martijn On 12 September 2011 12:41, Sanal K Stephen

Re: Sorting groups by numFound group size

2011-09-10 Thread Martijn v Groningen
Not yet. If you want you can create an issue for sorting groups by numFound. On 9 September 2011 18:49, O. Klein kl...@octoweb.nl wrote: I am also looking for way to sort on numFound. Has an issue been created? -- View this message in context:

Re: TermsComponent from deleted document

2011-09-10 Thread Martijn v Groningen
I'd use the suggester: http://wiki.apache.org/solr/Suggester The suggester can give a collation. The TermsComponent can't do that. The suggester builds on top of the spellchecking infrastructure, so should be easy to use if you're familiar with that. Martijn On 10 September 2011 08:37, Manish

Re: Sorting groups by numFound group size

2011-09-08 Thread Martijn v Groningen
No, as far as I know sorting by group count isn't planned. You can create an issue in Jira where future development of this feature can be tracked. On 7 September 2011 23:54, bobsolr xbvbvccvb...@hotmail.com wrote: Hi Martijn, Thanks for the reply. Unfortunately I can't reference the group

Re: Sorting groups by numFound group size

2011-09-07 Thread Martijn v Groningen
Sorting groups by numfound isn't possible. You can sort groups by specifying a function or a field (from your schema) in the sort parameter. The numFound isn't a field so that is why you can't sort on it. Martijn On 7 September 2011 08:17, bobsolr xbvbvccvb...@hotmail.com wrote: Hi, I'm

Re: Reading results from FieldCollapsing

2011-08-31 Thread Martijn v Groningen
The CollapseComponent was never comitted. This class exists in the SOLR-236 patches. You don't need to change the configuration in order to use grouping. The blog you mentioned is based on the SOLR-236 patches. The current grouping in Solr 3.3 has superseded these patches. From Solr 3.4 (not yet

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
Hi Omri, I think you can achieve that with grouping and the Solr StatsComponent ( http://wiki.apache.org/solr/StatsComponent). In order to compute statistics on groups you must set the option group.truncate=true An example query:

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
Or if you dont care about grouped results you can also add the following option: stats.facet=gender On 25 August 2011 14:40, Martijn v Groningen martijn.v.gronin...@gmail.comwrote: Hi Omri, I think you can achieve that with grouping and the Solr StatsComponent ( http://wiki.apache.org/solr

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
If you take this query from the wiki: http://localhost:8983/solr/select?q=*:*stats=truestats.field=pricestats.field=popularitystats.twopass=truerows=0indent=truestats.facet=inStock In this case you get stats about the popularity per inStock value (true / false). Replacing this values with weight

Re: Results Group-By using SolrJ

2011-08-14 Thread Martijn v Groningen
Hi Omri, SOLR-2637 was concerned with adding grouped response parsing. There is no convenience method for grouping, but you can use the normal SolrQuery#set(...) methods to enable grouping. The following code should enable grouping via SolrJ api: SolrQuery query = new SolrQuery();

Re: SOLR 3.3.0 multivalued field sort problem

2011-08-13 Thread Martijn v Groningen
The first solution would make sense to me. Some kind of a strategy mechanism for this would allow anyone to define their own rules. Duplicating results would be confusing to me. On 13 August 2011 18:39, Michael Lackhoff mich...@lackhoff.de wrote: On 13.08.2011 18:03 Erick Erickson wrote: The

Re: SOLR 3.3.0 multivalued field sort problem

2011-08-12 Thread Martijn v Groningen
Hi Johnny, Sorting on a multivalued field has never really worked in Solr. Solr versions = 1.4.1 allowed it, but there was a change that an error occurred and that the sorting might not be what you expect. From Solr 3.1 and up sorting on a multivalued isn't allowed and a http 400 is returned.

Re: SOLR Support for Lucene Nested Documents

2011-08-05 Thread Martijn v Groningen
Hi Josh, Solr doesn't expose this Lucene feature yet. To support this Solr needs to be able to index documents in a single block. Also the BlockJoinQuery needs to be exposed to Solr (this can easily happen via a QParserPlugin). Martijn On 5 August 2011 00:00, Joshua Harness

Re: Minimum Score

2011-08-05 Thread Martijn v Groningen
As far as I know there is no built-in solution for this like there is for max score. An alternative approach to the one already mentioned is to send a second request with rows=1 and sort=score asc This will return the lowest scoring document and you can then retrieve the score from that document

Re: A rant about field collapsing

2011-08-04 Thread Martijn v Groningen
The development of the field collapse feature is a long and confusing story. The main point is that SOLR-236 was never going to scale and the performance in general was bad. A new approach was needed. This was implemented in SOLR-1682 and added to the trunk (4.0-dev) around September last year.

Re: A rant about field collapsing

2011-08-04 Thread Martijn v Groningen
Well, the original page moved to: http://wiki.apache.org/solr/FieldCollapsingUncommitted Assuming that you're using Solr 3.3 you can't get the grouped result (lst name=grouped) with SolrJ. I added grouping support to SolrJ some time ago and will be in Solr 3.4. You can use a nightly 3.x build to

Re: ideas for versioning query?

2011-08-01 Thread Martijn v Groningen
Hi Mike, how many docs and groups do you have in your index? I think the group.sort option fits your requirements. If I remember correctly group.ngroup=true adds something like 30% extra time on top of the search request with grouping, but that was on my local test dataset (~30M docs, ~8000

Re: SolrJ and class versions

2011-07-26 Thread Martijn v Groningen
Where you upgrading from Solr 1.4? SolrJ uses by default for querying the javabin format (wt parameter). The javabin format is not compatible between 1.4 and 3.1 and above. So If your clients where running with SolrJ 1.4 versions I would expect errors to occur. Martijn On 25 July 2011 12:15,

Re: SolrJ and class versions

2011-07-26 Thread Martijn v Groningen
...@scanmine.com wrote: On 07/26/2011 09:26 AM, Martijn v Groningen wrote: Where you upgrading from Solr 1.4? Yep. SolrJ uses by default for querying the javabin format (wt parameter). The javabin format is not compatible between 1.4 and 3.1 and above. So If your clients where running with SolrJ 1.4

Re: Possible bug in Solr 3.3 grouping

2011-07-12 Thread Martijn v Groningen
Hi Nikhil, Thanks for raising this issue. I checked this particular issue in a test case and I ran into the same error, so this is indeed a bug. I've fixed this issue for 3x in revision 1145748. So checking out the latest 3x branch and building Solr yourself should give you this bug fix. Or you

Re: SolrJ and Range Faceting

2011-06-13 Thread Martijn v Groningen
wrote: Martjin, I had not considered doing something like manufacturedate_dt:[2007-02-13T15:26:37Z TO 2007-02-13T15:26:37Z+1YEAR] does this work? If so that completely eliminates the need to use the date math parsers right? On Sun, Jun 12, 2011 at 9:10 AM, Martijn v Groningen

Re: SolrJ and Range Faceting

2011-06-12 Thread Martijn v Groningen
version. Again thanks! On Sat, Jun 11, 2011 at 8:15 AM, Martijn v Groningen martijn.is.h...@gmail.com wrote: Hi James, Good idea! I'll add a getAsFilterQuery method to the patch. Martijn On 6 June 2011 19:32, Jamie Johnson jej2...@gmail.com wrote: Small error, shouldn't be using

Re: SolrJ and Range Faceting

2011-06-11 Thread Martijn v Groningen
+ TO + endStr; return facetField.getName() + :[ + label + ]); } On Fri, Jun 3, 2011 at 7:05 AM, Martijn v Groningen martijn.is.h...@gmail.com wrote: Hi Jamie, I don't know why range facets didn't make it into SolrJ. But I've recently opened an issue

Re: SolrJ and Range Faceting

2011-06-03 Thread Martijn v Groningen
Hi Jamie, I don't know why range facets didn't make it into SolrJ. But I've recently opened an issue for this: https://issues.apache.org/jira/browse/SOLR-2523 I hope this will be committed soon. Check the patch out and see if you like it. Martijn On 2 June 2011 18:22, Jamie Johnson

Re: Result Grouping always returns grouped output

2011-06-02 Thread Martijn v Groningen
Hi Karel, group.main=true should do the trick. When that is set to true the group.format is always simple. Martijn On 27 May 2011 19:13, kare...@gmail.com kare...@gmail.com wrote: Hello, I am using the latest nightly build of Solr 4.0 and I would like to use grouping/field collapsing while

Re: [POLL] How do you (like to) do logging with Solr

2011-05-16 Thread Martijn v Groningen
[ ] I always use the JDK logging as bundled in solr.war, that's perfect [ ] I sometimes use log4j or another framework and am happy with re-packaging solr.war [X] Give me solr.war WITHOUT an slf4j logger binding, so I can choose at deploy time [ ] Let me choose whether to bundle a binding or

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Martijn v Groningen
[] ASF Mirrors (linked in our release announcements or via the Lucene website) [ X ] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [ X ] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them internally or via a downstream project)

Re: Field Collapse question

2010-07-04 Thread Martijn v Groningen
Hi Ken, Not collapsing on null field values is not possible in the patch. However you can if you want to fix this in the patch it is a really small change. Assuming that you're using the default collapsing algorithm you can add the following piece of code in the NonAdjacentDocumentCollapser.java

Re: SOLR-236 Patch

2010-06-25 Thread Martijn v Groningen
Hi Sam, It seems that the patch is out of sync again with the trunk. Can you try patching with revision 955615? I'll update the patch shortly. Martijn On 24 June 2010 09:49, Amdebirhan, Samson, VF-Group samson.amdebir...@vodafone.com wrote: Hi Trying to apply the SOLR-236 patch to a trunk

Re: Field Collapsing SOLR-236

2010-06-22 Thread Martijn v Groningen
in this thread (i.e. rev 955615) but cannot find it in the repository. (it has revision 955569 followed by revision 955785). Any pointers?? Regards Raakhi On Tue, Jun 22, 2010 at 2:03 AM, Martijn v Groningen martijn.is.h...@gmail.com wrote: Oh in that case is the code stable enough to use

Re: collapse exception

2010-06-22 Thread Martijn v Groningen
: I don't know because it's patched by someone else but I can't get his help. When this component become a contrib? Using patch is so annoying 2010/6/22 Martijn v Groningen martijn.is.h...@gmail.com: What version of Solr and which patch are you using? On 21 June 2010 11:46, Li Li fancye

Re: Field Collapsing SOLR-236

2010-06-21 Thread Martijn v Groningen
but then i am not able to sort on any other field. is there any workaround to support this feature?? Regards, Raakhi On Fri, Jun 18, 2010 at 6:14 PM, Martijn v Groningen martijn.is.h...@gmail.com wrote: Hi Rakhi, The patch is not compatible with 1.4. If you want to work with the trunk. I'll

Re: collapse exception

2010-06-21 Thread Martijn v Groningen
What version of Solr and which patch are you using? On 21 June 2010 11:46, Li Li fancye...@gmail.com wrote: it says  Either filter or filterList may be set in the QueryCommand, but not both. I am newbie of solr and have no idea of the exception. What's wrong with it? thank you.

Re: Field Collapsing SOLR-236

2010-06-18 Thread Martijn v Groningen
at 1:24 AM, Moazzam Khan moazz...@gmail.com wrote: I knew it wasn't me! :) I found the patch just before I read this and applied it to the trunk and it works! Thanks Mark and martijn for all your help! - Moazzam On Thu, Jun 17, 2010 at 2:16 PM, Martijn v Groningen martijn.is.h

Re: Field Collapsing SOLR-236

2010-06-17 Thread Martijn v Groningen
I've added a new patch to the issue, so building the trunk (rev 955615) with the latest patch should not be a problem. Due to recent changes in the Lucene trunk the patch was not compatible. On 17 June 2010 20:20, Erik Hatcher erik.hatc...@gmail.com wrote: On Jun 16, 2010, at 7:31 PM, Mark

Re: question about the fieldCollapseCache

2010-06-09 Thread Martijn v Groningen
The fieldCollapseCache should not be used as it is now, it uses too much memory. It stores any information relevant for a field collapse search. Like document collapse counts, collapsed document ids / fields, collapsed docset and uncollapsed docset (everything per unique search). So the memory

Re: question about the fieldCollapseCache

2010-06-09 Thread Martijn v Groningen
I agree. I'll add this information to the wiki. On 9 June 2010 14:32, Jean-Sebastien Vachon js.vac...@videotron.ca wrote: ok great. I believe this should be mentioned in the wiki. Later On 2010-06-09, at 4:06 AM, Martijn v Groningen wrote: The fieldCollapseCache should not be used

Re: Applying collapse patch

2010-05-28 Thread Martijn v Groningen
The trunk should work with the latest patch (SOLR-236-trunk.patch). Did patching go successful? What compilation errors you get? On 28 May 2010 11:10, Sophie M. sop...@beezik.com wrote: Ok I will have a look on the comments and I will post if necessary. Thanks ^^ -- View this message in

Re: Applying collapse patch

2010-05-28 Thread Martijn v Groningen
Have you executed: ant example after building? (Assuming that this is the example solr) On 28 May 2010 12:17, Sophie M. sop...@beezik.com wrote: It is ok for applying the patch, thanks Martin. When I start Solr I get this logs in my console :

Re: Field Collapsing SOLR-236

2010-03-25 Thread Martijn v Groningen
Hi Blargy, The latest path is not compatible with 1.4, I believe that the latest field-collapse-5.patch file is compatible with 1.4. The file should at least compile with 1.4 trunk. I'm not sure how the performance is. Martijn On 25 March 2010 01:49, Dennis Gearon gear...@sbcglobal.net wrote:

Re: Dedupe of document results at query-time

2010-01-23 Thread Martijn v Groningen
This manner of detecting duplicates at query time does really match with what field collapsing does. So I suggest you look into that. As far as I know there isn't any function query that does something you have described in your example. Cheers, Martijn On 23 January 2010 12:31, Peter S

Re: Solr 1.4 Field collapsing - What are the steps for applying the SOLR-236 patch?

2010-01-12 Thread Martijn v Groningen
I wouldn't use the patches of the sub issues right now as they are under development right now (the are currently a POC). I also think that the latest patch in SOLR-236 is currently the best option. There are some memory related problems with the patch that have to do with caching. The

Re: Field Collapsing - disable cache

2009-12-23 Thread Martijn v Groningen
what i did last time to get field-collapse-5.patch working successfully. On Tue 22/12/09 22:43 , Lance Norskog  wrote: To avoid this possible bug, you could change the cache to only have a few entries. On Tue, Dec 22, 2009 at 6:34 AM, Martijn v Groningen wrote: In the latest patch some

Re: Field Collapsing - disable cache

2009-12-23 Thread Martijn v Groningen
(ThreadPool.java:685) at java.lang.Thread.run(Thread.java:636) On Wed 23/12/09 13:26 , r...@intelcompute.com wrote: Thanks, that latest update to the patch works fine now. On Wed 23/12/09 13:13 , Martijn v Groningen  wrote: Latest SOLR-236.patch is for the trunk, if have updated

Re: Field Collapsing - disable cache

2009-12-22 Thread Martijn v Groningen
Hi Rob, What patch are you actually using from SOLR-236? Martijn 2009/12/22 r...@intelcompute.com: I've tried both, the whole fieldCollapsing tag, and just the fieldCollapseCache tag inside it.        both cause error.        I guess I can just set size, initialSize, and autowarmCount to 0

Re: Field Collapsing - disable cache

2009-12-22 Thread Martijn v Groningen
In the latest patch some changes where made on the configuration side, but if you add the CollapseComponent to the conf no field collapse cache should be enabled. If not let me know. Martijn 2009/12/22 r...@intelcompute.com: On Tue 22/12/09 12:28 , Martijn v Groningen martijn.is.h

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

2009-12-13 Thread Martijn v Groningen
have applied the patch on the Solr 1.4 build. I am not using the latest solr nightly build. Can that cause any problem? -- Thanks Varun Gupta On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen martijn.is.h...@gmail.com wrote: I tried to reproduce a similar situation here, but I got

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

2009-12-10 Thread Martijn v Groningen
Hi Varun, Can you send the whole requests (with params), that you send to Solr for both queries? In your situation the collapse parameters only have to be used for the first query and not the second query. Martijn 2009/12/10 Varun Gupta varun.vgu...@gmail.com: Hi, I have documents under 6

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

2009-12-10 Thread Martijn v Groningen
On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen martijn.is.h...@gmail.com wrote: Hi Varun, Can you send the whole requests (with params), that you send to Solr for both queries? In your situation the collapse parameters only have to be used for the first query and not the second

Re: Grouping

2009-12-06 Thread Martijn v Groningen
Field collapsing has some aggregation functions like sum() and avg(), but the statistics are computed based on collapse groups instead of all documents with the same field value. A collapse group contains documents that were not relevant enough to end up (collapsed documents) in the search result

Re: Deduplication in 1.4

2009-11-26 Thread Martijn v Groningen
Field collapsing has been used by many in their production environment. The last few months the stability of the patch grew as quiet some bugs were fixed. The only big feature missing currently is caching of the collapsing algorithm. I'm currently working on that and I will put it in a new patch

Re: Deduplication in 1.4

2009-11-26 Thread Martijn v Groningen
From: Martijn v Groningen martijn.is.h...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, November 26, 2009 3:19:40 AM Subject: Re: Deduplication in 1.4 Field collapsing has been used by many in their production environment. Got any pointers to public sites you know use it?  I

Re: question about collapse.type = adjacent

2009-11-02 Thread Martijn v Groningen
Hi Micheal, Field collapsing is basicly done in two steps. The first step is to get the uncollapsed sorted (whether it is score or a field value) documents and the second step is to apply the collapse algorithm on the uncollapsed documents. So yes, when specifying collapse.type=adjacent the

Re: weird problem with letters S and T

2009-10-28 Thread Martijn v Groningen
I think that is not a problem, because your are only storing one character per field. There are other text field types that do not have the stop word filter, so give your first letter field that field type. In this way stopword filter analyser is only disabled for searches on the first letter

Re: field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)

2009-10-25 Thread Martijn v Groningen
Hi Joe, Can you give a bit more context info? Like the exact search and the field types you are using for example. Also are you doing a lot of frequent updates to the index? Cheers, Martijn 2009/10/23 Joe Calderon calderon@gmail.com: seems to happen when sort on anything besides strictly

Re: field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)

2009-10-25 Thread Martijn v Groningen
/10/25 Martijn v Groningen martijn.is.h...@gmail.com: Hi Joe, Can you give a bit more context info? Like the exact search and the field types you are using for example. Also are you doing a lot of frequent updates to the index? Cheers, Martijn 2009/10/23 Joe Calderon calderon@gmail.com

Re: Collapse with multiple fields

2009-10-23 Thread Martijn v Groningen
No this actually not supported at the moment. If you really need to collapse on two different field you can concatenate the two fields together in another field while indexing and then collapse on that field. Martijn 2009/10/23 Thijs vonk.th...@gmail.com: I haven't had time to actually ask this

Re: JVM OOM when using field collapse component

2009-10-02 Thread Martijn v Groningen
No I have not encountered OOM exception yet with current field collapse patch. How large is your configured JVM heap space (-Xmx)? Field collapsing requires more memory then regular searches so. Does Solr run out of memory during the first search(es) or does it run out of memory after a while when

Re: field collapsing sums

2009-10-02 Thread Martijn v Groningen
Well that is odd. How have you configured field collapsing with the dismax request handler? The collapse counts should X - 1 (if collapse.threshold=1). Martijn 2009/10/1 Joe Calderon calderon@gmail.com: thx for the reply, i just want the number of dupes in the query result, but it seems i

Re: field collapsing sums

2009-10-01 Thread Martijn v Groningen
Hi Joe, Currently the patch does not do that, but you can do something else that might help you in getting your summed stock. In the latest patch you can include fields of collapsed documents in the result per distinct field value. If your specify collapse.includeCollapseDocs.fl=num_in_stock in

Re: field collapsing sums

2009-10-01 Thread Martijn v Groningen
only dupes of records on the first page are returned or is tehre a a setting im missing? currently im only sending, collapse.field=brand and collapse.includeCollapseDocs.fl=num_in_stock --joe On Thu, Oct 1, 2009 at 1:14 AM, Martijn v Groningen martijn.is.h...@gmail.com wrote: Hi Joe

Re: Mapping SolrDoc to SolrInputDoc

2009-09-16 Thread Martijn v Groningen
Hi Licinio, You can use ClientUtils.toSolrInputDocument(...), that converts a SolrDocument to a SolrInputDocument. Martijn 2009/9/16 Licinio Fernández Maurelo licinio.fernan...@gmail.com: Hi there, currently i'm working on a small app which creates an Embedded Solr Server, reads all

Re: Monitoring split time for fq queries when filter cache is used

2009-09-01 Thread Martijn v Groningen
Hi Rahul, Yes you are understanding is correct, but it is not possible to monitor these actions separately with Solr. Martijn 2009/9/1 Rahul R rahul.s...@gmail.com: Hello, I am trying to measure the benefit that I am getting out of using the filter cache. As I understand, there are two major

Re: Problems importing HTML content contained within XML document

2009-08-19 Thread Martijn v Groningen
Hi Venn, I think what is happening when the BODY element is being processed by xpath expressen (/document/category/BODY), is that it does not retrieve the text content from the P elements inside the body element. The expression will only retrieve text content that is directly a child of the BODY