Re: Solr faceting vs. Lucene faceting

2012-12-14 Thread Upayavira
Solr has its PathHierarchyTokenFilter which can tokenise: /books/computers/programming into /books /books/computers /books/computers/programming You can facet on that. Of course, part of the work is done at index time, which appears to be no different to the Lucene faceting method, at least for

Re: Solr faceting vs. Lucene faceting

2012-12-14 Thread Yonik Seeley
On Fri, Dec 14, 2012 at 4:27 AM, Upayavira u...@odoko.co.uk wrote: Solr has its PathHierarchyTokenFilter which can tokenise: /books/computers/programming into /books /books/computers /books/computers/programming You can facet on that. Of course, part of the work is done at index time,

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Robert Muir
even as a step it would be nice to have lucene's faceting exposed to solr in a way that only works with a single node. because it supports NRT, doesnt need to build up massive top-level datastructures and so on, many people that currently need multiple nodes might be able to work just fine with a

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Shai Erera
As I said, if someone volunteers to do some work on the Solr side, I will gladly participate in that effort. I just don't even know where to start w/ Solr :). One thing that would be really great is if we can build an adapter (I think someone mentioned that word here) which supports basic facets

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Adrien Grand
Hi Shai, On Thu, Dec 13, 2012 at 12:21 PM, Shai Erera ser...@gmail.com wrote: As I said, if someone volunteers to do some work on the Solr side, I will gladly participate in that effort. I just don't even know where to start w/ Solr :). The entry point for Solr facets is

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Jack Krupansky
to the developers of higher-level search platforms such as Solr and ElasticSearch as opposed to the users of those platforms? -- Jack Krupansky -Original Message- From: Adrien Grand Sent: Thursday, December 13, 2012 7:03 AM To: dev@lucene.apache.org Subject: Re: Solr faceting vs. Lucene

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Shai Erera
faceting vs. Lucene faceting Hi Shai, On Thu, Dec 13, 2012 at 12:21 PM, Shai Erera ser...@gmail.com wrote: As I said, if someone volunteers to do some work on the Solr side, I will gladly participate in that effort. I just don't even know where to start w/ Solr :). The entry point for Solr

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Shai Erera
Hi Adrien, the lucene module requires users to decide at indexing time what and how to facet whereas Solr does everything at searching time True, that's one difference between the two implementations today, even though I think that we can create a specialized path (under LUCENE-4619) for

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Adrien Grand
Hi Shai, Thanks for your answers! On Thu, Dec 13, 2012 at 5:05 PM, Shai Erera ser...@gmail.com wrote: the lucene module requires users to decide at indexing time what and how to facet whereas Solr does everything at searching time True, that's one difference between the two implementations

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Jack Krupansky
Thanks. Now back to thinking about Lucene vs. Solr facets in Solr. -- Jack Krupansky From: Shai Erera Sent: Thursday, December 13, 2012 10:45 AM To: dev@lucene.apache.org Subject: Re: Solr faceting vs. Lucene faceting Hi Jack, Are Lucene facets static in some/any sense? Lucene facets

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Smiley, David W.
I second this use-case. This is my only concern with Solr faceting — Solr's UnInvertedField on the search index to discover frequently used words. It doesn't scale well. Shai; do you think this would scale? FWIW one of my indexes with only 300k docs has ~3.1M terms — not a lot but it's a

Re: Solr faceting vs. Lucene faceting

2012-12-13 Thread Shai Erera
Computing facets on the fly is interesting. Indeed, if you want to use the taxonomy index, you have to plan for this in advance, by say adding each term to the taxonomy under '/' and ask to count '/'. If your index is static, then not being able to delete from the taxonomy index won't be a

Re: Solr faceting vs. Lucene faceting

2012-12-11 Thread Yonik Seeley
On Tue, Dec 11, 2012 at 2:06 AM, Shai Erera ser...@gmail.com wrote: The taxonomy manages the global ordinals for categories. I wonder if there's a way to do global ordinals w/ a codec instead of a sidecar index? -Yonik http://lucidworks.com

Re: Solr faceting vs. Lucene faceting

2012-12-11 Thread Robert Muir
On Tue, Dec 11, 2012 at 11:02 AM, Yonik Seeley yo...@lucidworks.com wrote: On Tue, Dec 11, 2012 at 2:06 AM, Shai Erera ser...@gmail.com wrote: The taxonomy manages the global ordinals for categories. I wonder if there's a way to do global ordinals w/ a codec instead of a sidecar index? I'm

Re: Solr faceting vs. Lucene faceting

2012-12-11 Thread Tommaso Teofili
2012/12/11 Robert Muir rcm...@gmail.com On Tue, Dec 11, 2012 at 11:02 AM, Yonik Seeley yo...@lucidworks.com wrote: On Tue, Dec 11, 2012 at 2:06 AM, Shai Erera ser...@gmail.com wrote: The taxonomy manages the global ordinals for categories. I wonder if there's a way to do global ordinals

Re: Solr faceting vs. Lucene faceting

2012-12-11 Thread Lukáš Vlček
Hi Shai, thanks for your blog, I am looking forward to your future posts! Just two questions: you mentioned that you have been running this in production in distributed mode. If I understand it correctly the idea is there is only a single taxonomy index even if the distributed mode means that

Re: Solr faceting vs. Lucene faceting

2012-12-11 Thread Shai Erera
There are two ways you can work with the taxonomy index in a distributed environment (at least, these are the things that we've tried): (1) replicate the taxonomy to all shards, so that each shard sees the entire global taxonomy (2) each shard maintains its own taxonomy. (1) only makes sense when

Re: Solr faceting vs. Lucene faceting

2012-12-10 Thread Otis Gospodnetic
Thanks Yonik. Would it also make sense to add Solr's faceting method to Lucene's faceting module? Thanks, Otis On Sun, Dec 9, 2012 at 6:38 PM, Yonik Seeley yo...@lucidworks.com wrote: On Sun, Dec 9, 2012 at 5:55 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Are there plans to

Re: Solr faceting vs. Lucene faceting

2012-12-10 Thread Shai Erera
Hi The faceting module in Lucene is very generic and extendable in many ways. From the little I read about Solr facets, I think that all of its features can be implemented on top of Lucene facets. Some directly with the code that exists today, some with writing few extensions points. I don't

Re: Solr faceting vs. Lucene faceting

2012-12-10 Thread David Smiley (@MITRE.org)
Shai Erera wrote Yonik, unlike Solr facets (which manage everything in the search index), the Lucene module comes with a sidecar taxonomy index, so e.g. when Solr replicates shards, it will need to replicate one other index files. That's the big difference, the rest are miniscule I think. And

Re: Solr faceting vs. Lucene faceting

2012-12-10 Thread Shai Erera
You're right, the sidecar index does bring some challenges into the picture, but we're using it like that for many years, in distributed mode too, and so far it wasn't an issue. I opened LUCENE-3786 to create SearcherTaxoManager which lets you manage an IndexSearcher and TaxonomyReader pairs, like

Re: Solr faceting vs. Lucene faceting

2012-12-09 Thread Yonik Seeley
On Sun, Dec 9, 2012 at 5:55 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Are there plans to switch Solr to Lucene's faceting? Nope. There is no one best algorithm - different approaches work best in different circumstances. We've added faceting implementations to Solr over time, and