RE: Ebay Kleinanzeigen and Auto Suggest

2011-05-03 Thread Charton, Andre
Hi,

yes we do. 

If you use a limit number of categories (like 100) you can use dynamic fields 
with the termscomponent and by choosing a category specific prefix, like:

{schema.xml}
...
dynamicField name=*_suggestion type=textAS indexed=true stored=false 
multiValued=true omitNorms=true/
...
{schema.xml}

And within data import handler we script prefix from given category:

{data-config.xml}
function setCatPrefixFields(row) {
var catId = row.get('category');
var title = row.get('freetext');
var cat_prefix = c + catId + _suggestion;
return row;
}
{data-config.xml}

Then you we adapt these in our application layer by a specific request handler, 
regarding these prefix.

Pro:
- works fine for limit number of categories

Con:
- index is getting bigger, we measure increasing by ~40 percent

Regards

André Charton


-Original Message-
From: Eric Grobler [mailto:impalah...@googlemail.com] 
Sent: Wednesday, April 27, 2011 9:56 AM
To: solr-user@lucene.apache.org
Subject: Re: Ebay Kleinanzeigen and Auto Suggest

Hi Otis,

The new Solr 3.1 Suggester also does not support filter queries.

Is anyone using shingles with faceting on large data?

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Hi Eric,

 Before using the terms component, allow me to point out:

 * http://sematext.com/products/autocomplete/index.html (used on
 http://search-lucene.com/ for example)

 * http://wiki.apache.org/solr/Suggester


 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Eric Grobler impalah...@googlemail.com
  To: solr-user@lucene.apache.org
  Sent: Tue, April 26, 2011 1:11:11 PM
  Subject: Ebay Kleinanzeigen and Auto Suggest
 
  Hi
 
  Someone told me that ebay is using solr.
  I was looking at their  Auto Suggest implementation and I guess they are
  using Shingles and the  TermsComponent.
 
  I managed to get a satisfactory implementation but I have  a problem with
  category specific filtering.
  Ebay suggestions are sensitive  to categories like Cars and Pets.
 
  As far as I understand it is not  possible to using filters with a term
  query.
  Unless one uses multiple  fields or special prefixes for the words to
 index I
  cannot think how to  implement this.
 
  Is their perhaps a workaround for this  limitation?
 
  Best  Regards
  EricZ
 
  ---
 
  I am have  a shingle type like:
  fieldType name=shingle_text  class=solr.TextField
  positionIncrementGap=100
  analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter  class=solr.ShingleFilterFactory minShingleSize=2
  maxShingleSize=4  /
 filter class=solr.LowerCaseFilterFactory /
 /analyzer
  /fieldType
 
 
 
  and a query like
 
 http://localhost:8983/solr/terms?q=*%3A*terms.fl=suggest_textterms.sort=countterms.prefix=audi
 i
 



RE: Ebay Kleinanzeigen and Auto Suggest

2011-05-03 Thread Andy

--- On Tue, 5/3/11, Charton, Andre achar...@ebay-kleinanzeigen.de wrote:
 
 yes we do. 
 
 If you use a limit number of categories (like 100) you can
 use dynamic fields with the termscomponent and by choosing a
 category specific prefix, like:
 
 {schema.xml}
 ...
 dynamicField name=*_suggestion type=textAS
 indexed=true stored=false multiValued=true
 omitNorms=true/
 ...
 {schema.xml}
 
 And within data import handler we script prefix from given
 category:
 
 {data-config.xml}
         function
 setCatPrefixFields(row) {
            
 var catId = row.get('category');
            
 var title = row.get('freetext');
            
 var cat_prefix = c + catId + _suggestion;
            
 return row;
         }
 {data-config.xml}
 
 Then you we adapt these in our application layer by a
 specific request handler, regarding these prefix.
 
 Pro:
     - works fine for limit number of
 categories
 
 Con:
     - index is getting bigger, we measure
 increasing by ~40 percent


Very interesting.

Why did the index get bigger? You're still indexing the same title, just to 
different dynamic fields, right? So the total amount of data indexed should 
still be the same. Adding dynamic fields shouldn't increase the index size. 
What am I missing?

Andy


Re: Ebay Kleinanzeigen and Auto Suggest

2011-04-27 Thread Eric Grobler
Thanks for the links Otis,

I will have a look.

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Hi Eric,

 Before using the terms component, allow me to point out:

 * http://sematext.com/products/autocomplete/index.html (used on
 http://search-lucene.com/ for example)

 * http://wiki.apache.org/solr/Suggester


 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Eric Grobler impalah...@googlemail.com
  To: solr-user@lucene.apache.org
  Sent: Tue, April 26, 2011 1:11:11 PM
  Subject: Ebay Kleinanzeigen and Auto Suggest
 
  Hi
 
  Someone told me that ebay is using solr.
  I was looking at their  Auto Suggest implementation and I guess they are
  using Shingles and the  TermsComponent.
 
  I managed to get a satisfactory implementation but I have  a problem with
  category specific filtering.
  Ebay suggestions are sensitive  to categories like Cars and Pets.
 
  As far as I understand it is not  possible to using filters with a term
  query.
  Unless one uses multiple  fields or special prefixes for the words to
 index I
  cannot think how to  implement this.
 
  Is their perhaps a workaround for this  limitation?
 
  Best  Regards
  EricZ
 
  ---
 
  I am have  a shingle type like:
  fieldType name=shingle_text  class=solr.TextField
  positionIncrementGap=100
  analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter  class=solr.ShingleFilterFactory minShingleSize=2
  maxShingleSize=4  /
 filter class=solr.LowerCaseFilterFactory /
 /analyzer
  /fieldType
 
 
 
  and a query like
 
 http://localhost:8983/solr/terms?q=*%3A*terms.fl=suggest_textterms.sort=countterms.prefix=audi
 i
 



Re: Ebay Kleinanzeigen and Auto Suggest

2011-04-27 Thread Eric Grobler
Hi Otis,

The new Solr 3.1 Suggester also does not support filter queries.

Is anyone using shingles with faceting on large data?

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Hi Eric,

 Before using the terms component, allow me to point out:

 * http://sematext.com/products/autocomplete/index.html (used on
 http://search-lucene.com/ for example)

 * http://wiki.apache.org/solr/Suggester


 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Eric Grobler impalah...@googlemail.com
  To: solr-user@lucene.apache.org
  Sent: Tue, April 26, 2011 1:11:11 PM
  Subject: Ebay Kleinanzeigen and Auto Suggest
 
  Hi
 
  Someone told me that ebay is using solr.
  I was looking at their  Auto Suggest implementation and I guess they are
  using Shingles and the  TermsComponent.
 
  I managed to get a satisfactory implementation but I have  a problem with
  category specific filtering.
  Ebay suggestions are sensitive  to categories like Cars and Pets.
 
  As far as I understand it is not  possible to using filters with a term
  query.
  Unless one uses multiple  fields or special prefixes for the words to
 index I
  cannot think how to  implement this.
 
  Is their perhaps a workaround for this  limitation?
 
  Best  Regards
  EricZ
 
  ---
 
  I am have  a shingle type like:
  fieldType name=shingle_text  class=solr.TextField
  positionIncrementGap=100
  analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter  class=solr.ShingleFilterFactory minShingleSize=2
  maxShingleSize=4  /
 filter class=solr.LowerCaseFilterFactory /
 /analyzer
  /fieldType
 
 
 
  and a query like
 
 http://localhost:8983/solr/terms?q=*%3A*terms.fl=suggest_textterms.sort=countterms.prefix=audi
 i
 



Ebay Kleinanzeigen and Auto Suggest

2011-04-26 Thread Eric Grobler
Hi

Someone told me that ebay is using solr.
I was looking at their Auto Suggest implementation and I guess they are
using Shingles and the TermsComponent.

I managed to get a satisfactory implementation but I have a problem with
category specific filtering.
Ebay suggestions are sensitive to categories like Cars and Pets.

As far as I understand it is not possible to using filters with a term
query.
Unless one uses multiple fields or special prefixes for the words to index I
cannot think how to implement this.

Is their perhaps a workaround for this limitation?

Best Regards
EricZ

---

I am have a shingle type like:
fieldType name=shingle_text class=solr.TextField
positionIncrementGap=100
analyzer
  tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.ShingleFilterFactory minShingleSize=2
maxShingleSize=4 /
   filter class=solr.LowerCaseFilterFactory /
  /analyzer
/fieldType



and a query like
http://localhost:8983/solr/terms?q=*%3A*terms.fl=suggest_textterms.sort=countterms.prefix=audi


Re: Ebay Kleinanzeigen and Auto Suggest

2011-04-26 Thread Otis Gospodnetic
Hi Eric,

Before using the terms component, allow me to point out:

* http://sematext.com/products/autocomplete/index.html (used on 
http://search-lucene.com/ for example)

* http://wiki.apache.org/solr/Suggester


Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Eric Grobler impalah...@googlemail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, April 26, 2011 1:11:11 PM
 Subject: Ebay Kleinanzeigen and Auto Suggest
 
 Hi
 
 Someone told me that ebay is using solr.
 I was looking at their  Auto Suggest implementation and I guess they are
 using Shingles and the  TermsComponent.
 
 I managed to get a satisfactory implementation but I have  a problem with
 category specific filtering.
 Ebay suggestions are sensitive  to categories like Cars and Pets.
 
 As far as I understand it is not  possible to using filters with a term
 query.
 Unless one uses multiple  fields or special prefixes for the words to index I
 cannot think how to  implement this.
 
 Is their perhaps a workaround for this  limitation?
 
 Best  Regards
 EricZ
 
 ---
 
 I am have  a shingle type like:
 fieldType name=shingle_text  class=solr.TextField
 positionIncrementGap=100
 analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter  class=solr.ShingleFilterFactory minShingleSize=2
 maxShingleSize=4  /
filter class=solr.LowerCaseFilterFactory /
/analyzer
 /fieldType
 
 
 
 and a query like
http://localhost:8983/solr/terms?q=*%3A*terms.fl=suggest_textterms.sort=countterms.prefix=audi
i