copyField - copy only specific words

2013-01-25 Thread b.riez...@pixel-ink.de
Hi,

i'd like to copy specific words from the keywords field to another field.
Cause the data i get is all in one field i'd like to extract the cities (they 
are fixed, so i'll know them in advance) and put them in a seperate field.

Can i generate a whitelist file and tell the copy field to check this file and 
only copy matching words to a new field?

Thanks for your help
Ben


Re: copyField - copy only specific words

2013-01-25 Thread Tomás Fernández Löbbe
I think the best way will be to pre-process the document (or use a custom
UpdateRequestProcessor). Other option, if you'll only use the cities
field for faceting/sorting/searching (you don't need the stored content)
would be to use a regular copyField and use a KeepWordFilter for the
cities field. However, with this approach it will be difficult to handle
multi-word cities like New York or Buenos Aires.

Tomás


On Fri, Jan 25, 2013 at 7:33 AM, b.riez...@pixel-ink.de 
b.riez...@pixel-ink.de wrote:

 Hi,

 i'd like to copy specific words from the keywords field to another field.
 Cause the data i get is all in one field i'd like to extract the cities
 (they are fixed, so i'll know them in advance) and put them in a seperate
 field.

 Can i generate a whitelist file and tell the copy field to check this file
 and only copy matching words to a new field?

 Thanks for your help
 Ben



RE: copyField - copy only specific words

2013-01-25 Thread Markus Jelsma
Hi

Use the KeepWordFilter on the destination field:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeepWordFilterFactory

Cheers
 
 
-Original message-
 From:b.riez...@pixel-ink.de b.riez...@pixel-ink.de
 Sent: Fri 25-Jan-2013 11:41
 To: solr-user@lucene.apache.org
 Subject: copyField - copy only specific words
 
 Hi,
 
 i'd like to copy specific words from the keywords field to another field.
 Cause the data i get is all in one field i'd like to extract the cities (they 
 are fixed, so i'll know them in advance) and put them in a seperate field.
 
 Can i generate a whitelist file and tell the copy field to check this file 
 and only copy matching words to a new field?
 
 Thanks for your help
 Ben
 


Re: copyField - copy only specific words

2013-01-25 Thread Alexandre Rafalovitch
Possibly with Shingles before the KeepWord filter to deal with multi-word
situations (though I am not sure if KeepWord allows space-separate tokens
in the file): http://stackoverflow.com/questions/14479473/

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Fri, Jan 25, 2013 at 8:17 AM, Markus Jelsma
markus.jel...@openindex.iowrote:

 Hi

 Use the KeepWordFilter on the destination field:

 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeepWordFilterFactory

 Cheers


 -Original message-
  From:b.riez...@pixel-ink.de b.riez...@pixel-ink.de
  Sent: Fri 25-Jan-2013 11:41
  To: solr-user@lucene.apache.org
  Subject: copyField - copy only specific words
 
  Hi,
 
  i'd like to copy specific words from the keywords field to another field.
  Cause the data i get is all in one field i'd like to extract the cities
 (they are fixed, so i'll know them in advance) and put them in a seperate
 field.
 
  Can i generate a whitelist file and tell the copy field to check this
 file and only copy matching words to a new field?
 
  Thanks for your help
  Ben