Synonyms for AND/OR/NOT operators

2004-12-21 Thread Sanyi
Hi!

What is the simplest way to add synonyms for AND/OR/NOT operators?
I'd like to support two sets of operator words, so people can use either the 
original english
operators and my custom ones for our local language.

Thank you for your attention!
Sanyi



__ 
Do you Yahoo!? 
Send holiday email and support a worthy cause. Do good. 
http://celebrity.mail.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Erik Hatcher
On Dec 21, 2004, at 3:04 AM, Sanyi wrote:
What is the simplest way to add synonyms for AND/OR/NOT operators?
I'd like to support two sets of operator words, so people can use 
either the original english
operators and my custom ones for our local language.
There are two options that I know of: 1) add synonyms during indexing 
and 2) add synonyms during querying.  Generally this would be done 
using a custom analyzer.

If the synonym mappings are static and you don't mind a larger index, 
adding them during indexing avoids the complexity of rewriting the 
query.  Injecting synonyms during querying allows the synonym mappings 
to change dynamically, though does produce more complex queries.  
Here's an example you'll find with the source code distribution of 
Lucene in Action which uses WordNet to look up synonyms.

Erik
p.s. I'm sensitive to over-marketing Lucene in Action in this forum as 
it would bother me to constantly see an advertisement.  You can be sure 
that any mentions of it from me will coincide with concrete examples 
(which are freely available) that are directly related to questions 
being asked.

% ant -emacs SynonymAnalyzerViewer
Buildfile: build.xml
check-environment:
compile:
build-test-index:
build-perf-index:
prepare:
SynonymAnalyzerViewer:
  Using a custom SynonymAnalyzer, two fixed strings are
  analyzed with the results displayed.  Synonyms, from the
  WordNet database, are injected into the same positions
  as the original words.
  See the Analysis chapter for more on synonym injection and
  position increments.  The Tools and extensions chapter covers
  the WordNet feature found in the Lucene sandbox.
Press return to continue...
Running lia.analysis.synonym.SynonymAnalyzerViewer...
1: [quick] [warm] [straightaway] [spry] [speedy] [ready] [quickly] 
[promptly] [prompt] [nimble] [immediate] [flying] [fast] [agile]
2: [brown] [brownness] [brownish]
3: [fox] [trick] [throw] [slyboots] [fuddle] [fob] [dodger] 
[discombobulate] [confuse] [confound] [befuddle] [bedevil]
4: [jumps]
5: [over] [o] [across]
6: [lazy] [faineant] [indolent] [otiose] [slothful]
7: [dogs]

1: [oh]
2: [we]
3: [get] [acquire] [aim] [amaze] [arrest] [arrive] [baffle] [beat] 
[become] [beget] [begin] [bewilder] [bring] [can] [capture] [catch] 
[cause] [come] [commence] [contract] [convey] [develop] [draw] [drive] 
[dumbfound] [engender] [experience] [father] [fetch] [find] [fix] 
[flummox] [generate] [go] [gravel] [grow] [have] [incur] [induce] [let] 
[make] [may] [mother] [mystify] [nonplus] [obtain] [perplex] [produce] 
[puzzle] [receive] [scram] [sire] [start] [stimulate] [stupefy] 
[stupify] [suffer] [sustain] [take] [trounce] [undergo]
4: [both]
5: [kinds]
6: [country] [state] [nationality] [nation] [land] [commonwealth] [area]
7: [western] [westerly]
8: [bb]

BUILD SUCCESSFUL
Total time: 10 seconds
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Sanyi
Hi!

I think we're talking about different things.
My question is about using synonyms for AND/OR/NOT operators, not about 
synonyms of words in the
index.
For example, in some language: AND = AANNDD; OR = OORR; NOT = NNOOTT

So, the user can enter:
(cat OR kitty) AND black AND tail

and either:

(cat OORR kitty) AANNDD black AANNDD tail

Both sets of operators must work.
It must be some kind of a query parser modification/parametering, so there is 
nothing to do with
the index.

I hope I was more specific now ;)

Thanx,
Sanyi




--- Erik Hatcher [EMAIL PROTECTED] wrote:

 On Dec 21, 2004, at 3:04 AM, Sanyi wrote:
  What is the simplest way to add synonyms for AND/OR/NOT operators?
  I'd like to support two sets of operator words, so people can use 
  either the original english
  operators and my custom ones for our local language.
 
 There are two options that I know of: 1) add synonyms during indexing 
 and 2) add synonyms during querying.  Generally this would be done 
 using a custom analyzer.
 
 If the synonym mappings are static and you don't mind a larger index, 
 adding them during indexing avoids the complexity of rewriting the 
 query.  Injecting synonyms during querying allows the synonym mappings 
 to change dynamically, though does produce more complex queries.  
 Here's an example you'll find with the source code distribution of 
 Lucene in Action which uses WordNet to look up synonyms.
 
   Erik
 
 p.s. I'm sensitive to over-marketing Lucene in Action in this forum as 
 it would bother me to constantly see an advertisement.  You can be sure 
 that any mentions of it from me will coincide with concrete examples 
 (which are freely available) that are directly related to questions 
 being asked.
 
 
 % ant -emacs SynonymAnalyzerViewer
 Buildfile: build.xml
 
 check-environment:
 
 compile:
 
 build-test-index:
 
 build-perf-index:
 
 prepare:
 
 SynonymAnalyzerViewer:
 
Using a custom SynonymAnalyzer, two fixed strings are
analyzed with the results displayed.  Synonyms, from the
WordNet database, are injected into the same positions
as the original words.
 
See the Analysis chapter for more on synonym injection and
position increments.  The Tools and extensions chapter covers
the WordNet feature found in the Lucene sandbox.
 
 Press return to continue...
 
 Running lia.analysis.synonym.SynonymAnalyzerViewer...
 
 1: [quick] [warm] [straightaway] [spry] [speedy] [ready] [quickly] 
 [promptly] [prompt] [nimble] [immediate] [flying] [fast] [agile]
 2: [brown] [brownness] [brownish]
 3: [fox] [trick] [throw] [slyboots] [fuddle] [fob] [dodger] 
 [discombobulate] [confuse] [confound] [befuddle] [bedevil]
 4: [jumps]
 5: [over] [o] [across]
 6: [lazy] [faineant] [indolent] [otiose] [slothful]
 7: [dogs]
 
 1: [oh]
 2: [we]
 3: [get] [acquire] [aim] [amaze] [arrest] [arrive] [baffle] [beat] 
 [become] [beget] [begin] [bewilder] [bring] [can] [capture] [catch] 
 [cause] [come] [commence] [contract] [convey] [develop] [draw] [drive] 
 [dumbfound] [engender] [experience] [father] [fetch] [find] [fix] 
 [flummox] [generate] [go] [gravel] [grow] [have] [incur] [induce] [let] 
 [make] [may] [mother] [mystify] [nonplus] [obtain] [perplex] [produce] 
 [puzzle] [receive] [scram] [sire] [start] [stimulate] [stupefy] 
 [stupify] [suffer] [sustain] [take] [trounce] [undergo]
 4: [both]
 5: [kinds]
 6: [country] [state] [nationality] [nation] [land] [commonwealth] [area]
 7: [western] [westerly]
 8: [bb]
 
 BUILD SUCCESSFUL
 Total time: 10 seconds
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 




__ 
Do you Yahoo!? 
Dress up your holiday email, Hollywood style. Learn more. 
http://celebrity.mail.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Morus Walter
Erik Hatcher writes:
 On Dec 21, 2004, at 3:04 AM, Sanyi wrote:
  What is the simplest way to add synonyms for AND/OR/NOT operators?
  I'd like to support two sets of operator words, so people can use 
  either the original english
  operators and my custom ones for our local language.
 
 There are two options that I know of: 1) add synonyms during indexing 
 and 2) add synonyms during querying.  Generally this would be done 
 using a custom analyzer.

I guess you missunderstood the question.

I think he want's to know how to create a query parser understanding 
something like 'a UND b' as well as 'a AND b' to support localized 
operator names (german in this case).

AFAIK that can only be done by copying query parsers javacc-source and
adding the operators there.
Shouldn't be difficult, though it's a bit ugly since it implies code
duplication. And there will be no way of choosing the operators dynamically
at runtime. One will need to have different query parsers for different
languages.

Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Erik Hatcher
Wow, I really did misunderstand.  My apologies.
Yes, you will need to fork QueryParser.jj and install JavaCC to build 
your custom parser.  It should be pretty trivial to add alternatives to 
AND(+)/OR/NOT(-).

Erik
On Dec 21, 2004, at 4:42 AM, Sanyi wrote:
Hi!
I think we're talking about different things.
My question is about using synonyms for AND/OR/NOT operators, not 
about synonyms of words in the
index.
For example, in some language: AND = AANNDD; OR = OORR; NOT = NNOOTT

So, the user can enter:
(cat OR kitty) AND black AND tail
and either:
(cat OORR kitty) AANNDD black AANNDD tail
Both sets of operators must work.
It must be some kind of a query parser modification/parametering, so 
there is nothing to do with
the index.

I hope I was more specific now ;)
Thanx,
Sanyi

--- Erik Hatcher [EMAIL PROTECTED] wrote:
On Dec 21, 2004, at 3:04 AM, Sanyi wrote:
What is the simplest way to add synonyms for AND/OR/NOT operators?
I'd like to support two sets of operator words, so people can use
either the original english
operators and my custom ones for our local language.
There are two options that I know of: 1) add synonyms during indexing
and 2) add synonyms during querying.  Generally this would be done
using a custom analyzer.
If the synonym mappings are static and you don't mind a larger index,
adding them during indexing avoids the complexity of rewriting the
query.  Injecting synonyms during querying allows the synonym mappings
to change dynamically, though does produce more complex queries.
Here's an example you'll find with the source code distribution of
Lucene in Action which uses WordNet to look up synonyms.
Erik
p.s. I'm sensitive to over-marketing Lucene in Action in this forum as
it would bother me to constantly see an advertisement.  You can be 
sure
that any mentions of it from me will coincide with concrete examples
(which are freely available) that are directly related to questions
being asked.

% ant -emacs SynonymAnalyzerViewer
Buildfile: build.xml
check-environment:
compile:
build-test-index:
build-perf-index:
prepare:
SynonymAnalyzerViewer:
   Using a custom SynonymAnalyzer, two fixed strings are
   analyzed with the results displayed.  Synonyms, from the
   WordNet database, are injected into the same positions
   as the original words.
   See the Analysis chapter for more on synonym injection and
   position increments.  The Tools and extensions chapter covers
   the WordNet feature found in the Lucene sandbox.
Press return to continue...
Running lia.analysis.synonym.SynonymAnalyzerViewer...
1: [quick] [warm] [straightaway] [spry] [speedy] [ready] [quickly]
[promptly] [prompt] [nimble] [immediate] [flying] [fast] [agile]
2: [brown] [brownness] [brownish]
3: [fox] [trick] [throw] [slyboots] [fuddle] [fob] [dodger]
[discombobulate] [confuse] [confound] [befuddle] [bedevil]
4: [jumps]
5: [over] [o] [across]
6: [lazy] [faineant] [indolent] [otiose] [slothful]
7: [dogs]
1: [oh]
2: [we]
3: [get] [acquire] [aim] [amaze] [arrest] [arrive] [baffle] [beat]
[become] [beget] [begin] [bewilder] [bring] [can] [capture] [catch]
[cause] [come] [commence] [contract] [convey] [develop] [draw] [drive]
[dumbfound] [engender] [experience] [father] [fetch] [find] [fix]
[flummox] [generate] [go] [gravel] [grow] [have] [incur] [induce] 
[let]
[make] [may] [mother] [mystify] [nonplus] [obtain] [perplex] [produce]
[puzzle] [receive] [scram] [sire] [start] [stimulate] [stupefy]
[stupify] [suffer] [sustain] [take] [trounce] [undergo]
4: [both]
5: [kinds]
6: [country] [state] [nationality] [nation] [land] [commonwealth] 
[area]
7: [western] [westerly]
8: [bb]

BUILD SUCCESSFUL
Total time: 10 seconds
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Dress up your holiday email, Hollywood style. Learn more.
http://celebrity.mail.yahoo.com
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Sanyi
Well, I guess I'd better recognize and replace the operator synonyms to their 
original format
before passing them to QueryParser. I don't feel comfortable tampering with 
Lucene's source code.

Anyway, thanx for the answers.

Sanyi

--- Morus Walter [EMAIL PROTECTED] wrote:

 Erik Hatcher writes:
  On Dec 21, 2004, at 3:04 AM, Sanyi wrote:
   What is the simplest way to add synonyms for AND/OR/NOT operators?
   I'd like to support two sets of operator words, so people can use 
   either the original english
   operators and my custom ones for our local language.
  
  There are two options that I know of: 1) add synonyms during indexing 
  and 2) add synonyms during querying.  Generally this would be done 
  using a custom analyzer.
 
 I guess you missunderstood the question.
 
 I think he want's to know how to create a query parser understanding 
 something like 'a UND b' as well as 'a AND b' to support localized 
 operator names (german in this case).
 
 AFAIK that can only be done by copying query parsers javacc-source and
 adding the operators there.
 Shouldn't be difficult, though it's a bit ugly since it implies code
 duplication. And there will be no way of choosing the operators dynamically
 at runtime. One will need to have different query parsers for different
 languages.
 
 Morus
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 




__ 
Do you Yahoo!? 
Take Yahoo! Mail with you! Get it on your mobile phone. 
http://mobile.yahoo.com/maildemo 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Synonyms for AND/OR/NOT operators

2004-12-21 Thread Morus Walter
Sanyi writes:
 Well, I guess I'd better recognize and replace the operator synonyms to their 
 original format
 before passing them to QueryParser. I don't feel comfortable tampering with 
 Lucene's source code.
 
Apart from knowing how to compile lucene (including the javacc code
generation) you should only need to change

DEFAULT TOKEN : {
  AND:   (AND | ) 
| OR:(OR | ||) 
| NOT:   (NOT | !) 

to
DEFAULT TOKEN : {
  AND:   (AND | insert your version of and here | ) 
| OR:(OR | insert your version of or here | ||) 
| NOT:   (NOT | insert your version of not here | !) 

in jakarta-lucene/src/java/org/apache/lucene/queryParser/QueryParser.jj

Replacing the operators before query might be hard to do, if you want
to handle cases like »a AND b OR c«, which is a query for a 
phrase a AND b or the token c, correctly.

Morus



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]