Re: Can I Do Reverse Search?

2005-10-23 Thread Erik Hatcher
Sam - I'm not quite sure I follow you, but let's see if this fits... you want to have a document and see if a query matches it? Please elaborate more on what you're after. Maybe what you're looking for is the contrib/memory and the MemoryIndex within that Subversion area. Erik On 22

Classifier4J and Lucene

2005-10-23 Thread msftblows
Hey- I have an indexer at my company that I wrote while back that indexes database content (users and their profile)...one of the next req. of the project is to avoid 'spam' in hits. For example if I do a search for oracle, and oracle is in 25 places in someones bio field...and another person

Re: Classifier4J and Lucene

2005-10-23 Thread Jeff Rodenburg
Sounds like you might have to consider both, if the first one doesn't solve your issue. A company field sounds like it's a single entry, i.e. one that can't be "spammed up" with multiple terms, i.e. "Oralce Oracle Oracle". It also sounds as if you're searching multiple fields, and that some fields

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
ok, I am implementing a google adsense/adwords-like system. For examples, the website has keywords "nike red shoe", so it can match text ad with keywords "nike shoe -blue". Of course, I can always use the text ad keywords to match the website's keywords. But it will take too much resource to hav

Re: Can I Do Reverse Search?

2005-10-23 Thread Stefan Groschupf
Index the keywords of your ads with lucene. Extract all words from your page (ajax), remove stop words, build a query from the page words by connect the words with OR and you will find the best matching ad. You may need to limit the words per page or set the maximum clauses to a much higher

Re: Classifier4J and Lucene

2005-10-23 Thread Chris Hostetter
: Not sure if this makses sense...but curious if anyone has ideas, or has : done something like this. I have a few ideas, none of which are mutuallly exclusive... 1) look at the Explain output for the various queries you are generating to help you understand why your boosts aren't having as much

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
Yes, I thought of that. But since the ads have negative keywords, it's very possible for the webpage to match the ads but not the other way around because of the negative keywords. So the system cannot be sure that the ads match the webpage until it uses ads' keyword and negative keywords to rema

Re: Can I Do Reverse Search?

2005-10-23 Thread Stefan Groschupf
two document fields one named positive one called negative you query have to look somehow like this: positive: (keyword1 keywordN) AND NOT negative:(keyword1 keywordN) Am 23.10.2005 um 20:50 schrieb Sam Lee: Yes, I thought of that. But since the ads have negative keywords, it's very possible f

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
yes, but I will have to do it for each ad as I stated previously. webpage www.mysite.com --match--> ad1.ad101 > > > > Then I match each ad with the webpage. > > But due to negative keywords: > > ad1ad100 --NOT match--> www.mysite.com > > ad101 --match--> www.mysite.com > > > > # of queries

Re: Classifier4J and Lucene

2005-10-23 Thread msftblows
interesting information you have here...I will look into this and let you know what I come up with. Thanks! -Original Message- From: Chris Hostetter <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Sun, 23 Oct 2005 10:14:13 -0700 (PDT) Subject: Re: Classifier4J and Lucene

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
How fast is MemoryIndex? For examples, I have a webpage indexed, and I have 10 queries with negative keywords to match against this webpage. How much faster is it comparing to using normal method to match the same 10 queries to this webpage? --- Erik Hatcher <[EMAIL PROTECTED]> wrote: >

How Fast is MemoryIndex? How Much Resource Does It Use?

2005-10-23 Thread Sam Lee
Hi, Someone suggested that I should use MemoryIndex to match content to a large # of queries. e.g. "nike red shoes" --match--> "nike shoes -blue" and --match--> "nike shoes -black"... What if I have 10 of these queries for each content? and there maybe 100 of these contents. But how f