Thanks Sami,

This is closer from an initial look - does this do anything on the 
backend (i.e. defining the data flags sow e can get a match) as well or 
do we need to build that..??

Sami Siren wrote:
> Are you looking for something like the google keymatch as described in 
> [1]
> which was then more or less mimiced in nutch web2 module[1],
> and since also atleast as a lookalike released in google code [3]
>
> -- 
> Sami Siren
>
> [1] http://www.google.com/enterprise/mini/end_user_features.html
> [2]
> http://svn.apache.org/viewvc/lucene/nutch/trunk/contrib/web2/plugins/web-keymatch/
>  
>
> [3] http://custom-keymatch-onebox.googlecode.com/svn/trunk/Keymatch.java
>
> 2006/12/19, RP <[EMAIL PROTECTED]>:
>>
>> Let me qualify this - ad banner rotation is dealt with - I'm looking for
>> something that will use our Nutch engine to serve up relevant links from
>> people who pay for that privilege.  We do not want to serve up ad's from
>> someone else's system i.e. the big G or Y, but use our own Nutch search
>> results to serve up relevant paying links that we have sold and
>> maintain.   In a simple relational SQL world we would add a flag and
>> another table with the links and scores and look that up and pass back
>> when needed.  Problem with that is that we lose the whole multi word
>> scoring capability in Nutch i.e. pizza beer Chicago, should serve up a
>> Chicago pizza ad first and beer ads further down, just like our search
>> results have relevancy (not a great example but you get the idea).
>> Re-writing a scoring engine to do that in SQL seems like a waste when
>> Nutch already does it just fine.
>>
>> So in a nutshell - we need to do what the big G and Y and other do when
>> serving up key word based sponsor links.  My thought - automate the
>> build of a dummy page with the key words bought that would be indexed
>> and served up just like regular crawled and indexed pages, using the
>> scoring to rank them in terms of relevancy and placement - I have not
>> seen any snippets of code to do simple insert/update/delete operations
>> on a Nutch segment or index however....
>>
>> This is the idea gathering phase - think like a school/college search
>> engine with local paying advertisers - we want to serve those links up
>> to the searchers to help offset the cost of the service and serve up or
>> flag links that rank first because of payment followed by normal search
>> link results....
>>
>> rp
>>
>> Sean Dean wrote:
>> > I might be totally off base with what your asking to do, but take a 
>> look
>> at this open source project: http://phpadsnew.com/two/.
>> >
>> > Its basically an advertising engine, built on PHP. Integration within
>> any application is a breeze, and it supports external advertising 
>> such as
>> Google Ads.
>> >
>> > Sean
>> >
>> > ----- Original Message ----
>> > From: RP <[EMAIL PROTECTED]>
>> > To: [email protected]
>> > Sent: Tuesday, December 19, 2006 10:52:56 AM
>> > Subject: How best to add "sponsored link" support..??
>> >
>> >
>> > Hi all,
>> >
>> > I've been tasked with looking into this and am not a coder - that 
>> said,
>> > Nutch  is doing great and the bean counters have asked me to look into
>> > adding sponsored link results and I'm wondering how best to add this.
>> >
>> > It would be nice to utilize the Nutch engine to come up with the pages
>> > versus just doing a lookup on words and results in a flat file but the
>> > key word data could change daily (hourly) and would need to be able to
>> > be hand entered (or automated) as people sign up (re-index is not 
>> really
>> > an option).  I'm not sure this would fly within the main Nutch 
>> segments
>> > and index, but I could see maybe a separate index or possibly adding a
>> > flag to the existing data but I've not seen any easy to use tools to
>> > change/update/insert records into what is already there (yes Luke 
>> on the
>> > index but that does not touch the segment data, right?).  I don't want
>> > to change existing searched data and I don't see an issue with having
>> > duplicate results (sponsored up top and existing entry down below
>> > somewhere) but it would be more elegant to not have that occur.  I 
>> also
>> > see issues in a simple flat file look up as a multiple word search is
>> > best handled inside Nutch to "score" the results versus having to do
>> > something similar in the sponsored results.  I can see the need to
>> > control the summary text displayed and also pass thru any codes in the
>> > URL which are currently being stripped during the main crawl/index
>> > cycle.  I also see issues with seriously customizing the internals as
>> > they would have to be maintained as Nutch itself is updated....
>> >
>> > If anyone has looked at this and has at least some ideas on how 
>> best to
>> > do this let me know.  I need to come up with a preliminary estimate
>> > before I can engage and pay the coders to make this happen so if there
>> > are any easy or "best practices" ways on doing this any help/pointers
>> > would be appreciated....
>> >
>> >
>>
>>
>

-- 
rp



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to