In this case sounds like RSS feed URL is your natural primary key. You 
could add untokenized 'id' field to your documents and then retrieve and 
update them by using URLs as keys. And you could even have a more 
natural field name if you create index with some optional params.

Example:

url = 'http://feeds.feedburner.com/RidingRails'

index = Ferret::Index::Index.new(:path => "#{RAILS_ROOT}/db/ferret", 
:id_field => 'url')

document = Ferret::Document::Document.new

document << Ferret::Document::Field.new('url', url, 
Ferret::Document::Field::Store::YES, 
Ferret::Document::Field::Index::UNTOKENIZED)

document << Ferret::Document::Field.new('content', 'Rails are great!', 
Ferret::Document::Field::Store::YES, 
Ferret::Document::Field::Index::TOKENIZED)

index << document

document = index[url]

puts document['url'] == url # true

document['content'] = 'I agree'

index.update(url, document)

index[url]['content'] == I agree # true

index.size == 1 # true

--
Sergei Serdyuk
Red Leaf Software LLC
web: http://redleafsoft.com


> Hi All,
> 
> I have a Ferret index containing some cached RSS feeds.
> 
> I have a nightly cron script to cache the feeds, and I'd like to update
> the index with the latest feeds.
> 
> I see the Index class has an update method, but I can't work out how to
> get the id of the relevant document to pass in.
> 
> Lets say I have a file called "google_news.xml"
> 
> I want to go:
>     my_index.update(google_id, google_doc)
> 
> I'm sure this is way too easy and I'm being massively dumb, but - - any
> hints/advice gratefully received.
> 
> Many Thanks,
> Steven


-- 
Posted via http://www.ruby-forum.com/.
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to