Re: keeping the table ordered

Nurullah Akkaya Tue, 06 Feb 2007 09:14:50 -0800

It is not quite clear to me what you are trying to achieve. Why doyou want a sequential read? Scanning the entire table of 100million records should take longer time than looking up a recordusing a index on wordid. Have you retrieved the query plan andmade sure the index on wordid is used? Or are you talking aboutdoing a lookup of many different wordids in sorted order?

i did not meant sequential scanning of the whole table i meant disk i/o( bottom paragraph explains it )yes i checked the query plan and derby uses index to lookup recordsand index look up checks only two index pages. so i came to theconclusion that most of the time is lost making random i/o requestfor the data thats why i am trying to keep the table sorted. sincesequential hard disk access is much faster than random i/o .




On Feb 6, 2007, at 8:09 AM, Michael Segel wrote:

What exactly are you trying to do?
Based on the little snippet, it looks like this is an exercise tocreate a
"google like" search on a series of documents.
The problem is that your wordID, while an integer, is not going tobe unique
enough.

wordId isn't unique at all each word in a document gets acorresponding posting entry i look up wordId for the word the thenselect all docId's containg the wordId. that posting list is basiclya big inverted list. what i am trying to do is keep the table sortedby wordId so insted of keeping values randomly on disk they are beingwritten sequentialy to the file so that instead of doing random i/o ijust do a sequential read from the hard drive. i don't wantsequential scanning of the whole table.

For example, search your documents where the wordID is the integerlook up for
the word "the".

Do you see the problem?

--
--
Michael Segel
Principal
Michael Segel Consulting Corp.
derby [EMAIL PROTECTED]
(312) 952-8175 [mobile]

Re: keeping the table ordered

Reply via email to