It is not quite clear to me what you are trying to achieve. Why do
you want a sequential read? Scanning the entire table of 100
million records should take longer time than looking up a record
using a index on wordid. Have you retrieved the query plan and
made sure the index on wordid is used? Or are you talking about
doing a lookup of many different wordids in sorted order?
i did not meant sequential scanning of the whole table i meant disk i/
o( bottom paragraph explains it )
yes i checked the query plan and derby uses index to lookup records
and index look up checks only two index pages. so i came to the
conclusion that most of the time is lost making random i/o request
for the data thats why i am trying to keep the table sorted. since
sequential hard disk access is much faster than random i/o .
On Feb 6, 2007, at 8:09 AM, Michael Segel wrote:
What exactly are you trying to do?
Based on the little snippet, it looks like this is an exercise to
create a
"google like" search on a series of documents.
The problem is that your wordID, while an integer, is not going to
be unique
enough.
wordId isn't unique at all each word in a document gets a
corresponding posting entry i look up wordId for the word the then
select all docId's containg the wordId. that posting list is basicly
a big inverted list. what i am trying to do is keep the table sorted
by wordId so insted of keeping values randomly on disk they are being
written sequentialy to the file so that instead of doing random i/o i
just do a sequential read from the hard drive. i don't want
sequential scanning of the whole table.
For example, search your documents where the wordID is the integer
look up for
the word "the".
Do you see the problem?
--
--
Michael Segel
Principal
Michael Segel Consulting Corp.
derby [EMAIL PROTECTED]
(312) 952-8175 [mobile]