sorry for the typo it should have been 15 bytes not KB i wont be
reading the same records over and over again once i read them i will
calculate and put the result in cache so i wont be reading the same
records until next reboot what i want to do is make that initial read
as fast as i can. i migraded my application from mysql i did not have
any index trouble and derby was faster out of the box so i assumed it
was using the index. but i will check it.
Nurullah Akkaya
[EMAIL PROTECTED]
Blooby.com
Tel: +1 (256) 270 4091
On Jan 24, 2007, at 4:42 PM, Stanley Bradbury wrote:
Nurullah Akkaya wrote:
my application uses a large table ( more than 200 million ) 15k
rows that uses integer index to do simple selects no sorting but i
do have to read 1 to 2 million records at a time. i read in the
tuning manual that if we have large tables we should use a large
pageSize it also says if queries use index it could degrade
performance what should be the right pageSize for a application of
this type?
Nurullah Akkaya
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
Blooby.com
Tel: +1 (256) 270 4091
Only testing will can determine the optimal pageSize is for an
application but there are some basics principals that can be used
as guide if testing is undesirable.
For read intensive operations it is usually best to grab as many
*qualifying* rows as possible with a single read - this indicates
that a large pageSize will probably help. With an increased
pageSize you may need to adjust the java maxHeap size to avoid
OutOfMemory exceptions. Increasing the Derby pageCache size might
also help if you
1) will be reading the same records over and over
2) can cache most or all of the pages containing the records
Again, you will need to adjust the java maxHeap size to avoid
OutOfMemory exceptions.
Are you saying your rowsize is 15K? So at the maximum pagesize of
32K you would get 2 rows per page? This is certainly better than
using a 4K page and forcing Derby to allocate 3 overflow pages for
each record inserted. So do use the the maxPageSize because each
record is so large.
And you have to read between 50% and 100% of the rows each time?
With 2 million rows of 15K records you probably won't have enough
physical memory to cache a large portion of the data records but
look at the size of the index records and see if you can allocate
enough memory to keep the index in cache.
To avoid reading such large rows just to qualify records I would
think using the index would be best. Use
derby.language.logQueryPlan or runTimeStatistics to insure that
Derby is using the index. If it is not use DERBY-PROPERTIES (added
in version 10.2) to force the use of the index then test to see if
performance improves.
(for more info see the manual sections: Tuning Guide: Index use
and access paths
and Tuning Guide: Optimizer overrides)
HTH, Stan