Thanks for the comments. Here is some more info.
I attach the DDL for the table concerned, the simple test program I use and the execution strategy with and without an index.

Some additional points:

1. the query returning 7% of the table is certainly not an extreme case. The exact query criteria are specified by the user in the UI, and can be much worse than this case. I have no control over the natur eof the query that the user specifies.

2. Yes, if the query is much more selective the index can be a help.

3. The biggest data file in seg0 is 1452572672 bytes in size (e.g. 1.4GB).

4. the index was added after the rows were added.

5. making the index also have the pk_column as the second indexed field makes it go like lightning! search runs in 2 secs, about 14x faster.


So in summary it seems like an index will be of no help to me in this situation, unless I make it an index of both columns.

Many thanks

Tim





Tim Dudgeon wrote:
I found an interesting preformace problem that I'd welcome some help in inderstanding.

I have a table of 4,000,000 rows that has a DOUBLE column containing floating point numbers.
I run a query like this:

select pk_column from the_table where the_double_column > 300 and the_double_column < 320

271136 rows are returned.
I then go through the ResultSet and extract all the id_column values.
All of this is done using standard JDBC.

When I do this it takes 23 seconds, which I though was not unreasonable as a full table scan was involved and the table was pretty big.

But of course I thought an index would help, so I added an index to the the_double_column and repeated the process. It took 720 seconds, 31x slower! I thought this was strange, but thought it might be because I was using standard settings and the 4MB page cache was much too small to hold the index.

So I increased the page cache size (derby.storage.pageCacheSize property) to 10x the size (10,000) and repeated the process. There was only a very minor improvement in speed.

In all cases the memory usage, as reported by:
Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()
really no differnt, and the used memory was much less that the maximum available specified by the -Xmx setting.


Any ideas what to do?

Tim



-- ----------------------------------------------
-- DDL Statements for tables
-- ----------------------------------------------

CREATE TABLE "APP"."NCI_AUG00_1M" ("CD_ID" INTEGER NOT NULL GENERATED BY 
DEFAULT AS IDENTITY (START WITH 1, INCREMENT BY 1), "CD_STRUCTURE" 
BLOB(2147483647) NOT NULL, "CD_SMILES" VARCHAR(1000), "CD_FORMULA" 
VARCHAR(100), "CD_SORTABLE_FORMULA" VARCHAR(255), "CD_MOLWEIGHT" DOUBLE, 
"CD_HASH" INTEGER NOT NULL, "CD_FLAGS" VARCHAR(20), "CD_TIMESTAMP" TIMESTAMP 
NOT NULL, "CD_FP1" INTEGER NOT NULL, "CD_FP2" INTEGER NOT NULL, "CD_FP3" 
INTEGER NOT NULL, "CD_FP4" INTEGER NOT NULL, "CD_FP5" INTEGER NOT NULL, 
"CD_FP6" INTEGER NOT NULL, "CD_FP7" INTEGER NOT NULL, "CD_FP8" INTEGER NOT 
NULL, "CD_FP9" INTEGER NOT NULL, "CD_FP10" INTEGER NOT NULL, "CD_FP11" INTEGER 
NOT NULL, "CD_FP12" INTEGER NOT NULL, "CD_FP13" INTEGER NOT NULL, "CD_FP14" 
INTEGER NOT NULL, "CD_FP15" INTEGER NOT NULL, "CD_FP16" INTEGER NOT NULL, 
"FIELD_0" INTEGER);

-- ----------------------------------------------
-- DDL Statements for indexes
-- ----------------------------------------------

CREATE INDEX "APP"."IDX_MOL_WEIGHT" ON "APP"."NCI_AUG00_1M" ("CD_MOLWEIGHT");

CREATE INDEX "APP"."NCI_AUG00_1M_FX" ON "APP"."NCI_AUG00_1M" 
("CD_SORTABLE_FORMULA");

CREATE INDEX "APP"."NCI_AUG00_1M_HX" ON "APP"."NCI_AUG00_1M" ("CD_HASH");

-- ----------------------------------------------
-- DDL Statements for keys
-- ----------------------------------------------

-- primary/unique
ALTER TABLE "APP"."NCI_AUG00_1M" ADD CONSTRAINT "SQL090324105153810" PRIMARY 
KEY ("CD_ID");

/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */
package javaapplication1;

import java.sql.CallableStatement;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.sql.Types;
import java.util.ArrayList;
import java.util.List;

/**
 *
 * @author timbo
 */
public class Main {

    String URL = 
"jdbc:derby:/home/timbo/IJCProjects/ijc-project-0000/.config/localdb/db";
    String DRIVER = "";
    Connection con = null;
    String sql = "select distinct cd_id from NCI_AUG00_1M where cd_molweight > 
300 and cd_molweight < 320";

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) throws Exception {

        Class.forName("org.apache.derby.jdbc.EmbeddedDriver");

        Main m = new Main();

        //m.setPageData();
        //m.getPageData();
        m.runMWSearch();
    }

    Connection getConnection() throws SQLException {
        if (con == null) {
            con = DriverManager.getConnection(URL);
        }
        return con;
    }

    void runMWSearch() throws SQLException {

        
        Statement stmt = getConnection().createStatement();
        long max = Runtime.getRuntime().maxMemory();
        System.out.println("Max memory = " + max);
        long t0 = System.currentTimeMillis();
        ResultSet rs = stmt.executeQuery(sql);
        long t1 = System.currentTimeMillis();
        List vals = new ArrayList();
        long total = 0;
        long free = 0;
        int count = 0;
        while (rs.next()) {
            count++;
            int id = rs.getInt(1);
            vals.add(id);
            if (count % 1000 == 0) {
                //Runtime.getRuntime().gc();
                total = Runtime.getRuntime().totalMemory();
                free = Runtime.getRuntime().freeMemory();
                System.out.println("  " + count + " => " + (total - free));
            }

        }
        rs.close();
        stmt.close();
        long t2 = System.currentTimeMillis();
        Runtime.getRuntime().gc();
        total = Runtime.getRuntime().totalMemory();
        free = Runtime.getRuntime().freeMemory();
        System.out.println("Number of hits: " + vals.size());
        System.out.println("Query took: " + (t1 - t0));
        System.out.println("Retrieve took: " + (t2 - t1));
        System.out.println("Memory: " + (total - free));



    }

   
    void setPageData() throws SQLException {

        String val = "1000";

        CallableStatement cs = getConnection().prepareCall("CALL 
SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(?, ?)"); // NOI18N
        cs.setString(1, "derby.storage.pageCacheSize"); // NOI18N
        cs.setString(2, val); // NOI18N
        //cs.setNull(2, Types.VARCHAR);
        cs.execute();
        cs.close();

        System.out.println("derby.storage.pageCacheSize set to " + val);

    }

    void getPageData() throws SQLException {

        String sql = "VALUES 
SYSCS_UTIL.SYSCS_GET_DATABASE_PROPERTY('derby.storage.pageCacheSize')";
        Statement stmt = getConnection().createStatement();
        ResultSet rs = stmt.executeQuery(sql);
        String val = null;
        if (rs.next()) {
            val = rs.getString(1);
        }
        rs.close();
        stmt.close();
        System.out.println("derby.storage.pageCacheSize=" + val);
    }
}
2009-03-26 11:55:48.894 GMT Thread[main,5,main] (XID = 9540524), (SESSIONID = 
0), select distinct cd_id from NCI_AUG00_1M where cd_molweight > 300 and 
cd_molweight < 320 ******* Project-Restrict ResultSet (2):
Number of opens = 1
Rows seen = 271136
Rows filtered = 0
restriction = false
projection = true
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        restriction time (milliseconds) = 0
        projection time (milliseconds) = 0
        optimizer estimated row count:       436037.89
        optimizer estimated cost:      6339476.19

Source result set:
        Table Scan ResultSet for NCI_AUG00_1M at read committed isolation level 
using instantaneous share row locking chosen by the optimizer
        Number of opens = 1
        Rows seen = 271136
        Rows filtered = 0
        Fetch Size = 16
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                next time in milliseconds/row = 0

        scan information: 
                Bit set of columns fetched={0, 5}
                Number of columns fetched=2
                Number of pages visited=44327
                Number of rows qualified=271136
                Number of rows visited=4004016
                Scan type=heap
                start position: 
null            stop position: 
null            qualifiers:
Column[0][0] Id: 5
Operator: <
Ordered nulls: false
Unknown return value: false
Negate comparison result: false
Column[0][1] Id: 5
Operator: <=
Ordered nulls: false
Unknown return value: true
Negate comparison result: true

                optimizer estimated row count:       436037.89
                optimizer estimated cost:      6339476.19

2009-03-26 12:19:39.988 GMT Thread[main,5,main] (XID = 9621712), (SESSIONID = 
0), select distinct cd_id from NCI_AUG00_1M where cd_molweight > 300 and 
cd_molweight < 320 ******* Project-Restrict ResultSet (3):
Number of opens = 1
Rows seen = 271136
Rows filtered = 0
restriction = false
projection = true
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        restriction time (milliseconds) = 0
        projection time (milliseconds) = 0
        optimizer estimated row count:       203425.00
        optimizer estimated cost:       644900.84

Source result set:
        Index Row to Base Row ResultSet for NCI_AUG00_1M:
        Number of opens = 1
        Rows seen = 271136
        Columns accessed from heap = {0, 5}
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                optimizer estimated row count:       203425.00
                optimizer estimated cost:       644900.84

                Index Scan ResultSet for NCI_AUG00_1M using index 
IDX_MOL_WEIGHT at read committed isolation level using instantaneous share row 
locking chosen by the optimizer
                Number of opens = 1
                Rows seen = 271136
                Rows filtered = 0
                Fetch Size = 16
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                        next time in milliseconds/row = 0

                scan information: 
                        Bit set of columns fetched=All
                        Number of columns fetched=2
                        Number of deleted rows visited=0
                        Number of pages visited=1816
                        Number of rows qualified=271136
                        Number of rows visited=271137
                        Scan type=btree
                        Tree height=-1
                        start position: 
        > on first 1 column(s).
        Ordered null semantics on the following columns: 

                        stop position: 
        >= on first 1 column(s).
        Ordered null semantics on the following columns: 

                        qualifiers:
None
                        optimizer estimated row count:       203425.00
                        optimizer estimated cost:       644900.84


Reply via email to