Hi Kristian,

The entire code is available at http://code.google.com/p/jamun/.
It is written in scala, which might not be familiar to all. I am posting below 
some snippets that might help.

Kristian Waagan wrote:

> I think you'll have a better chance of getting any answers if you
> provide more information, for instance;
>   o data volume; number of rows in table(s)

Number of rows in the concerned table = ~ 50k

>   o the SQL query, and possibly the Java code executing it (a little
> later, the query plan might come in handy too)

This is the prepared statement:

conn.prepareStatement ( 
  "select name,version,release,time from PKG where PKG.id in (select id from 
PROVIDES where name = ?)",
  ResultSet.FETCH_FORWARD,ResultSet.TYPE_FORWARD_ONLY )


This is the collector function (it is in scala, but I hope it's logic would be 
apparent):

def extractPkgResult(repo:Repo, resultSet:ResultSet) = {
     if (resultSet.next) {
       var pkgList:List[MngdPackage] = Nil
       var keepGoing = true
       while (keepGoing) {
         pkgList ::= MngdPackage(resultSet.getString(1), 
resultSet.getString(2), resultSet.getString(3), "s umm", "", Nil, Nil, 
Some(repo), resultSet.getLong(4))

         keepGoing = resultSet.next
       }
       pkgList
     } else Nil
   }


>   o database schema (including indexes)
create table PKG  (
                         id integer primary key,
                         name character varying (60),
                         version character varying (40),
                         release character varying (40),
                         time bigint)
create table PROVIDES (
                         id integer, foreign key (id) references PKG(id),
                         name character varying (256),
                         version character varying (40))

create index PROVIDES_NAME_IDX on PROVIDES (name)

> 
> Since you're searching for a string inside a VARCHAR, Derby has to check
> all rows in the table. Are you using LIKE in the query?

No, it's a "where x = y"

> Is there a way that you can reduce the number of rows that qualify?
No. It is already down to 1 (99% of the time)

> Are you seeing the long wait on the first rs.next() ?

Yes! I tried more detailed profiling now and this is indeed true.

> What about the subsequent rs.next() calls (if more than one row in the
> result)?

In case of empty result, first rs.next consumes big time.
In case of non-empty result, the rs.next at the end of the result (when it 
returns false) consumes little time; less than 1 ms. All other rs.next calls 
consume big time (150 to 200 ms)

> 
> 
> There are also some tips here:
> http://wiki.apache.org/db-derby/PerformanceDiagnosisTips

I am attaching the query plan.

thanks,
Harshad
2008-12-15 18:39:13.375 GMT Thread[main,5,main] (XID = 4663), (SESSIONID = 0), 
select name,version,release,time from PKG where PKG.id in (select id from 
PROVIDES where name = ?) ******* Project-Restrict ResultSet (6):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        restriction time (milliseconds) = 0
        projection time (milliseconds) = 0
        optimizer estimated row count:          576.85
        optimizer estimated cost:        40545.65

Source result set:
        Nested Loop Exists Join ResultSet:
        Number of opens = 1
        Rows seen from the left = 5586
        Rows seen from the right = 1
        Rows filtered = 0
        Rows returned = 1
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                optimizer estimated row count:          576.85
                optimizer estimated cost:        40545.65

        Left result set:
                Table Scan ResultSet for PKG at read committed isolation level 
using instantaneous share row locking chosen by the optimizer (Actual locking 
used: table level locking.)
                Number of opens = 1
                Rows seen = 5586
                Rows filtered = 0
                Fetch Size = 16
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                        next time in milliseconds/row = 0

                scan information: 
                        Bit set of columns fetched=All
                        Number of columns fetched=5
                        Number of pages visited=105
                        Number of rows qualified=5586
                        Number of rows visited=5586
                        Scan type=heap
                        start position: 
null                    stop position: 
null                    qualifiers:
None
                        optimizer estimated row count:         5589.00
                        optimizer estimated cost:         2536.99

        Right result set:
                Project-Restrict ResultSet (5):
                Number of opens = 5586
                Rows seen = 5586
                Rows filtered = 5585
                restriction = true
                projection = true
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                        restriction time (milliseconds) = 0
                        projection time (milliseconds) = 0
                        optimizer estimated row count:          576.85
                        optimizer estimated cost:        38008.66

                Source result set:
                        Index Row to Base Row ResultSet for PROVIDES:
                        Number of opens = 5586
                        Rows seen = 5586
                        Columns accessed from heap = {0}
                                constructor time (milliseconds) = 0
                                open time (milliseconds) = 0
                                next time (milliseconds) = 0
                                close time (milliseconds) = 0
                                optimizer estimated row count:          576.85
                                optimizer estimated cost:        38008.66

                                Index Scan ResultSet for PROVIDES using index 
PROVIDES_NAME_IDX at read committed isolation level using share row locking 
chosen by the optimizer (Actual locking used: table level locking.)
                                Number of opens = 5586
                                Rows seen = 5586
                                Rows filtered = 0
                                Fetch Size = 1
                                        constructor time (milliseconds) = 0
                                        open time (milliseconds) = 0
                                        next time (milliseconds) = 0
                                        close time (milliseconds) = 0
                                        next time in milliseconds/row = 0

                                scan information: 
                                        Bit set of columns fetched=All
                                        Number of columns fetched=2
                                        Number of deleted rows visited=0
                                        Number of pages visited=16758
                                        Number of rows qualified=5586
                                        Number of rows visited=11171
                                        Scan type=btree
                                        Tree height=3
                                        start position: 
        >= on first 1 column(s).
        Ordered null semantics on the following columns: 

                                        stop position: 
        > on first 1 column(s).
        Ordered null semantics on the following columns: 

                                        qualifiers:
None
                                        optimizer estimated row count:          
576.85
                                        optimizer estimated cost:        
38008.66



Reply via email to