derby.drda.startNetworkServer property
Hi there, I got a question about the derby.drda.startNetworkServer property. I've read through the doc and it says the following. --- Use the derby.drda.startNetworkServer property to simplify embedding the Network Server in your application. When you set derby.drda.startNetworkServer, the Network Server will automatically start when you start Derby. Only one Network Server can be started in a JVM. --- Does it mean even though I start Derby using Class.forName(org.apache.derby.jdbc.EmbeddedDriver);, the Network Server will start if I set derby.drda.startNetworkServer true ? Regards, Wolfgang __ For All Sports Fans! http://pr.mail.yahoo.co.jp/yells/
RE: derby performance and 'order by'
So for the second query: select * from orders where time '10/01/2002' and time '11/30/2002' order by order_id; the query plan shows that the index IX_ORDERS_TIME is used to filter the result set by time. The order by step does not use the primary key index to sort the results after the filter step. My questions: --Is it correct that the sort step not use the primary key index in this case? --Why is it not possible to use the index on order_id to sort after the filter has happened? Here is the query plan: Statement Name: null Statement Text: select * from orders where time '10/01/2002' and time '11/30/2002' order by order_id Parse Time: 0 Bind Time: 0 Optimize Time: 0 Generate Time: 0 Compile Time: 0 Execute Time: 14329 Begin Compilation Timestamp : null End Compilation Timestamp : null Begin Execution Timestamp : 2005-09-19 09:20:06.171 End Execution Timestamp : 2005-09-19 09:20:20.5 Statement Execution Plan Text: Sort ResultSet: Number of opens = 1 Rows input = 166333 Rows returned = 1000 Eliminate duplicates = false In sorted order = false Sort information: Number of merge runs=1 Number of rows input=166333 Number of rows output=166333 Size of merge runs=[93695] Sort type=external constructor time (milliseconds) = 0 open time (milliseconds) = 14297 next time (milliseconds) = 32 close time (milliseconds) = 0 optimizer estimated row count:78377.51 optimizer estimated cost: 166745.12 Source result set: Index Row to Base Row ResultSet for ORDERS: Number of opens = 1 Rows seen = 166333 Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 10488 close time (milliseconds) = 0 optimizer estimated row count:78377.51 optimizer estimated cost: 166745.12 Index Scan ResultSet for ORDERS using index IX_ORDERS_TIME at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 1 Rows seen = 166333 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 3438 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=887 Number of rows qualified=166333 Number of rows visited=166333 Scan type=btree Tree height=3 start position: on first 1 column(s). Ordered null semantics on the following columns: stop position: = on first 1 column(s). Ordered null semantics on the following columns: qualifiers: None optimizer estimated row count:78377.51 optimizer estimated cost: 166745.12 --scott -Original Message- From: Sunitha Kambhampati [mailto:[EMAIL PROTECTED] Sent: Friday, September 16, 2005 5:55 PM To: Derby Discussion Subject: Re: derby performance and 'order by' Scott Ogden wrote: I have observed some interesting query performance behavior and am hoping someone here can explain. In my scenario, it appears that an existing index is not being used for the 'order by' part of the operation and as a result the performance of certain queries is suffering. Can someone explain if this is supposed to be what is happening and why? Please see below for the specific queries and their performance characteristics. Here are the particulars: - create table orders( order_id varchar(50) NOT NULL CONSTRAINT ORDERS_PK PRIMARY KEY, amount numeric(31,2), time date, inv_num varchar(50), line_num varchar(50), phone varchar(50), prod_num varchar(50)); --Load a large amount of data (720,000 records) into the 'orders' table --Create an index on the time column as that will be used in the 'where' clause. create index IX_ORDERS_TIME on orders(time); --When I run a query against this table returning top 1,000 records, this query returns very quickly, consistently less than .010 seconds. select * from orders where time '10/01/2002' and time '11/30/2002' order by time; --Now run a similarly query against same table, returning the top 1,000 records.
RE: derby performance and 'order by'
The test was set up and run using the SQuirreL client, not ij. All 3 of the queries return the top 1000 rows and the times I reported are to return these top 1000 rows, not just the first row. From: Craig Russell [mailto:[EMAIL PROTECTED] Sent: Saturday, September 17, 2005 2:35 PM To: Derby Discussion Subject: Re: derby performance and 'order by' Hi Scott, How have you set up the test? Are you using ij and displaying all of the data or using jdbc to access the data? What do you do in 0.010 seconds? Do you read all of the rows into memory, or just record the time until you get the first row? Are you measuring the time taken to return all the rows or just the first row? Another reader has already commented on the fact that the second query is doing a lot more work than the first. The second query must sort the results after filtering the data, whereas the first and third queries can simply use the indexes and filter on the fly. I'm a little suspicious of the third query returning 720,000 results in 0.010 seconds. Craig On Sep 16, 2005, at 4:42 PM, Scott Ogden wrote: I have observed some interesting query performance behavior and am hoping someone here can explain. In my scenario, it appears that an existing index is not being used for the order by part of the operation and as a result the performance of certain queries is suffering. Can someone explain if this is supposed to be what is happening and why? Please see below for the specific queries and their performance characteristics. Here are the particulars: - create table orders( order_id varchar(50) NOT NULL CONSTRAINT ORDERS_PK PRIMARY KEY, amount numeric(31,2), time date, inv_num varchar(50), line_num varchar(50), phone varchar(50), prod_num varchar(50)); --Load a large amount of data (720,000 records) into the orders table --Create an index on the time column as that will be used in the where clause. create index IX_ORDERS_TIME on orders(time); --When I run a query against this table returning top 1,000 records, this query returns very quickly, consistently less than .010 seconds. select * from orders where time '10/01/2002' and time '11/30/2002' order by time; --Now run a similarly query against same table, returning the top 1,000 records. --The difference is that the results are now sorted by the primary key (order_id) rather than time. --This query returns slowly, approximately 15 seconds. Why?? select * from orders where time '10/01/2002' and time '11/30/2002' order by order_id; --Now run a third query against the same orders table, removing the where clause --This query returns quickly, around .010 seconds. select * from orders order by order_id; - Craig Russell Architect, Sun Java Enterprise System http://java.sun.com/products/jdo 408 276-5638 mailto:[EMAIL PROTECTED] P.S. A good JDO? O, Gasp!
Re: derby.drda.startNetworkServer property
[EMAIL PROTECTED] writes: Hi there, I got a question about the derby.drda.startNetworkServer property. I've read through the doc and it says the following. --- Use the derby.drda.startNetworkServer property to simplify embedding the Network Server in your application. When you set derby.drda.startNetworkServer, the Network Server will automatically start when you start Derby. Only one Network Server can be started in a JVM. --- Does it mean even though I start Derby using Class.forName(org.apache.derby.jdbc.EmbeddedDriver);, the Network Server will start if I set derby.drda.startNetworkServer true ? Yes, that's what it means. When the embedded driver is loaded, it will check whether that property is set, and start the network server if it is true. -- Knut Anders
Re: derby.drda.startNetworkServer property
[EMAIL PROTECTED] wrote: Hi there, I got a question about the derby.drda.startNetworkServer property. I've read through the doc and it says the following. --- Use the derby.drda.startNetworkServer property to simplify embedding the Network Server in your application. When you set derby.drda.startNetworkServer, the Network Server will automatically start when you start Derby. Only one Network Server can be started in a JVM. --- Does it mean even though I start Derby using Class.forName(org.apache.derby.jdbc.EmbeddedDriver);, the Network Server will start if I set derby.drda.startNetworkServer true ? Regards, Wolfgang __ For All Sports Fans! http://pr.mail.yahoo.co.jp/yells/ Maybe that explains why you were seeing two attempts to start the Network Server in your derby.log, from your earlier mail (?). -Rajesh
Re: derby.drda.startNetworkServer property
Hi Knut, Thanks for your quick response. Then, how can I verify whether Derby, started using derby.drda.startNetworkServer=true and org.apache.derby.jdbc.EmbeddedDriver, is ready or not to accept clients like NetworkServerControl#ping() ? Regards, Wolfgang Hi there, I got a question about the derby.drda.startNetworkServer property. I've read through the doc and it says the following. --- Use the derby.drda.startNetworkServer property to simplify embedding the Network Server in your application. When you set derby.drda.startNetworkServer, the Network Server will automatically start when you start Derby. Only one Network Server can be started in a JVM. --- Does it mean even though I start Derby using Class.forName(org.apache.derby.jdbc.EmbeddedDriver);, the Network Server will start if I set derby.drda.startNetworkServer true ? Yes, that's what it means. When the embedded driver is loaded, it will check whether that property is set, and start the network server if it is true. -- Knut Anders __ For All Sports Fans! http://pr.mail.yahoo.co.jp/yells/
Re: derby performance and 'order by'
Hi Scott,From the query plan it appears that your filter selects 166,333 rows, of which you want the first 1000 according to the ordering of the order_id column. You can see that this is an effective strategy because Number of rows qualified=166333 Number of rows visited=166333. There's no time lost visiting rows that don't qualify.The database has to sort the 166,333 rows because the results are ordered according to the index scan column "time" not according to the order_id column. All of the rows need to be sorted even though you only want the first 1000 rows. I'd guess that the sorting of the 166,333 rows is what accounts for the 15 second delay you are experiencing.The index on order_id doesn't do you any good because you have a result that isn't indexed on order_id. If this isn't obvious, try to think of an algorithm that would use the order_id index on the result set.CraigOn Sep 19, 2005, at 9:29 AM, scotto wrote:So for the second query: select * from orderswhere time '10/01/2002' and time '11/30/2002'order by order_id;the query plan shows that the index IX_ORDERS_TIME is used to filter theresult set by time. The order by step does not use the primary key index tosort the results after the filter step. My questions: --Is it correct that the sort step not use the primary key index in thiscase? --Why is it not possible to use the index on order_id to sort after thefilter has happened? Here is the query plan: Statement Name: nullStatement Text: select * from orders where time '10/01/2002' and time '11/30/2002'order by order_idParse Time: 0Bind Time: 0Optimize Time: 0Generate Time: 0Compile Time: 0Execute Time: 14329Begin Compilation Timestamp : nullEnd Compilation Timestamp : nullBegin Execution Timestamp : 2005-09-19 09:20:06.171End Execution Timestamp : 2005-09-19 09:20:20.5Statement Execution Plan Text: Sort ResultSet:Number of opens = 1Rows input = 166333Rows returned = 1000Eliminate duplicates = falseIn sorted order = falseSort information: Number of merge runs=1 Number of rows input=166333 Number of rows output=166333 Size of merge runs=[93695] Sort type=external constructor time (milliseconds) = 0 open time (milliseconds) = 14297 next time (milliseconds) = 32 close time (milliseconds) = 0 optimizer estimated row count: 78377.51 optimizer estimated cost: 166745.12Source result set: Index Row to Base Row ResultSet for ORDERS: Number of opens = 1 Rows seen = 166333 Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 10488 close time (milliseconds) = 0 optimizer estimated row count: 78377.51 optimizer estimated cost: 166745.12 Index Scan ResultSet for ORDERS using index IX_ORDERS_TIMEat read committed isolation level using instantaneous share row lockingchosen by the optimizer Number of opens = 1 Rows seen = 166333 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 3438 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=887 Number of rows qualified=166333 Number of rows visited=166333 Scan type=btree Tree height=3 start position: on first 1 column(s). Ordered null semantics on the following columns: stop position: = on first 1 column(s). Ordered null semantics on the following columns: qualifiers:None optimizer estimated row count: 78377.51 optimizer estimated cost: 166745.12 --scott-Original Message-From: Sunitha Kambhampati [mailto:[EMAIL PROTECTED]] Sent: Friday, September 16, 2005 5:55 PMTo: Derby DiscussionSubject: Re: derby performance and 'order by'Scott Ogden wrote: I have observed some interesting query performance behavior and am hoping someone here can explain.In my scenario, it appears that an existing index is not being used for the 'order by' part of the operation and as a result the performance of certain queries is suffering.Can someone explain if this is supposed to be what is happening and why? Please see below for the specific queries and their performance characteristics.Here are the particulars:-create table orders(order_id varchar(50) NOT NULLCONSTRAINT ORDERS_PK PRIMARY KEY,amount numeric(31,2),time date,inv_num varchar(50),line_num varchar(50),phone varchar(50),prod_num varchar(50));--Load a large amount of data (720,000 records) into the 'orders' table--Create an index on the time column as that will be used in the 'where'
Re: derby performance and 'order by'
Actually, it sounds like the problem of finding top 1000 rows out of 166333 rows is different than sorting 166333 rows and maybe it could be optimized. There is no need to sort all 166333 but the information that we are only looking 1000 rows would have to be passed all the way down to the point where Derby decides to sort. I have not thought through the details of an algorithmbut when nRows we want is substantially smaller than TotalRows then ijust feelthere should be a better way to pick those nRows. For example, if nRows were 1, then all we had to do would be 1 single pass on 166333 rows to find the max. That is quite different than sorting all and this idea should be possible to generalize on 1=nRowsTotalRows. Ali Craig Russell [EMAIL PROTECTED] wrote: Hi Scott, From the query plan it appears that your filter selects 166,333 rows, of which you want the first 1000 according to the ordering of the order_id column. You can see that this is an effective strategy becauseNumber of rows qualified=166333Number of rows visited=166333. There's no time lost visiting rows that don't qualify. The database has to sort the 166,333 rows because the results are ordered according to the index scan column "time" not according to the order_id column. All of the rows need to be sorted even though you only want the first 1000 rows. I'd guess that the sorting of the 166,333 rows is what accounts for the 15 second delay you are experiencing. The index on order_id doesn't do you any good because you have a result that isn't indexed on order_id. If this isn't obvious, try to think of an algorithm that would use the order_id index on the result set. Craig On Sep 19, 2005, at 9:29 AM, scotto wrote: So for the second query: select * from orders where time '10/01/2002' and time '11/30/2002' order by order_id; the query plan shows that the index IX_ORDERS_TIME is used to filter the result set by time. The order by step does not use the primary key index to sort the results after the filter step. My questions: --Is it correct that the sort step not use the primary key index in this case? --Why is it not possible to use the index on order_id to sort after the filter has happened? Here is the query plan: Statement Name: null Statement Text: select * from orders where time '10/01/2002' and time '11/30/2002' order by order_id Parse Time: 0 Bind Time: 0 Optimize Time: 0 Generate Time: 0 Compile Time: 0 Execute Time: 14329 Begin Compilation Timestamp : null End Compilation Timestamp : null Begin Execution Timestamp : 2005-09-19 09:20:06.171 End Execution Timestamp : 2005-09-19 09:20:20.5 Statement Execution Plan Text: Sort ResultSet: Number of opens = 1 Rows input = 166333 Rows returned = 1000 Eliminate duplicates = false In sorted order = false Sort information: Number of merge runs=1 Number of rows input=166333 Number of rows output=166333 Size of merge runs=[93695] Sort type=external constructor time (milliseconds) = 0 open time (milliseconds) = 14297 next time (milliseconds) = 32 close time (milliseconds) = 0 optimizer estimated row count:78377.51 optimizer estimated cost:166745.12 Source result set: Index Row to Base Row ResultSet for ORDERS: Number of opens = 1 Rows seen = 166333 Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 10488 close time (milliseconds) = 0 optimizer estimated row count:78377.51 optimizer estimated cost:166745.12 Index Scan ResultSet for ORDERS using index IX_ORDERS_TIME at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 1 Rows seen = 166333 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 3438 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=887 Number of rows qualified=166333 Number of rows visited=166333 Scan type=btree Tree height=3 start position: on first 1 column(s). Ordered null semantics on the following columns: stop position: = on first 1 column(s). Ordered null semantics on the following columns: qualifiers: None optimizer estimated row count:78377.51 optimizer estimated cost:166745.12 --scott -Original Message- From: Sunitha Kambhampati [mailto:[EMAIL PROTECTED]] Sent: Friday, September 16, 2005 5:55 PM To: Derby Discussion Subject: Re: derby performance and 'order by' Scott Ogden wrote: I have observed some interesting query performance behavior and am hoping someone here can explain. In my scenario, it appears that an existing index is not being used for
Re: derby performance and 'order by'
Suavi Ali Demir wrote: Actually, it sounds like the problem of finding top 1000 rows out of 166333 rows is different than sorting 166333 rows and maybe it could be optimized. There is no need to sort all 166333 but the information that we are only looking 1000 rows would have to be passed all the way down to the point where Derby decides to sort. I have not thought through the details of an algorithm but when nRows we want is substantially smaller than TotalRows then i just feel there should be a better way to pick those nRows. For example, if nRows were 1, then all we had to do would be 1 single pass on 166333 rows to find the max. That is quite different than sorting all and this idea should be possible to generalize on 1=nRowsTotalRows. One optimization would be to pass the 1,000 down from Statement.setMaxRows to the sorter. Then the sorter could keep the sort set at 1000 rows, discarding any rows moved to or inserted at the end. This would most likely enable the sorted set to remain in memory, rather than spilling to disk. Even more likley if the application sets a max rows to a reasonable number to be views in a single on-screen page (e.g. 20-50). Or an even wilder idea would be to have a predicate that is modified on the fly, to represent the maximum once the sorted set reaches 1,000. E.g. an additional predicate of = MAX_INT for an INT column. Dan.
Re: derby performance and 'order by'
Hi Ali,On Sep 19, 2005, at 10:26 AM, Suavi Ali Demir wrote:Actually, it sounds like the problem of finding top 1000 rows out of 166333 rows is different than sorting 166333 rows and maybe it could be optimized. Indeed. There is no need to sort all 166333 but the information that we are only looking 1000 rows would have to be passed all the way down to the point where Derby decides to sort. I have not thought through the details of an algorithm but when nRows we want is substantially smaller than TotalRows then i just feel there should be a better way to pick those nRows. For example, if nRows were 1, then all we had to do would be 1 single pass on 166333 rows to find the max. That is quite different than sorting all and this idea should be possible to generalize on 1=nRowsTotalRows.I agree that this would be a useful improvement. Now, how do we tell the back end that we want only the first 1000 rows?Or more generally (as found in competitive products):How do we tell the back end that we want to skip the first N and return the next M rows?Craig Ali Craig Russell [EMAIL PROTECTED] wrote: Hi Scott, From the query plan it appears that your filter selects 166,333 rows, of which you want the first 1000 according to the ordering of the order_id column. You can see that this is an effective strategy because Number of rows qualified=166333 Number of rows visited=166333. There's no time lost visiting rows that don't qualify. The database has to sort the 166,333 rows because the results are ordered according to the index scan column "time" not according to the order_id column. All of the rows need to be sorted even though you only want the first 1000 rows. I'd guess that the sorting of the 166,333 rows is what accounts for the 15 second delay you are experiencing. The index on order_id doesn't do you any good because you have a result that isn't indexed on order_id. If this isn't obvious, try to think of an algorithm that would use the order_id index on the result set. Craig On Sep 19, 2005, at 9:29 AM, scotto wrote: So for the second query: select * from orders where time '10/01/2002' and time '11/30/2002' order by order_id; the query plan shows that the index IX_ORDERS_TIME is used to filter the result set by time. The order by step does not use the primary key index to sort the results after the filter step. My questions: --Is it correct that the sort step not use the primary key index in this case? --Why is it not possible to use the index on order_id to sort after the filter has happened? Here is the query plan: Statement Name: null Statement Text: select * from orders where time '10/01/2002' and time '11/30/2002' order by order_id Parse Time: 0 Bind Time: 0 Optimize Time: 0 Generate Time: 0 Compile Time: 0 Execute Time: 14329 Begin Compilation Timestamp : null End Compilation Timestamp : null Begin Execution Timestamp : 2005-09-19 09:20:06.171 End Execution Timestamp : 2005-09-19 09:20:20.5 Statement Execution Plan Text: Sort ResultSet: Number of opens = 1 Rows input = 166333 Rows returned = 1000 Eliminate duplicates = false In sorted order = false Sort information: Number of merge runs=1 Number of rows input=166333 Number of rows output=166333 Size of merge runs=[93695] Sort type=external constructor time (milliseconds) = 0 open time (milliseconds) = 14297 next time (milliseconds) = 32 close time (milliseconds) = 0 optimizer estimated row count: 78377.51 optimizer estimated cost: 166745.12 Source result set: Index Row to Base Row ResultSet for ORDERS: Number of opens = 1 Rows seen = 166333 Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 10488 close time (milliseconds) = 0 optimizer estimated row count: 78377.51 optimizer estimated cost: 166745.12 Index Scan ResultSet for ORDERS using index IX_ORDERS_TIME at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 1 Rows seen = 166333 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 3438 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=887 Number of rows qualified=166333 Number of rows visited=166333 Scan type=btree Tree height=3 start position: on first 1 column(s). Ordered null semantics on the following columns: stop position: = on first 1 column(s). Ordered
Re: FYI: Derby 'getting started' article in Sep. issue of Linux Magazine
Stanley Bradbury wrote: Anyone just getting started with Derby can find some useful information in the Sepember 2005 issue of Linux Magazine now available at newstands. The article is titled: Derby: the Java Relational Database and addresses the differences between programming using the Derby embedded driver vs. using the Derby Network Client driver. It is expected that the September issue will be made availabe online in about 60 days. The Linux Magazine website is: http://www.linux-mag.com/. Please post to this mail thread any feedback on how the article might be improved or what other 'getting started' topics would be helpful to have documented. Stan's article is now available on-line: http://www.linux-mag.com/content/view/2134/ -jean
Re: derby performance and 'order by'
How about this: Make 1 pass through the big chunk which is 166333 rows (or could be millions): For each row, decide whether or not it belongs to the final 1000 chunk. To do this efficiently, the tricky part needs to be on the 1000 chunk side. Steps: 1. Keep and maintain max-min values for this chunk (the 1000 rows result)at all times. 2. If valueis above max, accept right away, add to the top. Ifrequired (when chunk is already full): drop last value from bottom. 3. If value is below min and we already have 1000 in hand, reject right away. If we have less than 1000: add this value to the bottom. 4. If value falls in between max and min: Interpolate (or simple binary search - or some kind of a hashtable that keeps values sorted) to find exact location for this value in our 1000 chunk. Insert value in proper location. If required: drop last value from bottom and adjust min-max. 5. When we are done scanning through 166333 rows will have our 1000 chunk in our hand ready to return. This looks more scalable than sorting 100s of thousands of rows when Derby returns small chunks out of big big tables. It may be slower when chunk size is bigger. If hash sort etc(constant time) is used to decide position of a value in the final chunk, then even if slower, it will still be scalable (itshould notbe grossly bad even if optimizer picks this algorithm over normal sort when it should not have done so). For parallelism, it gives the opportunity(simple enough to outline here on the fly) to divide the whole task (divide 166333 into chunks of 16K rows for 10 threads to work on for example)into multiple threads where min-max, insert, drop from bottom kinda stuff needs to be protected and modifications on the final chunk structureneed to becommunicated to other threads. Assuming reading a row from 166333 result needs IO, when one thread is doing IO, another thread will be trying to decide where to insert it's newly found value and it *might* bring performance gains. Since final chunk needs to be synchronized at various points, two threads cannot insert at the same time and one thread cannot read the chunk structure whileanother is inserting a value into it... Come to think of it, it might run faster in single thread version when synchronization is not involved. Regards, Ali Daniel John Debrunner [EMAIL PROTECTED] wrote: Suavi Ali Demir wrote: Actually, it sounds like the problem of finding top 1000 rows out of 166333 rows is different than sorting 166333 rows and maybe it could be optimized. There is no need to sort all 166333 but the information that we are only looking 1000 rows would have to be passed all the way down to the point where Derby decides to sort. I have not thought through the details of an algorithm but when nRows we want is substantially smaller than TotalRows then i just feel there should be a better way to pick those nRows. For example, if nRows were 1, then all we had to do would be 1 single pass on 166333 rows to find the max. That is quite different than sorting all and this idea should be possible to generalize on 1=nRows
Re: GUI for Derby
Mamta Satoor wrote: It will be nice to put these GUI interfaces options in Derby FAQ. Initially, I'm just trying to clean up the GUI tools links at http://db.apache.org/derby/integrate/misc.html#Products+by+Type . Then I'll add a link to that from the Derby FAQ. Right now the summary page lists 3 products for GUI tools: 1) Eclipse The Eclipse link is way old (Oct 2004). Since then, the Derby Plug-ins have been added. Also the Web Tools Project http://www.eclipse.org/webtools/ bundles derby (albeit 10.0) and even provides a tutorial that includes derby at http://www.eclipse.org/webtools/wst/components/rdb/RDBTutorial.html . Anything more to say here? I'd like to add something about Eclipse + Hibernate, since Charlie raves about that (see http://mail-archives.apache.org/mod_mbox/db-derby-user/200509.mbox/[EMAIL PROTECTED] ), but only if somebody has an URL handy that shows how to configure that combination. 2) iSQL-Viewer I haven't seen lots of posts about this on derby-user. 3) SQuirreL SQL Lots of posts to derby-user indicate this works well. anything else? thanks, -jean
Re: GUI for Derby
Jean T. Anderson wrote: ... Also the Web Tools Project http://www.eclipse.org/webtools/ bundles derby (albeit 10.0) and even provides a tutorial that includes derby at http://www.eclipse.org/webtools/wst/components/rdb/RDBTutorial.html . I misspoke on the bundling -- wtp does not include the derby jars. It just includes the support for creating a 10.0 connection (which is easily pointed at the 10.1 jars). sorry for any confusion, -jean
Wish List addition
I would like to add 'full text indexing' as a wishlist item. There has been some good tutorials with Lucene, but I prefer an integrated solution. Does anyone have some schema or triggers to get this done?