Re: Puzzling error
the problem is creating a tmp file for query execution, probably not with the file with the actual data. i think derby does this when it needs to do deferred update, and some deferred constraint checking also. as others have said check permissions/existence on /Users/dmclean/IdeaProjects/aoiIteration16/testPersStore/tmp On 8/4/2016 7:32 AM, Donald McLean wrote: I am running Derby 10.12.1.1 under Java 1.8.0u102. I am getting an exception trying to delete a record that I added not long before (as part of a unit test) (see stack trace below). Any thoughts, hints, or ideas as to what I'm doing wrong would really be appreciated. Thank you, Donald Caused by: java.sql.SQLException: Exception during creation of file /Users/dmclean/IdeaProjects/aoiIteration16/testPersStore/tmp/T1470320730508.tmp for container at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source) ~[derby.jar:na] at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:204) ~[hibernate-core-5.0.5.Final.jar:5.0.5.Final] ... 36 common frames omitted Caused by: org.apache.derby.iapi.error.StandardException: Exception during creation of file /Users/dmclean/IdeaProjects/aoiIteration16/testPersStore/tmp/T1470320730508.tmp for container at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source) ~[derby.jar:na] ... 48 common frames omitted Caused by: java.io.FileNotFoundException: /Users/dmclean/IdeaProjects/aoiIteration16/testPersStore/tmp/T1470320730508.tmp (No such file or directory) at java.io.RandomAccessFile.open0(Native Method) ~[na:1.8.0_102] at java.io.RandomAccessFile.open(RandomAccessFile.java:316) ~[na:1.8.0_102] at java.io.RandomAccessFile.(RandomAccessFile.java:243) ~[na:1.8.0_102] at org.apache.derby.impl.io.DirRandomAccessFile.(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.io.DirFile.getRandomAccessFile(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.raw.data.RAFContainer.run(Unknown Source) ~[derby.jar:na] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_102] at org.apache.derby.impl.store.raw.data.RAFContainer.createContainer(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.raw.data.FileContainer.createIdent(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.raw.data.FileContainer.createIdentity(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.services.cache.ConcurrentCache.create(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.addContainer(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.raw.xact.Xact.addContainer(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.access.heap.Heap.create(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.access.heap.HeapConglomerateFactory.createConglomerate(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.store.access.RAMTransaction.createConglomerate(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.sql.execute.TemporaryRowHolderImpl.insert(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.sql.execute.UpdateResultSet.open(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source) ~[derby.jar:na] at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source) ~[derby.jar:na] ... 41 common frames omitted -- email:Mike Matrigali - mikema...@gmail.com linkedin: https://www.linkedin.com/in/MikeMatrigali
Re: OutOfMemoryError
Did you get the error when adding the primary key? If you got the error on a different query, could you post that query, and possibly the query plan being executed? A possible workaround might be to see if you could create a different index such that the system could pick a query plan where the sort is not needed. As you mention derby does try to estimate ahead of time amount of memory, but it has a number of problems: 1) the java interfaces do not work well for estimating amount of memory available in the future. 2) derby estimates for blob/clob size are often wrong. I believe xml uses clob implementation internally. I would guess the technical problem here is that derby underestimated the sort size. Providing a reproducible test case that you can check into JIRA will increase the likelyhood that a developer will look at the issue. /mikem On 6/3/2015 11:39 PM, Mattias Andersson wrote: We got a 'java.lang.OutOfMemoryError: Java heap space' when using a Derby disk database with the following schema. CREATE TABLE TEST (ID BIGINT NOT NULL ,NEXT _ID BIGINT ,PREV_ ID BIGINT ,TEST_XML XML); ALTER TABLE TABLE ADD PRIMARY KEY (ID); I opened the heap dump with Eclipse Memory Analyzer and found that tree instances of org.apache.derby.impl.store.access.sort.MergeInserter occupies 290 MB of memory (that is 66 % of the total heap) The problem seems to be that MergeInserter holds a sort buffer that contains complete rows and theTEST_XML is about 160 kB each. MergeInserter has an advanced algorithm to calculate the allowed size of the buffer, but is it necessary store the complete row in the buffer, i.e. the non-indexed values (TEST_XML is this case)? Is there some property I can set to limit the buffer size or is it possible to rewrite the SQL schema someway to exclude the XML type from the buffer? Br Mattias Following is the dominator tree from Eclipse Memory Analyzer: org.apache.derby.impl.store.access.sort.MergeInserter @ 0xe3f72e20|72 |127 533 080 |30,76% |- org.apache.derby.impl.store.access.sort.SortBuffer @ 0xe3f73018|48 |127 532 792 |30,76% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed5e0|40 |164 384 |0,04% ||'- org.apache.derby.iapi.types.DataValueDescriptor[4] @ 0xee3da280|32 |164 344 |0,04% |||- org.apache.derby.iapi.types.SQLClob @ 0xee3da2a0|56 |164 240 |0,04% - java.lang.String @ 0xee3da2d8 ?xml version=1.0?...|24 |164 160 |0,04% - char[1][] @ 0xee3da2f0|24 |24 |0,00% |||'- Total: 2 entries||| |||- org.apache.derby.iapi.types.SQLLongint @ 0xee3da308|24 |24 |0,00% |||- org.apache.derby.iapi.types.SQLLongint @ 0xee3da320|24 |24 |0,00% |||- org.apache.derby.iapi.types.SQLLongint @ 0xee3da338|24 |24 |0,00% ||'- Total: 4 entries||| ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed608|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed630|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed658|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed680|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed6a8|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed6d0|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed6f8|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fed720|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fedd50|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fedd78|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fedda0|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3feddc8|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fedff8|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fef5c8|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fef5f0|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3fef618|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff01f8|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff04c8|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff0608|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff0630|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff0658|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff0680|40 |164 384 |0,04% ||- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff06a8|40 |164 384 |0,04% | |- org.apache.derby.impl.store.access.sort.Node @ 0xe3ff06d0|40 |164 384 |0,04% |'- Total: 25 of 780 entries; 755 -- email:Mike Matrigali - mikema...@gmail.com linkedin: https://www.linkedin.com/in/MikeMatrigali
Re: Copying encypted DB?
this is better. i did not realize derby supported changing the encryption key. https://db.apache.org/derby/docs/10.10/devguide/tdevcsecurenewkeyoverview.html On 4/15/2015 10:57 AM, Peter Ondruška wrote: Unless I missed something why not just create backup and then open the backup copy and change encryption key. On Wednesday, 15 April 2015, John English john.fore...@gmail.com mailto:john.fore...@gmail.com wrote: I have a DB which is encrypted with one password, and I want to generate an identical copy of it which will be encrypted using a different password (so that I can provide copies of the same DB for two different customers without exposing one customer's password to the other). I thought at first I could create the tables from a script and then run lots of insert into foo (select * from bar) queries, but this won't work unless the auto-generated columns are allocated with the same sequence numbers so that the foreign key references will match up. And in some case the keys are not sequential, due to deletions. Is there an easy way to do this? TIA, -- John English -- Peter Ondruška -- email:Mike Matrigali - mikema...@gmail.com linkedin: https://www.linkedin.com/in/MikeMatrigali
Re: Copying encypted DB?
I suggest trying the following: shutdown db completely, do not do anything below while system is running. o do a full backup of the database to another location. you can use derby tools for this or copy it yourself if you know what you are doing. o unencrypt the 2nd database: http://db.apache.org/derby/docs/10.10/devguide/cdevcsecuredecryptdb.html o reencrypt the 2nd database: http://db.apache.org/derby/docs/10.10/devguide/tdevcsecureunencrypteddb.html On 4/15/2015 9:58 AM, John English wrote: I have a DB which is encrypted with one password, and I want to generate an identical copy of it which will be encrypted using a different password (so that I can provide copies of the same DB for two different customers without exposing one customer's password to the other). I thought at first I could create the tables from a script and then run lots of insert into foo (select * from bar) queries, but this won't work unless the auto-generated columns are allocated with the same sequence numbers so that the foreign key references will match up. And in some case the keys are not sequential, due to deletions. Is there an easy way to do this? TIA, -- email:Mike Matrigali - mikema...@gmail.com linkedin: https://www.linkedin.com/in/MikeMatrigali
Re: Performance degradation with Derby Embedded under Tomcat vs Jetty
Do many of the queries share a similar structure to the query you describe in point 5? Can you give an order of magnitude on the number of expected rows in t2 from that query, also maybe an estimate of overall size of the database tables involved in all the queries. Is the target system running embedded or client/server, derby should best perform embedded, though the number of rows being returned is pretty small so may not matter. Have you, or are you willing to look at query plans: http://wiki.apache.org/db-derby/PerformanceDiagnosisTips If you are able to post a query plan for that query I would be willing to look at it to see if derby is doing a scan or if the optimizer is using the expected multi-probe join strategy. Is it option to change the query? Derby does have 2 known issues when it comes to queries of this form: 1) derby compile time is often longer than other systems. priority was given to optimizing execution hoping that usual case was either prepared statement or internally cached statement plans. I think logStatementText or RUNTIMESTATISTICS can provide compile vs execution time in derby. 2) I have seen issues with large in-lists (which also should equate to large number of OR's as described below), where compilation cost is high, and optimizer chooses scan vs the probes. I don't know if it is an option for you, but I have seen others change queries of this form into a join on either a real or temp table depending on if the values are actually variable or not. On 4/13/2015 12:54 PM, Øyvind Harboe wrote: 5. There are some queries that stick out as a sore thumb below, e.g. the one which returns 1292 rows in table 1 below. I'm loathe to reproduce the query here because it is *LONG*. It takes a few hundred arguments to match against a list of primary keys. The SQL statement contains a few hundred of these .. OR (t2.field = ?) OR (t2.field = ?) OR (t2.field = ?) Stored queries are not an option here because the query is generated by Java code at the Cayenne ORM level. This query can't be simplified because the report is checking if the records are in one out of 322 specific categories where each category is a 'field' record as alluded to above. Table 1. Jetty Tomcat Rowstime/ms time/ms 13 0 1 13 0 0 11 0 0 13 1 1 13 1 0 0 1 1 129226622625 757 262 240 100 188 193 121 163 152 818 190 211 85 164 174 41 132 133 17 140 139 118 134 133 21 144 143 58 273 131 1 133 136 1340292 323 226 157 161 7 0 1 34 1 1 0 0 0 0 1 0 480950 52 91 2 1 61 1 1 0 0 0 0 1 0 129226842572 757 253 242 100 190 196 121 161 153 818 209 200 85 167 184 41 131 128 17 141 147 118 135 142 21 144 151 58 133 133 1 133 139 1340294 311 226 156 161 7 1 0 34 0 0 0 0 0 0 0 0 480967 51 91 1 1 61 1 0 0 0 0 13 1 5 11 1 1 Øyvind Harboe - Can Zylin Consulting help on your project? http://www.zylin.com/ -- email:Mike Matrigali - mikema...@gmail.com linkedin: https://www.linkedin.com/in/MikeMatrigali
Open letter to the Derby community
Hello Derby users, Those of you who also monitor postings on the derby-dev mailing list may have noticed a dramatic decline in activity over the past several months. That is because Derby's two corporate sponsors, Oracle and IBM, have significantly reduced their investment in Derby. Contributors do continue to monitor and field questions posted to the Derby mailing lists. Committers, in their spare time, do continue to help polish and commit patches which other contributors submit. Committers do continue to support contributions through the Google Summer of Code program. However, subsidized feature development and intensive bug-fixing have tapered off. Some committers would be happy to form a company which sells Derby support. Naturally, this depends on whether there is sufficient interest on the part of Derby users. If you are interested, then please contact Mike Matrigali and Rick Hillegas (see the email addresses cc'd above). Feel free to contact us even if your support requirements are tentative, future needs rather than immediate ones. We are sorry to be the bearers of this bad news. Best regards, Mike Matrigali (mikema...@gmail.com) and Rick Hillegas (rick.hille...@gmail.com) -- email:Mike Matrigali - mikema...@gmail.com linkedin: https://www.linkedin.com/in/MikeMatrigali
Re: RawStoreDaemon thread in Derby Embedded Database
On 2/24/2015 4:31 AM, Vivek Choudhary wrote: Please can someone provide me information regarding the use of RawStoreDaemon thread in Derby database. 2 main uses off the top of my head: 1) following events that result in rows being deleted from a page (like deleted, or aborted inserts), derby will post work to the daemon thread to be executed after the current transaction commits. This work involves reading the page, getting locks on rows, changing the page if it finds deleted rows are committed, and finally committing a separate transaction that will result eventually in writing that page to disk and log related I/O. 2) consistency points. Derby will schedule a consistency point after a configurable amount of log I/O has completed. At that point it schedule work to the daemon thread to cycle through every page in the cache and write those that are dirty. It also writes a log record to the transaction log. Once that work is done the system knows that dirty pages associated with transactions that committed before that point in the log have been written and that those log records are no longer needed for system reboot recovery. I am pretty new to derby database and was looking for some information regarding this. Also, since this is a daemon thread does this impact to any cpu usage during application runtime. yes, though most of the impact is I/O rather than cpu related. Thanks. Vivek Choudhary -- email:Mike Matrigali - mikema...@gmail.com linkedin: https://www.linkedin.com/in/MikeMatrigali available for Derby related contract projects.
Re: Derby with Indic UTF8 languages
On 2/13/2015 4:04 AM, BalNagendra wrote: Hi, I am writing an Indic Application which uses Sanskrit, Hindi and Telugu as languages. It is supported by UTF-8. When I am trying to insert these rows in Derby everything gets '?' out. After tooling around with some, I realized maybe it is Windows cmd.com problem. So I upgraded to Console2 which is showing these characters well. But still whatever I am inserting is coming back as '?'. http://apache-database.10148.n7.nabble.com/file/n143790/Indic.jpg Is there something wrong I am trying: 1. I learnt somewhere that Derby supports only these following languages, Indic is not one of them. Elsewhere it is written it can support any language which Java UNICODE has support for. I tested my Java program and it is displaying Indic well. So what is wrong. http://apache-database.10148.n7.nabble.com/file/n143790/DerbySupportedLanguages.jpg I am using Derby 10.11 on Windows XP and java 1.7. Thanks in advance. -- View this message in context: http://apache-database.10148.n7.nabble.com/Derby-with-Indic-UTF8-languages-tp143790.html Sent from the Apache Derby Users mailing list archive at Nabble.com. This is likely a tools issue, rather than a database issue. As noted the Derby database can store any string that is unicode. What you may be seeing is an issue with ij displaying the result. Given that in your example ij is printing english prompts it is likely interpreting the returned string as english and since it is not can't print it. See the following about using derby tools and language and codeset issues: http://db.apache.org/derby/docs/10.11/tools/derbytools.pdf Derby does have a limited number of languages that various messages can be printed in, so it is unlikely that will work for you. I don't have much experience with non-english environments, but can verify that derby should have have no problems storing and retrieving any unicode based string. Maybe try writing a simple java/jdbc program rather than use ij and see if that gives you results as you expect.
Re: Getting fail to start database and unknown page format Exception
On 1/14/2015 6:34 AM, Bryan Pendleton wrote: It looks like derby database is corrupt, and it is causing critical problem in our e derby.log http://apache-database.10148.n7.nabble.com/file/n143610/derby.log nd with the huge business impact. And the path which its saying derby not found is valid and correct path . Still why suddenly derby got corrputed and what are the reasons for the corruption ? There are various possibilities: 1) You experienced a hard-disk failure on the disk holding /usr/ironhide/var/db 2) or, you experienced a power-failure which caused the operating system to write incorrect data to the filesystem, or fail to flush its buffers properly, perhaps because write-caching was enabled on the filesystem hardware 3) or, you filled up the filesystem /usr/ironhide/var/db, and all the databases and their redo logs were on the same filesystem and all were damaged simultaneously 4) or, some other piece of software, not Derby, wrote updates to the Derby files directly There are certainly other possibilities. Have you inspected your system logs and talked to your system administrators for more information about what was going on at the moment that you noticed the problems? Do you have backups of these databases? Can you print the *full* exception logs from your attempts to access the databases (http://wiki.apache.org/db-derby/UnwindExceptionChain) thanks, bryan I agree with bryan's suggestions. A full derby.log may help. The following also may be helpful: https://wiki.apache.org/db-derby/DatabaseCorruption
Re: How to get expected time or processing percentage of Backup | Restore | a long running query
On 1/8/2015 11:16 PM, kosurusekhar wrote: Hi folks, We have application using derby database, which has backup | restore | cleanup (delete processed records from DB) options. The users of our application asking to show the processing percentage or estimated time to get know whether process is working or hanged. Is there any kind of feature or script to get this. Please let me know the possibilities. I am not aware of any Derby features that will give you this immediately. Some suggestions that you could work on: backup: Almost all of derby backup time is about copying the files. Derby provides a mechanism for you to implement your own backup, by simply calling an interface to tell derby when you are starting and when you are stopping. You could do this and then depending on what mechanism you are using compare % progress vs total db size. The following may depend on OS permissions. You could use derby default backup and in separate thread again monitory source size vs dest size. The caveat here is to make sure your monitoring does not get in the way of the I/O that derby is doing and make the backup fail. It would be interesting to file a JIRA for this request and explain what kind of interface you are looking for. The backup part of derby is very self contained and this could be a good project for someone new looking to contribute. At the lowest level for derby implemented backup it just does a file by file and page by page copy of the databse so it should not be too hard to implement logic to understand how big the db is to start, how much has gone so far and how much time it has taken so far. Not sure best way to communicate this to the caller. Anyone know how other products deal with this? restore: restore has 2 parts, first again is just an I/O bound phase of copying the whole db (if that is restore mode). this would lend itself to ideas above. The second is some database log level restore code which would need to be handled at low level of the log. Since restore is a boot time operation it is much harder to communicate to calling routine as there is not a connection yet. cleanup: i am not sure what you mean by this one. is it the progress of a single delete statement? Thanks in advance. Regards Sekhar. -- View this message in context: http://apache-database.10148.n7.nabble.com/How-to-get-expected-time-or-processing-percentage-of-Backup-Restore-a-long-running-query-tp143571.html Sent from the Apache Derby Users mailing list archive at Nabble.com.
Re: Index for TIMESTAMP COLUMN
I don't think there is anything specificly about a timestamp index vs another index that is bad. All indexes will have an extra overhead at insert/delete/update time as they are implemented as a separate file maintaining an btree index on disk. For your application it would depend if the benefit to queries that use the timestamp collumn out weigh the overhead of maintaining the index. Providing performance advice is hard. About the only generic thing i can say about insert performance is to do as many inserts as possible in a single transaction so as to avoid I/O bound performance on the synchronous write of the commit record. Also use and reuse prepared statements as possible. There are a lot of successful derby applications doing large numbers of inserts from evidence of past postings on derby user and derby developer list. To get more help I suggest you post more details to the list for those users to help. Best would be to include a reproducible test case of the problem area, including source code. That makes it the most likely you can get useful feedback. Also including some basic performance goals would be good - ie. hoping to get N inserts of rows with the following DDL in M seconds made by P concurrent threads on X hardware with Y disk technology. On 1/8/2015 11:34 PM, kosurusekhar wrote: Hi Folks, We have three tables where will have lot insertions per minute. In all three tables we have TIMESTAMP columns, And we kept index for these timestamp columns. This timestamp values always system time when the row is inserted. My doubt is whether maintaining the index tree for this kind of data (timestamp) will be overhead for derby database. Because Timestamp contains including seconds also right, I feel personally to create index structure maintaining this structure could be little overhead for derby server. Please correct me if I am wrong. In this scenario how to improve performance while huge insertions happening into these tables. Thanks in Advance. Regards Sekhar. -- View this message in context: http://apache-database.10148.n7.nabble.com/Index-for-TIMESTAMP-COLUMN-tp143572.html Sent from the Apache Derby Users mailing list archive at Nabble.com.
Re: Primary key auto increment sometimes fails
you should include exact ddl you use to create the table's primary key, and the version of derby you used to create the table. I believe that this is expected behavior but will be sure if you post ddl. I think all derby guarantees is that the primary keys will be unique, not that they will have no holes. I believe this is SQL standard compliant. The underlying reason for the implementation is multi-user performance. If the system needed to guarantee no holes, then it would logically have to have lock all inserts by all other transactions while a single transaction had inserts outstanding to make sure those inserts did not abort, thus stopping all work by other threads on the table by all other transactions until this transaction committed or aborted. /mikem On 12/13/2014 10:24 AM, pzsolt wrote: Hi! I have noticed that when there is a table with an auto generated primary key, the auto incrementation by INSERT fails. Instead of increment the value by 1, sometimes Derby increments the primary key with 100 or 1000 or other random value. I can't reproduce it, because it is random. For example, I have a table named 'INVOICE', and i have inserted 4 rows, and I get the following the auto generated keys: 1. INSERT: auto generated primary key: 806 2. INSERT: auto generated primary key: 807 3. INSERT: auto generated primary key: *904* 4. INSERT: auto generated primary key: *1004* 5. INSERT: auto generated primary key: 1005 It should be incremented by 1. The expected sequence should be: 1. INSERT: auto generated primary key: 806 2. INSERT: auto generated primary key: 807 3. INSERT: auto generated primary key: 808 4. INSERT: auto generated primary key: 809 5. INSERT: auto generated primary key: 810 C. a. 188 companies are using my Derby based software and I don't know what to do with this random error. And I know nobody who could help me. Has anybody met with this strange error? Do you have any suggestions, how to start to debug it? I can't reproduce it. Best regards, Zsolt Pocze -- View this message in context: http://apache-database.10148.n7.nabble.com/Primary-key-auto-increment-sometimes-fails-tp143465.html Sent from the Apache Derby Users mailing list archive at Nabble.com.
Re: Blob column behaviour, when we dont have data when having less data
On 12/4/2014 9:57 PM, kosurusekhar wrote: Thanks Mike for quick reply. 2) whether derby will occupy complete 6MB space if I am trying to insert small size files like 512KB or 1MB? *space used will be that of the size of the inserted column plus some metadata overhead/page overhead. * This means that in a 6MB column if I save 512KB content file then it will occupy 6MB + metadata size. Right? no, it will be 512kb plus a relatively small overhead. Is there any provision to space in this kind of scenario ? Thanks Sekhar. -- View this message in context: http://apache-database.10148.n7.nabble.com/Blob-column-behaviour-when-we-dont-have-data-when-having-less-data-tp143363p143376.html Sent from the Apache Derby Users mailing list archive at Nabble.com. .
Re: JSR310 and JDBC 4.2
Please repost the problem code, for some reason did not show up in your email. On 10/26/2014 2:13 PM, gquintana wrote: From documentation, Derby is supposed to implement JDBC4.2 which introduces JSR 310 (Date Time API) support into JDBC (among other things). Reading http://openjdk.java.net/jeps/170, I would have thought the following code would work, but it doesn't: What's the supposed way of reading/writing JSR 310 types from/into JDBC database? -- View this message in context: http://apache-database.10148.n7.nabble.com/JSR310-and-JDBC-4-2-tp143066.html Sent from the Apache Derby Users mailing list archive at Nabble.com.
Re: Multiple open/close of a derby database
or run all applications in same jvm, ie. 3 applications in an application server. On 9/25/2014 7:09 PM, George Sexton wrote: You're trying to run deny in embedded mode which is exclusive to that Jvm. You need to start an instanceof derby in server mode And have the applications connect to it. Sent from my iPhone On Sep 25, 2014, at 18:14, Bob M rgmatth...@orcon.net.nz wrote: Hi I have an application which opens a derby database and then does some things with table A I have now cloned this application to give me three applications which are 98% identical App 1 opens the database and actions table 1 App 2 opens the same database and actions table 2 App3 opens the same database and actions table 3 I wish to run the three Apps simultaneously and here I run into problems another instance of Derby may have already booted the database If I wish to retain the current setup of just one database with various tables what changes do I need to make to the current coding? Bob M Dunedin New Zealand -- View this message in context: http://apache-database.10148.n7.nabble.com/Multiple-open-close-of-a-derby-database-tp142454.html Sent from the Apache Derby Users mailing list archive at Nabble.com.
Re: Unexpected behavior for concurrent selection of an uncommitted record inserted in a different thread
if possible you should post a reproducible case directly against derby. Derby does use locking for its concurrency control, which will have different behavior than versioning if those other products do that. The expected behavior for read committed for what you describe would be for derby to use the index to get directly to key 123 and not ever see key 122 so not lock it. Seems like either T1 is locking more than is described, T2 is not using a plan that uses an index, or something else. I tried what you describe in the most simple case and derby acts as expected for a very simple case of a table with only 1 row in it. So there is something else going on not in your description. For instance maybe whatever else T1 is doing after the insert is the issue. It uses the index and returns 0 rows when looking for the key not in table and blocks when looking for non-committed insert when looking for key that is in uncommitted transaction. ij create table foo (a int not null primary key); 0 rows inserted/updated/deleted ij autocommit off; ij insert into foo values (1); 1 row inserted/updated/deleted ij(CONNECTION1) connect 'jdbc:derby:_db/wombat' as local2; ij(LOCAL2) select * from foo where a = 2; A --- 0 rows selected ij(LOCAL2) select * from foo where a = 1; A --- hanging on lock held by first uncommitted transaction * Sometimes this problem is with the query plan chosen by the select in T2. If it does not choose to use the index for some reason, then you will do a table scan and block. here is a start at info on debugging locking behavior in derby: http://wiki.apache.org/db-derby/LockDebugging On 2/7/2014 9:59 AM, Jacopo Cappellato wrote: Hi all! While I was writing some unit tests for the Apache OFBiz project (that by default runs on Derby) I noticed a behavior of Derby that I didn't expect and I would love to get your opinion. Here is my use case: * Derby 10.10.1.1 * there are two concurrent transactions T1 and T2 * isolation level is Read Committed * in transaction T1 a record with primary key 123 is inserted in a table; then other long running tasks are executed (i.e. the transaction is not immediately committed) * in the meantime T2 attempts to select from the same table the record with primary key 123 Behavior: T2 blocks on the select statement waiting for transaction T1 to release the write lock; this can cause a lock wait timeout Expected behavior: since T1 is not committed, T2 should not be able to select the record; I was expecting that the select statement in T2 would return an empty result set rather than blocking waiting for the lock held by T1 to be released; in fact this is what we get with MySQL and Postgres. What do you think? Thanks, Jacopo Cappellato
Re: Issue with large delete in derby
On 1/29/2014 10:46 PM, Amitava Kundu1 wrote: Hi, We are using embedded derby 10.5.1.1 in our product, This derby database is used as regular RDBMS where lot of insert, delete and select happens, There are business entities each of its occurrence could be of size 10 GB and upward e.g. a huge log file data. In our application, we use cascade delete and also has referential integrity constraints ON. This application runs on 64 bit Linux with 8 GB RAM allocated to JVM. Similar time is observed in our development Windows box. It takes more than 3 hour to delete those entities. During this time all the relevant tables stay locked and no other operation is feasible. We'd like know what could be different options/ strategy be adopted for: Speeding up the delete process Ability to other database activities in parallel Thanks Amitava Kundu here is a link about diagnosing performance issues that may help: http://wiki.apache.org/db-derby/PerformanceDiagnosisTips
Re: Setting derby.storage.pageSize
check out the following to see page size of table and indexes: http://db.apache.org/derby/docs/10.10/ref/rrefsyscsdiagspacetable.html On 1/9/2014 8:40 PM, Melissa Mifsud wrote: Version: 10.10.1.1 (Embedded) I have been trying to set the derby.storage.pageSize on an index according to http://db.apache.org/derby/docs/10.10/ref/rrefproper40688.html in the hope that increasing the pageSize will help with a large insert/update. It seems the value doesn't seem to be changing. The relevant part of my script is: call SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.storage.pageSize','16384'); -- CREATE TABLE DataSource... CREATE INDEX dataSource_path_idx ON DataSource(connectorId, path); select p, syscs_util.syscs_get_database_property(p) from (values 'derby.storage.pageCacheSize', 'derby.storage.pageSize') props(p); call SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.storage.pageSize', '4096'); select p, syscs_util.syscs_get_database_property(p) from (values 'derby.storage.pageCacheSize', 'derby.storage.pageSize') props(p); Both selects give the same output: P |2 --- derby.storage.pageCacheSize |4000 derby.storage.pageSize|NULL So I have (earlier in the script) changed the pageCacheSize property using SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY but the pageSize does not change. Running select pagesize from table (syscs_diag.space_table('APP', 'DataSource')) s; returns no rows. 1. Is there a way to check what the pageSize of an index is? 2. Is there some other way I should be setting the pageSize property? Thanks, Melissa **
Re: Upgrade from 10.8.2.3 to 10.10.1.3
I also don't know much about replication, so apologize if the following does not make sense. Have you tried decoupling the replication, do the upgrade and then reestablishing the replication making sure the latest software is on all systems. Also for a problem report it may help to state what Upgrade means to you. For instance the following: o I just updated jars everywhere o I want to run in soft upgrade mode everywhere after updating jars o I want to hard upgrade master. Not sure if it will help with not being able to upgrade in general, but may help to understand the problem. And as Kathey mentioned filing a JIRA with a reproducible test case is likely the best way to get help. Unfortunately the community members that implemented replication are not active, and the expertise on the feature is limited in the community. /mikem On 11/5/2013 9:32 AM, Oskar Zinger wrote: Hi Kathey, Nope that does not seem to help. Thanks, Oskar On Tuesday, November 5, 2013 11:54 AM, Katherine Marsden kmarsdende...@sbcglobal.net wrote: On 11/5/2013 8:46 AM, Oskar Zinger wrote: Hello, I'm in the process of upgrading 10.8.2.3 to 10.10.1.3. Once I did the upgrade, the Derby replication no longer works. It now complains about the Databases being not in synch. This same code was working just fine in 10.8.2.3, but does not work in 10.10.1.3. Anyone experienced this? This is the error I'm getting: java.sql.SQLNonTransientConnectionException: DERBY SQL error: ERRORCODE: 4, SQLSTATE: XRE05, SQLERRMC: The log files on the master and slave are not in synch for replicated database 'ImpactDB'. The master log instant is 1:980936, whereas the slave log instant is 1:846980. This is fatal for replication - replication will be stopped. I am not familiar with replication, but wonder if it is a requirement of upgrade to do a clean shutdown of the database before upgrading the jars or if that would at least help you work around the problem. I couldn't find anything in the documentation though requiring special procedures for upgrade with replication. Kathey
Re: no records in sys.sysstatistics
On 8/8/2013 6:19 PM, Katherine Marsden wrote: On 8/8/2013 3:13 PM, Jasmeet Bhatia (jasmbhat) wrote: I have in-memory 10.7.1.1 derby. I am trying to verify if indexes are being used in my query. But I notice that there are no statistics in the sys statistics table. Any ideas what is going on? Jasmeet Bhatia With 10.8 statistics update is automatic. do note for the automatic to kick in you need to do a query on the table and then they will show up later. So if you load your db from scratch for testing and look there may be no stats initially. If you are able to control the order in your application it is best to load data first and then create indexes and primary keys. Then the stats are automatically created as part of creating the supporting indexes. And will be there for your initial query. Of course this does not work if your app needs to start with empty tables and data comes in later. With 10.7 you can create/update statistics with the SYSCS_UTIL.SYSCS_UPDATE_STATISTICS system procedure: http://db.apache.org/derby/docs/10.7/ref/rrefupdatestatsproc.html or by compressing the tables: http://db.apache.org/derby/docs/10.7/ref/rrefaltertablecompress.html Thanks Kathey
Re: Vulnerability in API documentation (javadoc) bundled with Apache Derby
Do you happen to know if 10.1 is affected or not? On 6/21/2013 5:07 AM, Knut Anders Hatlen wrote: Hi all, Some of you may already have noticed that Oracle's latest security update release of Java SE included a fix for a vulnerability in the javadoc tool (CVE-2013-1571). The javadocs included in all versions of Derby from 10.2.1.6 up to 10.10.1.1 were built with versions of the javadoc tool that had this vulnerability. If you publish javadocs from Derby (or from any other project for that matter) on a public-facing web server, we strongly recommend that you read Oracle's security advisory - http://www.oracle.com/technetwork/topics/security/javacpujun2013-1899847.html - and follow the steps to remove the vulnerability from the javadoc output. Thanks,
Re: issue with using optimizer hints in jdbc
On 6/6/2013 8:33 AM, Martin Serrano wrote: Hi, I am using version 10.9.1.0 with the embedded driver, spring, and using the BasicDataSource. I am trying to issue this query: org.springframework.jdbc.BadSqlGrammarException: StatementCallback; bad SQL grammar [DELETE FROM metric_data md --derby-properties index=metric_data_ts WHERE metric_instance_id IN (SELECT metric_instance.metric_instance_id FROM projects, metrics, metric_instance WHERE projects.project_id = metrics.project_id AND metrics.metric_id = metric_instance.metric_id AND projects.online = true) AND md.timestamp = 1369884755355]; nested exception is java.sql.SQLSyntaxErrorException: Syntax error: DERBY-PROPERTIES. Caused by: ERROR 42X01: Syntax error: DERBY-PROPERTIES. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.sql.compile.SQLParser.checkInternalFeature(Unknown Source) at org.apache.derby.impl.sql.compile.SQLParser.propertyList(Unknown Source) at org.apache.derby.impl.sql.compile.SQLParser.deleteBody(Unknown Source) at org.apache.derby.impl.sql.compile.SQLParser.preparableDeleteStatement(Unknown Source) at org.apache.derby.impl.sql.compile.SQLParser.preparableSQLDataStatement(Unknown Source) at org.apache.derby.impl.sql.compile.SQLParser.StatementPart(Unknown Source) at org.apache.derby.impl.sql.compile.SQLParser.Statement(Unknown Source) at org.apache.derby.impl.sql.compile.ParserImpl.parseStatement(Unknown Source) at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown Source) at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown Source) The line of code where I issue it is like this: jdbc.update(DELETE FROM metric_data md --derby-properties index=metric_data_ts \r WHERE metric_instance_id IN (SELECT metric_instance.metric_instance_id FROM projects, metrics, metric_instance WHERE projects.project_id = metrics.project_id AND metrics.metric_id = metric_instance.metric_id AND projects.online = true) AND md.timestamp = 19292929292); From tests that derby executes here is one example for a delete. All our tests seem to use \n rather \r, so maybe try that: lang/TriggerTest.java:s.executeUpdate(CREATE TRIGGER APP.MYTRIG AFTER DELETE ON APP.TAB REFERENCING OLD_TABLE AS OLDROWS FOR EACH STATEMENT INSERT INTO APP.LOG(i,name,deltime) SELECT OLDROWS.I, NAMES.NAME, CURRENT_TIMESTAMP FROM --DERBY-PROPERTIES joinOrder=FIXED\n NAMES, OLDROWS --DERBY-PROPERTIES joinStrategy = NESTEDLOOP\n WHERE (OLDROWS.i = NAMES.ID) AND (1 = 1)); What do I need to do to use optimizer hints in this setup? Thanks, -Martin
Re: Peculiar sorting behaviour?
need ddl for both item_usage (i think you posted this in 1st posting) and tests (don't see this one) tables, to read the query plan. I read them bottom up, so first thing is to see what ordering is expected from the index scan, so looking for TESTS ddl and TEST_1 constraint: Index Scan ResultSet for TESTS using constraint TESTS_1 at read uncommitted isolation level using share row locking chosen by the optimizer On 4/2/2013 11:39 AM, John English wrote: On 02/04/2013 20:05, Katherine Marsden wrote: In derby.properties set: derby.language.logStatementText=true derby.language.logQueryPlan=true If you can post the two plans, someone might be able to give you the optimizer overrides to force the plan with incorrect sorting. Here it is (long!). It starts with the incorrect query; I then change the ORDER BY clause and reload the page, which then gives the correct results. There are some other queries mixed in, but I wasn't sure what would be relevant so I left it uncut. Tue Apr 02 21:14:23 IDT 2013 Thread[qtp31568925-37,5,main] (XID = 7222148), (SESSIONID = 26), (DATABASE = ../db/db_copy), (DRDAID = null), Executing prepared statement: SELECT tests.id,tests.item,title FROM tests,item_usage WHERE username=? AND user_role=3 AND item_usage.item=tests.item ORDER BY tests.item,title :End prepared statement with 1 parameters begin parameter #1: TAMMY :end parameter Tue Apr 02 21:14:23 IDT 2013 Thread[qtp31568925-37,5,main] (XID = 7222148), (SESSIONID = 26), SELECT tests.id,tests.item,title FROM tests,item_usage WHERE username=? AND user_role=3 AND item_usage.item=tests.item ORDER BY tests.item,title *** Scroll Insensitive ResultSet: Number of opens = 1 Rows seen = 30 Number of reads from hash table = 30 Number of writes to hash table = 30 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 19.34 optimizer estimated cost: 5186.92 Source result set: Project-Restrict ResultSet (7): Number of opens = 1 Rows seen = 30 Rows filtered = 0 restriction = false projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 19.34 optimizer estimated cost: 5186.92 Source result set: Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 5 Rows seen from the right = 30 Rows filtered = 0 Rows returned = 30 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 19.34 optimizer estimated cost: 5186.92 Left result set: Project-Restrict ResultSet (4): Number of opens = 1 Rows seen = 5 Rows filtered = 0 restriction = true projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 1.80 optimizer estimated cost: 17.01 Source result set: Index Row to Base Row ResultSet for ITEM_USAGE: Number of opens = 1 Rows seen = 5 Columns accessed from heap = {0, 1, 2} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 1.80 optimizer estimated cost: 17.01 Index Scan ResultSet for ITEM_USAGE using constraint ITEM_USAGE_1 at read uncommitted isolation level using share row locking chosen by the optimizer Number of opens = 1 Rows seen = 5 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=2 Number of rows qualified=5 Number of rows visited=6 Scan type=btree Tree height=2 start position: = on first 1 column(s). Ordered null semantics on the following columns: stop position: on first 1 column(s). Ordered null semantics on the following columns: qualifiers: None
Re: Derby Store Question
files starting with d are most usually associated with table, indexes, and/or constraints that have been dropped. In this case derby renames the original cXXX.dat file to dXXX.dat file, and later will eventually delete the dXXX.dat file. It does this so that if the drop is part of a transaction that aborts it can recover from the drop and rename the original file back. By later I think it is is the next checkpoint for the system where where the transaction that did the drop has been committed or aborted. There are a number of other operations which internally look like a drop to the storage system. I believe they include: off line compress some set of add column calls some set of change column type/length calls not sure about drop column What you are describing seems likely to be a bug in SYSCS_UTIL.SYSCS_BACKUP_DATABASE_NOWAIT(). It seems like this routine either should be disallowing the d files to go away if the backup really needs it, or needs to handle it disappearing if the backup does not actually need it. I thought it tried to make ddl type operations block during backup, but not sure. It could be an edge case that is not handled, if you can write a repro test case, please log a JIRA and include the test case. I wonder if the issue is a dropped file that has been committed before the backup, but has not yet been cleaned up and then gets cleaned up during the backup? On 3/11/2013 10:20 AM, Brandon L. Duncan wrote: This is more of a general Store question precipitated by a specific error I've come across during execution of the backup procedure. Typically, I know that the seg0 folder maintains a file per conglomerate in the format of cXXX.dat where XXX is the hex value of the conglomerate id (Thank you Mr. Pendleton's blog). In what case would Derby expect or create a .dat file starting with 'd'? The instance of Derby in question is running fine under normal operation, however when running a backup, the procedure fails due to a missing .dat file that happens to being with 'd' (exception posted below.). Is there any way to tell from the system tables, why derby believes this file should be there, or more specifically, what derby thinks should be stored in this file? Is it another conglomerate that is perhaps corrupted? This is from 10.8.3.0. Mon Mar 11 12:59:26 EDT 2013 Thread[DRDAConnThread_5,5,main] (XID = 4666255288), (SESSIONID = 1), (DATABASE = db), (DRDAID = .??-593347887885188381{1}), Cleanup action starting Mon Mar 11 12:59:26 EDT 2013 Thread[DRDAConnThread_5,5,main] (XID = 4666255288), (SESSIONID = 1), (DATABASE = db), (DRDAID = .??-593347887885188381{1}), Failed Statement is: CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE_NOWAIT('./backup') ERROR XSRS5: Error copying file (during backup) from /derby/db/seg0/D2f410.DAT to ./backup/db/seg0/D2f410.DAT. at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:303) at org.apache.derby.impl.store.raw.data.RAFContainer.privBackupContainer(RAFContainer.java:1071) at org.apache.derby.impl.store.raw.data.RAFContainer.run(RAFContainer.java:1650) at java.security.AccessController.doPrivileged(Native Method) at org.apache.derby.impl.store.raw.data.RAFContainer.backupContainer(RAFContainer.java:980) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.backupContainer(BaseContainerHandle.java:1031) at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.backupDataFiles(BaseDataFileFactory.java:2551) at org.apache.derby.impl.store.raw.RawStore.backup(RawStore.java:944) at org.apache.derby.impl.store.raw.RawStore.backup(RawStore.java:653) at org.apache.derby.impl.store.access.RAMAccessManager.backup(RAMAccessManager.java:964) at org.apache.derby.impl.db.BasicDatabase.backup(BasicDatabase.java:419) at org.apache.derby.catalog.SystemProcedures.SYSCS_BACKUP_DATABASE_NOWAIT(SystemProcedures.java:875) at org.apache.derby.exe.acf81e0010x013dx5a5fxee17x105f54e52.g0(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.derby.impl.services.reflect.ReflectMethod.invoke(ReflectMethod.java:46) at org.apache.derby.impl.sql.execute.CallStatementResultSet.open(CallStatementResultSet.java:75) at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:438) at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:319) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1242) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1692) at org.apache.derby.impl.jdbc.EmbedCallableStatement.executeStatement(EmbedCallableStatement.java:117) at
Re: NPE when compressing large table
I agree should log a JIRA then all the information and suggestions can go there. Having a line number would help a lot for exact cause of the null pointer. Historically SYSCS_COMPRESS_TABLE has been the most stable of the 2 compression techniques, and is the one I most recommend. The lowest part of the stack looks kind of wierd, but hard to know exactly what is going on as part of the stack is missing. It sort of looks like it is failing while setting up a connection rather than in your code. Can you tell from the line numbers where in your code it is failing? Also include full derby.log, sometimes that can give more information. I can't tell from your program fragment. Is the connection autocommit on or off? If it is on, there might be a problem with your resultset. On 9/4/2012 7:39 PM, Brett Wooldridge wrote: This looks like a bug to me, I recommend you open one. You might also try SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE. If you can copy the DB to your dev environment, reproduce the issue, and run a derby 10.8.2.2 built with line numbers, it would probably help the devs pinpoint the issue. Brett On Wed, Sep 5, 2012 at 5:11 AM, Erick Lichtas elich...@lichtasweb.com mailto:elich...@lichtasweb.com wrote: Hi Everyone, __ __ I am having an issue with the SYSCS_UTIL.SYSCS_COMPRESS_TABLE operation on a large table in Derby 10.8.2.2. __ __ Statement stmt = *null*; CallableStatement cs = *null*; *try*{ String sql = select schemaname, tablename from sys.sysschemas s, + sys.systables t where s.schemaid=t.schemaid and t.tabletype='T'; stmt = con.createStatement(); ResultSet rs = stmt.executeQuery(sql); cs = con.prepareCall(call SYSCS_UTIL.SYSCS_COMPRESS_TABLE(?, ?, ?)); *while*(rs.next()) { String schema = rs.getString(1).trim(); String table = rs.getString(2).trim(); log(Compressing + schema + .+ table + ...); cs.setString(1, schema); cs.setString(2, table); cs.setShort(3, (*short*) 1); cs.execute(); } } *finally*{ *if*(stmt != *null*) { stmt.close(); } *if*(cs != *null*) { cs.close(); } } __ __ The above code, successfully runs over 3 of 20 some tables, then hits the 4^th table with over 2 million records. After working on this table for about 5 minutes, the process terminates with an SQLException wrapping a NPE. This happened in a production instance and I have not yet set up a reproduction in a development environment. I’m wondering if anyone has any thoughts based on the stacktrace below? __ __ com.linoma.gaservices.upgrader.UpgradeException: Java exception: ': java.lang.NullPointerException'. at com.linoma.gaservices.upgrader.DefaultUpgrader.upgrade(DefaultUpgrader.java:227) at com.linoma.gaservices.upgrader.UnixUpgrader.upgrade(UnixUpgrader.java:32) at com.linoma.gaservices.upgrader.UpgradeStarter.startUpgrade(UpgradeStarter.java:25) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:79) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:618) at my.app.upgrader.Startup.main(Startup.java:72) Caused by: java.sql.SQLException: Java exception: ': java.lang.NullPointerException'. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.javaException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.init(Unknown Source) at org.apache.derby.jdbc.Driver30.getNewEmbedConnection(Unknown Source) at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
Re: Cache Hit/Miss rates for Derby pageCache
On 5/17/2012 10:07 AM, David Zanter wrote: Does anyone know, is there a way to get derby statistics from a running system regarding the hit/miss rates for the Derby Page Cache. i.e. Statistics that would help determine if the: derby.storage.pageCacheSize was being effective or needs to be tweaked. ~David cloudscape use to have this functionality, but I don't think it was exposed in derby as it depended on non-standard vti's. Does anyone know if the jmx work provided this functionality.
Re: CALL SYSCS_UTIL.SYSCS_COMPRESS_TABLE
Matthew McCawley wrote: Mike Matrigali wrote: You see the same kind of repeated stack in the error? This loop looks strange to me, and I don't think should be related to size of the tables: at org.apache.derby.iapi.store.raw.xact.RawTransaction.notifyObservers(Unknown Source) at org.apache.derby.impl.store.raw.data.DropOnCommit.update(Unknown Source) at java.util.Observable.notifyObservers(Observable.java:142) I see exactly this loop of repeated errors. http://old.nabble.com/file/p33357142/derby.log derby.log Mike Matrigali wrote: There are some reported problems with the amount of memory in general that compres table uses, which are likely to be a different issue. For these memory issues it is helpful to post exactly what jvm you are using, what OS, and what flags you are giving the jvm. And how much memory is on your machine. I'm running on Windows 7 64bit with 8 GB of RAM and am not passing any arguments to the JVM. It fails unless I pass in -Xss2048k (I think the default is 512k). I have reported DERBY-5624 to track this issue. I think I understand the problem, but would feel much better with a reproducible test case I could run. Feel free to add your information to DERBY-5624.
Re: CALL SYSCS_UTIL.SYSCS_COMPRESS_TABLE
Matthew McCawley wrote: Mike Matrigali wrote: I have reported DERBY-5624 to track this issue. I think I understand the problem, but would feel much better with a reproducible test case I could run. Feel free to add your information to DERBY-5624. In our case, we just have a single table with about 5 million rows of essentially junk data. I delete some portion of the data that's older than some margin (half, single day's worth, etc.) and run compression. I also tried another table that's about twice as big, but it required an 8 MB stack size. I've run out of heap space a few times as well, but I'm still working on reproducing it. at this point could you move the discussion to (just add comments I don't think you have to be authorized to do so): https://issues.apache.org/jira/browse/DERBY-5624
Re: CALL SYSCS_UTIL.SYSCS_COMPRESS_TABLE
Matthew McCawley wrote: I've run into the same issue as Adriano when running on a single, large table about 1.4 GB in size. I enable autocommit before the compress statement and disable it after. I have encountered the error when deleting portions of the data as well as all of it. I also found that the compression would succeed if I used a stack size of 2 MB and a maximum heap size of 1 GB (-Xss2048k -Xmx1g). I'll be working with this more next week, so I'll see if anything changes when working with a larger dataset. You see the same kind of repeated stack in the error? This loop looks strange to me, and I don't think should be related to size of the tables: at org.apache.derby.iapi.store.raw.xact.RawTransaction.notifyObservers(Unknown Source) at org.apache.derby.impl.store.raw.data.DropOnCommit.update(Unknown Source) at java.util.Observable.notifyObservers(Observable.java:142) at org.apache.derby.iapi.store.raw.xact.RawTransaction.notifyObservers(Unknown Source) at org.apache.derby.impl.store.raw.data.DropOnCommit.update(Unknown Source) at java.util.Observable.notifyObservers(Observable.java:142) at org.apache.derby.iapi.store.raw.xact.RawTransaction.notifyObservers(Unknown Source) at org.apache.derby.impl.store.raw.data.DropOnCommit.update(Unknown Source) at java.util.Observable.notifyObservers(Observable.java:142) at org.apache.derby.iapi.store.raw.xact.RawTransaction.notifyObservers(Unknown Source) at org.apache.derby.impl.store.raw.data.DropOnCommit.update(Unknown Source) at java.util.Observable.notifyObservers(Observable.java:142) at org.apache.derby.iapi.store.raw.xact.RawTransaction.notifyObservers(Unknown Source) There are some reported problems with the amount of memory in general that compres table uses, which are likely to be a different issue. For these memory issues it is helpful to post exactly what jvm you are using, what OS, and what flags you are giving the jvm. And how much memory is on your machine. Derby was not originally created with vldb in mind, so multi-gigabyte tables could very well be exercising new code paths. Derby definitely has the ability to perform index creations/sorts on tables bigger than memory size, but there are some reported problems in its estimates of how much memory it should use to do so. These estimates can definitely be affected by jvm startup flags.
Re: CALL SYSCS_UTIL.SYSCS_COMPRESS_TABLE
Adriano Oliveira wrote: Hi, Could you please help me with compress? Why i aways got an StackOverflowError exception when i try to run CALL SYSCS_UTIL.SYSCS_COMPRESS_TABLE() ? I have about 8 tables (consuming 3Gb of disk space) and compress never got success to compress all of then, I'm using the java application CompressAll listed in this wiki http://wiki.apache.org/db-derby/DatabaseConsistencyCheck Thans, --Adriano Java exception: ': java.lang.StackOverflowError'. Causado por: StackOverflowError java.lang.RuntimeException: Java exception: ': java.lang.StackOverflowError'. at ... removed stack Could you post a full copy of the derby.log containing the error. Sometimes that log has more information, or previous info in it is useful. Best case would be to log a JIRA as Bryan suggests and put all info into it. as a workaround you might try doing one table at a time and committing, then going on to the next table. It should be easy to alter the example java code, let us know if you need help with that. This may also reduce the total amount of disk space needed for the operation. This operation basically creates a new table and indexes and it can not remove the files associated with the old tables and indexes until a commit happens. Also is it at all possible that you are doing other work in the same transaction? The DropOnCommit calls are going to come 1 for each associated object that has a file in derby. This will include 1 for each table, index, constraint, and foreign key. You say you have about 8 tables, do you have an unusual number of objects associated with these tables? Without a repro it is hard to say what is going on. If I had the db first thing I would try to see if the problem still exists if there is not data in the tables, to see if the problem is related to size of tables, or to ddl of the the tables. My first question would be if the drop on commit calls are just too many, or if there is a buggy loop somewhere.
Re: SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE question
I am thinking about this issue, thank you for reporting. Sundar Narayanaswamy wrote: I have posted the issue to DERBY-5487. I have also attached the Java test program. The test rows do insert at one end of the primary key and delete the other end.. Interestingly, I noticed that primary key space is reclaimed if I reuse the primary keys across the insert-delete loops. But, my application This is not surprising, it confirms that in general the reclaim space at split time works well for uniform type index distributions. Your application is the 2nd worst case for Derby. I don't know if we can fix at row level, but there may be some extra work we can do to try and get table level locks and do page merging more often and/or in inplace compress. For your specific application would it work if inplace compress got table level locks during the purge phase? The worst case for Derby would be a data distribution of an index which resulted in one row on each leaf. There is not support for merging non-empty leaf pages other than full offline compress. Anyone know if this case is handled in other databases? requires me to use continuously increasing primary keys (not reuse them). Mike Matrigali wrote: Posting your test to a JIRA issue would be best. It would be interesting to post the space table results after each insert/delete/compress iteration (or every 10, ...). When do you commit (every row or every 1)? Is it multi-threaded? Does your test always insert rows at one end of the index and delete them from the other end. If so it may be DERBY-5473 (a runtime issue, not a compress table issue).
Re: SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE question
Posting your test to a JIRA issue would be best. It would be interesting to post the space table results after each insert/delete/compress iteration (or every 10, ...). When do you commit (every row or every 1)? Is it multi-threaded? Does your test always insert rows at one end of the index and delete them from the other end. If so it may be DERBY-5473 (a runtime issue, not a compress table issue). inplace compress as currently inplemented does not do much for indexes. Originally the problem it was solving was row level space reclamation of base tables, indexes were not an issue. Lately multiple reports of index space issues have been coming up so more work would be profitable in this area. Currently it's only use is to give back space to the OS if possible, and that depends on the highest page number of a used page. Here is a discussion of the 3 phases: 1) it does not do purging of index rows. The original reason for this is that there is no space savings from purging alone in indexes. half filled pages can't be used as they are in base tables. Each leaf page owns a range of keys and when a key is inserted in this range and no space is found on the page, it automatically at that point purges the deleted rows before splitting. Given DERBY-5473 it may make sense to implement purging and merging to handle cases where we miss reclaiming fully empty pages. 2) it does not do defragment of indexes. Especially without a table level lock this is very complicated. Moving rows is not possible as they only can go where they are. It could be possible to move pages but then all pointers would also have to fixed up. Currently only code exists to do splits under row level locking, this operation would be much more complicated than a split. The process is even more complicated in that one can not look at a page in an index and get all necessary pointer that need to be updated from that page (there are no parent pointers on children). 3) it does try to truncate the file of an index if possible, but it does no work to manipulate the pages such that the last page used in the file is moved to earlier in the file. Sundar Narayanaswamy wrote: Hi, I am trying to use Derby database in the embedded mode in an application. I wrote a test program where I have a table (named LOCATION) with 4 columns as below: create table location(id int, num int, addr varchar(40), zip int, primary key(id, zip)) create index loc_index on location (num) I insert 1 rows into the table, then delete all that rows. I then call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE with just the option to purge rows so that the space left behind by deleted rows can be reused for future inserts. I have the inserts, deletes and compress in a loop. After running through the loop a few times, I am noticing that the number of allocated and unfilled pages for the primary key keeps growing (causing database size to grow over time). The longer I run the loop, larger these numbers are. That does not happen for the table or the index files though.. CONGLOMERATENumAllocatedPages NumFreePages NumUnFilledPages LOCATION1 831 0 SQL111027234806120 1342294 594 LOC_INDEX 1 521 1 The primary key space continues to grow even when I include the options to defragment_rows and truncate_end to the above function. CONGLOMERATENumAllocatedPages NumFreePagesNumUnFilledPages LOCATION1 0 0 SQL111027233119770 167447 704 LOC_INDEX 13357 3 The SYSCS_UTIL.SYSCS_COMPRESS_TABLE function works well and leaves no free/unfilled pages as expected. However, I am concerned with the efficiency (time taken) of using the compress function since there could be 10s of millions of rows in a production setup. It seems that the SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE function with just the purge_rows option would work best for my situation. Is this function not expected to free up deleted space in primary key files ? Thanks very much for you help, Sundar.
Re: Database size larger than expected
There are a number of factors: 1) an empty derby database takes up 770k in 10.9. This is 71 system catalogs and indexes on those system catalogs. Even empty tables still have inital empty pages allocated and written, in the default they take at lease 8k each. Each file in seg0 represents either a table or an index. I assume you meant you have a Derby database with 2 tables, so only 2 of the files in seg0 have data for your tables. They are likely the 2 biggest files in seg0. 2) By default the system also reserves empty space in pages for future updates. I think the default in 20% reserved. There are properties to override this. 3) Each record has overhead and each field has overhead that is variable. 4) The tables will begin allocating 8 pages at a time when they need a new empty page. So worst case could be 7 empty pages at the end. anthonyri wrote: David, No indexes I am afraid, should have mentioned that, sorry. Anthony David Zanter wrote: Do you have Indexes on those Tables? On Fri, Oct 21, 2011 at 5:29 PM, anthonyri anthony.r...@inbox.com wrote: Hi, I have a Derby database with two records, the first has 5000 records (of int, double, double, double, double - so approx 34 bytes per record), the second has 1000 records (of int, varchar(10), varchar(40) - so approx 102 bytes per record). 5000 * 34 = 166k 1000 * 102 = 100k Total = 266k (if it was in a text file) What I don't understand is why the database (seg0 folder) is 1.2MB? I have compacted the database using; call SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE('APP', 'mapObject', 1, 1, 1); call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('APP', 'MAPOBJECT', 1); Any thoughts please? Anthony -- View this message in context: http://old.nabble.com/Database-size-larger-than-expected-tp32699233p32699233.html Sent from the Apache Derby Users mailing list archive at Nabble.com.
Re: derby 10.5.1 - embedded - crash recovery and logging
Knut Anders Hatlen wrote: Paul French paul.fre...@kirona.com writes: Hello All, we have an issue where derby takes a long time to startup. Anything from minutes to hours. We are running derby on J9 on a windows mobile 6.5 It seems to be due to the fact that devices are being powered off and so derby is not shutdown cleanly. Question is, what is derby doing to recover? Why so long? Even when we start the application cleanly, do nothing, power off the device and then on again, it can take a very long time to make the first database connection vie the embedded driver. Hi Paul, During recovery, Derby replays all write operations performed after the last checkpoint and finally undoes all the operations performed by transactions that hadn't been committed at the time the database was taken down. If the database is shut down cleanly, a checkpoint will be invoked, so that there should be nothing to do in the recovery phase on the next start-up. If shutting down the database cleanly isn't an option, one way to reduce the recovery time is to increase the checkpoint frequency. Then there should be fewer operations to replay on the next start-up. This can be achieved by setting the (undocumented) database property derby.storage.checkpointInterval, which specifies the maximum number of bytes of transaction log before a checkpoint is triggered. The default is 10MB. The minimum value accepted is 10 bytes. However, you say that you see that recovery takes a very long time even if no operations were performed the last time the database was running, in which case there should be no operations to replay. Does the database's log directory contain many and/or big files? The files in that directory contain the transaction logs that need to be replayed on recovery. If there are many files there, it might be that you are running with log archive mode enabled (typically used when taking backups), which prevents old log from being deleted. You can check if that's the case by looking at the service.properties file in the database directory and see if it contains derby.storage.logArchiveMode=true. To be clear running with logArchiveMode=true should not increase your recovery time. There will be many log files, but the system knows where the last checkpoint was and only replays the necessary files. So the amount of work done should be the same whether logArchiveMode=true or not. The point is with logArchiveMode=true it is not as easy to tell from just looking at the files if there is a lot of work to do.
Re: Derby secure by default
I am not sure how it applies to all of these points, but I am wondering if secure by default should be implemented on a per database basis rather than a system level basis? It seems wierd that security could change based on how the next embedded startup set a flag. What about having the property be part of what user requests at database creation time? And maybe allow some secure way either disable or enable it. The discussion could continue on what the default for a newly created database would be. At least for point 1-4 seems to make more sense, not sure about 5,6. I personally think many of these points make most sense for derby network server. While many possibly get in the way for many zero-admin embedded applications. Since we have one codeline for the most part for both it is hard to have one default. Rick Hillegas wrote: The Derby developers are considering introducing a single master security property. Turning this property on will enable most Derby security mechanisms: 1) Authentication - Will be on, requiring username/password credentials at connection time. Derby will supply a default authentication mechanism. 2) SQL authorization - Will be on, hiding a user's data from other people. In addition, Derby will support more SQL Standard protections for Java routines. 3) File permissions - Will be tightened as described by DERBY-5363. 4) PUBLIC -This keyword will not be allowed as a user name. 5) SSL/TLS encryption - Will shield client/server traffic. 6) Server administration - Will require credentials. When the property is off, Derby will behave as it does today: Authentication, authorization, and network encryption will be off, file permissions will inherit defaults from the account which runs the VM, PUBLIC will be a legal user name, and server administration won't need credentials. This new master property will make it easier to configure a more secure application. We want to introduce the property in an upcoming 10.x release, where it will default to being off. That means that it won't cause compatibility problems. Later on, we might change the default for this property so that it would normally be turned on. This would make Derby more secure out of the box at the cost of breaking existing applications. Many applications would need to explicitly turn the property off in order to run as they did previously. Release notes would document this behavioral change and we would bump the major release id from 10 to 11 in order to call attention to the change. We would like your feedback on this trade-off between security out of the box versus disruption. Should this extra security be enabled by default? Thanks, -Rick
Re: Derby secure by default
Rick Hillegas wrote: I'm also concerned about the embedded database on a USB stick. I could argue that it is more vulnerable than the server-side database locked up in a machine room. Derby has one answer to this issue and that is encrypted databases. I don't think anything other than that is secure. Because if you lose ownership of the files anyone can use a version of the open source software to look at the data no matter what you do with authentication or authorization. I sort of doubt even file permissions would help if someone has physical control of the device and can mount it anywhere. Having said that, I don't want to also enable encryption by default for embedded derby. It is unnecessary for many applications and imposes performance degredation and application complexity. Historically Derby was never meant to be secure. It was designed to be embedded and for owning applications to handle security. Early on the issue of losing ownership of the files was considered and so encrytion was added. Eventually client/server was added and even later sql authorization, but it was not expected to be used in general for embedded applications. While many possibly get in the way for many zero-admin embedded applications. I'm imagining that this change may be fairly unobtrusive. For an embedded database which has only one user (the dbo), the big change is that the dbo has to specify a username and password. There won't be any need to GRANT access to other users so (2) won't be noticed. Items (3) and (4) won't burden most applications. (5) and (6) are only issues for client/server usage. In many cases I don't think this is true. I know users I work with currently run an embedded derby application as a number of different users and their current application does not use authentication or authorization. As with other files used by the application they manage access to db files as appropriate for their application. Since we have one codeline for the most part for both it is hard to have one default. I agree that a common default would be best. It will make it easier to reason about Derby's behavior and simplify our user guides. Thanks, -Rick Rick Hillegas wrote: The Derby developers are considering introducing a single master security property. Turning this property on will enable most Derby security mechanisms: 1) Authentication - Will be on, requiring username/password credentials at connection time. Derby will supply a default authentication mechanism. 2) SQL authorization - Will be on, hiding a user's data from other people. In addition, Derby will support more SQL Standard protections for Java routines. 3) File permissions - Will be tightened as described by DERBY-5363. 4) PUBLIC -This keyword will not be allowed as a user name. 5) SSL/TLS encryption - Will shield client/server traffic. 6) Server administration - Will require credentials. When the property is off, Derby will behave as it does today: Authentication, authorization, and network encryption will be off, file permissions will inherit defaults from the account which runs the VM, PUBLIC will be a legal user name, and server administration won't need credentials. This new master property will make it easier to configure a more secure application. We want to introduce the property in an upcoming 10.x release, where it will default to being off. That means that it won't cause compatibility problems. Later on, we might change the default for this property so that it would normally be turned on. This would make Derby more secure out of the box at the cost of breaking existing applications. Many applications would need to explicitly turn the property off in order to run as they did previously. Release notes would document this behavioral change and we would bump the major release id from 10 to 11 in order to call attention to the change. We would like your feedback on this trade-off between security out of the box versus disruption. Should this extra security be enabled by default? Thanks, -Rick
Re: Derby secure by default
Rick Hillegas wrote: Hi Mike, Some comments inline... On 9/19/11 10:38 AM, Mike Matrigali wrote: I am not sure how it applies to all of these points, but I am wondering if secure by default should be implemented on a per database basis rather than a system level basis? It seems wierd that security could change based on how the next embedded startup set a flag. I think that it should behave like derby.database.sqlAuthorization: once it's been turned on it is stored in the database and you can't turn it off at the system level. I agree that it would be weird to let the next user subvert the security of your database by flipping a command line switch. I am trying to understand what happens when we change the default and a user upgrades to 11 and starts up on there existing database that has no authentication or authorization enabled. What is the proposed soft upgrade behavior? What is the proposed hard upgrade behavior? This is a development detail but what is proposed for the existing derby set of tests, which I have to assume are about 99% not authenticated or authorized? Would we implement a way to run them both ways? Convert them all or most to run under new default?
Re: Derby secure by default
roy.mi...@comcast.net wrote: Installing a new version should always be backward compatible and not break anything in existing applications. If things don't work this way, it's bound to be (unncessarily) disruptive to some, and especially those less sophisticated and less able to figure out and fix problems. I see no reason why anyone who wishes to utilize the new capabilities would have any problem with setting the new property when they are ready to do so. Security is obviously important, especially for networked applications. I think it's also important not to do anything that interferes with embedded applications designed to require ZERO administration. +1 I think that new features that rick is proposing are valuable to some set of Derby users, and welcome their inclusion in Derby. Having one flag to enable a set of secure oriented features also seems reasonable as long as users can still pick and choose if they don't want the complete set. I don't think they should be made the default in the current release or a future release. I believe zero admin upgrade/backward compatibility has been a great feature for Derby so far and we should do whatever we can to not break it going forward. It seems like applications that need this feature set can set the appropriate flag going forward and then do the work to properly configure authentication and authorization, ssl encryption passwords, and other server administration. Since there is extra work necessary to use these features it seems reasonable to put to work on these applications to set the flag rather than put the work on the zero-admin applications that do not need these features. By default Derby should be zero admin and thus default to not requiring this extra administration. *From: *Rick Hillegas rick.hille...@oracle.com *To: *Derby Discussion derby-user@db.apache.org *Sent: *Monday, September 19, 2011 12:39:07 PM *Subject: *Derby secure by default The Derby developers are considering introducing a single master security property. Turning this property on will enable most Derby security mechanisms: 1) Authentication - Will be on, requiring username/password credentials at connection time. Derby will supply a default authentication mechanism. 2) SQL authorization - Will be on, hiding a user's data from other people. In addition, Derby will support more SQL Standard protections for Java routines. 3) File permissions - Will be tightened as described by DERBY-5363. 4) PUBLIC -This keyword will not be allowed as a user name. 5) SSL/TLS encryption - Will shield client/server traffic. 6) Server administration - Will require credentials. When the property is off, Derby will behave as it does today: Authentication, authorization, and network encryption will be off, file permissions will inherit defaults from the account which runs the VM, PUBLIC will be a legal user name, and server administration won't need credentials. This new master property will make it easier to configure a more secure application. We want to introduce the property in an upcoming 10.x release, where it will default to being off. That means that it won't cause compatibility problems. Later on, we might change the default for this property so that it would normally be turned on. This would make Derby more secure out of the box at the cost of breaking existing applications. Many applications would need to explicitly turn the property off in order to run as they did previously. Release notes would document this behavioral change and we would bump the major release id from 10 to 11 in order to call attention to the change. We would like your feedback on this trade-off between security out of the box versus disruption. Should this extra security be enabled by default? Thanks, -Rick
Re: Does anyone want to run Derby 10.9 on JVM 1.4 or on CDC/FP 1.1?
I think it would be ok to drop testing/support for jdk1.4. But I think it continues to be valuable to support running derby on small devices/set top devices through support of the CDC/FP 1.1 platform. Derby has a architecture which allows developers to add optional features that depend on newer jdk features, which has been used in the past and continues to be supported. Rick Hillegas wrote: The 1.4 JVM has not been supported as a free platform for some time (although I believe you can buy a support contract for 1.4 if you need to). Does anyone plan to run Derby 10.9 on this platform? Does anyone plan to run Derby 10.9 on the related small device CDC/FP 1.1 platform? For the next feature release, we are considering dropping support for one or both of these platforms. Your responses will help us make a decision. Thanks, -Rick
Re: SYSCS_UTIL.SYSCS_BACKUP_DATABASE failing
There should be no size limits to the backup database command other than those imposed by destination disk/filesystem. Are you sure it is hanging, rather than going very slow? Basically is anything showing up on the destination disk. Is there anything in derby.log when you encounter the problem? Are you trying to backup the database while work is being done in the database? If so, derby does have to wait until current tranactions finish before it can get started. I would not be surprised if backup done by derby is slower than one would expect, it has not been optimized for very large databases. It is basic. The expectation is that if the solution is not fast enough then user can code their own backup and properly interact with derby by using method outlined in the documentation using freeze or unfreeze: http://db.apache.org/derby/docs/10.8/adminguide/cadminhubbkup98797.html http://db.apache.org/derby/docs/10.8/adminguide/cadminhubbkup75469.html Vijender Devakari wrote: Hi Team, we are using below proc for backing up the database, but this call is hanging when the Database is size more like 3GB, 4GB, ... Can you let me know in which case we can use this and also what else we need to use to backup database incase if the siz is more. CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE(?) Best Regards, Vijender D, Deutsche Bank Ph: +65 6423 8530 Mobile: +65 94517994 This mail is transmitted to you on behalf of [HCL]. Diese Post wird Ihnen im Namen der [HCL] ubermittelt * --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
Re: Indexing speed in presence of BLOB columns
Trejkaz wrote: Hi all. I have been doing some performance testing of the time it takes to create an index on a table with BLOB data. The schema: CREATE TABLE binary ( id INTEGER NOT NULL, binary BLOB(1G) NOT NULL ) Each time, 10,000 rows are inserted with data of a given size and then an index is created at the end: CREATE INDEX binary_id ON binary (id) Times are an average of three runs after some warmup runs which are not counted. 1 kB blobs: Index: 567 ms Insert: 78 ms 10 kB blobs: Insert: 3954 ms Index: 515 ms 100 kB blobs: Insert: 56307 ms Index: 20591 ms 1 MB blobs: Insert: 521904 ms Index: 122527 ms Surprisingly, the larger the data which is present in the BLOB column, the longer it takes to index. Since the indexing process shouldn't need to read the BLOB data in order to index the row, I am surprised that it slows down when more data is present. What is going on here exactly? Is Derby physically copying the BLOB data to a new location and then deleting the original copy instead of performing some kind of cheap move operation? Is there some way to avoid this (which won't require changing the schema)? Daniel This is the expected behavior. Derby base tables are very basic and thus the need for indexes. In order to do a scan of the entire base table to build the index every page in the base table needs to be read from disk. Even though we don't actually read the blob data for creating the index the system still needs to bring in each page from disk to memory to see if it has any rows. Basically the scan of the base table is to loop from page 1 to the last page and check if each page is a head page with rows and read the rows. The blob data is stored in the same file as the main pages.
Re: NFS and Derby
Kathey Marsden wrote: I have always told users they have to have their databases on a local disk to ensure data integrity and that a system crash for an NFS mounted database could cause fatal corruption, but had a user this morning take me to task on this and ask me to explain exactly why. I gave my general response about not being able to guarantee a sync to disk over the network, but want to have a more authoritative reference for why you cannot count on an NFS mounted disk although I did find several places where the sync option favors data integrity which certainly doesn't sound like a guarantee. Does anyone know a good general reference I can use on this topic to support my you gotta use a local disk mantra. The problem is one of documentation and implementation of nfs. I don't think there is just one nfs out there. And there are definitely all sorts of other remote mounting options. Some of the problems that can arise, that are avoided in local disk and thus why to be safe we have documented we can't guarantee support include: 1) We may not be able to prevent dual booting and thus db may get corrupted. All of our algorithms for preventing dual booting rely on the jvms that are accessing the database to be on the same machine. Once 2 machines can access the same file we have no way to prevent corruption. 2) Derby depends on synchonous write behavior when requested. Basically at certain times Derby asks the JVM to guarantee that data to a table or recovery log file has been written and forced to disk before returning. If this syncing is not correct a number of database problems can happen such as: 1) we tell user a transaction was commited because we believe the log was forced, but the nfs was caching the result and crashes. Now the committed xact is not there. 2) we want to remove some recovery log so we force data to disk, wait for it to hit disk and the delete the log file for those disk updates. But data is actually cached and lost and now we have old data in the db and no log files to recover it from. When this was first documented I don't believe any JVM implementation on top of nfs could guarantee a completed synchronous write. It may be the case that certain remote file system implementations now can guarantee this, and it may be the case that the JVM implementations make the right calls to the nfs file system to do this - but I believe it is a support nightmare to try and support this. A quick google of nfs topics seems to indicate that there may be some versions of nfs that do support write sync. I believe this because most of the hits that I got were descriptions of how to disable the syncing to get better performance, indicating that many of nfs that might support write sync actually have it disabled. I did not see anyway that a java program could find out if the required syncing was being enforced. Note that we also can not guarantee recovery on disks with write cache enabled, which I believe many users have set. Many may not even know it as I believe it is the default for some disk installations. Also I think our documentation on this topic should be a bit stronger. Currently we just say it may not work and probably should be clearer that data corruption could occur. I will file an issue to beef up the language based on the conversation in this thread. http://db.apache.org/derby/docs/10.5/devguide/cdevdvlp40350.html Thanks Kathey
Re: NFS and Derby
And for some really ancient history (at least 10 years ago), I believe this bit of documentation actually resulted from one of the developers acidently running the set of tests in their home directory on nfs and getting errors. So at least at that time it didn't even take a crash to make something fail across nfs vs local disk. I don't think we have done any testing on remote file systems on purpose since then. Kathey Marsden wrote: I have always told users they have to have their databases on a local disk to ensure data integrity and that a system crash for an NFS mounted database could cause fatal corruption, but had a user this morning take me to task on this and ask me to explain exactly why. I gave my general response about not being able to guarantee a sync to disk over the network, but want to have a more authoritative reference for why you cannot count on an NFS mounted disk although I did find several places where the sync option favors data integrity which certainly doesn't sound like a guarantee. Does anyone know a good general reference I can use on this topic to support my you gotta use a local disk mantra. Also I think our documentation on this topic should be a bit stronger. Currently we just say it may not work and probably should be clearer that data corruption could occur. I will file an issue to beef up the language based on the conversation in this thread. http://db.apache.org/derby/docs/10.5/devguide/cdevdvlp40350.html Thanks Kathey
Re: Illegal Seek during SELECT
That is strange, if the table or index was actually bad I would expect check table to find this kind of problem. From your description it has the feel of something affecting the I/O system in memory and causing the later I/O to fail. You included a snip of the derby.log. Is there anything of interest that happened between the last boot and when you got this error? Sometimes derby gets errors it considers system level and it basically shuts down the I/O system to prevent db corruption until the system is rebooted. The error you got is from an index. You can rebuild all the indexes on this table by running the offline compress system procedure on table. This is a very old version of the software, if possible you might consider upgrading. 10.6 is the latest release at this point. Brandon L. Duncan wrote: Hi all - I was wondering if anyone has any insight on to the exception below. I preformed a SYSCS_CHECK_TABLE on all tables after restarting Derby and all returned a 1. I also preformed disk checking on the physical disk which houses the database which was clean as well. This query has run successfully after the restart as well, and I'm looking to see what may have caused this and how to possibly prevent a similar occurrence in the future. This is version 10.2.2.0. Thanks 2010-08-14 07:46:26.047 GMT Thread[DRDAConnThread_13346,5,derby.daemons] (XID = 79141229328), (SESSIONID = 12194), (DATABASE = ../../database/xdb1), (DRDAID = NF01.P02E-953354465271340036{12196}), Cleanup action starting 2010-08-14 07:46:26.047 GMT Thread[DRDAConnThread_13346,5,derby.daemons] (XID = 79141229328), (SESSIONID = 12194), (DATABASE = ../../database/xdb1), (DRDAID = NF01.P02E-953354465271340036{12196}), Failed Statement is: SELECT * FROM EBI.BPScriptCategories WHERE ID = ? ERROR XSDG0: Page Page(1,Container(0, 1601)) could not be read from disk. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.store.raw.data.CachedPage.readPage(Unknown Source) at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(Unknown Source) at org.apache.derby.impl.services.cache.CachedItem.takeOnIdentity(Unknown Source) at org.apache.derby.impl.services.cache.Clock.addEntry(Unknown Source) at org.apache.derby.impl.services.cache.Clock.find(Unknown Source) at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(Unknown Source) at org.apache.derby.impl.store.raw.data.FileContainer.getPage(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(Unknown Source) at org.apache.derby.impl.store.access.btree.ControlRow.Get(Unknown Source) at org.apache.derby.impl.store.access.btree.ControlRow.Get(Unknown Source) at org.apache.derby.impl.store.access.btree.BTreeScan.positionAtStartForForwardScan(Unknown Source) at org.apache.derby.impl.store.access.btree.BTreeForwardScan.positionAtStartPosition(Unknown Source) at org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(Unknown Source) at org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(Unknown Source) at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.reloadArray(Unknown Source) at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) at org.apache.derby.impl.drda.DRDAConnThread.writeFDODTA(Unknown Source) at org.apache.derby.impl.drda.DRDAConnThread.writeQRYDTA(Unknown Source) at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown Source) at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source) = begin nested exception, level (1) === java.io.IOException: Illegal seek at java.io.RandomAccessFile.seek(Native Method) at org.apache.derby.impl.store.raw.data.RAFContainer.readPage(Unknown Source) at org.apache.derby.impl.store.raw.data.CachedPage.readPage(Unknown Source) at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(Unknown Source) at org.apache.derby.impl.services.cache.CachedItem.takeOnIdentity(Unknown Source) at org.apache.derby.impl.services.cache.Clock.addEntry(Unknown Source) at org.apache.derby.impl.services.cache.Clock.find(Unknown Source) at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(Unknown Source) at org.apache.derby.impl.store.raw.data.FileContainer.getPage(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(Unknown Source) at org.apache.derby.impl.store.access.btree.ControlRow.Get(Unknown Source) at org.apache.derby.impl.store.access.btree.ControlRow.Get(Unknown Source) at
Re: Fluctuating disk overhead
David Van Couvering wrote: It will probably be a lot easier to explain if you du the log directory and the seg0 directory separately each time. And for more detail do a ls -l of the log directory. My guess is that it is due to logging, and incremental log file cleanup, but would need more data to confirm. Sorry, about that, new mouse and I did a premature send... As I was saying... Here is the table definition: CREATE TABLE BLOBTEST(ID INTEGER PRIMARY KEY, BLOBCOL BLOB) I insert 100 rows, and then do a loop updating the 100 rows over and over again. When I watch the disk usage of the database directory, I see this pattern repeated over and over again. I tried adding a call to compress the table after every 100 updates, but this had no impact. Can you explain what is going on and what, if anything, I can do to keep the size small? Thanks, David $ du -s -h BLOBTEST 3.3MBLOBTEST $ du -s -h BLOBTEST 3.4MBLOBTEST $ du -s -h BLOBTEST 4.9MBLOBTEST $ du -s -h BLOBTEST 5.9MBLOBTEST $ du -s -h BLOBTEST 6.5MBLOBTEST $ du -s -h BLOBTEST 7.0MBLOBTEST $ du -s -h BLOBTEST 8.6MBLOBTEST $ du -s -h BLOBTEST 8.7MBLOBTEST $ du -s -h BLOBTEST 8.7MBLOBTEST $ du -s -h BLOBTEST 9.7MBLOBTEST $ du -s -h BLOBTEST 11M BLOBTEST $ du -s -h BLOBTEST 12M BLOBTEST $ du -s -h BLOBTEST 11M BLOBTEST $ du -s -h BLOBTEST 13M BLOBTEST $ du -s -h BLOBTEST 14M BLOBTEST $ du -s -h BLOBTEST 4.3MBLOBTEST On Mon, Aug 23, 2010 at 10:49 AM, David Van Couvering da...@vancouvering.com mailto:da...@vancouvering.com wrote: Hi, all. I have a test that updates 100 rows, each of which contains an int primary key and a 300-byte BLOB: -- David W. Van Couvering http://www.linkedin.com/in/davidvc http://davidvancouvering.blogspot.com http://twitter.com/dcouvering -- David W. Van Couvering http://www.linkedin.com/in/davidvc http://davidvancouvering.blogspot.com http://twitter.com/dcouvering
Re: Execution Plan Changes Dramatically between Executions
Do any of the tables in the select have very few rows? If so a few updates/inserts could change the statistics a lot. I would suggest running an offline compress on all the tables in the query and see if the issue still happens. This will update all the statistics that the optimizer uses when estimating, including optimizing the indexes on all the tables. In the worst case, as a workaround you could use the optimizer hints to force the use of indexes to match the optimized plan if this does not get rid of the problem. Simon wrote: Hello all, I have an application which executes the following query: select tp.ID, tp.NAME, tp.DESCRIPTION, (select count(*) from test_plan rtp, test_plan_instance rtpi, script rs where tp.id = rtp.id and rs.TEST_PLAN_INSTANCE_ID = rtpi.ID and rtpi.TEST_PLAN_ID = rtp.ID and rs.STATUS = 'Running' ) as running_count, (select count(*) from test_plan rtp, test_plan_instance rtpi, script rs where tp.id = rtp.id and rs.TEST_PLAN_INSTANCE_ID = rtpi.ID and rtpi.TEST_PLAN_ID = rtp.ID and rs.STATUS = 'Pending' ) as pending_count from test_plan tp Most of the time it works great, indexes get used and it is fast enough (it is a report so it is not necessary to be instant). However sometimes it runs really slowly. It goes from taking 140ms to taking 147 seconds!I tried logging the query plans and I see that on the times when it is running slowly it is using a completely different execution plan. I don't fully understand the plans but it is very clear that certain indexes just stop getting used and it does scans instead, eg: scan information: Bit set of columns fetched={0, 1} Number of columns fetched=2 Number of pages visited=726 Number of rows qualified=0 Number of rows visited=102870 Scan type=heap Since the data set does not change in any significant way between executions (maybe a few updates, an insert or two) and the query is static I'm really curious why it would change its query plan, especially to something that is so dramatically worse. Any hints on how I can avoid this? NB: Using Derby 10.5.3. You can see the full execution plans: http://badboysoftware.biz/test/fast.log http://badboysoftware.biz/test/slow.log Thanks for any help! Simon
Re: Derbus interruptus == Derbus corruptus?
The db should never be corrupted by this. It is sometimes confusing as many scary errors come back, as often the error results in all subsequent operations failing until the database is completely shutdown and rebooted. It is hard to say exactly what happens, mostly because of the way interupts work in java. Basically the interrupt is not noticed until a java call is made that checks for interrupts. The usual case in derby is that this is noticed when an I/O happens. Depending on what I/O is interrupted different thing can happen. If it is a log I/O then it is a server level error and the system disables all future I/O's, causing those to return errors and the eventually system will come down. A read error I think is just statement error and subsequent queries may work. In all cases the system should recover fine after subsequent reboot, causing any transaction to back out that was in middle when the I/O failed. David Van Couvering wrote: Hi, all. I am using Derby embedded in a server process. The server process can receive a request to shut down, in which case it does an orderly shutdown of all its services. This uses ExecutorService.shutdownNow(), which sends an InterruptException to any task currently running. I am looking at changing this to using shutdown with a timeout before reverting to shutdownNow(), but I still want to discuss with you what happens. It *appears* that as a result of receiving this interrupt, Derby receives a java.nio.channels.ClosedByInterruptException, which it then rethrows as a SQLException. That's fine, I can dig through the layers of causes to find that it's a ClosedByInterruptException. But my question to you is, when Derby gets interrupted like this, what does it do? Does it correctly roll back the current transaction? Does it ignore the interrupt until it can finish processing the transaction? I am particularly concerned that this could cause some kind of inconsistency or corruption in the database. For your edification, here is the stack trace. I am not sure if it always happens when trying to open a connection, or if it also happens in the middle of an operation, I'm going to look for more examples of the stack trace, and will send it your way. Thanks! David com.vontu.itemcatalog.api.ItemCatalogException: java.sql.SQLException: Failed to start database 'C:\Vontu\Protect\scan\incremental_index/MONITOR_TARGET_4', see the next exception for details. at com.vontu.itemcatalog.derby.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:71) at com.vontu.itemcatalog.derby.SQLCallable.call(SQLCallable.java:53) at com.vontu.itemcatalog.derby.SQLCallable.call(SQLCallable.java:44) at com.vontu.itemcatalog.derby.PendingDeliveriesPersister.createTablesAsNeeded(PendingDeliveriesPersister.java:47) at com.vontu.itemcatalog.derby.PendingDeliveriesPersister.init(PendingDeliveriesPersister.java:42) at com.vontu.itemcatalog.derby.PendingDeliveriesPersister.init(PendingDeliveriesPersister.java:29) at com.vontu.itemcatalog.derby.PendingDeliveriesPersisterProvider.getPersister(PendingDeliveriesPersisterProvider.java:25) at com.vontu.itemcatalog.marshall.BloomFilterBatchMarshallableProvider.getNextMarshallable(BloomFilterBatchMarshallableProvider.java:89) at com.vontu.itemcatalog.distribution.ItemSetSender.execute(ItemSetSender.java:121) at com.vontu.itemcatalog.distribution.ItemSetSender.run(ItemSetSender.java:104) at com.vontu.discover.incremental.ItemSetSenderDispatcher.run(ItemSetSenderDispatcher.java:83) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.sql.SQLException: Failed to start database 'C:\Vontu\Protect\scan\incremental_index/MONITOR_TARGET_4', see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown
Re: Very bad disk space leak in Derby
what are the names of the new files after the transaction that tried to create the existing table them commits? If they are still c*.dat then there is definitely something wrong. If they start with a different letter then it could be that derby is acting as expected and just did not get around to cleaning them yet. I believe it does this at checkpoint time. The assumption being that normally this case is just drop table and not needed to be optimized. I have not looked at the code but my guess is that create table is counting on a unique key violation to tell whether a table exists or not. To do this it has to do an insert and to do the insert it needs the name of the file which only exists after the file is created. David Van Couvering wrote: Yes, it does sound like a bug. I'll log a JIRA On Fri, Apr 30, 2010 at 6:50 PM, Bryan Pendleton bpendleton.de...@gmail.com mailto:bpendleton.de...@gmail.com wrote: As I mentioned to Knut, I try to issue a CREATE TABLE each time I connect, and ignore the exception saying it already exists if the table is already there. If this is leaking a conglomerate each time (creating a .dat file but never deleting it), that seems like a bug to me. thanks, bryan -- David W. Van Couvering http://www.linkedin.com/in/davidvc http://davidvancouvering.blogspot.com http://twitter.com/dcouvering
Re: Performance of in-place upgrade
The answer would of course depend on the level of the current db and the level of the software you are moving to. But as a general rule I would expect soft upgrades to almost take no time at all, as the purpose of a soft upgrade is to not change anything just allow new software to work on old dbs and allow the old software also to work. Since all this happens at boot time it is hard to tell the difference between other boot time stuff the db might have to do like recovery which is much harder to estimate as it depends on what db was doing and how it was shut down. So far I don't think even hard upgrades will take very much as I don't think any derby release has had a hard upgrade that required any work at upgrade time on a per row/column basis. Most hard upgrade work has had to do with ddl, so is more dependent on number of ddl like tables, indexes, statements, and the like. This does not mean it won't happen in the future, but it would be a last choice in my opinion. In general we have rather looked at supporting both new and old types rather than pay a convert cost at upgrade time. /mikem David Van Couvering wrote: Hi, all. Yes, I'm now using Derby within the product I'm working on in my new job! :) So, I have some questions. I'll send separate emails for each ones to keep the threads simpler. I could have quite a few rows in my table - perhaps up to 100 million. The table will have about 5 columns. Do you have a sense of how long it will take to do a soft upgrade of a database of this size? Is it in the 1-9 minute range, 10-20 minutes, 1 hour, multiple hours? I can test this myself, but I was wondering if anyone could give me a quick ballpark. Thanks, David -- David W. Van Couvering http://www.linkedin.com/in/davidvc http://davidvancouvering.blogspot.com http://twitter.com/dcouvering
Re: Does a database shrink?
a derby table/index will not automatically shrink ever. Deleted row space can be used automatically by subsequent inserts or updates to the same table or index. Space is returned to the operating system when a table or index is dropped. One needs to call the compress routines by hand to get a db to shrink. David Van Couvering wrote: If I delete a bunch of rows from my database, does the size of the database shrink, or does it keep the space it has allocated once I grow to a certain size? Thanks! David -- David W. Van Couvering http://www.linkedin.com/in/davidvc http://davidvancouvering.blogspot.com http://twitter.com/dcouvering
Re: Order of Select when Identity column is defined
Iris Eiron wrote: Hello all, I have a table E defined like that: CREATE TABLE E ( ID VARCHAR(30) NOT NULL, DOMAIN VARCHAR(10) NOT NULL, DATA BLOB NOT NULL, INSERT_ORDER INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1), LAST_UPDATE TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY ( ID , DOMAIN ) ) My question is: In what order will the rows be returned if I run the following statement on E: SELECT DATA, INSERT_ORDER FROM E WHERE DOMAIN=? FOR UPDATE OF DATA I cannot specify an ORDER BY clause since Derby does not support SELECT FOR UPDATE with ORDER BY. Is there a guarantee that the rows will be returned by the order of the identity column INSERT_ORDER? I was not able to find documentation supporting this. There is no guarantee of order of rows for any select query unless there is an order by. Even if there is a index, the plan may choose not to use this index. The answer for this question is the key for improving the perfroance of a very expensive operation, that without order takes days to execute, and with order we expect it to complete in a few hours. Thanks for your help! - Iris
Re: Corrupted database
Along these lines, if you don't want to hack the code I would try the following (note I have not ever tried this, so have no idea if it will work - I thought about this for awhile and looked at the headers and could not come up with why it would not work). For this you will need to know the seg0 file associated with the table you are trying to recover and the seg0 file you will create for a new dummy table. o Shutdown the db you are working with cleanly with shutdown=true, make an offline copy of it and only work on the copy. o In another db create a new table that has the same ddl as the original table - i will call this dummy table. The most important part of this is that the new table has the same page size as the table you are trying to recover. If you use the same ddl and you didn't set any page size properties when you created the original table then the page size should match. The size and structure of the allocation part of page 0 is different for each of the 4 supported page sizes (2k, 4k, 8k, 32k) - basically there is a fixed header and the rest is used for the allocation page. Thus the bigger the page size the more pages the allocation bit map on page 0 controls. o now insert as many rows as necessary into the dummy table such that it is as big as the table you are trying to recover. The goal here is to get the allocation map in page 0 to mark all the pages as allocated and in use. o now shutdown cleanly (ie. shutdown=true) - if you don't do this then the changes may only be in the log and not in the seg0 file. o now with both db's shutdown cleanly, copy page 0 from the dummy file over page 0 of the copied database table that you are trying to recover and try booting, and checking the table. On a unix system I think this can easily be done with one or two dd commands, let me know if you need more info. Again do this only while db's are shutdown cleanly otherwise all sorts of recovery problems may happen. o And of course after you do this you should run the consistency checker to see what else may be corrupted, note the consistency checker does not check everything. So I usually recommend the only safe thing to do when using this kind of data corruption mining is to select the recovered data out of the bad db and then insert into a newly created good db otherwise the corruptions may lurk around and bite you later. The checker is not perfect, it mostly does a good job of checking that the index tree's are consistent internally and that they are consistent with the base tables. For instance I don't think it even reads data in base that is not needed to check the indexes. /mikem Bryan Pendleton wrote: Anyone adventurous enough to help me with this problem? Even a yes/no to the question whether it's possible and maybe some pointers how to reconstruct page 0 would be a lot of help I think. Well, anything's possible, since it's open source software and so you can change it and make it do what you need. However, this doesn't sound like a very easy thing to do. You've definitely exhausted all possible sources of backups? Here's some fairly high-level information about page formats: http://db.apache.org/derby/papers/pageformats.html My first thought would be that if you could make page 0 appear to look as though ALL other pages in the conglomerate were in-use, so that it seemed to have no pages marked available, then you could try opening the database and reading all the data out. So you'd like the FreePages bitmap to be empty, for this recovery scenario. thanks, bryan
Re: Repairing a corrupted database with invalid checksum on page
Unfortunately the problem you have is during redo recovery, so not only is the page bad that you see in the log you don't have access to the database to find out what else is bad. Also since redo recovery is failing there is more work after that record that is not being done to get your database back to a consistent state. Do you have the ability to change/build derby source to produce your own version? If so I would try hacking in a change to redo recovery to just skip this page in this container, and see if the rest of redo recovery works. Then you would have a booted db that you could run the consistency checker on. Then you could figure out how important container 1105 is. The best case is if 1105 is an index then one need only drop the index and recreate it. From the conglomerate id I think it is an index: * The conglomerate number has 2 parts. The low 4 bits are used to * encode the factory which owns the conglomerate. The high 60 bits * are used as a normal unique id mechanism. * p * So for example if the next id to assign is 0x54 the following will * be the conglomid: * if a HEAP (factory 0) - 0x540 * if a BTREE (factory 1) - 0x541 * * And the next id assigned will be: * if a HEAP (factory 0) - 0x550 * if a BTREE (factory 1) - 0x551 David Sitsky wrote: I have a customer (unfortunately with confidential data) who has a Derby database with corruption. I'm not sure of the details - somehow an eSata disk was powered off, and perhaps it had write-caching enabled, I'm not sure. I've read on some Derby docs online that: In some cases one may recover data from the existing database, depending on the extent of the corruption, but will require by hand data recovery. Depending on the type of corruption this may be successful or not. one should consult the Derby list if attempting this recovery - no automatic software solution to this recovery exists. So I guess that is what I am doing... I've included the error below as to what the database has reported. The customer has 7 other databases on the same disk which is fine, so perhaps the corruption is limited. Apart from using backups (which are unfortunately old), are there any options to get this database back into any kind of shape? The thought of fixing the checksum on this page and praying for the best seems dangerous.. but they don't have many other options.. Thanks for any advice. 2009-06-09 02:04:52.245 GMT: Booting Derby version The Apache Software Foundation - Apache Derby - 10.4.1.3 - (exported): instance 40348015-0121-c2c2-c064-afcf10ad on database directory X:\Baulderstone\Stores\Complete-bea2c17c113d42519b6ff5aa5d9b0141\Stores\EvidenceDatabase BEGIN SHUTDOWN ERROR STACK - ERROR XSDG2: Invalid checksum on Page Page(3586,Container(0, 1105)), expected=1,077,491,282, on-disk version=1,792, page dump follows: Hex dump: : 0075 .u.. 0010: 0006 0020: 0001 0030: 0040: 0050: 0060: 0070: 0080: 0090: 00a0: 00b0: 00c0: 00d0: 00e0: 00f0: 0100: 0110: 0120: 0130: 0140: 0150: 0160: 0170: 0180: 0190: 01a0: 01b0: 01c0: 01d0:
Re: Increasing the length of a blob column
I logged JIRA DERBY-4256, as unfortunately derby does not currently support alter table to expand the length of the blob or clob fields. I included in it the necessary work to make derby support such a feature. Without doing the development work I think you are stuck with the kind of process you describe below. If you can come up with the space to complete it I think you can get your db size back down if you add a compress table call at the end of your work. But the db is definitely going to grow in the intermediate steps, both by having multiple copies of the blob in the row. Also a lot of logging will happen so wherever you log directory is going to grow during this process. Log space can't be reclaimed until transactions are committed so you may be growing space there also. /mikem Evan Leonard wrote: Hello again, As a corollary to my upgrade question, I have another issue I would like to get some input on. Several old databases in production were created with blob columns at the then default blob size of 1mb. How can I go about upgrading these columns to be longer? This is the best approach I've come up with so far: ALTER TABLE binarydata ADD COLUMN data2 blob(128M) UPDATE binarydata SET data2 = data; ALTER TABLE binarydata DROP COLUMN data RESTRICT; RENAME COLUMN binarydata.data2 TO data; The issue with this approach is that some deployed databases are nearly 1GB in size with a large portion of that being in this table I'm trying to adjust. When I ran this query as a test on such a database, the size of the db on disk balloon to over twice its starting size, and then failed because I was running it on a temp drive without enough storage to complete, so I haven't been able to fully test even if this will work. Is there a good reason why blob columns can't have their size adjusted directly? I've tried: ALTER TABLE binarydata ALTER COLUMN data SET DATA TYPE blob(128M) but this failed with an error. Any ideas? Evan
Re: syscs_compress_table deadlock
Is there any chance you can post a reproducible test case? The part of this I am having a hard time understanding is that somehow the transaction which is doing the inserts has gotten a table level read lock on SYSCONGLOMERATES, as indicated by the following line from the error: Granted XID : {27040324, IS} , {27043904, S} , {27043904, IS} , {27058697, S} Transaction 27043904 has a S (shared) table lock on SYSCONGLOMERATES which is a system table. The key is figuring out how that happened. Compress is doing row locking as indicated by it requesting a (IX), intended share table lock. Any chance the inserting threads are either doing direct queries on SYSCONGLOMERATES or doing database metadata queries? Setting derby.language.logStatementText=true may help track down what the insert threads are doing to get this unexpected lock. It is probably prudent to commit after any database metadata query to release any locks which may have been requested on system catalogs unless you require that info for a consistent transaction. Also search the documentation for derby.locks.monitor=true for ways to get the system to print more information when it gets a deadlock. T K wrote: While we call this stored proc INSERTS keep coming in from other threads, resulting in the deadlock exception at the bottom, and I assume this is expected although I did not see anything in the documentation. Can someone please confirm the proper conditions for calling this stored proc? Derby 10.3.3.0. tia com.XXX.pub.database.DatabaseException: code: -1 SQLState: 38000 Message: The exception 'java.sql.SQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is: Lock : TABLE, SYSCONGLOMERATES, Tablelock Waiting XID : {27040324, IX} , MYSCHEMA, alter table MYSCHEMA.SOMETABLE compress sequential Granted XID : {27040324, IS} , {27043904, S} , {27043904, IS} , {27058697, S} Lock : TABLE, SOMETABLE, Tablelock Waiting XID : {27058697, IX} , MYSCHEMA, insert into SOMETABLE (COLUMN1) values (?) Granted XID : {27040324, X} . The selected victim is XID : 27040324.' was thrown while evaluating an expression. From: XXX SQL: {call syscs_util.syscs_compress_table('MYSCHEMA','SOMETABLE',1)} SQL Inserts: Caused by SQL Problems. Problem #1, SQLState 38000, Error code -1: java.sql.SQLException: The exception 'java.sql.SQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is: Lock : TABLE, SYSCONGLOMERATES, Tablelock Waiting XID : {27040324, IX} , MYSCHEMA, alter table MYSCHEMA.SOMETABLE compress sequential Granted XID : {27040324, IS} , {27043904, S} , {27043904, IS} , {27058697, S} Lock : TABLE, SOMETABLE, Tablelock Waiting XID : {27058697, IX} , MYSCHEMA, insert into SOMETABLE (COLUMN1) values (?) Granted XID : {27040324, X} . The selected victim is XID : 27040324.' was thrown while evaluating an expression. Problem #2, SQLState 40001, Error code 9: java.sql.SQLNonTransientConnectionException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is: Lock : TABLE, SYSCONGLOMERATES, Tablelock Waiting XID : {27040324, IX} , MYSCHEMA, alter table MYSCHEMA.SOMETABLE compress sequential Granted XID : {27040324, IS} , {27043904, S} , {27043904, IS} , {27058697, S} Lock : TABLE, SOMETABLE, Tablelock Waiting XID : {27058697, IX} , MYSCHEMA, insert into SOMETABLE (COLUMN1) values (?) Granted XID : {27040324, X} . The selected victim is XID : 27040324. at
Re: syscs_compress_table deadlock
The table locking nomenclature is somewhat confusing. Let me try to explain. Derby always uses 2 level locking so that it can support both table level and row level locking. What people usually refer to as table locking are the cases of S and X locks on TABLE locks. An S table lock logically means a shared lock on every row in the table. An X table lock logically means an exclusive lock on every row in the table. Derby update row locking always first gets an IX table lock (intent to get x row locks), and read row locking gets an IS table lock (intent to get s row locks). In derby the only purpose for these intent locks is to block as appropriate the S and X table locks. So X table lock is not compatible with any other table lock. An S table lock is not compatible with X or IX. It is compatible with IS. transaction 27040324 (compress transaction): o is waiting to get an IX table lock on SYSCONGLOMERATES o owns a an X table lock on SOMETABLE transaction 27058697 (insert transaction): o is waiting on a IX lock on SOMETABLE (which is not compatible with the above X lock held by 27040324 o is granted a S table lock on SYSCONGLOMERATES (which is not compatible with the above IX table lock request on SYSCONGLOMERATES. More on Derby isolation and locking can be found in the docs: http://db.apache.org/derby/docs/10.5/devguide/cdevconcepts30291.html I assume the inserts are part of a multi-statement transaction, is that true? Does the transaction include the creation of the table? T K wrote: Hmm My reading of the situation is a bit different... You said: Compress is doing row locking as indicated by it requesting a (IX), intended share table lock. To me IX is Intended Exclusive and the fact some other transaction has a shared lock (S) on SOMETABLE shouldn't really matter, so going back to the error report - and for a deadlock to exist - I would expect to see some transaction A holding exclusive lock L1 while requesting exclusive lock L2, and another transaction B holding L2 while requesting L1. This means I would expect to see two X locks in the report between two tables but I don't - there is just one X lock... Here's the report again for clarity: 'java.sql.SQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is: Lock : TABLE, SYSCONGLOMERATES, Tablelock Waiting XID : {27040324, IX} , MYSCHEMA, alter table MYSCHEMA.SOMETABLE compress sequential Granted XID : {27040324, IS} , {27043904, S} , {27043904, IS} , {27058697, S} Lock : TABLE, SOMETABLE, Tablelock Waiting XID : {27058697, IX} , MYSCHEMA, insert into SOMETABLE (COLUMN1) values (?) Granted XID : {27040324, X} . The selected victim is XID : 27040324.' was thrown while evaluating an expression. From: XXX SQL: {call syscs_util.syscs_compress_table('MYSCHEMA','SOMETABLE',1)} If my reading of this is accurate, what I see is that 27040324 has an X lock on SOMETABLE and 27058697 is waiting to obtain the same lock (IX), while 27058697 is not holding any other X locks for it to engage in a deadlock. At the same time 27040324 is intending to obtain an exclusive lock on SYSCONGLOMERATES but the report is not telling who's holding an X lock on it. Basically in this report, and for a deadlock to occur, I am expecting to see that 27058697 also has an X lock on SYSCONGLOMERATES, but I don't. Therefore, there is either some sort of bug here, or I have to assume there must be a third transaction (not reported here) that has this exclusive lock, and this would then appear to be a 3-way deadlock, but the report isn't telling me this, and detecting a 3-way deadlock is a very difficult thing to begin with... so from this I conclude that we are dealing with some sort of bug here... The code is not accessing SYSCONGLOEMERATES directly or doing metadata queries - threads are simply INSERTing while another thread runs stats periodically. Thanks *From:* Mike Matrigali mikem_...@sbcglobal.net *To:* Derby Discussion derby-user@db.apache.org *Sent:* Wednesday, May 27, 2009 12:28:50 PM *Subject:* Re: syscs_compress_table deadlock Is there any chance you can post a reproducible test case? The part of this I am having a hard time understanding is that somehow the transaction which is doing the inserts has gotten a table level read lock on SYSCONGLOMERATES, as indicated by the following line from the error: Granted XID : {27040324, IS} , {27043904, S} , {27043904, IS} , {27058697, S} Transaction 27043904 has a S (shared) table lock on SYSCONGLOMERATES which is a system table. The key is figuring out how that happened. Compress is doing row locking as indicated by it requesting a (IX), intended share table lock. Any chance the inserting threads are either doing direct queries on SYSCONGLOMERATES or doing database metadata queries? Setting
Re: Clean derby shutdown - shutdown a db so no recovery needed on startup
Does the following mean that you plan on having multiple machines access a shared on disk image of derby that is shared among the machines? Derby can't support this configuration as it depends on the JVM coordinating access and in this case one JVM is not going to know another JVM on a different machine is accessing the db. If so, derby cannot prevent concurrent access to the db, and if concurrent access is allowed it is very easy to end up with a corrupt db in this case. This can even happen in cases where no update operation is performed by the access. Derby has definitely not been designed to perform well for a single operation per boot. The assumption is boot once, and then single connections come and go, with many derby implementation doing a single operation per connection (best with connection pooling). In this case a cache is left around so that each query does not have to pay the I/O for the various system catalog lookups necessary for the queries. Nick Puz wrote: Hi Jørgen, Thanks for the quick response. Due to our planned derby usage pattern (open derby, do something for a user request, close derby) this is more of an issue. This usage is done so that any mid-tier box can handle client requests and access the derby db on a nas/filer. I notice the following writes/fsyncs in a read only test, are they all due to the commit log record or is there another cause of the writes to log ctrl files:
Re: URGENT!!! JDBC SQL query taking long time for large IN clause
are getting parsed with much lower memory footprints. Best regards, Arindam. On Wed, Apr 8, 2009 at 2:39 AM, de...@segel.com mailto:de...@segel.com wrote: -Original Message- From: Mike Matrigali [mailto:mikem_...@sbcglobal.net mailto:mikem_...@sbcglobal.net] Sent: Tuesday, April 07, 2009 2:05 PM To: Derby Discussion Subject: Re: URGENT!!! JDBC SQL query taking long time for large IN clause It is impossible to say what the performance of the query can be without knowing exact values of all the values of the IN LIST. But it is possible to get some idea assuming some worst case behavior, and from that I am going to guess you will never come close to 100ms with an uncached database, on hardware using some sort of standard disk based hard drive. I do think the query may go faster with index and query tweeking, but 100ms to an uncached db and non-clustered unique values in that IN list is never going to go that fast. Adding up just what is posted it looks like this is a 1.2 gig db. Drop the unnecessary indexes and you'll see the database size shrink fast. Also note that he's running this on a Windows XP laptop. Depending on the model of the lap top, you will have not only CPU issues but also disk i/o issues as well. (5400 rpm IDE as an example) However, it is possible for the OP to get better performance, if not realistically 100ms performance. (BTW where did 100ms come from? I'm sorry but this really sounds like a class project...) You posted the space for the tables and indexes. The interesting ones are the big ones. You have 5 tables or indexes over 1000 pages big. If in the worst case your 1000 value IN list happens to be on 1000 different pages then Derby is going to need to do at least 1000 i/o's to get to them - I usually use back of envelope of max 100 i/o's per second (even if your disk has specs that say higher rate this I/O is not going to get streamed as fast as possible by this query, it is going to ask for page, process it, do some join work then later ask for another page, ...) : CATEGORY_MASTER01030040960 SQL0904060913026001550040960 SQL09040609130260111600140960 SQL090406091302730110140960 OBJECT_MASTER0104970040960 SQL090406091302760153400140960 SQL090406091302761116708041040960 OBJECT_CATEGORY_MAPPING01507940040960 OBJECT_CATEGORY_MAPPING_INDEX11121770574096 0 Mike, I think that a lot of this information is a bit skewed. Outside of the primary index, the indexes he created included the varchar field. Not sure why he did this except under the impression that he'd only have to hit the index and not the underlying table. While there is some potential merit to this, I think that there are things that he can do to improve performance. (Hence my post about reworking the query itself and using a temp table.) Drop those indexes and you'll see a big change in database size. There was work done in 10.3 on IN-LISTS, making them perform more like unions, See DERBY-47. So if you have a choice of releases I would suggest you move to 10.4 and post query plan and results against that. The basic idea of that change was to allow the system to do 1 probe into an index for each value in the IN-LIST, before this change DERBY could only sort the values in the IN list and then limit a index scan to the lowest and biggest values in the in list. So for instance for OBJECT_CATEGORY_MAPPING_INDEX, worst case it might have to scan 112177 pages to find the 1000 rows, where worst case for probing would be 1000 page (plus btree parent index pages, but those are much more likely cached). The problem is that there is definitely overhead for probing one at a time, scans go much faster - so there is a crossover point - ie. I would guess it would likely better to scan all 112177 pages then do 100,000 probes. I believe that it was already recommended that he do just that. There are two ways he could use the temp table. As a sub-select statement, or as part of the table join. I think this would bypass the whole use of the IN list. I'm still not 100% sure why there's 100+ values coming from an outside source. Based on his query below it looks like the object_ids in the IN clause are not unique... Its kind of hard trying
Re: URGENT!!! JDBC SQL query taking long time for large IN clause
removing the preserve rows from the declare temp table clause - and that didn't help me either. c) The data base which I have created is just a sample database which vaguely represents the load which we will see if we implement the existing feature of ours, using Derby RDBMS. It doesn't contain REAL data. d) The entire database was created in one shot, as I have stated earlier, and records are populated in one go. That was the actual problem - thanks to Knuth for indicating that as a possibility. When I compress the tables - the quries start performing a LOT faster! 1 million in object master, 20 million in object category mapping and 10 K in category master and 1000 in IN clause gives me 1100 millis speed now - down from 30 seconds. That probably explains why the object master index scans were taking nearly 30 seconds in the runtimestats which I posted. Which is slow, but still manageable since the 100 and 500s are much faster. However, that is still a lot slower than other databases we are testing on. Since Derby is embedded we expected it to be much faster since no IPC is required between the client driver and the network server. e) IN doesn't function properly beyond 1000 object ids, and we used a regular table CREATE TABLE DUMMY (ID INTEGER PRIMARY KEY). Insertion of 5000 records into this happened in 100 millis (after compression of the tables) and the query took 4 seconds. With executeUpdate() or execute() (as suggested) insertion was at least 200% slower. So the execute batch works better for the inserts into the table for the payloads required by us. The temp table query took 10 MINS (this time I waited patiently for it to finish - there was nothing wrong there - the query just takes too much time and I didn't have patience to wait for it to finish the first time :) ) to execute while the regular table with primary key took just 4 seconds. f) The stack overflow exception for select ... values...union was much less of a problem in 10.4 compared to 10.1 since the memory usage is tapering off to the -mxSize In Megs for the JVM. But, since it cannot be parameterized it isn't of much use to us. IN clause based queries are getting parsed with much lower memory footprints. Best regards, Arindam. On Wed, Apr 8, 2009 at 2:39 AM, de...@segel.com mailto:de...@segel.com wrote: -Original Message- From: Mike Matrigali [mailto:mikem_...@sbcglobal.net mailto:mikem_...@sbcglobal.net] Sent: Tuesday, April 07, 2009 2:05 PM To: Derby Discussion Subject: Re: URGENT!!! JDBC SQL query taking long time for large IN clause It is impossible to say what the performance of the query can be without knowing exact values of all the values of the IN LIST. But it is possible to get some idea assuming some worst case behavior, and from that I am going to guess you will never come close to 100ms with an uncached database, on hardware using some sort of standard disk based hard drive. I do think the query may go faster with index and query tweeking, but 100ms to an uncached db and non-clustered unique values in that IN list is never going to go that fast. Adding up just what is posted it looks like this is a 1.2 gig db. Drop the unnecessary indexes and you'll see the database size shrink fast. Also note that he's running this on a Windows XP laptop. Depending on the model of the lap top, you will have not only CPU issues but also disk i/o issues as well. (5400 rpm IDE as an example) However, it is possible for the OP to get better performance, if not realistically 100ms performance. (BTW where did 100ms come from? I'm sorry but this really sounds like a class project...) You posted the space for the tables and indexes. The interesting ones are the big ones. You have 5 tables or indexes over 1000 pages big. If in the worst case your 1000 value IN list happens to be on 1000 different pages then Derby is going to need to do at least 1000 i/o's to get to them - I usually use back of envelope of max 100 i/o's per second (even if your disk has specs that say higher rate this I/O is not going to get streamed as fast as possible by this query, it is going to ask for page, process it, do some join work then later ask for another page, ...) : CATEGORY_MASTER01030040960 SQL0904060913026001550040960 SQL09040609130260111600140960 SQL090406091302730110140960 OBJECT_MASTER0104970040960 SQL09040609130276015340014096
Re: Index perfomance
The technique of adding extra columns to indexes works well for derby if it matching your application needs. The docs usually refer to this as a covering index and the optimizer is pretty good at looking for cases where it can use a covering index and avoid going to the base table. Hopefully this will help your application. From info posted, the query with the index is definitely doing an index to base row lookup for every qualifying row. It looks like for this data distribution and number of qualifying rows this a worst plan than just doing a full table scan. The optimizer should have picked the base table scan rather than the index to base row given how the 2 performed. I think this is another case showing the costs for the optimizer need to be updated to reflect current technology. There has been a lot of work to make scans go fast and that is not reflected in the current costing. The optimizer estimated 203425 rows for the index qualification and got 271136 which seems not too bad (5% vs. 6.8%). This info comes from the with-index.txt query plan. It assumes equal distribution of values across all the values so maybe this range was a little hotter than others. Since the row count estimate looks close I lean toward the base costing as the problem. It would be interesting to know how much of the index/baserow performance issue is that it keeps getting cache misses vs. the cpu overhead of just doing the base row look up for every row. For this db it would take a 50,000 page cache just to cache the base row plus whatever it takes to cache the index. For this kind of distribution I have seen db's gather all the row pointers from the index and then do ordered probes into the base table. This insures good caching for the lookups. Derby does not have this technique available. Tim Dudgeon wrote: Thanks for the comments. Here is some more info. I attach the DDL for the table concerned, the simple test program I use and the execution strategy with and without an index. Some additional points: 1. the query returning 7% of the table is certainly not an extreme case. The exact query criteria are specified by the user in the UI, and can be much worse than this case. I have no control over the natur eof the query that the user specifies. 2. Yes, if the query is much more selective the index can be a help. 3. The biggest data file in seg0 is 1452572672 bytes in size (e.g. 1.4GB). 4. the index was added after the rows were added. 5. making the index also have the pk_column as the second indexed field makes it go like lightning! search runs in 2 secs, about 14x faster. So in summary it seems like an index will be of no help to me in this situation, unless I make it an index of both columns. Many thanks Tim Tim Dudgeon wrote: I found an interesting preformace problem that I'd welcome some help in inderstanding. I have a table of 4,000,000 rows that has a DOUBLE column containing floating point numbers. I run a query like this: select pk_column from the_table where the_double_column 300 and the_double_column 320 271136 rows are returned. I then go through the ResultSet and extract all the id_column values. All of this is done using standard JDBC. When I do this it takes 23 seconds, which I though was not unreasonable as a full table scan was involved and the table was pretty big. But of course I thought an index would help, so I added an index to the the_double_column and repeated the process. It took 720 seconds, 31x slower! I thought this was strange, but thought it might be because I was using standard settings and the 4MB page cache was much too small to hold the index. So I increased the page cache size (derby.storage.pageCacheSize property) to 10x the size (10,000) and repeated the process. There was only a very minor improvement in speed. In all cases the memory usage, as reported by: Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory() really no differnt, and the used memory was much less that the maximum available specified by the -Xmx setting. Any ideas what to do? Tim
Re: initial page allocation versus incremental allocation
Whoever decides to work on this should do more than just document, as these params are not tested. They were added in the past for ad-hoc testing and performance debugging. As has been suggested they were never fully implemented/documented as they did not fit in with the original goal of the product - ie. be a zero-admin embedded db for smaller devices. The idea, if they were used at all was for the software to automatically determine what the best settings and up them dynamically rather than force/allow user to do so. As I remember the defaults have something to do with the following: o Start with no preallocation as originally users actually would notice if a 2k page table started out with 16k vs 2k for an empty table. o default to 8 as again the standard user was space limited and did not want to possibly waste space. o 1000 is just a round number, but if it actually gets documented I would see if the code can actually support a number that is more than the number of pages that can be tracked by a single allocation page (or maybe 2). This number is dependent on page sizes, so worst case would be allocation on a 2k page table. Should just add some testing if it is to be supported. Knut Anders Hatlen wrote: Brian Peterson publicay...@verizon.net writes: Won't this still limit the setting to the MAX_PRE_ALLOC_SIZE? Why have the maximum setting? Or, at least, why limit it to just 8 pages? MAX_PRE_ALLOC_SIZE is 1000, it's the default that is just 8 pages, as far as I can see. (And no, I don't know why it is limited to 1000 pages, but I wouldn't expect many objections if a request to increase the limit was filed in JIRA - https://issues.apache.org/jira/browse/DERBY)
Re: Recovering a database corrupted by DERBY-3347
What state is your db currently in, will not boot at all? Daniel Tripp wrote: Does anyone know of any way to recover a database that has been corrupted by the bug known as DERBY-3347? (https://issues.apache.org/jira/browse/DERBY-3347) The Derby 10.3.3.0 release announcement states: “In some cases one may recover data from the existing database, depending on the extent of the corruption, but will require by hand data recovery.” and that “one should consult the Derby list if attempting this recovery.” But I could find no details anywhere of how to even begin such a recovery. There are many tables in this corrupted database of mine, but even if I could recover even a few tables (or even get a row count on those tables) that would be very helpful to me. Any info on this issue would be much appreciated.
Re: Safe backups
At a very high level here is the problem with backing up an active derby database without using the provided procedures to properly coordinate with Derby. In an active derby database, the database state is a coordinated combination of the database files on disk, the updated pages of those database files still in internal derby cache, and the transaction log file. Derby is carefully coded to be able to handle a crash that loses the updates in cache -- but only if the the database files on disk and the transaction log files are exactly in the coordinated state that Derby expects. Copying files by hand during a running system from outside derby loses this careful coordination. This is why derby provides a number of supported ways to backup an active db. Derby uses write ahead logging for transactional consistency, which means it guarantees that log records are forced to disk before data file updates. So take 2 simple cases of un-coordinated backup: 1) you copy log before data Now you may have inconsistent data as you may miss log updates associated with the data that you copied - may lead to corrupt db. 2) you copy the data before the log Derby may, after guaranteeing that all data associated with a particular logged transaction has been forced to disk, delete that portion of the log as unneeded. But your copy of the data may not include these updates, and when you go to copy the log you are now missing log records associated with the pages of the data you have copied. This is just one of the problems you may encounter. Raymond Kroeker wrote: If you are looking to backup your database realtime; take a look at the SYSCS_BACKUP_DATABASE stored procedure. It works well. http://db.apache.org/derby/docs/10.4/adminguide Raymond On Fri, Jul 18, 2008 at 09:28, Bryan Pendleton [EMAIL PROTECTED] wrote: Could someone explain the mechanism(s) for getting a corrupt database copy? I'm using transactions, so isn't there protection against this sort of problem anyway? How long does it take to commit a transaction? Yes, Derby uses transactions to protect the database integrity, and committing an individual transaction is quite speedy. You can ensure that your database copy is valid by following the techniques described here: http://db.apache.org/derby/docs/10.4/adminguide/cadminhubbkup98797.html thanks, bryan
Re: Questions about compress table
Kathey Marsden wrote: http://db.apache.org/derby/docs/dev/ref/rrefaltertablecompress.html says a non sequential compress uses an amount equal to approximately two times the used space plus the unused, allocated space and it says SEQUENTIAL uses less space. How much space does SEQUENTIAL use? Also is there a general guideline for how much longer SEQUENTIAL will take? Thanks Kathey It hard to give absolutes. Non sequential builds everything at once, so the space used is: 1) all the original space of the base table + all the original space of all the indexes 2) all the space of the new base table + all the new space of all the indexes 3) all the space that has been overflowed to disk as part of the sort to build the indexes. In the worst case this will be close to the space used by the new indexes in #2. Non-sequential is still going to need 1+2. Because it builds one index at a time the only additional space used will be the sort needed for the current index it is building. This is in temp space and can be deleted immediately as each index is built. The extra time is basically dependent on the size of the base table, and how long it takes to do a sequential scan of it. If it takes N to do a sequential scan of the base table and there are Y indexes, then it will take approximately N * Y-1 more time for non-sequential vs. sequential. Note that no matter what the system is going to hold an exclusive lock on the whole table for the entire compress process, whether it is sequential or non-sequential.
Re: Pathologically poor query performance - advice required
Matt Doran wrote: Army wrote: Matt Doran wrote: Both of these queries only return fields from 2 of the joined tables. The slow one selects 5 fields from each of these tables. The fast one selects 5 fields from one table and 4 fields from the other. Strange indeed!! Thanks for the great description of the problem, and for taking the time to narrow down the good vs bad scenarios. No worries ... I hoped if I put in a lot of effort in describing/investigation the problem ... someone you help. And you did. Thanks for looking at this! I though that due to the complexity of what's going on ... I might get no answers. It's very much appreciated So does anybody understand what might be happening here? I took a (very) quick look at the good query plan vs the bad query plan and the thing that first jumped out at me is that, for the fast query, the optimizer is choosing to do two Hash Joins, but for the slow query the optimizer is choosing Nested Loop joins everywhere. I don't know if that alone is the difference, but *if* it is, one possible explanation of this is that the optimizer will deliberately reject Hash Joins if it (the optimizer) thinks that the resultant in-memory hash table will take too much of the available JVM memory. In that case the optimizer will skip the hash join and use a nested loop join instead. Having said that, the calculation for what too much memory is depends on several things, once of which is the estimated memory usage for a single row in the result set. So it's *possible* that addition of the extra column to the SELECT result column list is causing the estimated per-row memory usage to increase such that the optimizer's calculation of too much memory changes, ultimately leading to rejection of the hash join. How I could go about improving the performance? Another factor in the calculation of too much memory is a Derby property called derby.language.maxMemoryPerTable. For details see DERBY-1397 (https://issues.apache.org/jira/browse/DERBY-1397); an extract from that issue is as follows: Wow! This worked. Thank you very very much. Upping the max memory to 10MB made my modified query run fast (but not the original). Upping it to 20MB made the original query run fast too. Thank you. We embed derby in our commercial product, where we have minimal control of memory. We'll have to make a compromise here. We can dynamically change this property at startup based on the amount of memory available. I think the optimizer is being too conservative in switching back to nested loop joins. And I'd be interested in understanding why with the possibility of improving the optimizer's behavior. Unfortunately it seems I can't enable optimizer tracing without building my own derby (I might try that later). The tracing would allow me to see which memory check (there are a number of them throughout the optimizer) is triggering the behaviour. In this particular case there is an index on the large main table, which limits the results returned dramatically. In this particular case, I was only returning 4 of the 130,000 rows. The index was being used in the fast version, but was no being used in the slow version. So with the slow version it looks like it's doing the join on all these rows ... even though the index could have dramatically reduced it's work. That's a pretty serious issue when this index is *so* useful in this case. I'm no DB expert (by any stretch of the imagination) but if I was a derby developer I'd be asking: 1. In estimating the memory used, why aren't we making effective use of the index cardinality/stats to make an accurate estimate of the rows returned. If the optimizer knew that only a small percentage of the rows would be returned (and that minimal memory would be used) it could have used the hash join approach. Do derby index statistics give you enough info to do this type of thing? 2. When we fell back to the nested loop join, why weren't we making any use of the obvious index? And order the join appropriately? If it used that index in the first step the result should have been much faster. But as I said I'm no expert ... and achieving those could be extremely difficult. Thanks again at least this derby.language.maxMemoryPerTable setting gives us a way forward. This setting doesn't seem to be documented in the tuning guide ... but DERBY-1397 talks about documenting it ... but I couldn't find it. Regards, Matt To get feedback from people it may also help to post the ddl of the tables in the query, including the indexes/constraints. I know you provided the db, so with work it is available - but the more up front info you provide the more help possible. One other suggestion might be to try the query once with no optimizer timeout to see if it would have found a better plan:
Re: Thread Interruption in large Database
PhilCope wrote: I have a large cloudscape 5 database (over 5million records) and I have found that when a ResultsSet includes all or many of these records, the initial call to ResultsSet.next() takes a very long (but finite) time (I would estimate about 10-15mins) . I can (indeed have) arrange the code so that this call occurs in a separate java thread which can be interrupted from the UI thread. BUT, as you may know, java threads that are set to as interrupted continue to execute until either application code or some system calls actually check the interrupted state of the running thread. So, given this background info, my questions on Derby are 1. Have any significant performance improvements been made such that, for databases of this size, migrating from cloudscape to Derby would provide a significantly better response time ? 2. If not, are there any improvements to the responsiveness of the timeconsuming call to .next() to the setting of the interrupted flag on the current thread in Derby ? Thanks Phil Cope The issue is not really the timeconsumming call of .next() or not necessarily the size of the result set, it is just that when you call next it has to wait until query processing is ready to return the 1st row. In some cases derby can return the 1st row before completing processing of the entire query. For instance I believe if you just did a simple select of all the rows from your 5 million row table you would see that the 1st row comes back very quickly. In other cases it may do a lot of processing before it even gets to the 1st row (imagine a query with no key that required the db to process every row in the db and only the last row in the table actually would be returned). In other cases the semantics of the query require the db to pr Can you post the query, it may help people to give you suggestions. If possible derby tries to stream results out as it gets them, but there are queries where all the rows have to be seen and processed before the first row can be returned. The simplest example is a query with an order by at the end. If there is no index that provides the ordering of the order by then derby will process all the query, and throw all the rows in the sorter and sort them all and then give you the first row back. Sometimes this order by behavior can be worked around by creating an index on the exact keys in the same order as the order by. Also note that while not necessary, the current derby/cloudscape sorter algorithm will not return the 1st row of the sort before it has finished sorting all the rows. As queries get more complicated it may be harder and harder for derby to return a row early.
Re: Speed of using Derby DB
vodarus vodarus wrote: Hello. I tried to use Derby and compare it with Oracle. I thought that Derby can have the same performance as Oracle on easy procedures. Purpose of bench-mark test: use Derby as local db and get better performance for local data-manipulations. DB schema: create table TESTBIG ( CLIENT int not null, ORDER_ID int not null, ORDER_AMOUNT int not null ); alter table TESTBIG add constraint TESTBIG_PK primary key (CLIENT, ORDER_ID); create table TESTTOTALS ( CLIENT int not null, CLIENT_TOTAL int ); alter table TESTTOTALS add constraint TESTTOTALS_PK primary key (CLIENT); We populating TESTBIG table with 1 000 000 rows, then stored procedure calculates TESTTOTAL: CLIENT_TOTAL is SUMM of all ORDER_AMOUNT for that CLIENT. i wrote stored procedure for Derby in Java language: static public void calculateTotal() { int totalAmount = 0; int lastClient = 0; try { Connection connection = DriverManager.getConnection(jdbc:default:connection); Statement s = connection.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY); ResultSet rs = s.executeQuery(SELECT CLIENT, ORDER_ID, ORDER_AMOUNT FROM TESTBIG); PreparedStatement updateData = connection.prepareStatement(UPDATE testtotals SET + client_total = client_total + ? + WHERE client = ?); PreparedStatement insertData = connection.prepareStatement(INSERT INTO testtotals + (client, client_total) + VALUES (?, ?) ); while (rs.next()) { int client = rs.getInt(1); int order_amount = rs.getInt(3); if (lastClient == 0) { lastClient = client; totalAmount = 0; } if (lastClient != client) { // System.out.println(MERGE amount + lastClient + : + totalAmount); updateData.setInt(1, totalAmount); updateData.setInt(2, lastClient); int sqlRowCount = updateData.executeUpdate(); if (sqlRowCount == 0) { insertData.setInt(1, lastClient); insertData.setInt(2, totalAmount); sqlRowCount = insertData.executeUpdate(); } lastClient = client; totalAmount = order_amount; } else { totalAmount = totalAmount + order_amount; } } updateData.setInt(1, totalAmount); updateData.setInt(2, lastClient); int sqlRowCount = updateData.executeUpdate(); if (sqlRowCount == 0) { insertData.setInt(1, lastClient); insertData.setInt(2, totalAmount); sqlRowCount = insertData.executeUpdate(); } rs.close(); s.close(); connection.commit(); connection.close(); } catch (SQLException ex) { Logger.getLogger(CalculateTotal.class.getName()).log(Level.SEVERE, null, ex); } } it takes up to 12 seconds to calculate TESTTOTALS. Oracle PL/SQL procedure with the same algorithm need 1,5 second. *How can i improve performance? Or Derby is so slow because of Java / JVM issues???* Thanks For derby the default is autocommit=true, which means it will do a synchronous log commit write for every statement. My reading of the app and the call to commit at the end seems to indicate you want to only commit at the end. Depending on the data distribution (ie. how many times you call executeUpdate this may be a big issue. When you run the test do you see a lot of idle time or is 1 cpu at 100% for 12-14 secs? The suggestion for using aggregates seems like a better way to do the app. What kind of page cache does oracle have when you compare? Are you interested in performance of a query when no page is in cache or when all pages are in cache? This size table is bigger than the default derby page cache (1000 pages) so it basically is going to be an I/O test of how fast stuff can be read from disk each time. If you want to try a cached test try setting derby page cache bigger - probably 10,000 pages will fit that table (probably less - didn't do exact math).
Re: SELECT places shared locks in READ_UNCOMMITTED insolation level?
I am not sure if derby documents the meaning of the line you are referring to in the query plan: at read uncommitted isolation level using share row locking chosen by the optimizer. I think this is what it always says for read uncommitted, as is up to a lower part of the code to not get the read locks. Could you describe the behavior you are seeing or give a small test case that could be run. Note Derby's implementation of read uncommitted does get table level intent locks, so if you try operations that depend on table level locking you will get blocking. The system is set up such that the top level optimizer passes down the isolation level and then the information about row vs. table and share vs exclusive down to a lower level. At the lower level if things are working right the code makes a share row lock call that goes into the read uncommitted module and then no actual read lock is requested. There have been significant improvements to IN handling in 10.3/10.4 so you may want to try that. Bogdan Calmac wrote: Derby documentation specifies that SELECT statements do not place any locks in the READ_UNCOMMITTED isolation level. However, the query plan shows that it uses shared locks (which is also confirmed by application behaviour). So is the documentation wrong or is this a bug? This is derby 10.2.2.0, see the query plan below: 2008-05-07 14:21:39.990 GMT Thread[ExportWorker 2,5,ArchiveGroup] (XID = 583431), (SESSIONID = 17), select ID, AGENT_SESSION_ID, SEQUENCE_NO, SCRIPT_SESSION_ID, EVENT_TIME, EVENT_TIME_SRV, L1, L2, L3, L4, L5, TOPIC_ID, OBJECT_ID, TEXT, ATTRIBUTES from TRACK_EVENT where AGENT_SESSION_ID in (43909,43914,43921,43922,43923,43924,43925,43926,43927,43930,43931,43932,43933,43934,43935,43936,43937,43938,43939,43940) *** Project-Restrict ResultSet (3): Number of opens = 1 Rows seen = 63645 Rows filtered = 0 restriction = true projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count:16824.30 optimizer estimated cost: 127904.69 Source result set: Index Row to Base Row ResultSet for TRACK_EVENT: Number of opens = 1 Rows seen = 63645 Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count:16824.30 optimizer estimated cost: 127904.69 Index Scan ResultSet for TRACK_EVENT using index TE_IDX_SESSION at read uncommitted isolation level using share row locking chosen by the optimizer Number of opens = 1 Rows seen = 63645 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 2} Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=512 Number of rows qualified=63645 Number of rows visited=63646 Scan type=btree Tree height=4 start position: = on first 1 column(s). Ordered null semantics on the following columns: 0 stop position: on first 1 column(s). Ordered null semantics on the following columns: 0 qualifiers: None optimizer estimated row count:16824.30 optimizer estimated cost: 127904.69
Re: Derby get-together at Java One
Knut Anders Hatlen wrote: Rick Hillegas [EMAIL PROTECTED] writes: A number of Derby people have expressed interest in getting together the week of Java One. I'm thinking that people may be busy attending sessions during Java One itself, so it might be better to meet the first day before the Java One sessions start. That would be the Community One day (Monday, May 5). However, that may be an awkward day for some people for other reasons. I'm leaning toward Monday May 5. I'm also leaning toward a bring-your-own-lunch format as we've done before with Derby developer get-togethers. Let me know if lunchtime (say 11:30 onward) that day sounds good to you. Hi Rick, I think that'll work fine for me. I plan on taking advantage of the free Community One offer (no money in my budget to go to javaone), so depending on the offerings lunch may not work for me. Community One day is free and at least last year they offered free beer (even had achor steam on tap) for an hour and 1/2 after work sometime, would anyone like to meet up then - and maybe head to a different location from there? It looks like the reception is 6-7:30pm. You do have to sign up. The link is easily findable, I would post it but I don't know the rules on advertising on the list. I am not an employee of Sun. I did find the event useful especially given the price - free. I probably would not travel to go to it, but since I am in the neighborhood.
Re: Derby get-together at Java One
Rick Hillegas wrote: Mike Matrigali wrote: Knut Anders Hatlen wrote: Rick Hillegas [EMAIL PROTECTED] writes: A number of Derby people have expressed interest in getting together the week of Java One. I'm thinking that people may be busy attending sessions during Java One itself, so it might be better to meet the first day before the Java One sessions start. That would be the Community One day (Monday, May 5). However, that may be an awkward day for some people for other reasons. I'm leaning toward Monday May 5. I'm also leaning toward a bring-your-own-lunch format as we've done before with Derby developer get-togethers. Let me know if lunchtime (say 11:30 onward) that day sounds good to you. Hi Rick, I think that'll work fine for me. I plan on taking advantage of the free Community One offer (no money in my budget to go to javaone), so depending on the offerings lunch may not work for me. Community One day is free and at least last year they offered free beer (even had achor steam on tap) for an hour and 1/2 after work sometime, would anyone like to meet up then - and maybe head to a different location from there? It looks like the reception is 6-7:30pm. You do have to sign up. The link is easily findable, I would post it but I don't know the rules on advertising on the list. I am not an employee of Sun. I did find the event useful especially given the price - free. I probably would not travel to go to it, but since I am in the neighborhood. Not conflicting with Community One sessions sounds good, as does meeting after 6:00 pm for free beer. I didn't attend the reception last year. Do you have any advice about how we would raft together at the reception? Does anyone have a Derby sign that we could wave around? Last year they just held it in the area outside where the javaone vendor pavillion is, down the escallator where they usually have a bunch of bean bag seeting and the wall of video. And people hung out and talked in that area, it wasn't so big or crowed that you would not see a sign if you walked around a bit. But of course I have no idea what they will do this year. Thanks, -Rick
Re: NullPointerException, possibly in getControlRowForPage?
I second, could you report this as a new issue. If possible can you try with different JVM/jit disabled to try and get a line number for the null pointer? If it reproduces in your environment can you turn on statement logging to derby.log so maybe we can see what is causing the null pointer. Do you do anything in the long transaction that may be affecting the state of indexes affected like - dropping indexes, compressing tables, other sorts of ddl. Of course if you can provide standalone repro that would be easiest way to allow someone to help with the problem. Bryan Pendleton wrote: James A. Shepherd wrote: Hi, I have a long transaction that has suddenly started throwing a NPE Did you recently upgrade your Derby installation? What version are you running? It seems remotely possible that you might be seeing problems related to this recent change: https://issues.apache.org/jira/browse/DERBY-3216 If you didn't recently upgrade, then perhaps you have found a new bug in Derby. Either way, if you can gather whatever information you have, and record it in an issue in the Jira system, that would be nice! http://db.apache.org/derby/DerbyBugGuidelines.html thanks, bryan
Re: FW: Advice on *very* badly performing query
Most of the stat information that derby uses is automatically kept up to date as part of underlying index and table maintenance. This info includes count of rows in the table and data distribution of data in indexes. The one piece of info that is not kept up to date is average number of duplicates for columns in an index. This stat is given a default and then is updated whenever you create an index, run the discussed compress option, and as a side effect of some of the alter table commands. I don't remember what the default is, something like 10%. /mikem Kim Haase wrote: Do you think it might also help to put the information into the Tuning Guide under Performance tips and tricks? http://db.apache.org/derby/docs/dev/tuning/tuning-single.html#ctunperf22457 Thanks, Kim Haase Matt Doran wrote: [EMAIL PROTECTED] wrote: Matt Doran [EMAIL PROTECTED] writes: I had not idea that derby didn't keep any stats up-to-date without performing that operation explicitly. Ideally it would keep this up-to-date itself. The sys.sysstatistics didn't have any rows in it until I ran the compress table operation. Agreed, but strangely I cannot recall many users actually requesting this. Maybe people just suffer silently? We have hundreds if not thousands of customers using our product and not many of them have seen this pathological performance problem. So maybe the optimizer does a good enough job in 90% of cases. We just happened to hit the an extremely bad case. So maybe it's just not something that people notice often. Or they just think oh it's an embedded java database, it probably doesn't perform that well. Let's just upgrade to a real database.That's what we did, and it's what other people probably do. The beauty of the embedded DB is that it is self-maintaining. I suspect that if it maintained the statistics by itself and therefore there were performance benefits ... it will improve people's perception that it performs well. Anyway, thank you for what I would call an exemplary bug-report/question! Even though you use Hibernate you took the time to identfy the actual SQL causing the problem, identified a minimal repro and provided query plans. Thanks. I had trouble understanding the behaviour ... so I though that nobody would believe me unless I provided enough evidence. It really needs to be made more prominent in the documentation. i.e. once your database is loaded with representative data, perform the compress op for optimal performance. Agreed. Any thoughts on where it would be good to mention it? If you want, you can file a Jira issue about this. I'm not sure. No-one is every going to read a whole manual.But I had read the ApacheCon performance presentations, and I don't remember them ever mention this. I think those presentations would be one of the first places people start when they have performance problems. I know you can't change these retrospectively ... but maybe making this clear in the wiki would be a start. I'm not sure if this is the appropriate page, but it was the only thing that looked relevant to performance ( http://wiki.apache.org/db-derby/PerformanceDiagnosisTips). It doesn't mention the stats/compress. Maybe some of the tips in those presentations should be the distilled into some performance tips wiki pages ... and also make it clear that stats need to be updated. Regards, Matt
Re: Can we create BUFFERPOOL and TABLESPACE in derby
derby does not have tablespaces. All tables are in one directory and thus must be on one logical disk. One could use the OS to spread that disk across multiple disks. One can also place the derby recovery log on a separate device. Derby control over the buffer pool is by setting maximum number of pages. It defaults to 1000 pages. And the pages can range from 4k to 32k. To set page size see setting the page cache size: http://db.apache.org/derby/docs/10.3/tuning/ctunperf54492.html Suman N wrote: Hi, Is it possible to create a BUFFERPOOL and TABLESPACE in Derby? If it is possible how should I do that? Thanks, Suman The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. If you are not the intended addressee please contact the sender and dispose of this e-mail. Thank you.
Re: Memory Leak or Expected Behavior - BtreePostCommit
gregsmit wrote: Hi, We have an application that is using embedded Derby (10.3), where we do a lot of adds and deletes. When we run this application under our stress scenarios, we see a memory leak in Derby. When we look at our heap dumps, what we see is an ever growing number of these classes: org.apache.derby.impl.services.deamon.ServiceRecord org.apache.derby.impl.store.access.btree.BtreePostCommit I found some other documentation that said that the BtreePostCommit is a job that runs after a delete is committed, that frees space in a Btree, and that it requires a table lock. What I think is happening is that because we are running at a constant heavily stressed state, is that our deletes are occurring, then this job is being placed on a queue after the commit. But because we are in a heavily stressed state, these jobs never run, so the queue grows larger and larger. Does this theory sound right, or does anyone have a better explanation? OK, assuming its right -- Is this the correct behavior? Should theses Post Commit jobs continue to get queued, and never run? Is there a way we can force them to take grab the locks and complete? Maybe something wrong with the way that we are committing, that doesn't allow them to run? We do not have a lot of experience with Derby, so we may be doing something wrong. Thanks for any help, Greg There are a few things that you can make sure of your application. Make sure to commit transactions, a single transaction without a commit that holds a lock on the btree will cause this problem. Does your stress application happen to make the table average around 0 rows (ie. add rows and then delete them all, ...) , this is likely the worst case for derby. Just for information do you see the memory go down at the end of your stress test after you stop the active deletes? I have seen this behavior in stress tests where there are lots of users vs.the single post commit thread. Basically the post commit thread only gets a chance to run 1/number of threads amount of time and if during that time any of the transactions are hold the lock then work can't be done. Historically I have not seen this be a problem in a real customer application, only in stress tests. Usually in most applications there is some downtime where the post commit gets to catch up and then these items can be run and memory released. But of course there could be a real app that requires this constant throughput with no chance of time for the post commit. A couple of improvements could be made to derby in this area. The current post commit architecture was developed when the code was targeted at embedded applications likely to be running on small single processor machines with likely small number of threads running. Now that cheap very fast dual cores are the norm for even the smallest laptop/deskside machine it might make sense to update the post commit thread code to recognize when it is falling behind and somehow increase it's throughput (ie. either add more threads, maybe async execution, or perform more chunks of work before giving up it's time ...).
Re: Memory Leak or Expected Behavior - BtreePostCommit
gregsmit wrote: Hi Mike, OK, sounds like our stress test fits in one of those worst case for derby categories: Does your stress application happen to make the table average around 0 rows (ie. add rows and then delete them all, ...) Yes, this is exactly what our stress application does. It loops through 1) Add Entry, 2) List all Entries, 3) Delete Entry just added. It does this on 25 threads with no think time or pauses Could you expand a little more on your test case (of course posting it would be best but I understand usually in these stress things there is a lot more than just derby stuff going on). o can you post ddl of table/indexes and maybe some description of usual data for keys in the btree. o what version of derby are you using? If I can check some intermediate changes into the trunk, can you test them? o where are the commits in the test with respect to the above tasks? o what is the isolation level of the above tasks. o could you describe exactly what jdbc you do for list all entries - is it a normal jdbc select loop off of select * from table? Just for information do you see the memory go down at the end of your stress test after you stop the active deletes? I don't know about this -- Unfortunately we usually have stopped everything after we start to see memory increase. I'll have to do some other tests to see if the memory drops back down if the stress stops. I'll change our test to put some pauses in there or something, so that these get a chance to run. Unfortunately, we need to go through all of these paths over and over to make sure we don't have a leak somewhere else. The more pauses we take, the longer we need to run to consider our code properly tested. I do agree with you -- It seems like that there should be some way to force the post commit queue to be active if its gets too large, to prevent these it looks like a leak situations. I think this is pretty common stress testing scenario. We are running on a 2 way, and still have hit this, I guess you could call it, Post Commit Queue Starvation problem. Thanks for the help, Greg Mike Matrigali wrote: There are a few things that you can make sure of your application. Make sure to commit transactions, a single transaction without a commit that holds a lock on the btree will cause this problem. Does your stress application happen to make the table average around 0 rows (ie. add rows and then delete them all, ...) , this is likely the worst case for derby. Just for information do you see the memory go down at the end of your stress test after you stop the active deletes? I have seen this behavior in stress tests where there are lots of users vs.the single post commit thread. Basically the post commit thread only gets a chance to run 1/number of threads amount of time and if during that time any of the transactions are hold the lock then work can't be done. Historically I have not seen this be a problem in a real customer application, only in stress tests. Usually in most applications there is some downtime where the post commit gets to catch up and then these items can be run and memory released. But of course there could be a real app that requires this constant throughput with no chance of time for the post commit. A couple of improvements could be made to derby in this area. The current post commit architecture was developed when the code was targeted at embedded applications likely to be running on small single processor machines with likely small number of threads running. Now that cheap very fast dual cores are the norm for even the smallest laptop/deskside machine it might make sense to update the post commit thread code to recognize when it is falling behind and somehow increase it's throughput (ie. either add more threads, maybe async execution, or perform more chunks of work before giving up it's time ...).
Re: Memory Leak or Expected Behavior - BtreePostCommit
gregsmit wrote: Hi Mike, Sure -- be happy to help with this issue. The database and the queries are very simple. The reason is that we are not really testing a real app, but rather a web framework that we are building as a platform for other apps. So, our Demo App is the one that is being stress tested, so all of the interactions with it are very simple. o can you post ddl of table/indexes and maybe some description of usual data for keys in the btree. Here's what we use to create the database table CREATE TABLE employees ( username varchar(32) NOT NULL, firstname varchar(16) NOT NULL, lastname varchar(16) NOT NULL, location varchar(64) NOT NULL, phonenumber varchar(16) NOT NULL, PRIMARY KEY (username) ); o what version of derby are you using? If I can check some intermediate changes into the trunk, can you test them? 10.3.1.4 Yes, I would be happy to test changes. It may take some time (to change our tests, and then it takes quite a while for the tests to run before we start to notice this leak) -- but absolutely. o where are the commits in the test with respect to the above tasks? There is a commit after each task. Each represents a REST call to a web app (a POST, a GET, an a DELETE) o what is the isolation level of the above tasks. Completely isolated, each being driven by a web app testing harness in another process I meant the SQL isolation level. Derby's default isolation level is read committed, but can be changed by either the application or possibly the framework you are running in. o could you describe exactly what jdbc you do for list all entries - is it a normal jdbc select loop off of select * from table? SELECT * FROM employees Let me know if there is anything else about our app that I can tell you. Also, just let me know, and I can try out fixes you put in trunk. Cheers, Greg
Re: Secure Delete
Ruben Misrahi wrote: We want to get security-certified and we need to know if deleting a record in Derby actually eliminates or physically overrides that space and/or if there is a way to do that. And more specifically, if I have a 200 byte field and I override it with a 2 character field. Can some still get the data that was there? you ask about deleting, but your example is an update. Do you care about just updating, or both? Also is your question what happens after the delete/update operation commits or after the statement but possibly before the commit. Once any operation happens there is no way through derby interfaces to get the old value. If you are concerned about bits/bytes left on pages that is a different story. For instance in the case of delete of a record when statement happens lock is granted on row, row is marked deleted but data left on page, and finally after commit the bytes are available for reuse sometime in the future but are not guaranteed to be overwritten/reused. Updates are more complicated depending on whether after update the row is bigger or smaller and if there was any internal reserved space associated with the row. Again different stuff happens before commit and after. We're willing to pay for an answer. Please email to: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Thanks -- Ruben Misrahi
Re: Derby and character set encodings
This is mixing a lot of things up. I also may use the wrong terminology here. Character set encodings really only come into play with tools like ij, and import getting the string from the environment into derby. The more standard interaction is using jdbc to load a java string into derby. At that level we don't do anything with encodings. We happen to use a modified utf8 to store stuff to disk, and this is not configurable. But no user interface should depend on this encoding, and Derby could change this storage in the future. Logically all strings at runtime are converted to standard java char. Before 10.3 we always used standard java string compare which did a numerical comparison of the unicode value of chars to arrive at ordering. That is still the default. In 10.3 an option was added to set the territory based collation when the database is created such that comparison is dependent on the territory of the database. For this standard java rule based Collator interfaces are used. This is documented in the latest derby release. David Van Couvering wrote: Hi, all. I am getting some questions from Ken Frank NetBeans internationalization quality team about Java DB and character set encodings. Rather than try and play go-between, I'm including him here so he can directly ask any follow-on questions. Ken would like to understand how Derby makes use of character encodings, and how it is affected by various settings. How does Derby handle things if the encoding is set to something different from our default of UTF-8? Are we impacted, or do we rely on Java routines such as the Collator and Comparator class to handle this? Sorry if I'm talking out my ear, i18n is not one of my fortes. Thanks, David
Re: Any problems to run derby in USB drive ?
This leads me to believe that these drives are not correctly supporting sync to disk as required by derby. Derby uses java to sync the data and log files when necessary. If the devices don't correctly respond to these requests then a number of different types of corruption: checksum errors mismatched index/base tables because log and data syncing not correctly coordinated ... My guess is that device driver for the usb device responds to a write sync immediately before actually forcing the data to the device. This is similar to the problem of setting write cache enabled on windows. Derby can not guarantee a consistent database under these conditions. /mikem Francois Orsini wrote: Ivan, If you decide to run your application from an USB drive, try to make sure it is *unmount'ed* (on Unix) or software-safely removed on Windows properly, to ensure all data has been flushed to the USB drive. I have seen file checksum/corruption issues occasionally where removing a drive right after completing writes on it...Maybe something you can test as part of your application and on the targeted platforms. --francois On 7/6/07, *Ivan Ooi* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hi all, I wanted to deploy my App with Derby in USB drive but I'm not sure is there any restrictions or not ? or problems ? Thanks
Re: Adding order by clause eliminates results
you may want to look at this recent discussion about updating an open cursor, in the development discussion group. http://thread.gmane.org/gmane.comp.apache.db.derby.devel/44225/focus=44299 Elliotte Harold wrote: Interesting. That does indeed fix it, and it does make some sense. What's really surprising here is that the SELECT before the INSERT ever worked. I'm still not sure if this qualifies as a Derby bug or not, but it does feel weird that adding ORDER BY to the query can change the result set. -- Elliotte Rusty Harold [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
Re: java.sql.SQLException: An internal error was identified by RawStore module.
NEVER, NEVER, NEVER, . delete transaction log files from a database that you ever hope to get a consistent view of again. If you are really getting a recovery error when trying to boot the database you may have to resort to extreme measures if you have no backup to go to, but anytime you delete a file from derby data or log directly you have almost always insured that the database will be corrupt. Sometimes you can get lucky and some of the tables are accessible and you can copy them out to a non-corrupt database. Hopefully you were only acting on a copy of the real database, and were just experimenting. Were you getting recovery errors on boot when attempting to connect to the database? Also the best way to get more info, is to post complete dumps of derby.log for the various errors you are encountering. Complete stack traces, with logStatementText enabled gives a pretty good idea what is going on. Note that the An internal error was identified by RawStore module. is almost always not the real error - some error before it caused this. I think fixes for this went into 10.2. BALDWIN, ALAN J [AG-Contractor/1000] wrote: Hi, I have a client's database that I cannot connect to. I assumed that it was a result of a transaction log corruption or some such thing since we have seen that in the past. As an attempt to salvage it, I tried deleting the two transaction logs in the log folder. Now, I can open it, but when I run a statement like SELECT * FROM Dealer, I first get a NullPointerException. Then if I execute that or any select statement again, I get this: java.sql.SQLException: An internal error was identified by RawStore module. Is there any way I'm going to be able to salvage this database? Does anyone have any tricks that I could try? Regards, -Alan Baldwin- /This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited./ /All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment./
Re: java.sql.SQLException: An internal error was identified by RawStore module.
This derby.log is indeed a recovery boot error. It has the feel of the kind of error one would get if someone had deleted a log file from the log directory. I've also seen this kind of error if someone copies an active derby database without using derby backup to insure a consistent set of database and log files. What is happening is that recovery is booting trying to REDO work in the log. The following error is the key one: ERROR XSDB4: Page Page(1,Container(0, 560)) is at version 575, the log file contains change version 1,052, either there are log records of this page missing, or this page did not get written out to disk properly. It has found a log record with version 1,052 of the page but the page on disk is only at version 575, so there are 577 changes to the page missing from the recovery log files. Given the number of missing records for this page, there are going to be big problems with any attempt to get this table back. You may want to try using 10.2.2 or even a latest build of the 10.2 branch, but I would not expect it to fix this recovery error. It may be better at trying to access the copy of the corrupt db with log files deleted, I thought fixes had gone in to solve the bad error handling for null pointers which sometimes led to the RawStore module error. The intended solution to problems of this type is to recover from backup. But I assume since you are here, there is no backup. You can try the path you are already on and try to access a copy of the db to data mine the corrupt the db, if there is no other way to get the data. You may want to post the derby.log from when you tried to do that. BALDWIN, ALAN J [AG-Contractor/1000] wrote: Yes, I have the original database with transaction logs intact. Attached is derby.log. The error is what happens when I try to connect using the original database with no modifications. I'm using cloudscape workbench to try to connect. Also, the version of Derby I'm using 10.2.1.6, not 10.1.x as I mentioned before. That was a typo. Thanks, -Alan- -Original Message- From: Mike Matrigali [mailto:[EMAIL PROTECTED] Sent: Thursday, June 07, 2007 12:59 PM To: Derby Discussion Subject: Re: java.sql.SQLException: An internal error was identified by RawStore module. NEVER, NEVER, NEVER, . delete transaction log files from a database that you ever hope to get a consistent view of again. If you are really getting a recovery error when trying to boot the database you may have to resort to extreme measures if you have no backup to go to, but anytime you delete a file from derby data or log directly you have almost always insured that the database will be corrupt. Sometimes you can get lucky and some of the tables are accessible and you can copy them out to a non-corrupt database. Hopefully you were only acting on a copy of the real database, and were just experimenting. Were you getting recovery errors on boot when attempting to connect to the database? Also the best way to get more info, is to post complete dumps of derby.log for the various errors you are encountering. Complete stack traces, with logStatementText enabled gives a pretty good idea what is going on. Note that the An internal error was identified by RawStore module. is almost always not the real error - some error before it caused this. I think fixes for this went into 10.2. BALDWIN, ALAN J [AG-Contractor/1000] wrote: Hi, I have a client's database that I cannot connect to. I assumed that it was a result of a transaction log corruption or some such thing since we have seen that in the past. As an attempt to salvage it, I tried deleting the two transaction logs in the log folder. Now, I can open it, but when I run a statement like SELECT * FROM Dealer, I first get a NullPointerException. Then if I execute that or any select statement again, I get this: java.sql.SQLException: An internal error was identified by RawStore module. Is there any way I'm going to be able to salvage this database? Does anyone have any tricks that I could try? Regards, -Alan Baldwin- /This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited./ /All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment./ - This e-mail message may
Re: tuning/storage questions
Jim Newsham wrote: I’m reading the tuning guide and trying to eke out a little more performance for our app by tweaking storage parameters. The properties our database exhibits which seem potentially of important to storage tuning are: I'll answer below, but what I usually say in these cases is that you are starting too low. I would not expect more than a 10% change for any of these low level stuff. I would look more at the queries that are going to slow and see if the right indexes exist for these queries. Often you can get orders of magnitude changes in this area. also you don't say if you are using embedded vs. network. Returning large result sets is likely to go much faster in embedded vs. network. - tables with small records (typically 3 fields; usually fixed size, but some have varchar) - few, but large (number of records per) tables - frequent inserts - once inserted, data will likely not change (it’s possible for an update operation, but in practice may never happen) - several indexes per table - queries usually involve joins and are usually selective - some queries return somewhat large result sets - queries currently take longer than we would like 1. Because updates will be very rare, I think it makes sense to set pageReservedSpace to 0%. If an update occurs which expands the size of a row (unlikely but possible), there will be a performance hit (I assume during both the update and subsequent reads of this record, because the documentation implies that part of the row will be written to an overflow page). But my pages will hold 25% more data, which will be a big win for the normal case. Reasonable? should be completely reasonable for tables with only char colums. And probably reasonable for varchar columns. Your interpretation of impact is right on. Derby, as zero-admin should already optimize the space for all fixed length column tables, but it doesn't. I just filed DERBY-2746 for this feature - you may want to vote on it or help do the work to implement it if you think it would be useful. 2. Page size is a little less straightforward. The following imply page size should be small: - not storing large objects - small columns/records - selective queries The following imply page size should be large: - tables with large numbers of records - inserts only (?? The documentation says read-only apps should use a large page size. I’m not quite sure of the reasoning for this. Would this also apply to an app such as mine which only performs inserts but not updates?) The documentation says large page size for indexes is a big performance gain. I was wondering if it makes sense to have small pages for all tables, and large pages for all indexes. Does having heterogeneous page sizes have any adverse effects on the page cache? Does having mismatched page sizes between a table and its indexes have any negative impact? There may be a garbage collection impact on the cache. The current cache handles heterogeneous page sizes fine, but when replacing a page where the size is the same it reuses a page size array, but if it is different it throws away the old array and allocates a new one. 3. What is the page cache eviction policy? Is it least-frequently-used? Do indexes get preferential treatment? Since some queries pull a large amount of data, I think this could potentially repopulate the entire cache with data which may not be used again. Does this mean it would be wasteful to increase the page cache size? it is basically lru, so if your queries are using indexes the index pages probably stay in cache. But a scan not using the indexes that is bigger than cache may throw out everything. Thanks, Jim Newsham
Re: Derby db - need to disable improved performance
is your app single threaded, if so group commit is not the issue. What is your OS? What is your JVM? Derby may use different syncing algorithms depending on JVM version. How did you measure synchronous write, ie. did you write a java program and execute against the same JVM as derby is running in? The disk that that contains the log directory is the one of interest. Each transaction is made up of a number of log records. From your description each transaction will have the following: begin log record insert log record for row into base table insert log record for row into primary key index commit log record yarono wrote: Hello, I'm working on a simple db. Each record is composed of 3 long values. The first two are the primary key. I have to measure the performance of the insertions. Each insertion is wrapped in a transaction, which is commited having only one insertion in it. I've measured both berkeley db performance and postgres and got about 110-115 insertions per second. Now in derby db (both in embeded mode and server mode) I get better performance: about 250-300 insertions per second. This obviously results from some kind of a group commit, although I get these results both when auto-commiting or manual-commiting after each insertion. I've performed a simple test of synchronious writing 24 bytes (3 * 8 bytes) to the disk. It measure 117 writes per second, and I believe this is the upper bound of any db performance. So, I don't understand why I get such good performance, although I commit after each insertion. I examined the .dat files in both /log and /seg0 folders. None of them increase in 24 bytes segments, but rathar bigger segments. So, my questions are: 1. Which log file in /log or /seg0 should I examine to analyze the numebr of bytes written each write to disk? 2. How do I disable the group commit or whatever attribute that causes this communal write? how do I make each transaction be written on its own to the disk? Thanks in advance, Yaron
Re: Embedded thread taking 100% cpu
are you saying that if you stop your loop and do nothing then you see the cpu spinning at 100% forever. The stack traces are what I would expect, and they may spin for 100% cpu for a short time depending on size of table and cache size as they may require no I/O. Anytime you delete all rows from a page as it seems your step 4 does, then work is queued to be activated when that transaction commits. At step 5 the work is processed, it basically reclaims the space freed by the committed deletes and makes it available for future inserts into the table. Adam Bovill wrote: Hello, I seem to be having a problem with my embedded derby derby.rawStoreDeamon thread where it takes up 100% of my CPU. Everything else is still responsive, but this thread seems to be looping through the following stack traces. Also, as soon as I make any additional action on the Database, it returns to normal, so it seems that I have done something to get it into a strange state. The following stack traces were taken by halting that thread multiple times, they are not in any order. So this happens after repeating the following multiple times: 1. Add items to two tables. (3 items into one, 15 into the other) 2. Set autocommit = false 3. force commit 4. attempt to delete all the items (and they are successfully deleted) 5. force commit 6. set autocommit = true I set the autocommit to false so that I can roll back the changes if the delete fails. It seems to just sit there and spin doing nothing. Have I done something wrong with my commit logic? Thanks, Adam Thread [derby.rawStoreDaemon] (Suspended) LockControl.addLock(LockSet, Object, Object) line: not available LockSet.lockObject(Object, Lockable, Object, int, Latch) line: not available SinglePool.lockAnObject(Object, Object, Lockable, Object, int, Latch) line: not available SinglePool.lockObject(Object, Object, Lockable, Object, int) line: not available ContainerLocking3.lockContainer(Transaction, ContainerHandle, boolean, boolean) line: not available BaseContainerHandle.useContainer(boolean, boolean) line: not available BaseDataFileFactory.openContainer(RawTransaction, ContainerKey, LockingPolicy, int, boolean) line: not available BaseDataFileFactory.openContainer(RawTransaction, ContainerKey, LockingPolicy, int) line: not available InternalXact(Xact).openContainer(ContainerKey, LockingPolicy, int) line: not available OpenHeap(OpenConglomerate).init(ContainerHandle, Conglomerate, int[], TransactionManager, Transaction, boolean, int, int, LockingPolicy, DynamicCompiledOpenConglomInfo) line: not available Heap.open(TransactionManager, Transaction, boolean, int, int, LockingPolicy, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line: not available RAMTransaction.openConglomerate(Conglomerate, boolean, int, int, int, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line: not available RAMTransaction.openConglomerate(long, boolean, int, int, int) line: not available B2I.lockTable(TransactionManager, int, int, int) line: not available BTreePostCommit.performWork(ContextManager) line: not available BasicDaemon.serviceClient(ServiceRecord) line: not available BasicDaemon.work(boolean) line: not available BasicDaemon.run() line: not available Thread.run() line: not available Thread [derby.rawStoreDaemon] (Suspended) OpenHeap(OpenConglomerate).init(ContainerHandle, Conglomerate, int[], TransactionManager, Transaction, boolean, int, int, LockingPolicy, DynamicCompiledOpenConglomInfo) line: not available Heap.open(TransactionManager, Transaction, boolean, int, int, LockingPolicy, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line: not available RAMTransaction.openConglomerate(Conglomerate, boolean, int, int, int, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line: not available RAMTransaction.openConglomerate(long, boolean, int, int, int) line: not available B2I.lockTable(TransactionManager, int, int, int) line: not available BTreePostCommit.performWork(ContextManager) line: not available BasicDaemon.serviceClient(ServiceRecord) line: not available BasicDaemon.work(boolean) line: not available BasicDaemon.run() line: not available Thread.run() line: not available Thread [derby.rawStoreDaemon] (Suspended) BaseContainerHandle.useContainer(boolean, boolean) line: not available
Re: Embedded thread taking 100% cpu
Another question. Is there a chance that your application is still holding a lock on the table after you stop the loop, maybe a missed commit? Adam Bovill wrote: Well, I cannot attest to forever, but they have been running at 100% for the last 8 hours. It happens after repeating these steps for 2-4 hours. Is there a way to trace all of the actual commands that are sent to the db? I can also provide my schema if that helps. These are two separate tables, with no foreign keys. Thanks, Adam -Original Message- From: Mike Matrigali [mailto:[EMAIL PROTECTED] Sent: Friday, May 18, 2007 3:29 PM To: Derby Discussion Subject: Re: Embedded thread taking 100% cpu are you saying that if you stop your loop and do nothing then you see the cpu spinning at 100% forever. The stack traces are what I would expect, and they may spin for 100% cpu for a short time depending on size of table and cache size as they may require no I/O. Anytime you delete all rows from a page as it seems your step 4 does, then work is queued to be activated when that transaction commits. At step 5 the work is processed, it basically reclaims the space freed by the committed deletes and makes it available for future inserts into the table. Adam Bovill wrote: Hello, I seem to be having a problem with my embedded derby derby.rawStoreDeamon thread where it takes up 100% of my CPU. Everything else is still responsive, but this thread seems to be looping through the following stack traces. Also, as soon as I make any additional action on the Database, it returns to normal, so it seems that I have done something to get it into a strange state. The following stack traces were taken by halting that thread multiple times, they are not in any order. So this happens after repeating the following multiple times: 1. Add items to two tables. (3 items into one, 15 into the other) 2. Set autocommit = false 3. force commit 4. attempt to delete all the items (and they are successfully deleted) 5. force commit 6. set autocommit = true I set the autocommit to false so that I can roll back the changes if the delete fails. It seems to just sit there and spin doing nothing. Have I done something wrong with my commit logic? Thanks, Adam Thread [derby.rawStoreDaemon] (Suspended) LockControl.addLock(LockSet, Object, Object) line: not available LockSet.lockObject(Object, Lockable, Object, int, Latch) line: not available SinglePool.lockAnObject(Object, Object, Lockable, Object, int, Latch) line: not available SinglePool.lockObject(Object, Object, Lockable, Object, int) line: not available ContainerLocking3.lockContainer(Transaction, ContainerHandle, boolean, boolean) line: not available BaseContainerHandle.useContainer(boolean, boolean) line: not available BaseDataFileFactory.openContainer(RawTransaction, ContainerKey, LockingPolicy, int, boolean) line: not available BaseDataFileFactory.openContainer(RawTransaction, ContainerKey, LockingPolicy, int) line: not available InternalXact(Xact).openContainer(ContainerKey, LockingPolicy, int) line: not available OpenHeap(OpenConglomerate).init(ContainerHandle, Conglomerate, int[], TransactionManager, Transaction, boolean, int, int, LockingPolicy, DynamicCompiledOpenConglomInfo) line: not available Heap.open(TransactionManager, Transaction, boolean, int, int, LockingPolicy, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line: not available RAMTransaction.openConglomerate(Conglomerate, boolean, int, int, int, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line: not available RAMTransaction.openConglomerate(long, boolean, int, int, int) line: not available B2I.lockTable(TransactionManager, int, int, int) line: not available BTreePostCommit.performWork(ContextManager) line: not available BasicDaemon.serviceClient(ServiceRecord) line: not available BasicDaemon.work(boolean) line: not available BasicDaemon.run() line: not available Thread.run() line: not available Thread [derby.rawStoreDaemon] (Suspended) OpenHeap(OpenConglomerate).init(ContainerHandle, Conglomerate, int[], TransactionManager, Transaction, boolean, int, int, LockingPolicy, DynamicCompiledOpenConglomInfo) line: not available Heap.open(TransactionManager, Transaction, boolean, int, int, LockingPolicy, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line: not available RAMTransaction.openConglomerate(Conglomerate, boolean, int, int, int, StaticCompiledOpenConglomInfo, DynamicCompiledOpenConglomInfo) line
Re: Exception getting columns from table
As with all problem reports, could you say what exact version of Derby you are using. There were 2 different 10.2 releases, the latest was 10.2.2.0. Also just to verify, do you expect your database to be read only? There have been problems in the past with soft upgrade and stored prepared statements - DERBY-541. Stanley Bradbury wrote: Andy Stewart wrote: Hi can anyone help? We were using derby 10.1 but have since upgraded to 10.2. Any databases created against version 10.1 work fine in our app against version 10.2. of derby. However when we try and connect to a database that has been created in 10.2 we get the following exception. Does anyone know why this is and what the solution is? Cheers, Andy Caused by: java.sql.SQLException: Container was opened in read-only mode. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.init(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement20.init(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement30.init(Unknown Source) at org.apache.derby.jdbc.Driver30.newEmbedPreparedStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.prepareMetaDataStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.prepareSPS(Unknown Source) at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.getPreparedQueryUsingSystemTables(Unknown Source) at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.getPreparedQuery(Unknown Source) at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.getPreparedQuery(Unknown Source) at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.doGetCols(Unknown Source) at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.getColumns(Unknown Source) Hi Andy - Derby will open a database as read-only when it can't create the db.lck file when the database is booted. Check for ownership and permission problems with the 10.2 database directories. Does this message show up in the derby.log file as well?
Re: SQL Exception: Container xxx not found
Are the two threads sharing the same Connection object (and thus the same transaction)? If not I also would have expected a lock wait call. There may be a problem with dependency management. Are you using bulk load to load an empty table? /mikem Parker, Patricia (LNG-CON) wrote: Hi Stanley, I disabled the cache and did more analysis of the actual timing of the events, and it appears that the problem is that the bulk load CALL SYSCS_UTIL.SYSCS_IMPORT_DATA (?,?,?,?,?,?,?,?,?) is not entirely finishing before I try to query the table it is populating. I would expect to get a locking exception, or something of that nature, but the SQL error code that gets returned is either XSCB1 or XSCH1, both of which come with the message container containerName not found. I think you're right that it's a Windows caching or heap issue. I am getting around the problem by trapping for those two error codes, putting my thread to sleep for awhile, then retrying. It seems to work fine that way. An improvement I would suggest is having SYSCS_UTIL.SYSCS_IMPORT_DATA indicate when it is done loading. As it stands now, the thread I have running it finishes before the utility finishes, and I have no elegant way of knowing when the load is done, other than trying to use the table and trapping for errors, as stated above. Thanks for your help! Patty -Original Message- From: Stanley Bradbury [mailto:[EMAIL PROTECTED] Sent: Thursday, May 10, 2007 3:18 PM To: Derby Discussion Subject: Re: SQL Exception: Container xxx not found Hi Patty - I fear this will be a hard one to solve without a reproducible test case. Before going that route (I assume it will be difficult to provide a test case) I suggest seeing if this might be system related. The oddest thing is that the Exception is not in the derby.log along with the SQLState and stack trace. Missing data (be it exceptions information or data containers) could be caused by in-memory problems or possibly I/O caching. Windows write caching is enabled by default and can cause problems in I/O intensive applications. Please disable this and see if the problem persists. This can be disabled on XP from the disk drive/properties/hardware tab/policy tab - more information can be found, along with a warning, at: http://support.microsoft.com/kb/259716 If the problem still happens would you run your tests using a different JVM and/or machine to see if this corrects or effects the problem. If none of these actions resolve the problem then please post the problem to the derby-dev for additional ideas. Parker, Patricia (LNG-CON) wrote: Hi Stanley, Thanks for your reply. I am running Derby 10.2.2 in embedded mode on Windows XP platform in Eclipse. I have attached both the derby.log file and a snapshot of my Eclipse console, since the actual SQL error shows up there on line 231. The corresponding messages in derby.log occur around line 185, when the rollback starts to happen. The only mention I have seen of a similar problem in Derby was in the mail archives here... http://mail-archives.apache.org/mod_mbox/db-derby-user/200702.mbox/%3C4 [EMAIL PROTECTED] The response to the above mail seemed to imply that Derby may lose track of tables, so I have been trying different timings of when I define my tables in relation to when I do the inserts. Here is more detail on my process. I am running a graph process, where a node process runs on its own thread and bulk loads a bunch of data from a flat file into a table that it defines on-the-fly using some XML that describes the data in the flat file. After the node process thread completes, a connector thread runs that applies a filter to all the data that was just bulk loaded and inserts the filtered data into another brand new, defined on-the-fly table. This is done by performing an INSERT INTO new connector table SELECT * FROM table created for node WHERE some filter statement. Since this is a graph process, many node threads may be processing at once, and many connector threads may also be processing concurrently. I sometimes have noticed some deadlocking between connectors, but it seems to recover and move on ok. My problem happens whether or not a deadlock message occurs. My bulk loads have been working fine in the nodes, but I believe the problem may occur when my program goes to access the bulk loaded data to use as data to do the inserts into my new tables in the connectors. Right now, I define my tables right before loading/inserting into them. I have tried defining all my tables up front before running the bulk loads and inserts. I have also tried sharing instances of the DriverManager, but that did not seem to help. I now get a new DriverManager each time I do a bulk load or insert, as the log will show. I am currently using the isolation TRANSACTION_READ_COMMITTED, but I have had the same issue with TRANSACTION_SERIALIZABLE. Thanks again for
Re: Derby Errors with Apache Jackrabbit
The problem looks like Derby is encountering a bad pointer while traversing a long column chain - likely a blob or clob column. The thing to narrow down is if this is a static problem in the existing db, or some runtime issue with muliple threads of control accessing (read/write) the data. I have seen these types of issues in multi-threaded apps where multiple threads use the same connection (ie. same transaction) and one thread is in a select loop on a large object and another thread updates the same object breaking the linkage in the middle. locking will prevent this if threads use different transactions but not when using the same transaction. Also are you using embedded or network version. Can you reproduce the error, single threaded on a copy of the db? You can use logStatementText to get the exact value of ? in that statement if you have some jackrabbit script that reproduces the problem. Does anyone know how hard it would be to get the error statement to log to derby.log the parameters of the statement? The derby code should be checking for the -1 return, but not sure that will help you as I think it doesn't expect it. Steven Huey wrote: Hi, I'm using Derby 10.2.1.6 with Apache Jackrabbit 1.2.3 and just started getting the following error in our derby.log file, which is rendering our Jackrabbit repository inaccessible. To date we've had no problems with Derby and it's been maintenance free so I'm not sure how to proceed to resolve the errors, any help is greatly appreciated. Database Class Loader started - derby.database.classpath='' 2007-05-02 17:41:12.805 GMT: Booting Derby version The Apache Software Foundation - Apache Derby - 10.2.1.6 - (452058): instance 6074401f-0112-4de1-629c-00150808 on database directory /opt/jcr/repository/workspaces/default/db Database Class Loader started - derby.database.classpath='' 2007-05-02 17:41:59.626 GMT Thread [DefaultQuartzScheduler_Worker-1,5,main] (XID = 28872), (SESSIONID = 0), (DATABASE = /opt/jcr/repository/workspaces/default/db), (DRDAID = null), Cleanup action starting 2007-05-02 17:41:59.626 GMT Thread [DefaultQuartzScheduler_Worker-1,5,main] (XID = 28872), (SESSIONID = 0), (DATABASE = /opt/jcr/repository/workspaces/default/db), (DRDAID = null), Failed Statement is: select NODE_DATA from DEFAULT_NODE where NODE_ID = ? java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.derby.impl.store.raw.data.BasePage.getHeaderAtSlot (Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.restorePortionLongColumn (Unknown Source) at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder( Unknown Source) at org.apache.derby.impl.store.raw.data.OverflowInputStream.init (Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.readRecordFromArray (Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.restoreLongRecordFromSlo t(Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.restoreRecordFromSlot (Unknown Source) at org.apache.derby.impl.store.raw.data.BasePage.fetchFromSlot (Unknown Source) at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateContr oller.fetch(Unknown Source) at org.apache.derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowC ore(Unknown Source) at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCor e(Unknown Source) at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow (Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) at org.apache.jackrabbit.core.persistence.db.DatabasePersistenceManager.loa d(DatabasePersistenceManager.java:339) at org.apache.jackrabbit.core.state.SharedItemStateManager.loadItemState (SharedItemStateManager.java:1101) at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualIte mState(SharedItemStateManager.java:1028) at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState (SharedItemStateManager.java:250) at org.apache.jackrabbit.core.state.LocalItemStateManager.getNodeState (LocalItemStateManager.java:95) at org.apache.jackrabbit.core.state.LocalItemStateManager.getItemState (LocalItemStateManager.java:150) at org.apache.jackrabbit.core.state.XAItemStateManager.getItemState (XAItemStateManager.java:226) at org.apache.jackrabbit.core.state.SessionItemStateManager.getItemState (SessionItemStateManager.java:177) at org.apache.jackrabbit.core.HierarchyManagerImpl.getItemState (HierarchyManagerImpl.java:120) at org.apache.jackrabbit.core.HierarchyManagerImpl.resolvePath (HierarchyManagerImpl.java:267) at org.apache.jackrabbit.core.CachingHierarchyManager.resolvePath
Re: Derby Errors with Apache Jackrabbit
Mike Matrigali wrote: The problem looks like Derby is encountering a bad pointer while traversing a long column chain - likely a blob or clob column. The thing to narrow down is if this is a static problem in the existing db, or some runtime issue with muliple threads of control accessing (read/write) the data. I have seen these types of issues in multi-threaded apps where multiple threads use the same connection (ie. same transaction) and one thread is in a select loop on a large object and another thread updates the same object breaking the linkage in the middle. locking will prevent this if threads use different transactions but not when using the same transaction. Also are you using embedded or network version. Can you reproduce the error, single threaded on a copy of the db? You can use logStatementText to get the exact value of ? in that statement if you have some jackrabbit script that reproduces the problem If you can't repro single threaded on copy of db, it would be interesting to know what isolation level the fetch is happening in. default is read committed. I would have to think if read uncommitted could cause a problem in multi-user mode. Does anyone know how hard it would be to get the error statement to log to derby.log the parameters of the statement? The derby code should be checking for the -1 return, but not sure that will help you as I think it doesn't expect it. Steven Huey wrote: Hi, I'm using Derby 10.2.1.6 with Apache Jackrabbit 1.2.3 and just started getting the following error in our derby.log file, which is rendering our Jackrabbit repository inaccessible. To date we've had no problems with Derby and it's been maintenance free so I'm not sure how to proceed to resolve the errors, any help is greatly appreciated. Database Class Loader started - derby.database.classpath='' 2007-05-02 17:41:12.805 GMT: Booting Derby version The Apache Software Foundation - Apache Derby - 10.2.1.6 - (452058): instance 6074401f-0112-4de1-629c-00150808 on database directory /opt/jcr/repository/workspaces/default/db Database Class Loader started - derby.database.classpath='' 2007-05-02 17:41:59.626 GMT Thread [DefaultQuartzScheduler_Worker-1,5,main] (XID = 28872), (SESSIONID = 0), (DATABASE = /opt/jcr/repository/workspaces/default/db), (DRDAID = null), Cleanup action starting 2007-05-02 17:41:59.626 GMT Thread [DefaultQuartzScheduler_Worker-1,5,main] (XID = 28872), (SESSIONID = 0), (DATABASE = /opt/jcr/repository/workspaces/default/db), (DRDAID = null), Failed Statement is: select NODE_DATA from DEFAULT_NODE where NODE_ID = ? java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.derby.impl.store.raw.data.BasePage.getHeaderAtSlot (Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.restorePortionLongColumn (Unknown Source) at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder( Unknown Source) at org.apache.derby.impl.store.raw.data.OverflowInputStream.init (Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.readRecordFromArray (Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.restoreLongRecordFromSlo t(Unknown Source) at org.apache.derby.impl.store.raw.data.StoredPage.restoreRecordFromSlot (Unknown Source) at org.apache.derby.impl.store.raw.data.BasePage.fetchFromSlot (Unknown Source) at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateContr oller.fetch(Unknown Source) at org.apache.derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowC ore(Unknown Source) at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCor e(Unknown Source) at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow (Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) at org.apache.jackrabbit.core.persistence.db.DatabasePersistenceManager.loa d(DatabasePersistenceManager.java:339) at org.apache.jackrabbit.core.state.SharedItemStateManager.loadItemState (SharedItemStateManager.java:1101) at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualIte mState(SharedItemStateManager.java:1028) at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState (SharedItemStateManager.java:250) at org.apache.jackrabbit.core.state.LocalItemStateManager.getNodeState (LocalItemStateManager.java:95) at org.apache.jackrabbit.core.state.LocalItemStateManager.getItemState (LocalItemStateManager.java:150) at org.apache.jackrabbit.core.state.XAItemStateManager.getItemState (XAItemStateManager.java:226) at org.apache.jackrabbit.core.state.SessionItemStateManager.getItemState (SessionItemStateManager.java:177
Re: commit then read immediately
frederic barachant wrote: Craig L Russell a écrit : Hi, Are the reader and writer in different processes? Then caching is definitely on your list of to be investigated. If there's no caching, I can't think of anything that might cause this. Another possibility is that you are not managing relationships correctly (on both sides of the relationship). And the access pattern relies on navigating relationships instead of querying or finding instances by key. I do not think it is the case. Remember that a simple sleep after commit changes/'corrects' the behavior. No other code change is needed. Is there a way to prevent this if it is a known behavior? It's unlikely that this is a Derby issue. That was my first guess, as no one would use it if it were the case. :) I also can't think of any way that this can be a Derby issue. Once Derby executes a change to the data we actually have no previous copy of the data around (we have no versioning support to get at previous copies of the data). This sounds like some cache in the software between derby and your application. Have you tried multiple JPA implementations? Not yet. Have you asked this same question on your JPA implementation's forum? Not yet. I wanted to start at lowest level first. This will be my next step. Good luck, Thank you, i will need it.
Re: Drop Table
tbrazao wrote: Hi, i'm a new user of derby. My question is about the ideia of sql : drop table if exists I read some explanations in this forum, but i didn´t find answers to work in sql line commander. I know that derby doens't support this job, but i want to ensure that if an table exists it is dropped. So i think we can use something like... select count(*) from SYS.SYSTABLES where TABLENAME = tableName, where tableName could be an arg for an procedure, but i don´t know how to relate this with an drop call. It was simple if we have one if clause. The problem is that i need to do this without java sintax, i know that seems strange, but life it's strange and nobody cares about it... If anyone could help, Thanks! Why not just drop the table and let derby return an error if it did not exist? And sorry my English!
Re: Store very large Strings
Ruben Fonseca wrote: On Fri, 2007-04-13 at 11:31 -0700, Raymond Kroeker wrote: Hi Ruben, The connection object has a setAutoCommit(boolean) api. If you can easily reference it and are not running within a transaction manager framework set it to false for the method call. Ok, this really solved my problem. THANK YOU. Now my second question is: is this the expected behaviour? It is really necessary to disable AutoCommit when dealing with large BLOBs in Derby? no, but jdbc and derby only guarantees the lifetime of a blob for the current transaction so if your application processes the whole blob before end of transaction then it should work. The problem is that many applications tend to execute another statement on the same connection before finishing with the blob, this other statement causes the connection to commit under autocommit and leads to inconsistent results as the application is now accessing an invalid blob. Ruben
Re: Asychronous Replication
Chabot, Jerry wrote: I'm considering options for asynchronous replication. A low cost option is to freeze the source and target databases, synchronize the files using rsync (one way) and unfreeze the databases. Freezing is acceptable given the expected usage. My concern is existing clients of the target database may see stale data. Is there a need to flag the target clients that the files have changed? Does anyone see any technical issues with this approach? It won't work. From the limited description I am assuming you are hoping to allow both the source and target to remain open but inactive during the process. The state of an open database is maintained by it's in memory cache, data files and log files. If you just copy over data and log from source to target and then unfreeze the target you have clobbered the association of the target's in memory cache and the data/log files. If freezing the target could include shutting down all connections, then a cheap solution would be to use derby approved backup mechanism to backup the data files and the logs to the target and then connect to the backup at the target using a normal connection - which if the database had been closed before would allow normal booting of the database which would apply the log and bring up a consistent db. This will of course overwrite everything that was at target. -Jerry * The information contained in this email message and its attachments is intended only for the private and confidential use of the recipient(s) named above, unless the sender expressly agrees otherwise. Transmission of email over the Internet is not a secure communications medium. If you are requesting or have requested the transmittal of personal data, as defined in applicable privacy laws by means of email or in an attachment to email, you must select a more secure alternate means of transmittal that supports your obligations to protect such personal data. If the reader of this message is not the intended recipient and/or you have received this email in error, you must take no action based on the information in this email and you are hereby notified that any dissemination, misuse or copying or disclosure of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by email and delete the original message. *
Re: [n00b] Recovering from program crashes
Stanley Bradbury wrote: Hi John - When using a version 1.4.2 JVM or higher and the default Derby property set you should not need to manually remove the db.lck file. Just to reiterate if your JVM is higher that 1.4.2 then none of the below applies. Unless there is a bug, Derby will only lock you out if an existing JVM still has a lock on that file. There should be no OS/JVM/machine variations if you have version 1.4.2 or higher and are seeing this message. Can you describe what it means that your program crashed? Did the JVM that was running it go down? Are you using network server, if so is it possible your client program crashed but left the network server up and still connected to database? The message you are getting indicates that the db.lck file exists AND a process has the file locked. Generally when an embedded Derby application aborts the JVM aborts too thus releasing the filelock. The next time database access is attempted the db.lck is found but it is not locked so Derby removes the old file, creates a new one and boots the database. The property derby.database.forceDatabaseLock changes this behavior so unset this if it is set: [On some platforms, if set to true, prevents Derby from booting a database if a db.lck file is present in the database directory]. Otherwise check to see if, after the application crashes, there is still a JVM process running. If so need to either reattach to that process or kill the JVM process to get past the filelock or release it. HTH
Re: Alter Table ADD Column throws column number exception
add constraint FK_ORDERDETAIL_ORDERLINETYPE foreign key (ORDERLINETYPEID) references ORDERLINETYPE (ORDERLINETYPEID); alter table ORDERDETAIL add constraint FK_ORDERDETAIL_PRODUCT foreign key (PRODUCTUPC, SEEDYEAR) references PRODUCT (PRODUCTUPC, SEEDYEAR); alter table ORDERDETAIL add constraint FK_ORDERDETAIL_PRODUCTORDER foreign key (ORDERID) references PRODUCTORDER (ORDERID); alter table ORDERDETAIL add constraint FK_ORDERDETAIL_SEEDPROC foreign key (PROCESSORID) references SEEDPROCESSORROLE (SEEDPROCESSORROLEID); alter table ORDERDISCOUNT add constraint FK_ORDERDISCOUNT_DISCOUNT foreign key (DISCOUNTID) references DISCOUNT (DISCOUNTID); alter table ORDERDISCOUNT add constraint FK_ORDERDISCOUNT_PRODUCTORDER foreign key (ORDERID) references PRODUCTORDER (ORDERID); alter table ORDERLINEDELIVERY add constraint FK_ORDERLINEDELIVERY_DEALER foreign key (AFFILIATEIRDACCOUNTID) references DEALER (IRDACCOUNTID); alter table ORDERLINEDELIVERY add constraint FK_ORDERLINEDELIV_ORDERDETAIL foreign key (ORDERITEMNUMBER, ORDERID) references ORDERDETAIL (ORDERITEMNUMBER, ORDERID); alter table ORDERPAYMENT add constraint FK_ORDERPAYMENT_PAYMENT foreign key (PAYMENTID) references PAYMENT (PAYMENTID); alter table ORDERPAYMENT add constraint FK_ORDERPAYMENT_PRODUCTORDER foreign key (ORDERID) references PRODUCTORDER (ORDERID); alter table ORDERSHARE add constraint FK_ORDERSHARE_CUSTOMER foreign key (CUSTOMERID) references CUSTOMER (CUSTOMERID); alter table ORDERSHARE add constraint FK_ORDERSHARE_PRODUCTORDER foreign key (ORDERID) references PRODUCTORDER (ORDERID); alter table PAYMENT add constraint FK_PAYMENT_CUSTOMER foreign key (CUSTOMERID) references CUSTOMER (CUSTOMERID); alter table POSQUEUE add constraint FK_POSQUEUE_DATAEXCHTYPE foreign key (EXCHANGETYPERESOURCEID) references DATAEXCHANGETYPE (EXCHANGETYPERESOURCEID); alter table POSQUEUE add constraint FK_POSQUEUE_ORDRDETAIL foreign key (ORDERITEMNUMBER, ORDERID) references ORDERDETAIL (ORDERITEMNUMBER, ORDERID); alter table POSTRANSACTIONHISTORY add constraint FK_POSTRANSHIST_DATAEXCHTYP foreign key (EXCHANGETYPERESOURCEID) references DATAEXCHANGETYPE (EXCHANGETYPERESOURCEID); alter table PRICE add constraint FK_PRICE_PRICEAREA foreign key (PRICEAREANAME) references PRICEAREA (PRICEAREANAME); alter table PRICE add constraint FK_PRICE_PRODUCT foreign key (PRODUCTUPC, SEEDYEAR) references PRODUCT (PRODUCTUPC, SEEDYEAR); alter table PRICECOUNTY add constraint FK_PRICECOUNTY_PRICEAREA foreign key (PRICEAREANAME) references PRICEAREA (PRICEAREANAME); alter table PRICECOUNTY add constraint FK_PRICECOUNTY_STATE foreign key (STATECODE) references STATE (STATECODE); alter table PRODUCT add constraint FK_PRODUCT_CROP foreign key (CROPID) references CROP (CROPID); alter table PRODUCT add constraint FK_PRODUCT_PRODBRANDS foreign key (BRANDNAME) references PRODUCTBRANDS (BRANDNAME); alter table PRODUCT add constraint FK_PRODUCT_UNITOFMEASURE foreign key (UNITOFMEASUREID) references UNITOFMEASURE (UNITOFMEASUREID); alter table PRODUCTORDER add constraint FK_PRODORDER_CUSTDELIVCONTACT foreign key (DELIVERYCUSTOMERID, DELIVERYCONTACTID) references CUSTOMERDELIVERYCONTACT (CUSTOMERID, CONTACTID); alter table PRODUCTORDER add constraint FK_PRODORDER_FARMMGR foreign key (FARMMANAGERID) references FARMMANAGER (FARMMANAGERID); alter table PRODUCTORDER add constraint FK_PRODUCTORDER_SELLER foreign key (SELLERIRDACCOUNTID) references SELLER (SELLERID); alter table SEEDPROCESSORROLE add constraint FK_SEEDPROCROLE_SEEDPROC foreign key (PROCESSORID) references SEEDPROCESSOR (PROCESSORID); alter table SUPPORTEDBRANDS add constraint FK_SUPPORTEDBRANDS_DEALER foreign key (DEALERID) references DEALER (IRDACCOUNTID); alter table SUPPORTEDBRANDS add constraint FK_SUPPORTEDBRANDS_PRODBRANDS foreign key (BRANDNAME) references PRODUCTBRANDS (BRANDNAME); alter table SYSTEMPARAMS add constraint FK_SYSTEMPARAMS_DEALER foreign key (DEALERID) references DEALER (IRDACCOUNTID); == -Original Message- From: Mike Matrigali [mailto:[EMAIL PROTECTED] Sent: Thursday, March 29, 2007 4:30 PM To: Derby Discussion Subject: Re: Alter Table ADD Column throws column number exception can you provide a self contained repro for this. At least telling what the current ddl of the table may be interesting. This would let us know if it is a runtime multi-user ddl issue, or something else. Do all of your alter tables fail or just this one? Does this one fail all the time or just some time? BALDWIN, ALAN J [AG-Contractor/1000] wrote: Hello all, When I execute the following SQL statement: ALTER TABLE OrderDetail ADD COLUMN blah varchar(15) I get
Re: Alter Table ADD Column throws column number exception
); alter table PRICE add constraint FK_PRICE_PRICEAREA foreign key (PRICEAREANAME) references PRICEAREA (PRICEAREANAME); alter table PRICE add constraint FK_PRICE_PRODUCT foreign key (PRODUCTUPC, SEEDYEAR) references PRODUCT (PRODUCTUPC, SEEDYEAR); alter table PRICECOUNTY add constraint FK_PRICECOUNTY_PRICEAREA foreign key (PRICEAREANAME) references PRICEAREA (PRICEAREANAME); alter table PRICECOUNTY add constraint FK_PRICECOUNTY_STATE foreign key (STATECODE) references STATE (STATECODE); alter table PRODUCT add constraint FK_PRODUCT_CROP foreign key (CROPID) references CROP (CROPID); alter table PRODUCT add constraint FK_PRODUCT_PRODBRANDS foreign key (BRANDNAME) references PRODUCTBRANDS (BRANDNAME); alter table PRODUCT add constraint FK_PRODUCT_UNITOFMEASURE foreign key (UNITOFMEASUREID) references UNITOFMEASURE (UNITOFMEASUREID); alter table PRODUCTORDER add constraint FK_PRODORDER_CUSTDELIVCONTACT foreign key (DELIVERYCUSTOMERID, DELIVERYCONTACTID) references CUSTOMERDELIVERYCONTACT (CUSTOMERID, CONTACTID); alter table PRODUCTORDER add constraint FK_PRODORDER_FARMMGR foreign key (FARMMANAGERID) references FARMMANAGER (FARMMANAGERID); alter table PRODUCTORDER add constraint FK_PRODUCTORDER_SELLER foreign key (SELLERIRDACCOUNTID) references SELLER (SELLERID); alter table SEEDPROCESSORROLE add constraint FK_SEEDPROCROLE_SEEDPROC foreign key (PROCESSORID) references SEEDPROCESSOR (PROCESSORID); alter table SUPPORTEDBRANDS add constraint FK_SUPPORTEDBRANDS_DEALER foreign key (DEALERID) references DEALER (IRDACCOUNTID); alter table SUPPORTEDBRANDS add constraint FK_SUPPORTEDBRANDS_PRODBRANDS foreign key (BRANDNAME) references PRODUCTBRANDS (BRANDNAME); alter table SYSTEMPARAMS add constraint FK_SYSTEMPARAMS_DEALER foreign key (DEALERID) references DEALER (IRDACCOUNTID); == -Original Message- From: Mike Matrigali [mailto:[EMAIL PROTECTED] Sent: Thursday, March 29, 2007 4:30 PM To: Derby Discussion Subject: Re: Alter Table ADD Column throws column number exception can you provide a self contained repro for this. At least telling what the current ddl of the table may be interesting. This would let us know if it is a runtime multi-user ddl issue, or something else. Do all of your alter tables fail or just this one? Does this one fail all the time or just some time? BALDWIN, ALAN J [AG-Contractor/1000] wrote: Hello all, When I execute the following SQL statement: ALTER TABLE OrderDetail ADD COLUMN blah varchar(15) I get this exception: java.sql.SQLException: In a base table there was a mismatch between the requested column number 16 and the maximum number of columns 20. Does anyone have an idea how I can fix this, or at least find a workaround? Thanks, -Alan- - This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. - - This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. -
Re: Alter Table ADD Column throws column number exception
can you provide a self contained repro for this. At least telling what the current ddl of the table may be interesting. This would let us know if it is a runtime multi-user ddl issue, or something else. Do all of your alter tables fail or just this one? Does this one fail all the time or just some time? BALDWIN, ALAN J [AG-Contractor/1000] wrote: Hello all, When I execute the following SQL statement: ALTER TABLE OrderDetail ADD COLUMN blah varchar(15) I get this exception: java.sql.SQLException: In a base table there was a mismatch between the requested column number 16 and the maximum number of columns 20. Does anyone have an idea how I can fix this, or at least find a workaround? Thanks, -Alan- - This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. -
Re: Queries satisfiable from indexes
I agree with army, Derby can and often does satisfy queries from indexes alone (I usually refer to this as covered query). If you can't provide a reproducible case for us to run, some of the following may help us guess what is going on: 1) exact ddl of the tables 2) size estimate of data for any data columns that are variable length (ie. most likely vchar fields), from this we can roughly estimate size of keys and see if 161,000 rows in 500 leaf pages makes sense. Assuming 4k index page, then this would be about 12 bytes a key, which seems reasonable for a 3 part key + a pointer to base table. 3) anything you can tell us about data distribution. 4) page size of tables/indexes. My reading of this is that the query for whatever value you put in for ? returned ~141,000 rows matching the value of mailbox_id that you put in, and there are only 3 distinct values of mailbox_id, tags in the dataset. It actually had to look through 161970 rows including the deleted rows that have not been reclaimed yet. Dan Karp wrote: I've got a few queries that should be satisfiable from an index without any access to the underlying rows. Can Derby do this, or does it load the rows regardless? In this case, would it help things if I'd used DISTINCT(tags) instead of a GROUP BY clause? QUERY: SELECT tags FROM mboxgroup1.mail_item WHERE mailbox_id = ? GROUP BY mailbox_id, tags; INDEX: CREATE INDEX ${DATABASE_NAME}.i_mail_item_tags_date ON mail_item(mailbox_id, tags, date); PLAN: Number of opens = 1 Rows seen = 3 Rows filtered = 0 restriction = false projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count:14134.80 optimizer estimated cost: 5049.44 Source result set: Sort ResultSet: Number of opens = 1 Rows input = 141512 Rows returned = 3 Eliminate duplicates = true In sorted order = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count:14134.80 optimizer estimated cost: 5049.44 Source result set: Project-Restrict ResultSet (3): Number of opens = 1 Rows seen = 141512 Rows filtered = 0 restriction = false projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count:14134.80 optimizer estimated cost: 5049.44 Source result set: Index Scan ResultSet for MAIL_ITEM using index I_MAIL_ITEM_TAGS_DATE at read committed isolation level using share row locking chosen by the optimizer Number of opens = 1 Rows seen = 141512 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 1} Number of columns fetched=2 Number of deleted rows visited=20458 Number of pages visited=501 Number of rows qualified=141512 Number of rows visited=161970 Scan type=btree Tree height=3 start position: = on first 1 column(s). Ordered null semantics on the following columns: stop position: on first 1 column(s). Ordered null semantics on the following columns: qualifiers: None optimizer estimated row count:14134.80 optimizer estimated cost: 5049.44
Re: Queries satisfiable from indexes
Dan Karp wrote: In the case of the query plan you presented, it seems to me that Derby is in fact using the index, and not the base table; that's what I believe this line is saying: Index Scan ResultSet for MAIL_ITEM using index I_MAIL_ITEM_TAGS_DATE I believe that's included for every query satisfied using an index. But the number of pages accessed indicates that it actually loaded all the rows from the table: scan information: Bit set of columns fetched={0, 1} Number of columns fetched=2 Number of deleted rows visited=20458 Number of pages visited=501 Number of rows qualified=141512 Number of rows visited=161970 I think that's *way* too many pages hit if we're just using the index. I've had a query on this table that sorted 33,000 rows without an index, and it only visited 300+ pages. page size of indexes and base tables do not have to be the same size. If you have not set page size yourself, giving us your ddl we can tell you default page size derby will pick.