>> Patches are welcome of course, but I would like to keep it simple if
possible.
>
> I'll have a second look. There's always 1-2 things to be found when
profiling these cases. I'll report them here, in case I should find
something.
OK, so this is the place where all the CPU power gets wasted:
MetaTable.checkIndex():
Value v = ValueString.get(value);
if (indexFrom != null && db.compare(v, indexFrom) < 0) {
return false;
}
if (indexTo != null && db.compare(v, indexTo) > 0) {
return false;
}
In fact, ValueString.get() is a very expensive operation if called 100M
times, given the time it takes to always compare strings and cache them,
evict objects from the cache, etc. After playing around with a lot of
little tweaks, I was finally thinking that this checkIndex() method always
gets an "MetaTable.identifier(Table.getName())" argument for "String
value". So instead of re-wrapping this value into a ValueString several
millions of times, it would be much better to simply hold a constant
reference to a ValueString (and a ValueStringIgnoreCase) inside of
DbObjectBase. I think the H2 database can live with the slight overhead of
holding this reference.
The effect is massive:
A 40s execution of my previous query against an INFORMATION_SCHEMA with 10k
tables, 10k primary keys and 10k foreign keys speeds up to a mere 5-6ms!!
Please find attached a patch including the relevant changes. Note:
- You may find it more appropriate to push down my new
DbObjectBase.getIdentifier() method to Table, or to cache these values on a
per-query basis. I don't know H2 well enough to assess, whether adding
cache values to DbObjectBase is a good idea.
- MetaTable.identifier() has now gotten somewhat obsolete, if all
DbObjectBase objects can provide a cached identifier. To keep the change
slim, I avoided a refactoring to remove MetaTable.identifier(). The code
duplication is acceptable, I think
Let me know if you need anything else
Cheers
Lukas
Am Dienstag, 19. Februar 2013 11:56:01 UTC+1 schrieb Lukas Eder:
>
> Hi Thomas,
>
> > I thought about changing it, by using a 'before select' trigger, so
> that the metadata tables are materialized.
>
> Yes, I had thought about that, too. Materialising views might be the path
> of least resistance here, specifically because all sorts of regular indexes
> can then be applied, and execution plans would be more sophisticated.
> Probably, Oracle-style materialised views would be quite a feature anyway
> for H2, in the long run.
>
> > Patches are welcome of course, but I would like to keep it simple if
> possible.
>
> I'll have a second look. There's always 1-2 things to be found when
> profiling these cases. I'll report them here, in case I should find
> something.
>
> > I'm not convinced that 15000 tables is a good use of a database :-)
>
> I wouldn't be, either...
> I hardly ever have good and non-academic uses of a database in my set of
> use cases for jOOQ ;-)
>
> So I'll see you in 2 weeks! :-)
>
> Cheers
> Lukas
>
> Am Montag, 18. Februar 2013 21:18:56 UTC+1 schrieb Thomas Mueller:
>>
>> Hi,
>>
>> Yes, the current metadata implementation does not scale very well. It is
>> made for about 100 tables, not 15000. For example, the complete schema
>> is always fully in memory at the moment.
>>
>> > Or is this a "bug" in H2
>>
>> I would say it's a limitation. I will document it.
>>
>> I thought about changing it, by using a 'before select' trigger, so that
>> the metadata tables are materialized. Before selecting from the metadata
>> tables, the 'before select' trigger re-builds them. That way the current
>> (slow) algorithms are still used, but then the result is cached. That would
>> improve performance a lot, unless if there are many metadata changes. I
>> never had the time to implement it, and so far only one person complained
>> (that was many years ago, using millions of tables as far as I remember).
>>
>> Patches are welcome of course, but I would like to keep it simple if
>> possible. I'm not convinced that 15000 tables is a good use of a database
>> :-)
>>
>> Regards,
>> Thomas
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Feb 18, 2013 at 7:21 PM, Lukas Eder <[email protected]> wrote:
>>
>>> Here's a bit of a corner-case, showing a prohibitive amount of calls to
>>> MetaTable.checkIndex() and other methods within MetaTable.generateRows()
>>>
>>> 1. I've created a database schema with 15k tables (and 15k primary keys,
>>> 15k foreign keys) of this form:
>>>
>>> DROP SCHEMA IF EXISTS large;
>>> CREATE SCHEMA large;
>>> CREATE TABLE large.t00000 (id INT, CONSTRAINT pk_00000
>>> PRIMARY KEY (id));
>>> CREATE TABLE large.t00001 (id INT, prev_id INT, CONSTRAINT pk_00001
>>> PRIMARY KEY (id), CONSTRAINT fk_00001 FOREIGN KEY (prev_id) REFERENCES
>>> t00000(id));
>>> CREATE TABLE large.t00002 (id INT, prev_id INT, CONSTRAINT pk_00002
>>> PRIMARY KEY (id), CONSTRAINT fk_00002 FOREIGN KEY (prev_id) REFERENCES
>>> t00001(id));
>>> CREATE TABLE large.t00003 (id INT, prev_id INT, CONSTRAINT pk_00003
>>> PRIMARY KEY (id), CONSTRAINT fk_00003 FOREIGN KEY (prev_id) REFERENCES
>>> t00002(id));
>>> -- etc, you get the idea
>>>
>>>
>>> ... Yes. A corner case ;-) I can provide you with the full script, if
>>> you want.
>>> 2. I've run the following query here. This query runs forever:
>>>
>>> SELECT "INFORMATION_SCHEMA"."CROSS_REFERENCES"."FK_NAME",
>>> "INFORMATION_SCHEMA"."CROSS_REFERENCES"."FKTABLE_NAME",
>>> "INFORMATION_SCHEMA"."CROSS_REFERENCES"."FKTABLE_SCHEMA",
>>> "INFORMATION_SCHEMA"."CROSS_REFERENCES"."FKCOLUMN_NAME",
>>> "INFORMATION_SCHEMA"."CONSTRAINTS"."CONSTRAINT_NAME",
>>> "INFORMATION_SCHEMA"."CONSTRAINTS"."CONSTRAINT_SCHEMA"
>>> FROM "INFORMATION_SCHEMA"."CROSS_REFERENCES"
>>> JOIN "INFORMATION_SCHEMA"."CONSTRAINTS"
>>> ON ("INFORMATION_SCHEMA"."CROSS_REFERENCES"."PK_NAME" =
>>> "INFORMATION_SCHEMA"."CONSTRAINTS"."UNIQUE_INDEX_NAME"
>>> AND "INFORMATION_SCHEMA"."CROSS_REFERENCES"."PKTABLE_NAME" =
>>> "INFORMATION_SCHEMA"."CONSTRAINTS"."TABLE_NAME"
>>> AND "INFORMATION_SCHEMA"."CROSS_REFERENCES"."PKTABLE_SCHEMA" =
>>> "INFORMATION_SCHEMA"."CONSTRAINTS"."TABLE_SCHEMA")
>>> WHERE "INFORMATION_SCHEMA"."CROSS_REFERENCES"."FKTABLE_SCHEMA" IN
>>> ('LARGE')
>>> ORDER BY "INFORMATION_SCHEMA"."CROSS_REFERENCES"."FKTABLE_SCHEMA" ASC,
>>> "INFORMATION_SCHEMA"."CROSS_REFERENCES"."FK_NAME" ASC,
>>> "INFORMATION_SCHEMA"."CROSS_REFERENCES"."ORDINAL_POSITION" ASC
>>>
>>>
>>> 3. Profiling the above query seems to indicate, that the query results
>>> in a cartesian product, even if join criteria might be useful for
>>> optimisation:
>>>
>>> [image: Inline-Bild 1]
>>>
>>>
>>> http://i.imgur.com/8gh6ld0.png
>>>
>>> The time column is not accurate, as it includes profiling side-effects.
>>> Also, the above data does not include the complete execution time.
>>>
>>> But as you can see, the invocation count goes up sky-high within
>>> generateRows(), which leads to a lot of overhead, mostly from within
>>> checkIndex(). Is there any way that such queries can be optimised by
>>> adding more selective criteria to avoid cartesian products? Or is this a
>>> "bug" in H2's handling of INFORMATION_SCHEMA data?
>>>
>>> Also, did I understand this correctly: The INFORMATION_SCHEMA is
>>> implemented entirely in Java code, producing ad-hoc in-memory tables. There
>>> are no physical tables with actual indexes that can be queried in a much
>>> faster fashion...?
>>>
>>> Cheers
>>> Lukas
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "H2 Database" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/h2-database?hl=en.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>>
>>>
>>
>>
--
You received this message because you are subscribed to the Google Groups "H2
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
Index: src/docsrc/html/changelog.html
===================================================================
--- src/docsrc/html/changelog.html (revision 4679)
+++ src/docsrc/html/changelog.html (working copy)
@@ -67,6 +67,7 @@
</li><li>Issue 404: SHOW COLUMNS FROM tableName does not work with ALLOW_LITERALS=NUMBERS.
</li><li>Throw an explicit error to make it clear we don't support the TRIGGER combination of SELECT and FOR EACH ROW.
</li><li>Issue 439: Utils.sortTopN does not handle single-element arrays.
+</li><li>Significant performance improvement for queries against the INFORMATION_SCHEMA, thanks to a patch from Lukas Eder
</li></ul>
<h2>Version 1.3.170 (2012-11-30)</h2>
Index: src/main/org/h2/engine/DbObjectBase.java
===================================================================
--- src/main/org/h2/engine/DbObjectBase.java (revision 4679)
+++ src/main/org/h2/engine/DbObjectBase.java (working copy)
@@ -7,8 +7,12 @@
package org.h2.engine;
import java.util.ArrayList;
+
import org.h2.command.Parser;
import org.h2.message.Trace;
+import org.h2.util.StringUtils;
+import org.h2.value.ValueString;
+import org.h2.value.ValueStringIgnoreCase;
/**
* The base class for all database objects.
@@ -32,6 +36,8 @@
private int id;
private String objectName;
+ private ValueString identifier;
+ private ValueStringIgnoreCase identifierIgnoreCase;
private long modificationId;
private boolean temporary;
@@ -47,7 +53,7 @@
this.database = db;
this.trace = db.getTrace(traceModule);
this.id = objectId;
- this.objectName = name;
+ setObjectName(name);
this.modificationId = db.getModificationMetaId();
}
@@ -91,6 +97,8 @@
protected void setObjectName(String name) {
objectName = name;
+ identifier = name == null ? null : ValueString.get(identifier(name, false));
+ identifierIgnoreCase = name == null ? null : ValueStringIgnoreCase.get(identifier(name, true));
}
public String getSQL() {
@@ -113,7 +121,11 @@
return objectName;
}
- /**
+ public ValueString getIdentifier(boolean ignoreCase) {
+ return ignoreCase ? identifierIgnoreCase : identifier;
+ }
+
+ /**
* Set the main attributes to null to make sure the object is no longer
* used.
*/
@@ -122,12 +134,12 @@
id = -1;
database = null;
trace = null;
- objectName = null;
+ setObjectName(null);
}
public void rename(String newName) {
checkRename();
- objectName = newName;
+ setObjectName(newName);
setModified();
}
@@ -151,4 +163,10 @@
return objectName + ":" + id + ":" + super.toString();
}
+ private String identifier(String s, boolean lowerCaseIdentifiers) {
+ if (lowerCaseIdentifiers) {
+ s = s == null ? null : StringUtils.toLowerEnglish(s);
+ }
+ return s;
+ }
}
Index: src/main/org/h2/table/MetaTable.java
===================================================================
--- src/main/org/h2/table/MetaTable.java (revision 4679)
+++ src/main/org/h2/table/MetaTable.java (working copy)
@@ -60,7 +60,6 @@
import org.h2.value.Value;
import org.h2.value.ValueNull;
import org.h2.value.ValueString;
-import org.h2.value.ValueStringIgnoreCase;
/**
* This class is responsible to build the database meta data pseudo tables.
@@ -577,18 +576,18 @@
return tables;
}
- private boolean checkIndex(Session session, String value, Value indexFrom, Value indexTo) {
- if (value == null || (indexFrom == null && indexTo == null)) {
+ private boolean checkIndex(Session session, Table table, Value indexFrom, Value indexTo) {
+ if (table == null || (indexFrom == null && indexTo == null)) {
return true;
}
Database db = session.getDatabase();
if (database.getMode().lowerCaseIdentifiers) {
- Value v = ValueStringIgnoreCase.get(value);
+ Value v = table.getIdentifier(true);
if (indexFrom.equals(indexTo) && db.compare(v, indexFrom) != 0) {
return false;
}
} else {
- Value v = ValueString.get(value);
+ Value v = table.getIdentifier(false);
if (indexFrom != null && db.compare(v, indexFrom) < 0) {
return false;
}
@@ -635,7 +634,7 @@
case TABLES: {
for (Table table : getAllTables(session)) {
String tableName = identifier(table.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, table, indexFrom, indexTo)) {
continue;
}
if (hideTable(table, session)) {
@@ -681,7 +680,7 @@
case COLUMNS: {
for (Table table : getAllTables(session)) {
String tableName = identifier(table.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, table, indexFrom, indexTo)) {
continue;
}
if (hideTable(table, session)) {
@@ -747,7 +746,7 @@
case INDEXES: {
for (Table table : getAllTables(session)) {
String tableName = identifier(table.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, table, indexFrom, indexTo)) {
continue;
}
if (hideTable(table, session)) {
@@ -1029,8 +1028,7 @@
String rightType = grantee.getType() == DbObject.USER ? "USER" : "ROLE";
if (role == null) {
Table granted = r.getGrantedTable();
- String tableName = identifier(granted.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, granted, indexFrom, indexTo)) {
continue;
}
add(rows,
@@ -1223,8 +1221,7 @@
if (table == null || hideTable(table, session)) {
continue;
}
- String tableName = identifier(table.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, table, indexFrom, indexTo)) {
continue;
}
addPrivileges(rows, r.getGrantee(), catalog, table, null, r.getRightMask());
@@ -1237,8 +1234,7 @@
if (table == null || hideTable(table, session)) {
continue;
}
- String tableName = identifier(table.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, table, indexFrom, indexTo)) {
continue;
}
DbObject grantee = r.getGrantee();
@@ -1266,7 +1262,7 @@
continue;
}
String tableName = identifier(table.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, table, indexFrom, indexTo)) {
continue;
}
TableView view = (TableView) table;
@@ -1318,8 +1314,7 @@
IndexColumn[] refCols = ref.getRefColumns();
Table tab = ref.getTable();
Table refTab = ref.getRefTable();
- String tableName = identifier(refTab.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, refTab, indexFrom, indexTo)) {
continue;
}
int update = getRefAction(ref.getUpdateAction());
@@ -1375,7 +1370,7 @@
uniqueIndexName = index.getName();
}
String tableName = identifier(table.getName());
- if (!checkIndex(session, tableName, indexFrom, indexTo)) {
+ if (!checkIndex(session, table, indexFrom, indexTo)) {
continue;
}
if (constraintType.equals(Constraint.CHECK)) {