Suresh Thalamati wrote:
Kristian Waagan wrote:
Hello,
For a database population program I run, I have observed that the disk
space allocation is larger when populating the database with multiple
concurrent connections.
For a specific configuration, the database ends up at 642 MB with a
single connection, whereas it ends up at 1.3 GB when using multiple
connections. The raw data volume is at about 215 MB, there are 20
tables and between 20 and 30 indexes (didn't take the time to figure
out which indexes are composite, 'show indexes' doesn't give this
information but returns 31 rows).
Is this to be expected?
No. The Difference in size seems to be too high with just inserts. Are
you doing inserts into the same table in parallel threads ?. Is it one
particular tables/index that is becoming too big or all the
tables/indexes ?
Thanks Suresh and Bryan for your replies.
You both ask similar questions, so here's a common reply.
The "big" database is the original one, created with 25 insert threads.
The "small" database is the original one after running compress on it.
I also tried inserting with only one thread.
Database size on disk, allocated space:
25 threads 1282 MB
25 threads after compress 642 MB
1 thread 643 MB
As can be seen, running compress on the original database and inserting
with only one thread gave almost identical results.
The database in question has a total of 103 conglomerates, including
system tables. Out of these, 34 has changed size.
The increase/decrease from the compressed database to the original one
varied between -50% to 196%. If we ignore conglomerates smaller then 10
MB, we have variation from -8%/+2 to 196%, but there is only one
conglomerate where the size decreased.
I have attached the sizes below.
So the answer to your question is that some tables become to big, some
do not. I am not able to see a pattern based on the current data, but
maybe you are?
I have also attached the program I used to output space diag
information. It is far from perfect, but it does get the information out.
thanks,
--
Kristian
Second and third column are size of conglomerate files on disk in KB.
Increase/decrease in number of allocated pages is the same as the file size.
Type Compres Orig Change
User table 115404 217848 89%
User index (backing) 8324 16036 93%
User table 220664 653696 196%
User index (backing) 63848 115148 80%
User index 63848 115500 81%
User table 12868 14180 10%
User index (backing) 3972 3204 -19%
User table 10180 10660 5%
User index (backing) 6084 6628 9%
User table 5924 6148 4%
User index (backing) 3972 3364 -15%
User table 11908 12612 6%
User index (backing) 3972 3396 -15%
User table 580 676 17%
User index (backing) 132 196 48%
User table 868 2148 147%
User index (backing) 228 324 42%
User index 132 196 48%
User table 11108 11300 2%
User index (backing) 3108 5892 90%
User table 3204 3236 1%
User index (backing) 836 1444 73%
System table 68 36 -47%
System index 16 20 25%
System index 16 8 -50%
System index 16 8 -50%
System table 16 12 -25%
System table 328 296 -10%
User table 10052 11652 16%
User index (backing) 3844 3076 -20%
User table 836 868 4%
User index (backing) 164 292 78%
User table 42344 47624 12%
User index (backing) 45992 42184 - 8%
Total 656760 1311812 100%
Note that for the ones with negative change, the conglomerates grew when
running compress.
Program used to extract space diagnostics information.
Invoke like this: java -classpath .:path-to-derby-stuff
DerbyDiskSpaceDiag /tmp/myDatabase [myschema]
Thanks
-suresh
import java.io.File;
import java.sql.*;
import java.util.Arrays;
/**
* Extract disk space diagnostics for a Derby database.
* This program *WILL NOT WORK* for other databases!
*/
public class DerbyDiskSpaceDiag {
/** Newline character(s). */
private static final String SEP = System.getProperty("line.separator");
/** Name of the diagnostics table. */
private static final String DIAGTABLE =
"new org.apache.derby.diag.SpaceTable";
/**
* Pads string with spaces on the left side.
*/
private static String padLeft(String text, int length) {
StringBuffer str = new StringBuffer(text);
str.ensureCapacity(length);
while (str.length() < length) {
str.insert(0, ' ');
}
return str.toString();
}
/**
* Pads string with spaces on the right side.
*/
private static String padRight(String text, int length) {
StringBuffer str = new StringBuffer(text);
str.ensureCapacity(length);
while (str.length() < length) {
str.append(' ');
}
return str.toString();
}
/**
* Load Derby driver.
* Will load the client driver if the database name starts with '//',
* the embedded driver otherwise.
*/
private static void loadDriver(String dbName) {
String driver = "org.apache.derby.jdbc.EmbeddedDriver";
if (dbName.startsWith("//")) {
driver = "org.apache.derby.jdbc.ClientDriver";
}
try {
Class.forName(driver);
} catch (ClassNotFoundException cnfe) {
cnfe.printStackTrace();
System.err.println("!! Unable to load driver class for Derby: " +
driver + SEP + "!! Please verify your classpath.");
System.exit(1);
}
}
/**
* Print usage message/help.
*/
private static void usage() {
System.out.println("Print disk usage statistics for a Derby database.");
System.out.println(SEP + "Usage:");
System.out.println("\tDerbyDiskSpaceDiag DBNAME [SCHEMA]");
System.out.println();
}
/** SQL used to extract conglomerate number from conglomerate name. */
private static final String CONGLOM_NUMBER =
"SELECT CONGLOMERATENUMBER FROM SYS.SYSCONGLOMERATES c " +
"WHERE c.CONGLOMERATENAME = ?";
/** SQL used to extract table id from table name. */
private static final String TABLEID =
"SELECT TABLEID FROM SYS.SYSTABLES t " +
"WHERE t.TABLENAME = ?";
private static final String HORIZONTAL_LINE =
"- - - - - - - - - - - - - - - - - - - - - - - " +
"- - - - - - - - - - - - - - - -";
/**
* Name of the database to connect to.
* Note that the JDBC protocol prefix is not supposed to be included.
*/
private final String dbName;
/** Tell if we are connect through the embedded driver. */
private final boolean embedded;
/** SQL to extract space diagnostcs. */
private final String diagQuery;
/**
* Total counts for various space diagnostics.
* Currently: allocated, free, save, filesizes.
*/
private long[] totalCounts = new long[4];
/**
* Get space diagnostics for the specified database.
*/
public DerbyDiskSpaceDiag(String dbName) {
this.dbName = dbName;
this.embedded = !dbName.startsWith("//");
this.diagQuery =
"SELECT spaceinfo.* FROM " +
"SYS.SYSSCHEMAS s, SYS.SYSTABLES t, " +
DIAGTABLE + "(SCHEMANAME,TABLENAME) spaceinfo " +
"WHERE s.SCHEMAID = t.SCHEMAID";
}
/**
* Get space diagnostics for the specified schema in the given database.
*/
public DerbyDiskSpaceDiag(String dbName, String schema) {
this.dbName = dbName;
this.embedded = !dbName.startsWith("//");
this.diagQuery =
"SELECT spaceinfo.* FROM " +
"SYS.SYSSCHEMAS s, SYS.SYSTABLES t, " +
DIAGTABLE + "('" + schema + "',TABLENAME) spaceinfo " +
"WHERE s.SCHEMAID = t.SCHEMAID AND " +
"s.SCHEMANAME = '" + schema + "'";
}
/**
* Print disk space statistics.
*/
public void printInfo()
throws SQLException {
Arrays.fill(this.totalCounts, 0L);
Connection con = DriverManager.getConnection(
"jdbc:derby:" + this.dbName);
PreparedStatement conglomNrStmt = con.prepareStatement(CONGLOM_NUMBER);
PreparedStatement tableIdStmt = con.prepareStatement(TABLEID);
Statement diagStmt = con.createStatement();
ResultSet diagRs = diagStmt.executeQuery(diagQuery);
ResultSet fileRs = null;
System.out.println(
padRight("Tablename", 25) + " " +
padRight("Type", 5) + " " +
padLeft("ALLOC", 6) + " " +
padLeft("FREE", 6) + " " +
padLeft("PSIZE", 5) + " " +
padLeft("SAVE", 6) + " " +
padRight("File", 8) + " " +
padLeft("Size KB", 8) + " ");
System.out.println(HORIZONTAL_LINE);
String conglomName = null;
boolean index = false;
Long tmpLong;
// Let's hope the order in the resultset doesn't change.
while (diagRs.next()) {
conglomName = diagRs.getString(1);
index = diagRs.getInt(2) > 0;
if (index) {
System.out.print(" " + padRight(conglomName, 23) + " ");
} else {
System.out.print(padRight(conglomName, 25) + " ");
}
System.out.print((index == true ? "INDEX" : "TABLE") + " ");
tmpLong = diagRs.getLong(3);
this.totalCounts[0] += tmpLong.longValue();
// Allocated
System.out.print(padLeft(tmpLong.toString(), 6) + " ");
tmpLong = diagRs.getLong(4);
this.totalCounts[1] += tmpLong.longValue();
// Free
System.out.print(padLeft(tmpLong.toString(), 6) + " ");
tmpLong = diagRs.getLong(6);
// Page size
System.out.print(padLeft(tmpLong.toString(), 5) + " ");
tmpLong = diagRs.getLong(7);
this.totalCounts[2] += tmpLong.longValue();
// Save
System.out.print(padLeft(tmpLong.toString(), 6) + " ");
if (embedded) {
if (index) {
conglomNrStmt.setString(1, conglomName);
fileRs = conglomNrStmt.executeQuery();
} else {
// Must get table id fidiagRst.
tableIdStmt.setString(1, conglomName);
fileRs = tableIdStmt.executeQuery();
fileRs.next();
String tableId = fileRs.getString(1);
fileRs.close();
conglomNrStmt.setString(1, tableId);
fileRs = conglomNrStmt.executeQuery();
}
if (fileRs.next()) {
printFileInfo(dbName, fileRs.getInt(1));
fileRs.close();
} else {
// Assume system table, which have '_HEAP' as postfix.
conglomNrStmt.setString(1, conglomName + "_HEAP");
fileRs = conglomNrStmt.executeQuery();
if (fileRs.next()) {
printFileInfo(dbName, fileRs.getInt(1));
fileRs.close();
} else {
System.out.print("unable to retrieve");
}
}
} else {
System.out.print("use embedded driver!");
}
// End line
System.out.println();
}
System.out.println(HORIZONTAL_LINE);
System.out.println(
padRight("Total", 25) + " " +
padRight("", 5) + " " +
padLeft(new Long(totalCounts[0]).toString(), 6) + " " +
padLeft(new Long(totalCounts[1]).toString(), 6) + " " +
padLeft("n/a", 5) + " " +
padLeft(new Long(totalCounts[2]).toString(), 6) + " " +
padRight("KB n/a", 8) + " " +
padLeft(new Long(totalCounts[3]).toString(), 8) + " ");
System.out.println(HORIZONTAL_LINE);
con.close();
}
/**
* Print file information.
* This will only work if connection is embedded.
*/
private void printFileInfo(String dbName, int conglomNr)
throws SQLException {
File f = new File(
new File(dbName, "seg0"),
"c" + Integer.toHexString(conglomNr) + ".dat");
System.out.print(padRight(f.getName(), 8) + " ");
System.out.print(padLeft(new Long(f.length() / 1024).toString(), 8));
this.totalCounts[3] += (f.length() / 1024);
}
/**
* Get disk space diagnostics table for a database.
*
* Arguments:
* DBNAME : name/path of the database /required/
* SCHEMA : name of the schema /optional/
*/
public static void main(String[] args)
throws SQLException {
// If no arguments are given, print usage and exit.
if (args.length < 1) {
usage();
System.exit(1);
}
final String dbName = args[0];
loadDriver(dbName);
DerbyDiskSpaceDiag diag = null;
if (args.length > 1) {
// Limit extraction to the specified schema.
diag = new DerbyDiskSpaceDiag(dbName, args[1]);
} else {
diag = new DerbyDiskSpaceDiag(dbName);
}
diag.printInfo();
}
} // End class DerbyDIskSpaceDiag