Suresh Thalamati wrote:
Kristian Waagan wrote:
Hello,

For a database population program I run, I have observed that the disk space allocation is larger when populating the database with multiple concurrent connections. For a specific configuration, the database ends up at 642 MB with a single connection, whereas it ends up at 1.3 GB when using multiple connections. The raw data volume is at about 215 MB, there are 20 tables and between 20 and 30 indexes (didn't take the time to figure out which indexes are composite, 'show indexes' doesn't give this information but returns 31 rows).

Is this to be expected?


No. The Difference in size seems to be too high with just inserts. Are you doing inserts into the same table in parallel threads ?. Is it one particular tables/index that is becoming too big or all the tables/indexes ?

Thanks Suresh and Bryan for your replies.

You both ask similar questions, so here's a common reply.

The "big" database is the original one, created with 25 insert threads.
The "small" database is the original one after running compress on it.
I also tried inserting with only one thread.

Database size on disk, allocated space:
25 threads                      1282 MB
25 threads after compress 642 MB 1 thread 643 MB

As can be seen, running compress on the original database and inserting with only one thread gave almost identical results.

The database in question has a total of 103 conglomerates, including system tables. Out of these, 34 has changed size. The increase/decrease from the compressed database to the original one varied between -50% to 196%. If we ignore conglomerates smaller then 10 MB, we have variation from -8%/+2 to 196%, but there is only one conglomerate where the size decreased.
I have attached the sizes below.

So the answer to your question is that some tables become to big, some do not. I am not able to see a pattern based on the current data, but maybe you are?


I have also attached the program I used to output space diag information. It is far from perfect, but it does get the information out.



thanks,
--
Kristian

Second and third column are size of conglomerate files on disk in KB.
Increase/decrease in number of allocated pages is the same as the file size.

Type                    Compres Orig    Change
User table              115404  217848   89%
User index (backing)    8324    16036    93%
User table              220664  653696  196%
User index (backing)    63848   115148   80%
User index              63848   115500   81%
User table              12868   14180    10%
User index (backing)    3972    3204    -19%
User table              10180   10660     5%
User index (backing)    6084    6628      9%
User table              5924    6148      4%
User index (backing)    3972    3364    -15%
User table              11908   12612     6%
User index (backing)    3972    3396    -15%
User table              580     676      17%
User index (backing)    132     196      48%
User table              868     2148    147%
User index (backing)    228     324      42%
User index              132     196      48%
User table              11108   11300     2%
User index (backing)    3108    5892     90%
User table              3204    3236      1%
User index (backing)    836     1444     73%
System table            68      36      -47%
System index            16      20       25%
System index            16      8       -50%
System index            16      8       -50%
System table            16      12      -25%
System table            328     296     -10%
User table              10052   11652    16%
User index (backing)    3844    3076    -20%
User table              836     868       4%
User index (backing)    164     292      78%
User table              42344   47624    12%
User index (backing)    45992   42184   - 8%
Total                   656760  1311812 100%

Note that for the ones with negative change, the conglomerates grew when running compress.


Program used to extract space diagnostics information.
Invoke like this: java -classpath .:path-to-derby-stuff DerbyDiskSpaceDiag /tmp/myDatabase [myschema]




Thanks
-suresh



import java.io.File;
import java.sql.*;
import java.util.Arrays;

/**
 * Extract disk space diagnostics for a Derby database.
 * This program *WILL NOT WORK* for other databases!
 */
public class DerbyDiskSpaceDiag {

    /** Newline character(s). */
    private static final String SEP = System.getProperty("line.separator");

    /** Name of the diagnostics table. */
    private static final String DIAGTABLE = 
        "new org.apache.derby.diag.SpaceTable";

    /**
     * Pads string with spaces on the left side.
     */
    private static String padLeft(String text, int length) {
        StringBuffer str = new StringBuffer(text);
        str.ensureCapacity(length);
        while (str.length() < length) {
            str.insert(0, ' ');
        }
        return str.toString();
    }

    /**
     * Pads string with spaces on the right side.
     */
    private static String padRight(String text, int length) {
        StringBuffer str = new StringBuffer(text);
        str.ensureCapacity(length);
        while (str.length() < length) {
            str.append(' ');
        }
        return str.toString();
    }

    /**
     * Load Derby driver.
     * Will load the client driver if the database name starts with '//',
     * the embedded driver otherwise.
     */
    private static void loadDriver(String dbName) {
        String driver = "org.apache.derby.jdbc.EmbeddedDriver";
        if (dbName.startsWith("//")) {
            driver = "org.apache.derby.jdbc.ClientDriver";
        }
        try {
            Class.forName(driver);
        } catch (ClassNotFoundException cnfe) {
            cnfe.printStackTrace();
            System.err.println("!! Unable to load driver class for Derby: " +
                    driver + SEP + "!! Please verify your classpath.");
            System.exit(1);
        }
    }

    /**
     * Print usage message/help.
     */
    private static void usage() {
        System.out.println("Print disk usage statistics for a Derby database.");
        System.out.println(SEP + "Usage:");
        System.out.println("\tDerbyDiskSpaceDiag DBNAME [SCHEMA]");
        System.out.println();
    }

    /** SQL used to extract conglomerate number from conglomerate name. */
    private static final String CONGLOM_NUMBER =
        "SELECT CONGLOMERATENUMBER FROM SYS.SYSCONGLOMERATES c " +
        "WHERE c.CONGLOMERATENAME = ?";

    /** SQL used to extract table id from table name. */
    private static final String TABLEID =
        "SELECT TABLEID FROM SYS.SYSTABLES t " +
        "WHERE t.TABLENAME = ?";

    private static final String HORIZONTAL_LINE =
        "- - - - - - - - - - - - - - - - - - - - - - - " +
                "- - - - - - - - - - - - - - - -";

    /**
     * Name of the database to connect to.
     * Note that the JDBC protocol prefix is not supposed to be included.
     */
    private final String dbName;
    /** Tell if we are connect through the embedded driver. */
    private final boolean embedded;
    /** SQL to extract space diagnostcs. */
    private final String diagQuery;
    /**
     * Total counts for various space diagnostics.
     * Currently: allocated, free, save, filesizes.
     */
    private long[] totalCounts = new long[4];

    /**
     * Get space diagnostics for the specified database.
     */
    public DerbyDiskSpaceDiag(String dbName) {
        this.dbName = dbName;
        this.embedded = !dbName.startsWith("//");
        this.diagQuery = 
            "SELECT spaceinfo.* FROM " +
            "SYS.SYSSCHEMAS s, SYS.SYSTABLES t, " +
            DIAGTABLE + "(SCHEMANAME,TABLENAME) spaceinfo " +
            "WHERE s.SCHEMAID = t.SCHEMAID";
    }

    /**
     * Get space diagnostics for the specified schema in the given database.
     */
    public DerbyDiskSpaceDiag(String dbName, String schema) {
        this.dbName = dbName;
        this.embedded = !dbName.startsWith("//");
        this.diagQuery = 
            "SELECT spaceinfo.* FROM " +
            "SYS.SYSSCHEMAS s, SYS.SYSTABLES t, " +
            DIAGTABLE + "('" + schema + "',TABLENAME) spaceinfo " +
            "WHERE s.SCHEMAID = t.SCHEMAID AND " +
            "s.SCHEMANAME = '" + schema + "'";
    }

    /**
     * Print disk space statistics.
     */
    public void printInfo()
            throws SQLException {
        Arrays.fill(this.totalCounts, 0L);
        Connection con = DriverManager.getConnection(
                "jdbc:derby:" + this.dbName);
        PreparedStatement conglomNrStmt = con.prepareStatement(CONGLOM_NUMBER);
        PreparedStatement tableIdStmt = con.prepareStatement(TABLEID); 
        Statement diagStmt = con.createStatement();
        ResultSet diagRs = diagStmt.executeQuery(diagQuery);
        ResultSet fileRs = null;
        System.out.println(
                padRight("Tablename", 25) + " " +
                padRight("Type", 5) + " " +
                padLeft("ALLOC", 6) + " " +
                padLeft("FREE", 6) + " " +
                padLeft("PSIZE", 5) + " " +
                padLeft("SAVE", 6) + " " +
                padRight("File", 8) + " " +
                padLeft("Size KB", 8) + " ");
        System.out.println(HORIZONTAL_LINE);
        String conglomName = null;
        boolean index = false;
        Long tmpLong;
        // Let's hope the order in the resultset doesn't change.
        while (diagRs.next()) {
            conglomName = diagRs.getString(1);
            index = diagRs.getInt(2) > 0;
            if (index) {
                System.out.print("  " + padRight(conglomName, 23) + " ");
            } else {
                System.out.print(padRight(conglomName, 25) + " ");
            }
            System.out.print((index == true ? "INDEX" : "TABLE") + " ");
            tmpLong = diagRs.getLong(3);
            this.totalCounts[0] += tmpLong.longValue();
            // Allocated
            System.out.print(padLeft(tmpLong.toString(), 6) + " ");
            tmpLong = diagRs.getLong(4);
            this.totalCounts[1] += tmpLong.longValue();
            // Free
            System.out.print(padLeft(tmpLong.toString(), 6) + " ");
            tmpLong = diagRs.getLong(6);
            // Page size
            System.out.print(padLeft(tmpLong.toString(), 5) + " ");
            tmpLong = diagRs.getLong(7);
            this.totalCounts[2] += tmpLong.longValue();
            // Save
            System.out.print(padLeft(tmpLong.toString(), 6) + " ");
            if (embedded) {
                if (index) {
                    conglomNrStmt.setString(1, conglomName);
                    fileRs = conglomNrStmt.executeQuery();
                } else {
                    // Must get table id fidiagRst.
                    tableIdStmt.setString(1, conglomName);
                    fileRs = tableIdStmt.executeQuery();
                    fileRs.next();
                    String tableId = fileRs.getString(1);
                    fileRs.close(); 
                    conglomNrStmt.setString(1, tableId);
                    fileRs = conglomNrStmt.executeQuery();
                }
                if (fileRs.next()) {
                    printFileInfo(dbName, fileRs.getInt(1));
                    fileRs.close();
                } else {
                    // Assume system table, which have '_HEAP' as postfix.
                    conglomNrStmt.setString(1, conglomName + "_HEAP");
                    fileRs = conglomNrStmt.executeQuery();
                    if (fileRs.next()) {
                        printFileInfo(dbName, fileRs.getInt(1));
                        fileRs.close();
                    } else {
                        System.out.print("unable to retrieve");
                    }
                }
            } else {
                System.out.print("use embedded driver!");

            }
            // End line
            System.out.println();
        }
        System.out.println(HORIZONTAL_LINE);
        System.out.println(
                padRight("Total", 25) + " " +
                padRight("", 5) + " " +
                padLeft(new Long(totalCounts[0]).toString(), 6) + " " +
                padLeft(new Long(totalCounts[1]).toString(), 6) + " " +
                padLeft("n/a", 5) + " " +
                padLeft(new Long(totalCounts[2]).toString(), 6) + " " +
                padRight("KB n/a", 8) + " " +
                padLeft(new Long(totalCounts[3]).toString(), 8) + " ");
        System.out.println(HORIZONTAL_LINE);
        con.close();
    }

    /**
     * Print file information.
     * This will only work if connection is embedded.
     */
    private void printFileInfo(String dbName, int conglomNr)
            throws SQLException {
        File f = new File(
                    new File(dbName, "seg0"),
                    "c" + Integer.toHexString(conglomNr) + ".dat");
        System.out.print(padRight(f.getName(), 8) + " ");
        System.out.print(padLeft(new Long(f.length() / 1024).toString(), 8));
        this.totalCounts[3] += (f.length() / 1024);
    }

    /**
     * Get disk space diagnostics table for a database.
     *
     * Arguments:
     *  DBNAME : name/path of the database  /required/
     *  SCHEMA : name of the schema         /optional/
     */
    public static void main(String[] args)
            throws SQLException {
        // If no arguments are given, print usage and exit.
        if (args.length < 1) {
            usage();
            System.exit(1);
        }
        final String dbName = args[0];
        loadDriver(dbName);
        DerbyDiskSpaceDiag diag = null;
        if (args.length > 1) {
            // Limit extraction to the specified schema.
            diag = new DerbyDiskSpaceDiag(dbName, args[1]);
        } else {
            diag = new DerbyDiskSpaceDiag(dbName);
        }
        diag.printInfo();
    }
} // End class DerbyDIskSpaceDiag

Reply via email to