Information to and problems with LOBs not in DB

SME Wed, 30 May 2012 03:15:51 -0700

Hi,
 
I'm currently evaluating H2 for using it mainly as a key - value storage, 
which involves storing LOBs in the range in between 1KB and 10MB. The main 
features why I am looking at H2 are transactions, transparent storage of 
external LOBs in the case they are too large and transparent file 
encryption. All of this works quite fine for me. However, I ran into some 
'problems', which someone can hopefully clarify for me.
 
0. My Setup:
H2 Version 1.3.166
System.setProperty("h2.lobInDatabase", "false"); 
DB setting: SET MAX_LENGTH_INPLACE_LOB=512
connection: jdbc:h2:split:28:fooDB;CIPHER=AES
 
DROP TABLES IF EXISTS documents;
CREATE TABLE documents(id INT PRIMARY KEY, name VARCHAR, data BLOB);
CREATE INDEX docNameIDX ON documents(name);
 
Finally, I insert 2 million documents, which results in an id range from 1 
to 2,000,000.
 
 
1. Searches are usually done against name. Hence, I thought about speeding 
up those searches with a temporary memory table, which fails:
CREATE MEMORY TEMP TABLE IF NOT EXISTS memTempDocs AS SELECT * FROM 
documents;
Errormessage:
Eingabe/Ausgabe: "java.io.FileNotFoundException: 
X:\tmpDB\fooDB.lobs.db\1.lobs.db\1.lobs.db\1.lobs.db\16843009.t24.lob.db
Indeed, this file does not exist on the HDD. However, if I restrict the 
above select statement to AS SELECT * FROM documents WHERE id=x
and insert for x different values like 1 or 10,000 or 1,000,000 or whatever 
id exists, I get exactly the same error message.
 
If I omit the BLOBs it works:
CREATE MEMORY TEMP TABLE IF NOT EXISTS memTempDocs AS SELECT id,name FROM 
documents;
 
My question is whether it is possible to create a temporary memory table 
that still contains the BLOBs.
 
2. For the 2 Million documents, query time highly differs depending on the 
column queried. Is there an implementation reason for it?
a) about 60ms for SELECT * FROM documents WHERE id=someid; 
--> returns at most 1 row
b) about 550ms for SELECT * FROM documents WHERE name=SELECT name FROM 
documents where id=someid; 
--> possibly returns many elements because of similar names (on average 40 
rows). I know the SELECT name statement takes time equal to a). However, in 
the index, the returned rows should be near each other in the search tree 
because of equality. Right now, I cannot use the column name as primary key 
as the names are not unique.
 
3. If I query the documents table with a prepared statement, is the binary 
data directly fetched from HDD or at the time when I access the binary 
stream from the returned resultset?
Example, which should use streams instead of a simple byte array (but for 
my tests ok):
String query = "SELECT * FROM documents WHERE id=" + index + ";";
ps = conn.prepareStatement(query);
rs = ps.executeQuery();
if(rs.first()) { //at most one element because of unique id!
  String name = rs.getString("name");
  Blob data = rs.getBlob("data");
  if(retrieveData) {
    byteData = data.getBytes(0,(int)data.length());
    this.fetchedDataSize += byteData.length;
  }
}
 
4. H2 manages the storage of external BLOBs transparently. I wonder whether 
I can get the explicit information of the BLOB's name. The use case here is 
twofold:
a) get statistical information about the occupied HDD-Space vs. 
Memory-Space (e.g. compressed files or encrypted vs. uncompressed files)
b) check whether some hero deleted BLOB-files to save some space on HDD ;-) 
--> I guess I can surround the fetch statement with a try/catch block and 
mark rows with a missing file as invalid. However, I have the impression 
this is a workaround and a misuse of the java.io.FileNotFoundException.
 
 
Many thanks in advance,
 
Sascha


-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/h2-database/-/foyKqmGVSlMJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/h2-database?hl=en.

Information to and problems with LOBs not in DB

Reply via email to