On Wednesday, October 9, 2019 at 6:23:09 AM UTC-4, Tim Fielder wrote:
>
> My suspicion is that nesting a SELECT inside any other function is going
> to cause these kinds of errors, and I will need to instead retrieve a
> ResultSet and use a PreparedStatement to batch-INSERT into tables in chunks
> of ~10,000 or so.
>
Well, that doesn't seem to be the case. I modified the code to:
Statement docType = gConn.createStatement();
docType.execute("ALTER TABLE doctype ADD PRIMARY KEY (abbrev);");
docType.close();
Statement refDoc = gConn.createStatement();
refDoc.execute(
"CREATE TABLE indexDoc ("
+ " docid varchar,"
+ " title varchar,"
+ " content varchar,"
+ " sourceTitle varchar,"
+ " startPage varchar,"
+ " endPage varchar,"
+ " year varchar,"
+ " indexed bit NOT NULL DEFAULT 0 );");
refDoc.executeUpdate("INSERT INTO indexDoc (docid, indexed) ( SELECT docid,
1 FROM document );");
refDoc.executeUpdate(
"INSERT INTO indexDoc (docid, title, content, sourceTitle,
startPage, endPage, year) ("
+ " SELECT refid, MAX(title), MAX(content),
MAX(sourceTitle), MAX(startPage), MAX(endPage), MAX(year)"
+ " FROM refdoc"
+ " GROUP BY refid );");
ResultSet refDocs =
refDoc.executeQuery("SELECT docid, MAX(title), MAX(content),
MAX(sourceTitle), MAX(startPage), MAX(endPage), MAX(year)"
+ " FROM indexDoc"
+ " GROUP BY docid"
+ " HAVING MAX(indexed) = 0;");
PreparedStatement insert =
gConn.prepareStatement(
"INSERT INTO document (docid, title, sourceTitle,
startPage, endPage, year) VALUES (?, ?, ?, ?, ?, ?);");
int count = 0;
while (refDocs.next())
{
Document doc = new Document();
doc.add(new Field("ID", refDocs.getString(1), FieldFactory.field(false
)));
doc.add(new Field("title", refDocs.getString(2), FieldFactory.field(
false)));
doc.add(new Field("content", refDocs.getString(3), FieldFactory.field(
true)));
writer.addDocument(doc);
insert.setString(1, refDocs.getString(1));
insert.setString(2, refDocs.getString(2));
insert.setString(3, refDocs.getString(4));
insert.setString(4, refDocs.getString(5));
insert.setString(5, refDocs.getString(6));
insert.setString(6, refDocs.getString(7));
insert.addBatch();
count++;
if(count >= 10000)
{
insert.executeBatch();
}
}
refDocs.close();
if (count > 0)
{
insert.executeBatch();
}
insert.close();
refDoc.execute("DROP TABLE indexDoc;");
refDoc.execute("DROP TABLE refdoc;");
refDoc.close();
Statement doc = gConn.createStatement();
doc.execute("ALTER TABLE document ADD PRIMARY KEY (docid);");
doc.execute("ALTER TABLE document ADD FOREIGN KEY (docType) REFERENCES
doctype(abbrev);");
doc.close();
Statement ref = gConn.createStatement();
ref.execute("CREATE TABLE reference AS ( SELECT DISTINCT docid, refid FROM
tempref );");
ref.execute("DROP TABLE tempref;");
ref.execute("ALTER TABLE reference MODIFY COLUMN docid varchar NOT NULL;");
ref.execute("ALTER TABLE reference MODIFY COLUMN refid varchar NOT NULL;");
ref.execute("ALTER TABLE reference ADD PRIMARY KEY (docid, refid);");
ref.execute("ALTER TABLE reference ADD FOREIGN KEY (docid) REFERENCES
document(docid);");
ref.execute("ALTER TABLE reference ADD FOREIGN KEY (refid) REFERENCES
document(docid);");
ref.close();
I still get an out-of-memory exception on the query
SELECT docid, MAX(title), MAX(content), MAX(sourceTitle), MAX(startPage),
MAX(endPage), MAX(year)
FROM indexDoc
GROUP BY docid
HAVING MAX(indexed) = 0;
Is it just the case that any conditional SELECT will fail once a table
reaches a certain size?
--
You received this message because you are subscribed to the Google Groups "H2
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/h2-database/d981171f-55cb-4df4-86ac-890a5037fd06%40googlegroups.com.