Re: [h2] Group BY on "large" tables from file-system causes Out of Memory Error

2020-04-21 Thread Noel Grandin
TBH, retrieving super large result sets is not something we optimise for.

If you really need that, you can try turn on the LAZY_FETCH feature.

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/CAFYHVnW6H71LKPK1qrdbFKqr7J4qoD%2BU0TM5FaZuQuhR%2BU%3Dh_Q%40mail.gmail.com.


Re: [h2] Group BY on "large" tables from file-system causes Out of Memory Error

2020-04-21 Thread Evgenij Ryazanov
H2 doesn't need a lot of memory for plain queries without aggregate and 
window functions, large results are stored on the disk automatically. But 
queries with aggregate or window functions currently need to load the whole 
result into the memory; the only exclusion is the mentioned optimization 
for group-sorted queries in presence of compatible index.

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/4efcbc85-65b9-452e-9577-d0ba4e60dc3e%40googlegroups.com.


Re: [h2] Group BY on "large" tables from file-system causes Out of Memory Error

2020-04-21 Thread MacMahon McCallister


On Tuesday, 21 April 2020 13:12:01 UTC+3, MacMahon McCallister wrote:
>
>
>
> On Tuesday, 21 April 2020 11:18:02 UTC+3, Noel Grandin wrote:
>>
>> Which version is this ?
>>
>> And what happens when you remove the dangerous options? (LOG and UNDO_LOG)
>>
>
> Version: 1.4.200.
> Nothing happens if i remove the options. I actually tried fiddling with 
> the options earlier, but it always halts on 5M rows.
>
> On Tuesday, 21 April 2020 11:30:20 UTC+3, Evgenij Ryazanov wrote
>>
>> If you don't have an index on GROUP BY column, you need a lot of memory 
>> for such queries in H2.
>>
>>  
> This does make kind of sense, but still not in this test-scenario. How 
> come the previous test-cases (up to 1M rows, without index), run fine, even 
> with memory as low as -XmX256m:
> Executing with size: 1000
> Processed 1000, time 30 ms
> Executing with size: 1
> Processed 1, time 50 ms
> Executing with size: 10
> Processed 10, time 241 ms
> Executing with size: 100
> Processed 100, time 1925 ms
>
>
 To respond to myself, actually, the Xmx settings didn't apply properly and 
therefore (as suggested earlier) the h2 operation ran out of memory.
 It seems, that actually using heap settings of Xmx1024 is able to execute 
the unindexed query on a table with 5M rows within 10 seconds.

But a follow up question - for these "un-indexed group by" scenarios, does 
h2 have to read all the result-set into memory?
And besides indexing the table (which I can not too probably) are there any 
other optimizations to consider?


-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/ee3720aa-d375-4681-bcb0-9b43023716b7%40googlegroups.com.


Re: [h2] Group BY on "large" tables from file-system causes Out of Memory Error

2020-04-21 Thread MacMahon McCallister


On Tuesday, 21 April 2020 11:18:02 UTC+3, Noel Grandin wrote:
>
> Which version is this ?
>
> And what happens when you remove the dangerous options? (LOG and UNDO_LOG)
>

Version: 1.4.200.
Nothing happens if i remove the options. I actually tried fiddling with the 
options earlier, but it always halts on 5M rows.

On Tuesday, 21 April 2020 11:30:20 UTC+3, Evgenij Ryazanov wrote
>
> If you don't have an index on GROUP BY column, you need a lot of memory 
> for such queries in H2.
>
>  
This does make kind of sense, but still not in this test-scenario. How come 
the previous test-cases (up to 1M rows, without index), run fine, even with 
memory as low as -XmX256m:
Executing with size: 1000
Processed 1000, time 30 ms
Executing with size: 1
Processed 1, time 50 ms
Executing with size: 10
Processed 10, time 241 ms
Executing with size: 100
Processed 100, time 1925 ms




 


-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/6d82d626-c3de-4ee8-955c-c4a0effc0cac%40googlegroups.com.


Re: [h2] Group BY on "large" tables from file-system causes Out of Memory Error

2020-04-21 Thread Noel Grandin
Which version is this ?

And what happens when you remove the dangerous options? (LOG and UNDO_LOG)

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/CAFYHVnXnwWZm9hHe3KpLEFWZDQ%3DnbCQhF7L67jfhL-5HWdJhvA%40mail.gmail.com.


[h2] Group BY on "large" tables from file-system causes Out of Memory Error

2020-04-21 Thread MacMahon McCallister
Hello, I wrote a simple case in order to reproduce the problem (doesn't do 
any cleanup).

In the test scenario, the OOM will always happen, when H2 is running the 
group by query on a table with 5 million rows.
What would be the cause? Memory options for the JVM in the test scenario, 
during my testing were -XmX1024m.





void createTablesForGroupByScenario() throws SQLException {
String dir = "/path/to/some/dir";
Integer[] sizes = new Integer[] { 1000, 10_000, 100_000, 1_000_000, 
5_000_000 };

for (Integer size : sizes) {
System.out.println("Creating table with size: " + size);

String name = "group_by_" + size;
String h2Url = "jdbc:h2:file:" + dir + "/" + name
+ ";DB_CLOSE_ON_EXIT=FALSE"
+ ";AUTO_RECONNECT=TRUE"
+ ";FILE_LOCK=NO"
+ ";TRACE_LEVEL_FILE=0"
+ ";TRACE_LEVEL_SYSTEM_OUT=0"
+ ";LOG=0"
+ ";UNDO_LOG=0"
+ ";CACHE_SIZE=" + 65000;

Connection con = DriverManager.getConnection(h2Url);

String initSql = "create table result(id bigint, name varchar, 
phone int);\n";
RunScript.execute(con, new StringReader(initSql));
con.commit();

PreparedStatement st = con.prepareStatement("insert into result 
values (?, ?, ?)");
for (int i = 0; i < size; i++) {
st.setLong(1, i);
st.setString(2, "name_" + i);
st.setInt(3, i);
st.addBatch();
if (i % 500 == 0) {
st.executeBatch();
con.commit();
}
}
st.executeBatch();
con.commit();
con.close();
}
}


void forEveryDbCreatedRunGroupByQuery() throws SQLException {
String dir = "/path/to/some/dir";
Integer[] sizes = new Integer[] { 1000, 10_000, 100_000, 1_000_000, 
5_000_000 };

for (Integer size : sizes) {
System.out.println("Running query for table with size: " + size
);

String name = "group_by_" + size;
String h2Url = "jdbc:h2:file:" + dir + "/" + name
+ ";DB_CLOSE_ON_EXIT=FALSE"
+ ";AUTO_RECONNECT=TRUE"
+ ";FILE_LOCK=NO"
+ ";TRACE_LEVEL_FILE=0"
+ ";TRACE_LEVEL_SYSTEM_OUT=0"
+ ";LOG=0"
+ ";UNDO_LOG=0"
+ ";CACHE_SIZE=" + 65000;

Connection con = DriverManager.getConnection(h2Url);

String sql = "select id, sum(phone) from result group by id;\n";

long start = System.currentTimeMillis();
Statement st = con.createStatement();
ResultSet rs = st.executeQuery(sql);

int processed = 0;
while (rs.next()) {
//'fake' result-set processing by just counting the results
processed++;
}
con.close();
long time = System.currentTimeMillis() - start;
System.out.println(String.format("Processed %s, time %s ms", 
processed, time));
}
}






-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/07c5cd18-4c99-497b-a30d-7fab2abcfbba%40googlegroups.com.