Hi Val, The reason for the execution time difference that you see between full scan of the original table (LAZY_QUERY_EXECUTION, 20min) vs. scan of the intermediate temp table (6 min), could be related to the fact that database may be very sparsely populated (as a result of inserts/updates during it's construction), whereas temp table supposedly is pretty dense (built in special "append" mode). I wonder, what is the size of that temp file? It would show you amount of real data you have, and I suspect it's somewhat smaller than 40+GB in you database. You can try do "SHUTDOWN COMPACT" (assuming you are using v.1.4.200 and have a backup), and most likely size of the db and it's scan time will go down after that (to ~6 min for lazy exec?), unless I am missing something and lazy execution has some significant overhead.
On Wednesday, November 11, 2020 at 12:14:38 AM UTC-5 val wrote: > If I understand what Evgenij is saying, that temp file is how the > h2-db-engine behaves, afterall, it is NOT the result set, but is a internal > temporary H2 structure. > > I still dont understand why ResultSet iteration so much faster from this > temporary structure, compared to directly from the database using > LAZY_QUERY_EXECUTION. We are talking 6 minutes vs 20 minutes. I can > understand if there were other requirements like sorting, filtering or > joining; then I could see how that would require a temporary structure. But > in this case the query is a basic SELECT * FROM TABLE, why would the > LAZY_QUERY_EXECUTION approach not be able to iterate all the rows in the > same 6 minutes as with the temporary structure. > > Perhaps the rows in the database are stored differently in the db than in > the temporary structure that makes them easier/cheaper to iterate? > > Separately; should we use LAZY_QUERY_EXECUTION=TRUE by default? If it can > already detect if it needs to use a temporary structure or not; is there > any drawback? Its a bit of an odd programming pattern to have to setup > different connection urls depending on what you are trying to select. > On Tuesday, November 10, 2020 at 8:13:24 PM UTC-5 Evgenij Ryazanov wrote: > >> Hello. >> >> There is nothing to do with temporary results in H2. If you have large >> queries and they may return parts of results early, use >> `LAZY_QUERY_EXECUTION=TRUE`. If some query doesn't support lazy execution, >> there is nothing to do with it in H2 itself. With >> `LAZY_QUERY_EXECUTION=TRUE` H2 uses LocalResult if and only if rows need a >> post-processing, such as sorting, distinct filtration, or OFFSET / FETCH >> filtration and these operations can't be performed early, for example, >> because there are no compatible indexes or some previous step needs a >> complete set of rows. In all these cases rows passed to LocalResult are not >> the same rows as will be returned by ResultSet. >> > -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/99dbd6af-7705-4f25-a44d-6d45e11f86b2n%40googlegroups.com.
