I haven't traced it through all the way. *org.h2.result.LocalResultImpl#addRowsToDisk *is a good starting point.
By the way I was using the word serialize 'loosely', its not using Java serialization to serialize the result set. From a quick code review, its placing the result sets into *org.h2.mvstore.db.MVPlainTempResult* *org.h2.mvstore.db.MVSortedTempResult* So its using these MVStore classes to temporarily persist the ResultSet to disk. Based on information here http://www.h2database.com/html/mvstore.html it maybe possible to start returning rows from the MVStore while the full content is still being written. That's probably easier then replacing this whole implementation with a new mechanism like CSV or ASCII; as it would already contains the ResultSet meta data like types, sizes, etc.. which would need a brand new custom implementation to be encoded in csv/ascii On Tuesday, November 10, 2020 at 3:03:23 AM UTC-5 [email protected] wrote: > Noel and Val, good morning. > > Would you like to give us a hint, where exactly this > "temporally serialization of a resultset" happens in the code? > We could try to modify it so it writes Flat ASCII files or CSV files > instead of binary blobs (on demand, per hint), which would enable us to > read line by line and access the first rows while the full content is still > written. > > Best regards > Andreas > > > On Tue, 2020-11-10 at 08:52 +0200, Noel Grandin wrote: > > > On Tuesday, November 10, 2020 at 12:03:30 AM UTC-5 val wrote: > > I'm guessing this is how the h2 client/driver (not the db engine > it self) behaves to handle a ResultSet. It will > > fully serialize the contents of the ResultSet to disk first and > then serve that. Perhaps this is necessary to > > No, it's just the easiest way to handle multiple resultset performantly, > because then the engine doesn't have to > remember any state about the resultset. > > On 2020/11/10 7:31 am, val wrote: > > Adding *LAZY_QUERY_EXECUTION=1 * to the connection url is closer to what I > was looking for. > > > Yeah, that is currently our best answer for extremely large result sets, > note that it only works for relatively simple > queries, and it will likely be quite bad performance-wise if you have more > than connection attempting to hit the same table. > > The alternative is to grab chunks of the resultset at a time, using > OFFSET...LIMIT.... > > > I wonder where the bottleneck is. What would it take to bring this 20 > minutes total iteration time down to the 6 minutes > it takes if you prebuild the ResultSet first. > > > You'd have to run a profiler, either our simple one (see our performance > page), or one of the other ones, to see where > the bottleneck is. > > TBH, H2 is mostly optimised for small quick queries, not really large ones. > > > -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/87f5020a-3ca4-46ad-b88d-ddc2a656d106n%40googlegroups.com.
