Database servers do often host a lot of databases in it, so I don't think having 1000s of independent DBs is beyond the design boundary. With proper cache management, I don't think we'll ever have all 1000s of them open at the same time, and for those few that would do, a few GB of heap isn't the end of the world.
That said, we can and should try a few more like HSQLDB to see if they have different characteristics. It might be also possible to make some quick improvements to H2 for some quick gains, for H2 probably isn't designed for 1000s of independent DBs in one JVM. One benefit of SQL DB is the popularity and vast number of people who are familiar with it, inluding users. For example, test data in DB would allow users and other devs to come up with queries. It'll also make it easier for existing plugin devs to use them. MapDB is a nice library on its own, but Map is by definition single index, so I'm not sure it works for many typical use cases. Say test reports --- we need to be able to query all failing ones for a given build as well as all the past executions of a specific test case. 2012/12/5 Jesse Glick <[email protected]> > On 12/05/2012 06:37 AM, Kohsuke Kawaguchi wrote: > >> H2 database, when opened, takes up about 1MB in heap. >> > > Seems excessive when typical jobs will have much less data than this that > needs to be stored. > > I am still not convinced that using a SQL database for this kind of thing > is appropriate. > > 1. Portability of SQL is a bit of a red herring because once you start > using, say, H2 to store per-job data, you cannot casually switch to another > DB without losing historical build records; and Jenkins would have to ship > with _some_ DB plugin, or all plugins using the DB API would be broken. And > for per-job DBs we are narrowing the field to those that are embeddable, > which probably means just H2 in practice. > > 2. SQL databases are generally optimized for one slow-to-start instance, a > small number of expensive connections, and maybe dozens of tables with lots > of data. Whereas we need thousands of extremely cheap instances, each with > one immediately available connection and a few tables with usually not so > much data. The closer we can get to java.io.RandomAccessFile.<**init> the > better. > > Is there any fully free (so not BDB-JE) DB which is pure Java, embeddable, > supports some kind of indices, supports compact binary schemas (so not e.g. > Lucene or the current wave of JSON DBs), and has a very simple client API > once the connection is set up; while being openable from one or two disk > files in say under a millisecond with no significant penalty beyond the > file descriptor? MapDB [1] looks most promising so far. > > [1] https://github.com/jankotek/**mapdb<https://github.com/jankotek/mapdb> > -- Kohsuke Kawaguchi
