hi Yanni, > Am 12.08.2022 um 07:25 schrieb Yanni Chiu <yannix...@gmail.com>: > > Sounds good. Using Fuel to serialize/deserialize should be fine, but I had > unhappy experience with early versions where a newer Fuel version could not > read content written by earlier Fuel versions.
That is indeed the weakest part of Fuel I know and something I need to take care off. I will use Fuel without that part of the header that barks on versions. What needs to be done is to see what changed between versions and make a way to migrate. For this the database files will most likely have a version that is dedicated to the Fuel version being used. > > In my mind, a disk-based B-tree is the fundamental data structure for any > database (e.g. GemStone, Omnibase, Relational DBs all have it), but I don’t > see it mentioned so far. Everything else about a "database" is optional, IMHO. > Ah, right, I didn't write about it but it is essential. My plan would be to also have a disk based b-tree dictionary that can keep parts like the keys (maybe even incrementally) in memory. But I'm not an expert in b-trees so I need to figure out what is the gap between the requirements of a b-tree and my plans ;) Also b-tree will come later after you are able to write/read something reliable. > For example, a query language is not needed at all, if the data items are > equivalent to objects running in a Smalltalk VM - the query language is > Smalltalk itself. It “just” has to mesh well with the indexing. > I think it needs a query DSL. Using blocks has serious limitations doing a proper query. The usage of instVars needs to be well known in order to analyze a query for index usage. Furthermore is the problem like #collect:thenSelect: even more severe in a database. So for indexes and query optimizations it should be a query DSL that is more like aCollection select firstName = 'Norbert' I will experiment a bit. The combination of multiple queries in smalltalk and using binary messages is a problem of its own. > Another feature is MVCC. How much concurrency is needed, and how it is > implemented, is an implementation choice. I’d mentioned LMDB on Discord. It > has single writer, many readers. Choosing to have multiple simultaneous > writers has a cost. When there is one or more big machines, it seems sensible > have the big machines use multiple writers. But when these databases scale > up, the database ends up sharded, with parts of the data spread over multiple > machines. Why not just start there, with many small machines owning a small > part of the database, and use just a single writer process. The application > can then deal with concurrency, conflicts and assembling results, in whatever > manner best suits. > Almost all databases have only a single writer for one set of data. Sharding is a way to have virtual partitions of your data where you can have a writer per virtual partition removing the conflit potential. MVCC is not about having multiple writers as it is hardly possible. The benefits of MVCC are that data in a transaction is fully isolated and writers do not block readers. I personally like that a MVCC database is an append-only store for object data. You have to have one concurrency approach and MVCC is not really hard to implement. In ApptiveGrid we have a fully partioned model per user where inter-user-references use URNs as content address notation. This way we have hundreds of databases where each has a single writer. This is my main focus but I think we are on the same page regarding that. If nothing changes my talk at ESUG will talk about this. > So I see the end goal to be a set of frameworks, and some “recipes” for > putting the frameworks together, to solve the “persistence” problem. These > are just half-baked ideas of course. I’ll follow the project, and hope to > contribute. Thanks. I don't how far this project will get. But I have a need and ideas. Just need time :) Let's see where the journey gets. Norbert > > Yanni > > On Wed, Aug 10, 2022 at 7:08 AM Norbert Hartl <norb...@hartl.name > <mailto:norb...@hartl.name>> wrote: > FYI > > As the license situation around Omnibase is unlikely to change and such a > license is not an enabler for collaboration I came to the conclusion that > there is no other way than to start a new OO database (what a surprise! :) ) > > If you are interested or want to see what will happen in the next months you > can have a look at > > https://github.com/ApptiveGrid/Soil <https://github.com/ApptiveGrid/Soil> > > After half a day of work it can already serialize and partition a simple > graph and store it to disk in a way it can be read back even when > partitioned. Thanks to having Fuel we do not need to reinvent the wheel and > have serialization/materialization done. > Of course this is super simple and does not comes close to anything usable > but a good start. I've also added a documentation in the github repo to > describe the most important parts for me that I will target. > > Questions, critiques and laughs are appreciated, > > Norbert > >> Am 08.08.2022 um 15:18 schrieb Norbert Hartl <norb...@hartl.name >> <mailto:norb...@hartl.name>>: >> >> To all Omnibase and Monibase users. >> >> It turned out that neither of those are open source. The author of the >> database contacted me clarifying the situation that he has the copyright and >> never released something open source. This means that I will remove the >> Omnibase repositories in few weeks from >> >> https://github.com/ApptiveGrid/MoniBase >> <https://github.com/ApptiveGrid/MoniBase> >> >> and >> >> https://github.com/pharo-nosql/OmniBase >> <https://github.com/pharo-nosql/OmniBase> >> >> I'm very sorry about that but someone just took the code 9 years before, >> copied it on github and put illegally an MIT license to the repository. We >> only want free software in our repositories and hence the above will go away. >> >> As we see it essential to have a good OO database in pharo we will see how >> much effort it will be build a small and simple OO database that can replace >> Omnibase. >> >> regards, >> >> Norbert >