hi Yanni,

> Am 12.08.2022 um 07:25 schrieb Yanni Chiu <yannix...@gmail.com>:
> 
> Sounds good. Using Fuel to serialize/deserialize should be fine, but I had 
> unhappy experience with early versions where a newer Fuel version could not 
> read content written by earlier Fuel versions.

That is indeed the weakest part of Fuel I know and something I need to take 
care off. I will use Fuel without that part of the header that barks on 
versions. What needs to be done is to see what changed between versions and 
make a way to migrate. For this the database files will most likely have a 
version that is dedicated to the Fuel version being used.  

> 
> In my mind, a disk-based B-tree is the fundamental data structure for any 
> database (e.g. GemStone, Omnibase, Relational DBs all have it), but I don’t 
> see it mentioned so far. Everything else about a "database" is optional, IMHO.
> 

Ah, right, I didn't write about it but it is essential. My plan would be to 
also have a disk based b-tree dictionary that can keep parts like the keys 
(maybe even incrementally) in memory. But I'm not an expert in b-trees so I 
need to figure out what is the gap between the requirements of a b-tree and my 
plans ;) Also b-tree will come later after you are able to write/read something 
reliable.

> For example, a query language is not needed at all, if the data items are 
> equivalent to objects running in a Smalltalk VM - the query language is 
> Smalltalk itself. It “just” has to mesh well with the indexing.
> 
I think it needs a query DSL. Using blocks has serious limitations doing a 
proper query. The usage of instVars needs to be well known in order to analyze 
a query for index usage. Furthermore is the problem like #collect:thenSelect: 
even more severe in a database. So for indexes and query optimizations it 
should be a query DSL that is more like 

aCollection select firstName = 'Norbert'

I will experiment a bit. The combination of multiple queries in smalltalk and 
using binary messages is a problem of its own.

> Another feature is MVCC. How much concurrency is needed, and how it is 
> implemented, is an implementation choice. I’d mentioned LMDB on Discord. It 
> has single writer, many readers. Choosing to have multiple simultaneous 
> writers has a cost. When there is one or more big machines, it seems sensible 
> have the big machines use multiple writers. But when these databases scale 
> up, the database ends up sharded, with parts of the data spread over multiple 
> machines. Why not just start there, with many small machines owning a small 
> part of the database, and use just a single writer process. The application 
> can then deal with concurrency, conflicts and assembling results, in whatever 
> manner best suits.
> 
Almost all databases have only a single writer for one set of data. Sharding is 
a way to have virtual partitions of your data where you can have a writer per 
virtual partition removing the conflit potential. MVCC is not about having 
multiple writers as it is hardly possible. The benefits of MVCC are that data 
in a transaction is fully isolated and writers do not block readers. I 
personally like that a MVCC database is an append-only store for object data. 
You have to have one concurrency approach and MVCC is not really hard to 
implement. 
In ApptiveGrid we have a fully partioned model per user where 
inter-user-references use URNs as content address notation. This way we have 
hundreds of databases where each has a single writer. This is my main focus but 
I think we are on the same page regarding that. If nothing changes my talk at 
ESUG will talk about this.

> So I see the end goal to be a set of frameworks, and some “recipes” for 
> putting the frameworks together, to solve the “persistence” problem. These 
> are just half-baked ideas of course. I’ll follow the project, and hope to 
> contribute.

Thanks. I don't how far this project will get. But I have a need and ideas. 
Just need time :)

Let's see where the journey gets.

Norbert

> 
> Yanni
> 
> On Wed, Aug 10, 2022 at 7:08 AM Norbert Hartl <norb...@hartl.name 
> <mailto:norb...@hartl.name>> wrote:
> FYI
> 
> As the license situation around Omnibase is unlikely to change and such a 
> license is not an enabler for collaboration I came to the conclusion that 
> there is no other way than to start a new OO database (what a surprise! :) )
> 
> If you are interested or want to see what will happen in the next months you 
> can have a look at 
> 
> https://github.com/ApptiveGrid/Soil <https://github.com/ApptiveGrid/Soil>
> 
> After half a day of work it can already serialize and partition a simple 
> graph and store it to disk in a way it can be read back even when 
> partitioned. Thanks to having Fuel we do not need to reinvent the wheel and 
> have serialization/materialization done. 
> Of course this is super simple and does not comes close to anything usable 
> but a good start. I've also added a documentation in the github repo to 
> describe the most important parts for me that I will target.
> 
> Questions, critiques and laughs are appreciated,
> 
> Norbert
> 
>> Am 08.08.2022 um 15:18 schrieb Norbert Hartl <norb...@hartl.name 
>> <mailto:norb...@hartl.name>>:
>> 
>> To all Omnibase and Monibase users. 
>> 
>> It turned out that neither of those are open source. The author of the 
>> database contacted me clarifying the situation that he has the copyright and 
>> never released something open source. This means that I will remove the 
>> Omnibase repositories in few weeks from 
>> 
>> https://github.com/ApptiveGrid/MoniBase 
>> <https://github.com/ApptiveGrid/MoniBase>
>> 
>> and 
>> 
>> https://github.com/pharo-nosql/OmniBase 
>> <https://github.com/pharo-nosql/OmniBase>
>> 
>> I'm very sorry about that but someone just took the code 9 years before, 
>> copied it on github and put illegally an MIT license to the repository. We 
>> only want free software in our repositories and hence the above will go away.
>> 
>> As we see it essential to have a good OO database in pharo we will see how 
>> much effort it will be build a small and simple OO database that can replace 
>> Omnibase. 
>> 
>> regards,
>> 
>> Norbert
> 

Reply via email to