Re: Pluggable backends for refs,wip
On Thu, Aug 7, 2014 at 5:57 AM, Michael Haggerty mhag...@alum.mit.edu wrote: On 08/05/2014 02:40 PM, Ronnie Sahlberg wrote: Please see https://github.com/rsahlberg/git/tree/backend-struct-db-2 for an example of a pluggable backend for refs storage. This series contain changes to make it possible to add new backends for handling/storage of refs and implements one new backend : refs-be-be.c . This new backend offloads the actual refs handling to a small database daemon with which ita talks via a very simple rpc protocol. That daemon in turn then connects to the datastore and read/writes the values to it. [...] Ronnie, This is awesome! Congratulations on your progress. I'm still on vacation and haven't yet looked at the code. I will be back next week and hope to find time to check it out, and also to do some more review of the code that you have already submitted to git core. Thanks! Have you thought about how to test alternate reference backends? This will be very important to getting one or more of them accepted into git core (not to mention giving people confidence to actually *use* them!) I have thought about it and also done some experiments. For the initial git support, I think we first should try to get the pluggable backend support into git, and also the work to change the current files backend into a built-in pluggable backend. I.e. get everything in the https://github.com/rsahlberg/git/tree/backend-struct-db-2 branch except the last three patches. That brings us to a stage where we have pluggable backend support and we have one backend, the files backend, that works just like today. The last three patches in that series are then just confirmation that the pluggable backend approach works and we can add that a little later once we finish tests and other things. For tests there are the issues with git-clone and git-init requiring two additional arguments in order to set up and initialize a repository to use the database daemon backend. Other future backends I would imagine would have similar needs. The way I handle in the experiments I did was to use two new environment variables GIT_INIT and GIT_CLONE that would default to git-clone and git-init respectively and then just override them with GIT_INIT=git-init --db-repo-name=ROCKy --db-socket=/tmp/refsd.socket when I wanted the tests to initialize a database backend repository. This required some updates to test-lib.sh and test-lib-functions.sh as well as the tests themself to use ${GIT_INIT} instead of git-init directly. I am not sure what is the best approach here is and would love if you could help out with this once we get the basic pluggable backend stuff in. It seems to me that a few steps are needed: * Each backend would need a suite of backend-aware tests that verify proper operation *within* the backend. These tests would mostly use low-level plumbing commands like update-refs to create/modify/delete references, and would be allowed to grub around in the filesystem, talk directly with the database, etc. to make sure that the commands have the correct effects. For example, for the traditional filesystem backend, these tests would be the ones to check that creating a reference causes a file to spring into existence under $GIT_DIR/refs. Yes. Quite a few tests do muck around with the files directly. Some for good reasons but I think there are a lot of cases where the tests do it just out of convenience. For this we will need to convert the tests that don't strictly need to muck around with the files to use a backend agnostic method to do the same checks. For the tests that are truly testing the backend itself, such as a hypothetical test to check that a symbolic link to a ref behaves as it should, we will need a mechanism where we can conditionalize the tests based on what is the current backend. So lots of if backend == database then skip this test The tests for pack-refs, and all tests that care about the distinction between packed and loose refs, would become part of the backend-aware tests for the filesystem backend. All of the backend-aware tests should be run every time the test suite is run (provided, of course, that the correct prerequisites are available, and subject to being turned off manually). * The rest of the test suite has to be made backend-agnostic. For example, such tests should *not* be allowed to look under $GIT_DIR for the existence/absence of loose reference files [1] but would rather have to inquire about references via git commands. * It should be possible for the developer to choose easily which reference backend to use when running the agnostic part of the test suite. The chosen backend should be used to run *all* backend-agnostic tests. Agree. It would be great if we could work on this together. A database-backed backend might even want to be testable in two modes: one with the DB daemon running constantly, and one where the daemon is
Re: Pluggable backends for refs,wip
On 08/05/2014 02:40 PM, Ronnie Sahlberg wrote: Please see https://github.com/rsahlberg/git/tree/backend-struct-db-2 for an example of a pluggable backend for refs storage. This series contain changes to make it possible to add new backends for handling/storage of refs and implements one new backend : refs-be-be.c . This new backend offloads the actual refs handling to a small database daemon with which ita talks via a very simple rpc protocol. That daemon in turn then connects to the datastore and read/writes the values to it. [...] Ronnie, This is awesome! Congratulations on your progress. I'm still on vacation and haven't yet looked at the code. I will be back next week and hope to find time to check it out, and also to do some more review of the code that you have already submitted to git core. Have you thought about how to test alternate reference backends? This will be very important to getting one or more of them accepted into git core (not to mention giving people confidence to actually *use* them!) It seems to me that a few steps are needed: * Each backend would need a suite of backend-aware tests that verify proper operation *within* the backend. These tests would mostly use low-level plumbing commands like update-refs to create/modify/delete references, and would be allowed to grub around in the filesystem, talk directly with the database, etc. to make sure that the commands have the correct effects. For example, for the traditional filesystem backend, these tests would be the ones to check that creating a reference causes a file to spring into existence under $GIT_DIR/refs. The tests for pack-refs, and all tests that care about the distinction between packed and loose refs, would become part of the backend-aware tests for the filesystem backend. All of the backend-aware tests should be run every time the test suite is run (provided, of course, that the correct prerequisites are available, and subject to being turned off manually). * The rest of the test suite has to be made backend-agnostic. For example, such tests should *not* be allowed to look under $GIT_DIR for the existence/absence of loose reference files [1] but would rather have to inquire about references via git commands. * It should be possible for the developer to choose easily which reference backend to use when running the agnostic part of the test suite. The chosen backend should be used to run *all* backend-agnostic tests. A database-backed backend might even want to be testable in two modes: one with the DB daemon running constantly, and one where the daemon is stopped and started between each pair of Git commands. So after the changes, a single run of the test suite should run the backend-aware tests for *all* known backends followed by the backend-agnostic tests for a single selected backend. Michael [1] When I was working on my quagga-reference spike [2] I found that a lot of the test suite uses knowledge about how references and reflogs are stored by the filesystem backend and just grabs at the files rather than accessing the references using git commands. It will take some work to clean this up. [2] http://thread.gmane.org/gmane.comp.version-control.git/243726 -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Pluggable backends for refs,wip
Personally (a user of, not a maintainer of, git) I really want some alternative backends. In particular I'm after something like Fossil's use of SQLite3; I want a SQLite3 backend for several reasons, not the least of which is the power of SQL for looking at history. I'm not sure that I necessarily want a daemon/background process. I get the appeal (add inotify and bingo, very fast git status, always), but it seems likely to add obnoxious failure modes. As to a SQLite3-type backend, I am of two minds: either add it as a bolt-on to the builtin backend, or add it as a first-class backend that replaces the builtin one. The former is nice because the SQLite3 DB becomes more of a cache/index and query engine than a store, and can be used without migrating any repos, but the latter is also nice because SQLite3 provides strong ACID transactional semantics on local filesystems. Nico -- -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Pluggable backends for refs,wip
On Tue, Aug 5, 2014 at 2:56 PM, Nico Williams n...@cryptonector.com wrote: Personally (a user of, not a maintainer of, git) I really want some alternative backends. In particular I'm after something like Fossil's use of SQLite3; I want a SQLite3 backend for several reasons, not the least of which is the power of SQL for looking at history. I'm not sure that I necessarily want a daemon/background process. I get the appeal (add inotify and bingo, very fast git status, always), but it seems likely to add obnoxious failure modes. As to a SQLite3-type backend, I am of two minds: either add it as a bolt-on to the builtin backend, or add it as a first-class backend that replaces the builtin one. The former is nice because the SQLite3 DB becomes more of a cache/index and query engine than a store, and can be used without migrating any repos, but the latter is also nice because SQLite3 provides strong ACID transactional semantics on local filesystems. This will allow you to do either or both, depending on what you want. I am adding one new first-class backend to talk to a separate daemon : refs-be-db.c which then talks to a separate daemon refsd-tdb.c refsd-tdb.c is 7 RPCs and ~500 lines of code for a naive implementation for a standalone separate daemon implementation. If you rather want want a new first-class backend builtin to git itself instead of as a separate daemon, then that will be possible too. It just means that you will have to base the work on refs-be-db.c which is a much larger and complex code base than refsd-tdb.c. But yeah, once this work is finished, you will be able to build new first-class ref backends if you so wish. Please see refs-be-db.c that is the file and the methods you will need to implement in order to have a first-class SQL* backend. regards ronnie sahlberg -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Pluggable backends for refs,wip
Excellent. Thanks! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html