[email protected] (Ludovic Courtès) writes: > Hi reepca! > > I gave this patch set a try and looked at the code, and it looks very > good to me! > > “make check TESTS=tests/store.scm” passes now with the fixes you posted > today (‘getenv’ & co.)
I end up with 2 failed tests, but neither of them are the register-path one. > For deduplication, you already know the (guix hash) API that will allow > you to do that I guess. The “deduplication database” is simply the > /gnu/store/.links directory. Each entry there is a hard link to a file; > the name of the entry is the hex representation of the sha256 hash of > that file. So when inserting a file in the store, all you need is to > look up its hash in /gnu/store/.links and hard-link from there. See > what I mean? And if it isn't in /gnu/store/.links, put a hardlink there? Also, when you say "when inserting a file", what do you mean by that? I was under the impression that the paths given to register-path were already in the store but simply weren't in the database or otherwise "officially" in the store. Is my assumption incorrect? Less on-topic, I don't really know how sha256 works, but if it is a hashing function, there must be a possibility (albeit small) of collisions, right? Is it worth developing a strategy for handling them? > Some minor suggestions about the code: Implemented all of these except maybe half of the commit log one... I'm not sure what the commit message headline should be, so I sort of guessed based on some of the ones in recent history, and I'm not sure I've got the formatting quite right yet, but it should be a lot closer. > What about pushing your changes to a WIP branch on Savannah or > elsewhere? (If you have an account on Savannah we can give you access.) I just made an account today as "reepca". Should I submit a "request for inclusion"? In the meantime, here's a much less fixup-riddled patch set that I think is just about feature-complete as far as register-path is concerned:
>From ce4a322446d1865791686b1e4573973573bdcdfc Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <[email protected]> Date: Sat, 3 Jun 2017 02:26:05 -0500 Subject: [PATCH 1/7] guix: register-path: Implement prototype in scheme. * guix/store.scm (register-path): reimplement in scheme. * guix/sql.scm: New file. --- gnu/packages/package-management.scm | 3 +- guix/sql.scm | 224 ++++++++++++++++++++++++++++++++++++ guix/store.scm | 78 ++++++++++--- 3 files changed, 286 insertions(+), 19 deletions(-) create mode 100644 guix/sql.scm diff --git a/gnu/packages/package-management.scm b/gnu/packages/package-management.scm index af91ec1d7..50be3a23f 100644 --- a/gnu/packages/package-management.scm +++ b/gnu/packages/package-management.scm @@ -250,7 +250,8 @@ (propagated-inputs `(("gnutls" ,gnutls/guile-2.2) ;for 'guix download' & co. ("guile-json" ,guile-json) - ("guile-ssh" ,guile-ssh))) + ("guile-ssh" ,guile-ssh) + ("guile-sqlite3" ,guile-sqlite3))) (home-page "https://www.gnu.org/software/guix/") (synopsis "Functional package manager for installed software packages and versions") diff --git a/guix/sql.scm b/guix/sql.scm new file mode 100644 index 000000000..b1e0c0aa4 --- /dev/null +++ b/guix/sql.scm @@ -0,0 +1,224 @@ +(define-module (guix sql) + #:use-module (sqlite3) + #:use-module (system foreign) + #:use-module (rnrs bytevectors) + #:use-module (srfi srfi-9) + #:export (sqlite-register)) + +;; Miscellaneous SQL stuff, currently just setup for sqlite-register. Mostly +;; macros. + +;; This really belongs in guile-sqlite3, as can be seen from the @@s. +(define sqlite-last-insert-rowid + (let ((last-rowid (pointer->procedure + int + (dynamic-func "sqlite3_last_insert_rowid" + (@@ (sqlite3) libsqlite3)) + (list '*)))) + (lambda (db) + "Gives the row id of the last inserted row in DB." + (last-rowid ((@@ (sqlite3) db-pointer) db))))) + + +;; Should I go from key->index here or try to change that in guile-sqlite3? +(define-syntax sql-parameters + (syntax-rules () + "Converts key-value pairs into sqlite bindings for a specific statement." + ((sql-parameters statement (name1 val1) (name2 val2) (name3 val3) ...) + (begin (sqlite-bind statement name1 val1) + (sql-parameters statement (name2 val2) (name3 val3) ...))) + ((sql-parameters statement (name value)) + (sqlite-bind statement name value)))) + +(define* (step-all statement #:optional (callback noop)) + "Step until statement is completed. Return number of rows." + ;; Where "number of rows" is assumed to be number of steps taken, excluding + ;; the last one. + (let maybe-step ((ret (sqlite-step statement)) + (count 0)) + (if ret + (maybe-step ret (+ count 1)) + count))) + +;; I get the feeling schemers have probably already got this "with" business +;; much more automated than this... +(define-syntax with-sql-statement + (syntax-rules () + "Automatically prepares statements and then finalizes statements once the +scope of this macro is left. Also with built-in sqlite parameter binding via +key-value pairs." + ((with-sql-statement db sql statement-var + ((name1 val1) (name2 val2) ...) + exps ...) + (let ((statement-var (sqlite-prepare db sql))) + (dynamic-wind noop + (lambda () + (sql-parameters statement-var + (name1 val1) + (name2 val2) ...) + exps ...) + (lambda () + (sqlite-finalize statement-var))))) + ((with-sql-statement db sql statement-var () exps ...) + (let ((statement-var (sqlite-prepare db sql))) + (dynamic-wind noop + (lambda () + exps ...) + (lambda () + (sqlite-finalize statement-var))))))) + +(define-syntax with-sql-database + (syntax-rules () + "Automatically closes the database once the scope of this macro is left." + ((with-sql-database location db-var exps ...) + (let ((db-var (sqlite-open location))) + (dynamic-wind noop + (lambda () + exps ...) + (lambda () + (sqlite-close db-var))))))) + +(define-syntax run-sql + (syntax-rules () + "For one-off queries that don't get repeated on the same +database. Everything after database and sql source should be 2-element lists +containing the sql placeholder name and the value to use. Returns the number +of rows." + ((run-sql db sql (name1 val1) (name2 val2) ...) + (let ((statement (sqlite-prepare db sql))) + (dynamic-wind noop + (lambda () + (sql-parameters statement + (name1 val1) + (name2 val2) ...) + (step-all statement)) + (lambda () + (sqlite-finalize statement))))) + ((run-sql db sql) + (let ((statement (sqlite-prepare db sql))) + (dynamic-wind noop + (lambda () + (step-all statement)) + (lambda () + (sqlite-finalize statement))))))) + +(define-syntax run-statement + (syntax-rules () + "For compiled statements that may be run multiple times. Everything after +database and sql source should be 2-element lists containing the sql +placeholder name and the value to use. Returns the number of rows." + ((run-sql db statement (name1 val1) (name2 val2) ...) + (dynamic-wind noop + (lambda () + (sql-parameters statement + (name1 val1) + (name2 val2) ...) + (step-all statement)) + (lambda () + (sqlite-reset statement)))) + ((run-sql db statement) + (dynamic-wind noop + (lambda () + (step-all statement)) + (lambda () + (sqlite-reset statement)))))) + +(define path-id-sql + "SELECT id FROM ValidPaths WHERE path = $path") + +(define (single-result statement) + "Gives the first element of the first row returned by statement." + (let ((row (sqlite-step statement))) + (if row + (vector-ref row 0) + #f))) + +(define* (path-id db path) + "If the path \"path\" exists in the ValidPaths table, return its +id. Otherwise, return #f. If you already have a compiled statement for this +purpose, you can give it as statement." + (with-sql-statement db path-id-sql statement + (;("$path" path) + (1 path)) + (single-result statement))) + + +(define update-sql + "UPDATE ValidPaths SET hash = $hash, registrationTime = $time, deriver = +$deriver, narSize = $size WHERE id = $id") + +(define insert-sql + "INSERT INTO ValidPaths (path, hash, registrationTime, deriver, narSize) +VALUES ($path, $hash, $time, $deriver, $size)") + +(define* (update-or-insert #:key db path deriver hash nar-size time) + "The classic update-if-exists and insert-if-doesn't feature that sqlite +doesn't exactly have... they've got something close, but it involves deleting +and re-inserting instead of updating, which causes problems with foreign keys, +of course. Returns the row id of the row that was modified or inserted." + (let ((id (path-id db path))) + (if id + (begin + (run-sql db update-sql + ;; As you may have noticed, sqlite-bind doesn't behave + ;; exactly how I was expecting... + ;; ("$id" id) + ;; ("$deriver" deriver) + ;; ("$hash" hash) + ;; ("$size" nar-size) + ;; ("$time" time) + (5 id) + (3 deriver) + (1 hash) + (4 nar-size) + (2 time)) + id) + (begin + (run-sql db insert-sql + ;; ("$path" path) + ;; ("$deriver" deriver) + ;; ("$hash" hash) + ;; ("$size" nar-size) + ;; ("$time" time) + (1 path) + (4 deriver) + (2 hash) + (5 nar-size) + (3 time)) + (sqlite-last-insert-rowid db))))) + +(define add-reference-sql + "INSERT OR IGNORE INTO Refs (referrer, reference) SELECT $referrer, id +FROM ValidPaths WHERE path = $reference") + +(define (add-references db referrer references) + "referrer is the id of the referring store item, references is a list +containing store item paths being referred to. Note that all of the store +items in \"references\" should already be registered." + (with-sql-statement db add-reference-sql add-reference-statement () + (for-each (lambda (reference) + (run-statement db + add-reference-statement + ;("$referrer" referrer) + ;("$reference" reference) + (1 referrer) + (2 reference))) + references))) + +;; XXX figure out caching of statement and database objects... later +(define* (sqlite-register #:key dbpath path references deriver hash nar-size) + "Registers this stuff in a database specified by DBPATH. PATH is the string +path of some store item, REFERENCES is a list of string paths which the store +item PATH refers to (they need to be already registered!), DERIVER is a string +path of the derivation that created the store item PATH, HASH is the +base16-encoded sha256 hash of the store item denoted by PATH (prefixed with +\"sha256:\") after being converted to nar form, and nar-size is the size in +bytes of the store item denoted by PATH after being converted to nar form." + (with-sql-database dbpath db + (let ((id (update-or-insert #:db db + #:path path + #:deriver deriver + #:hash hash + #:nar-size nar-size + #:time (current-time)))) + (add-references db id references)))) diff --git a/guix/store.scm b/guix/store.scm index c94dfea95..a62fcf3f1 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -27,6 +27,7 @@ #:use-module (guix hash) #:autoload (guix build syscalls) (terminal-columns) #:use-module (rnrs bytevectors) + #:use-module (rnrs io ports) #:use-module (ice-9 binary-ports) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) @@ -41,6 +42,8 @@ #:use-module (ice-9 vlist) #:use-module (ice-9 popen) #:use-module (web uri) + #:use-module (sqlite3) + #:use-module (guix sql) #:export (%daemon-socket-uri %gc-roots-directory %default-substitute-urls @@ -1206,32 +1209,71 @@ The result is always the empty list unless the daemon was started with This makes sense only when the daemon was started with '--cache-failures'." boolean) + +;; Would it be better to just make WRITE-FILE give size as well? I question +;; the general utility of this approach. +(define (counting-wrapper-port output-port) + "Some custom ports don't implement GET-POSITION at all. But if we want to +figure out how many bytes are being written, we will want to use that. So this +makes a wrapper around a port which implements GET-POSITION." + (let ((byte-count 0)) + (make-custom-binary-output-port "counting-wrapper" + (lambda (bytes offset count) + (set! byte-count + (+ byte-count count)) + (put-bytevector output-port bytes + offset count) + count) + (lambda () + byte-count) + #f + (lambda () + (close-port output-port))))) + + +(define (nar-sha256 file) + "Gives the sha256 hash of a file and the size of the file in nar form." + (let-values (((port get-hash) (open-sha256-port))) + (let ((wrapper (counting-wrapper-port port))) + (write-file file wrapper) + (force-output wrapper) + (force-output port) + (let ((hash (get-hash)) + (size (port-position wrapper))) + (close-port wrapper) + (values hash + size))))) + +;; TODO: make this canonicalize store items that are registered. This involves +;; setting permissions and timestamps, I think. Also, run a "deduplication +;; pass", whatever that involves. Also, honor environment variables. Also, +;; handle databases not existing yet (what should the default behavior be? +;; Figuring out how the C++ stuff currently does it sounds like a lot of +;; grepping for global variables...) + (define* (register-path path - #:key (references '()) deriver prefix - state-directory) + #:key (references '()) deriver (prefix "") + (state-directory + (string-append prefix %state-directory))) "Register PATH as a valid store file, with REFERENCES as its list of references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is -not #f, it must be the name of the directory containing the new store to -initialize; if STATE-DIRECTORY is not #f, it must be a string containing the +given, it must be the name of the directory containing the new store to +initialize; if STATE-DIRECTORY is given, it must be a string containing the absolute file name to the state directory of the store being initialized. Return #t on success. Use with care as it directly modifies the store! This is primarily meant to be used internally by the daemon's build hook." - ;; Currently this is implemented by calling out to the fine C++ blob. - (let ((pipe (apply open-pipe* OPEN_WRITE %guix-register-program - `(,@(if prefix - `("--prefix" ,prefix) - '()) - ,@(if state-directory - `("--state-directory" ,state-directory) - '()))))) - (and pipe - (begin - (format pipe "~a~%~a~%~a~%" - path (or deriver "") (length references)) - (for-each (cut format pipe "~a~%" <>) references) - (zero? (close-pipe pipe)))))) + (let* ((to-register (string-append %store-directory "/" (basename path)))) + (let-values (((hash nar-size) + (nar-sha256 (string-append prefix "/" to-register)))) + (sqlite-register #:dbpath (string-append state-directory "/db/db.sqlite") + #:path to-register + #:references references + #:deriver deriver + #:hash (string-append "sha256:" + (bytevector->base16-string hash)) + #:nar-size nar-size)))) ;;; -- 2.13.0
>From ef072b28b764c192a31e4a4c7cd1b384e0943e49 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <[email protected]> Date: Mon, 5 Jun 2017 01:34:28 -0500 Subject: [PATCH 2/7] guix: register-path: Honor environment variables. * guix/store.scm (register-path): Honor environment variables involving the store, state directory, or database path. Update copyright info. * guix/sql.scm: Add copyright notice. --- guix/sql.scm | 18 +++++++++++++++++ guix/store.scm | 61 +++++++++++++++++++++++++++++++++++++++++++--------------- 2 files changed, 63 insertions(+), 16 deletions(-) diff --git a/guix/sql.scm b/guix/sql.scm index b1e0c0aa4..b6153e332 100644 --- a/guix/sql.scm +++ b/guix/sql.scm @@ -1,3 +1,21 @@ +;;; Copyright © 2017 Caleb Ristvedt <[email protected]> +;;; +;;; This file is part of GNU Guix. +;;; +;;; GNU Guix is free software; you can redistribute it and/or modify it +;;; under the terms of the GNU General Public License as published by +;;; the Free Software Foundation; either version 3 of the License, or (at +;;; your option) any later version. +;;; +;;; GNU Guix is distributed in the hope that it will be useful, but +;;; WITHOUT ANY WARRANTY; without even the implied warranty of +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;;; GNU General Public License for more details. +;;; +;;; You should have received a copy of the GNU General Public License +;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. + + (define-module (guix sql) #:use-module (sqlite3) #:use-module (system foreign) diff --git a/guix/store.scm b/guix/store.scm index a62fcf3f1..2f16ec2b1 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -1,5 +1,6 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017 Ludovic Courtès <[email protected]> +;;; Copyright © 2017 Caleb Ristvedt <[email protected]> ;;; ;;; This file is part of GNU Guix. ;;; @@ -1246,15 +1247,17 @@ makes a wrapper around a port which implements GET-POSITION." ;; TODO: make this canonicalize store items that are registered. This involves ;; setting permissions and timestamps, I think. Also, run a "deduplication -;; pass", whatever that involves. Also, honor environment variables. Also, -;; handle databases not existing yet (what should the default behavior be? -;; Figuring out how the C++ stuff currently does it sounds like a lot of -;; grepping for global variables...) +;; pass", whatever that involves. Also, handle databases not existing yet +;; (what should the default behavior be? Figuring out how the C++ stuff +;; currently does it sounds like a lot of grepping for global +;; variables...). Also, return #t on success like the documentation says we +;; should. (define* (register-path path - #:key (references '()) deriver (prefix "") - (state-directory - (string-append prefix %state-directory))) + #:key (references '()) deriver prefix + state-directory) + ;; Priority for options: first what is given, then environment variables, + ;; then defaults. "Register PATH as a valid store file, with REFERENCES as its list of references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is given, it must be the name of the directory containing the new store to @@ -1264,16 +1267,42 @@ Return #t on success. Use with care as it directly modifies the store! This is primarily meant to be used internally by the daemon's build hook." - (let* ((to-register (string-append %store-directory "/" (basename path)))) + (let* ((db-dir (cond + (state-directory + (string-append state-directory "/db")) + (prefix + (string-append prefix %state-directory "/db")) + ((getenv "NIX_DB_DIR") + (getenv "NIX_DB_DIR")) + ((getenv "NIX_STATE_DIR") + (string-append (getenv "NIX_STATE_DIR") "/db")) + (else + (string-append %state-directory "/db")))) + (store-dir (if prefix + (string-append prefix %store-directory) + (or + (getenv "NIX_STORE_DIR") + (getenv "NIX_STORE") + %store-directory))) + (to-register (if prefix + ;; note: we assume here that if path is, for example, + ;; /foo/bar/gnu/store/thing.txt, then an environment + ;; variable has been used to change the store + ;; directory to /foo/bar/gnu/store. + (string-append %store-directory "/" (basename path)) + path)) + (real-path (string-append store-dir "/" + (basename path)))) (let-values (((hash nar-size) - (nar-sha256 (string-append prefix "/" to-register)))) - (sqlite-register #:dbpath (string-append state-directory "/db/db.sqlite") - #:path to-register - #:references references - #:deriver deriver - #:hash (string-append "sha256:" - (bytevector->base16-string hash)) - #:nar-size nar-size)))) + (nar-sha256 real-path))) + (sqlite-register + #:dbpath (string-append db-dir "/db.sqlite") + #:path to-register + #:references references + #:deriver deriver + #:hash (string-append "sha256:" + (bytevector->base16-string hash)) + #:nar-size nar-size)))) ;;; -- 2.13.0
>From 855a2347ee90cda2835deffda0a7fcb1348b8aef Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <[email protected]> Date: Mon, 5 Jun 2017 20:28:33 -0500 Subject: [PATCH 3/7] .dir-locals.el: properly indent sql macros. * .dir-locals.el: add indentation for with-sql-statement, with-sql-database, run-sql, and run-statement. * guix/sql.scm: use that indentation. (step-all): fix invocation of named-let so it works for multi-step statements. --- .dir-locals.el | 5 +++ guix/sql.scm | 99 ++++++++++++++++++++++++++++++---------------------------- 2 files changed, 57 insertions(+), 47 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index 04b58d2ce..d5caef0a6 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -73,6 +73,11 @@ (eval . (put 'wrap-program 'scheme-indent-function 1)) (eval . (put 'with-imported-modules 'scheme-indent-function 1)) + (eval . (put 'with-sql-statement 'scheme-indent-function 1)) + (eval . (put 'with-sql-database 'scheme-indent-function 1)) + (eval . (put 'run-sql 'scheme-indent-function 1)) + (eval . (put 'run-statement 'scheme-indent-function 1)) + (eval . (put 'call-with-container 'scheme-indent-function 1)) (eval . (put 'container-excursion 'scheme-indent-function 1)) (eval . (put 'eventually 'scheme-indent-function 1)) diff --git a/guix/sql.scm b/guix/sql.scm index b6153e332..d5c72105b 100644 --- a/guix/sql.scm +++ b/guix/sql.scm @@ -55,7 +55,7 @@ (let maybe-step ((ret (sqlite-step statement)) (count 0)) (if ret - (maybe-step ret (+ count 1)) + (maybe-step (sqlite-step statement) (+ count 1)) count))) ;; I get the feeling schemers have probably already got this "with" business @@ -72,8 +72,8 @@ key-value pairs." (dynamic-wind noop (lambda () (sql-parameters statement-var - (name1 val1) - (name2 val2) ...) + (name1 val1) + (name2 val2) ...) exps ...) (lambda () (sqlite-finalize statement-var))))) @@ -155,10 +155,11 @@ placeholder name and the value to use. Returns the number of rows." "If the path \"path\" exists in the ValidPaths table, return its id. Otherwise, return #f. If you already have a compiled statement for this purpose, you can give it as statement." - (with-sql-statement db path-id-sql statement - (;("$path" path) - (1 path)) - (single-result statement))) + (with-sql-statement + db path-id-sql statement + (;("$path" path) + (1 path)) + (single-result statement))) (define update-sql @@ -177,32 +178,34 @@ of course. Returns the row id of the row that was modified or inserted." (let ((id (path-id db path))) (if id (begin - (run-sql db update-sql - ;; As you may have noticed, sqlite-bind doesn't behave - ;; exactly how I was expecting... - ;; ("$id" id) - ;; ("$deriver" deriver) - ;; ("$hash" hash) - ;; ("$size" nar-size) - ;; ("$time" time) - (5 id) - (3 deriver) - (1 hash) - (4 nar-size) - (2 time)) + (run-sql + db update-sql + ;; As you may have noticed, sqlite-bind doesn't behave + ;; exactly how I was expecting... + ;; ("$id" id) + ;; ("$deriver" deriver) + ;; ("$hash" hash) + ;; ("$size" nar-size) + ;; ("$time" time) + (5 id) + (3 deriver) + (1 hash) + (4 nar-size) + (2 time)) id) (begin - (run-sql db insert-sql - ;; ("$path" path) - ;; ("$deriver" deriver) - ;; ("$hash" hash) - ;; ("$size" nar-size) - ;; ("$time" time) - (1 path) - (4 deriver) - (2 hash) - (5 nar-size) - (3 time)) + (run-sql + db insert-sql + ;; ("$path" path) + ;; ("$deriver" deriver) + ;; ("$hash" hash) + ;; ("$size" nar-size) + ;; ("$time" time) + (1 path) + (4 deriver) + (2 hash) + (5 nar-size) + (3 time)) (sqlite-last-insert-rowid db))))) (define add-reference-sql @@ -213,15 +216,16 @@ FROM ValidPaths WHERE path = $reference") "referrer is the id of the referring store item, references is a list containing store item paths being referred to. Note that all of the store items in \"references\" should already be registered." - (with-sql-statement db add-reference-sql add-reference-statement () - (for-each (lambda (reference) - (run-statement db - add-reference-statement - ;("$referrer" referrer) - ;("$reference" reference) - (1 referrer) - (2 reference))) - references))) + (with-sql-statement + db add-reference-sql add-reference-statement () + (for-each (lambda (reference) + (run-statement + db add-reference-statement + ;("$referrer" referrer) + ;("$reference" reference) + (1 referrer) + (2 reference))) + references))) ;; XXX figure out caching of statement and database objects... later (define* (sqlite-register #:key dbpath path references deriver hash nar-size) @@ -232,11 +236,12 @@ path of the derivation that created the store item PATH, HASH is the base16-encoded sha256 hash of the store item denoted by PATH (prefixed with \"sha256:\") after being converted to nar form, and nar-size is the size in bytes of the store item denoted by PATH after being converted to nar form." - (with-sql-database dbpath db - (let ((id (update-or-insert #:db db - #:path path - #:deriver deriver - #:hash hash - #:nar-size nar-size - #:time (current-time)))) + (with-sql-database + dbpath db + (let ((id (update-or-insert #:db db + #:path path + #:deriver deriver + #:hash hash + #:nar-size nar-size + #:time (current-time)))) (add-references db id references)))) -- 2.13.0
>From e3619f840c69ea75669cf302fa3054ddc53aefb5 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <[email protected]> Date: Mon, 5 Jun 2017 21:31:24 -0500 Subject: [PATCH 4/7] guix: sql.scm: split into generic and store-specific parts. * guix/sql.scm (path-id-sql, path-id, update-sql, insert-sql, update-or-insert, add-reference-sql, add-references, sqlite-register): removed. (sqlite-parameter-index): new procedure. (sqlite-parameters): use sqlite-parameter-index, works with parameter names instead of indexes now. Updated callers. * guix/store/database.scm: new file. (path-id-sql, path-id, update-sql, insert-sql, update-or-insert, add-reference-sql, add-references, sqlite-register): added. * guix/store.scm: use (guix store database) instead of (guix sql). --- guix/sql.scm | 134 ++++++++++++------------------------------------ guix/store.scm | 2 +- guix/store/database.scm | 104 +++++++++++++++++++++++++++++++++++++ 3 files changed, 138 insertions(+), 102 deletions(-) create mode 100644 guix/store/database.scm diff --git a/guix/sql.scm b/guix/sql.scm index d5c72105b..6b6f7867d 100644 --- a/guix/sql.scm +++ b/guix/sql.scm @@ -21,7 +21,20 @@ #:use-module (system foreign) #:use-module (rnrs bytevectors) #:use-module (srfi srfi-9) - #:export (sqlite-register)) + #:export (sqlite-last-insert-rowid + sql-parameters + with-sql-statement + with-sql-database + run-sql + run-statement + single-result) + #:re-export (sqlite-step + sqlite-fold + sqlite-fold-right + sqlite-map + sqlite-prepare + sqlite-reset + sqlite-finalize)) ;; Miscellaneous SQL stuff, currently just setup for sqlite-register. Mostly ;; macros. @@ -37,16 +50,31 @@ "Gives the row id of the last inserted row in DB." (last-rowid ((@@ (sqlite3) db-pointer) db))))) +(define sqlite-parameter-index + (let ((param-index (pointer->procedure + int + (dynamic-func "sqlite3_bind_parameter_index" + (@@ (sqlite3) libsqlite3)) + (list '* '*)))) + (lambda (statement key) + "Gives the index of an sqlite parameter for a certain statement with a +certain (string) name." + (param-index ((@@ (sqlite3) stmt-pointer) statement) + (string->pointer key "utf-8"))))) + -;; Should I go from key->index here or try to change that in guile-sqlite3? (define-syntax sql-parameters (syntax-rules () "Converts key-value pairs into sqlite bindings for a specific statement." ((sql-parameters statement (name1 val1) (name2 val2) (name3 val3) ...) - (begin (sqlite-bind statement name1 val1) + (begin (sqlite-bind statement + (sqlite-parameter-index statement name1) + val1) (sql-parameters statement (name2 val2) (name3 val3) ...))) ((sql-parameters statement (name value)) - (sqlite-bind statement name value)))) + (sqlite-bind statement + (sqlite-parameter-index statement name) + value)))) (define* (step-all statement #:optional (callback noop)) "Step until statement is completed. Return number of rows." @@ -141,8 +169,7 @@ placeholder name and the value to use. Returns the number of rows." (lambda () (sqlite-reset statement)))))) -(define path-id-sql - "SELECT id FROM ValidPaths WHERE path = $path") + (define (single-result statement) "Gives the first element of the first row returned by statement." @@ -150,98 +177,3 @@ placeholder name and the value to use. Returns the number of rows." (if row (vector-ref row 0) #f))) - -(define* (path-id db path) - "If the path \"path\" exists in the ValidPaths table, return its -id. Otherwise, return #f. If you already have a compiled statement for this -purpose, you can give it as statement." - (with-sql-statement - db path-id-sql statement - (;("$path" path) - (1 path)) - (single-result statement))) - - -(define update-sql - "UPDATE ValidPaths SET hash = $hash, registrationTime = $time, deriver = -$deriver, narSize = $size WHERE id = $id") - -(define insert-sql - "INSERT INTO ValidPaths (path, hash, registrationTime, deriver, narSize) -VALUES ($path, $hash, $time, $deriver, $size)") - -(define* (update-or-insert #:key db path deriver hash nar-size time) - "The classic update-if-exists and insert-if-doesn't feature that sqlite -doesn't exactly have... they've got something close, but it involves deleting -and re-inserting instead of updating, which causes problems with foreign keys, -of course. Returns the row id of the row that was modified or inserted." - (let ((id (path-id db path))) - (if id - (begin - (run-sql - db update-sql - ;; As you may have noticed, sqlite-bind doesn't behave - ;; exactly how I was expecting... - ;; ("$id" id) - ;; ("$deriver" deriver) - ;; ("$hash" hash) - ;; ("$size" nar-size) - ;; ("$time" time) - (5 id) - (3 deriver) - (1 hash) - (4 nar-size) - (2 time)) - id) - (begin - (run-sql - db insert-sql - ;; ("$path" path) - ;; ("$deriver" deriver) - ;; ("$hash" hash) - ;; ("$size" nar-size) - ;; ("$time" time) - (1 path) - (4 deriver) - (2 hash) - (5 nar-size) - (3 time)) - (sqlite-last-insert-rowid db))))) - -(define add-reference-sql - "INSERT OR IGNORE INTO Refs (referrer, reference) SELECT $referrer, id -FROM ValidPaths WHERE path = $reference") - -(define (add-references db referrer references) - "referrer is the id of the referring store item, references is a list -containing store item paths being referred to. Note that all of the store -items in \"references\" should already be registered." - (with-sql-statement - db add-reference-sql add-reference-statement () - (for-each (lambda (reference) - (run-statement - db add-reference-statement - ;("$referrer" referrer) - ;("$reference" reference) - (1 referrer) - (2 reference))) - references))) - -;; XXX figure out caching of statement and database objects... later -(define* (sqlite-register #:key dbpath path references deriver hash nar-size) - "Registers this stuff in a database specified by DBPATH. PATH is the string -path of some store item, REFERENCES is a list of string paths which the store -item PATH refers to (they need to be already registered!), DERIVER is a string -path of the derivation that created the store item PATH, HASH is the -base16-encoded sha256 hash of the store item denoted by PATH (prefixed with -\"sha256:\") after being converted to nar form, and nar-size is the size in -bytes of the store item denoted by PATH after being converted to nar form." - (with-sql-database - dbpath db - (let ((id (update-or-insert #:db db - #:path path - #:deriver deriver - #:hash hash - #:nar-size nar-size - #:time (current-time)))) - (add-references db id references)))) diff --git a/guix/store.scm b/guix/store.scm index 2f16ec2b1..f32cdc6aa 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -44,7 +44,7 @@ #:use-module (ice-9 popen) #:use-module (web uri) #:use-module (sqlite3) - #:use-module (guix sql) + #:use-module (guix store database) #:export (%daemon-socket-uri %gc-roots-directory %default-substitute-urls diff --git a/guix/store/database.scm b/guix/store/database.scm new file mode 100644 index 000000000..ecf7ba4aa --- /dev/null +++ b/guix/store/database.scm @@ -0,0 +1,104 @@ +;;; Copyright © 2017 Caleb Ristvedt <[email protected]> +;;; +;;; This file is part of GNU Guix. +;;; +;;; GNU Guix is free software; you can redistribute it and/or modify it +;;; under the terms of the GNU General Public License as published by +;;; the Free Software Foundation; either version 3 of the License, or (at +;;; your option) any later version. +;;; +;;; GNU Guix is distributed in the hope that it will be useful, but +;;; WITHOUT ANY WARRANTY; without even the implied warranty of +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;;; GNU General Public License for more details. +;;; +;;; You should have received a copy of the GNU General Public License +;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. + +(define-module (guix store database) + #:use-module (guix sql) + #:export (sqlite-register)) + +;;; Code for working with the store database directly. + +(define path-id-sql + "SELECT id FROM ValidPaths WHERE path = $path") + +(define* (path-id db path) + "If the path \"path\" exists in the ValidPaths table, return its +id. Otherwise, return #f." + (with-sql-statement + db path-id-sql statement + (("$path" path)) + (single-result statement))) + + +(define update-sql + "UPDATE ValidPaths SET hash = $hash, registrationTime = $time, deriver = +$deriver, narSize = $size WHERE id = $id") + +(define insert-sql + "INSERT INTO ValidPaths (path, hash, registrationTime, deriver, narSize) +VALUES ($path, $hash, $time, $deriver, $size)") + +(define* (update-or-insert #:key db path deriver hash nar-size time) + "The classic update-if-exists and insert-if-doesn't feature that sqlite +doesn't exactly have... they've got something close, but it involves deleting +and re-inserting instead of updating, which causes problems with foreign keys, +of course. Returns the row id of the row that was modified or inserted." + (let ((id (path-id db path))) + (if id + (begin + (run-sql + db update-sql + ("$id" id) + ("$deriver" deriver) + ("$hash" hash) + ("$size" nar-size) + ("$time" time)) + id) + (begin + (run-sql + db insert-sql + ("$path" path) + ("$deriver" deriver) + ("$hash" hash) + ("$size" nar-size) + ("$time" time)) + (sqlite-last-insert-rowid db))))) + +(define add-reference-sql + "INSERT OR IGNORE INTO Refs (referrer, reference) SELECT $referrer, id +FROM ValidPaths WHERE path = $reference") + +(define (add-references db referrer references) + "referrer is the id of the referring store item, references is a list +containing store item paths being referred to. Note that all of the store +items in \"references\" should already be registered." + (with-sql-statement + db add-reference-sql add-reference-statement () + (for-each (lambda (reference) + (run-statement + db add-reference-statement + ("$referrer" referrer) + ("$reference" reference))) + references))) + +;; XXX figure out caching of statement and database objects... later +(define* (sqlite-register #:key dbpath path references deriver hash nar-size) + "Registers this stuff in a database specified by DBPATH. PATH is the string +path of some store item, REFERENCES is a list of string paths which the store +item PATH refers to (they need to be already registered!), DERIVER is a string +path of the derivation that created the store item PATH, HASH is the +base16-encoded sha256 hash of the store item denoted by PATH (prefixed with +\"sha256:\") after being converted to nar form, and nar-size is the size in +bytes of the store item denoted by PATH after being converted to nar form." + (with-sql-database + dbpath db + (let ((id (update-or-insert #:db db + #:path path + #:deriver deriver + #:hash hash + #:nar-size nar-size + #:time (current-time)))) + (add-references db id references)))) -- 2.13.0
>From ae3f83c82c46d8dec3dce5a3ff89805111c8e7ab Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <[email protected]> Date: Mon, 5 Jun 2017 22:34:59 -0500 Subject: [PATCH 5/7] guix: register-path: use new %store-database-directory * guix/config.scm.in (%store-database-directory): new variable. * guix/store.scm (register-path): use variables from (guix config) instead of using environment variables directly. --- guix/config.scm.in | 6 ++++++ guix/store.scm | 41 ++++++++++++++++++++++------------------- 2 files changed, 28 insertions(+), 19 deletions(-) diff --git a/guix/config.scm.in b/guix/config.scm.in index 8f2c4abd8..dfe5fe0db 100644 --- a/guix/config.scm.in +++ b/guix/config.scm.in @@ -1,5 +1,6 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2012, 2013, 2014, 2015, 2016 Ludovic Courtès <[email protected]> +;;; Copyright © 2017 Caleb Ristvedt <[email protected]> ;;; ;;; This file is part of GNU Guix. ;;; @@ -29,6 +30,7 @@ %store-directory %state-directory + %store-database-directory %config-directory %guix-register-program @@ -80,6 +82,10 @@ (or (getenv "NIX_STATE_DIR") (string-append %localstatedir "/guix"))) +(define %store-database-directory + (or (and=> (getenv "NIX_DB_DIR") canonicalize-path) + (string-append %state-directory "/db"))) + (define %config-directory ;; This must match `GUIX_CONFIGURATION_DIRECTORY' as defined in `nix/local.mk'. (or (getenv "GUIX_CONFIGURATION_DIRECTORY") diff --git a/guix/store.scm b/guix/store.scm index f32cdc6aa..77fd5b51e 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -1257,7 +1257,10 @@ makes a wrapper around a port which implements GET-POSITION." #:key (references '()) deriver prefix state-directory) ;; Priority for options: first what is given, then environment variables, - ;; then defaults. + ;; then defaults. %state-directory, %store-directory, and + ;; %store-database-directory already handle the "environment variables / + ;; defaults" question, so we only need to choose between what is given and + ;; those. "Register PATH as a valid store file, with REFERENCES as its list of references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is given, it must be the name of the directory containing the new store to @@ -1271,28 +1274,28 @@ be used internally by the daemon's build hook." (state-directory (string-append state-directory "/db")) (prefix - (string-append prefix %state-directory "/db")) - ((getenv "NIX_DB_DIR") - (getenv "NIX_DB_DIR")) - ((getenv "NIX_STATE_DIR") - (string-append (getenv "NIX_STATE_DIR") "/db")) + ;; If prefix is specified, the value of NIX_STATE_DIR + ;; (which affects %state-directory) isn't supposed to + ;; affect db-dir, only the compile-time-customized + ;; default should. + (string-append prefix %localstatedir "/guix/db")) (else - (string-append %state-directory "/db")))) + %store-database-directory))) (store-dir (if prefix - (string-append prefix %store-directory) - (or - (getenv "NIX_STORE_DIR") - (getenv "NIX_STORE") - %store-directory))) + ;; same situation as above + (string-append prefix %storedir) + %store-directory)) (to-register (if prefix - ;; note: we assume here that if path is, for example, - ;; /foo/bar/gnu/store/thing.txt, then an environment - ;; variable has been used to change the store - ;; directory to /foo/bar/gnu/store. - (string-append %store-directory "/" (basename path)) + (string-append %storedir "/" (basename path)) + ;; note: we assume here that if path is, for + ;; example, /foo/bar/gnu/store/thing.txt and prefix + ;; isn't given, then an environment variable has + ;; been used to change the store directory to + ;; /foo/bar/gnu/store, since otherwise real-path + ;; would end up being /gnu/store/thing.txt, which is + ;; probably not the right file in this case. path)) - (real-path (string-append store-dir "/" - (basename path)))) + (real-path (string-append store-dir "/" (basename path)))) (let-values (((hash nar-size) (nar-sha256 real-path))) (sqlite-register -- 2.13.0
>From caccdd578b2bb87a23d4bdbe29039595391bf768 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <[email protected]> Date: Tue, 6 Jun 2017 00:04:54 -0500 Subject: [PATCH 6/7] guix: register-path: reset timestamps after registering. * guix/store.scm (register-path): Now resets timestamps. --- guix/store.scm | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/guix/store.scm b/guix/store.scm index 77fd5b51e..cf08da632 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -45,6 +45,7 @@ #:use-module (web uri) #:use-module (sqlite3) #:use-module (guix store database) + #:use-module (gnu build install) #:export (%daemon-socket-uri %gc-roots-directory %default-substitute-urls @@ -1245,17 +1246,14 @@ makes a wrapper around a port which implements GET-POSITION." (values hash size))))) -;; TODO: make this canonicalize store items that are registered. This involves -;; setting permissions and timestamps, I think. Also, run a "deduplication -;; pass", whatever that involves. Also, handle databases not existing yet -;; (what should the default behavior be? Figuring out how the C++ stuff -;; currently does it sounds like a lot of grepping for global -;; variables...). Also, return #t on success like the documentation says we -;; should. +;; TODO: Run a "deduplication pass", whatever that involves. Also, handle +;; databases not existing yet (what should the default behavior be? Figuring +;; out how the C++ stuff currently does it sounds like a lot of grepping for +;; global variables...). Also, return #t on success like the documentation +;; says we should. (define* (register-path path - #:key (references '()) deriver prefix - state-directory) + #:key (references '()) deriver prefix state-directory) ;; Priority for options: first what is given, then environment variables, ;; then defaults. %state-directory, %store-directory, and ;; %store-database-directory already handle the "environment variables / @@ -1305,7 +1303,13 @@ be used internally by the daemon's build hook." #:deriver deriver #:hash (string-append "sha256:" (bytevector->base16-string hash)) - #:nar-size nar-size)))) + #:nar-size nar-size) + ;; reset-timestamps prints a message on each invocation that we probably + ;; don't want. + (with-output-to-port + (%make-void-port "w") + (lambda () + (reset-timestamps real-path)))))) ;;; -- 2.13.0
>From 23c5dfb1b32e8ba9e820ce9866fc44a28b5603c2 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <[email protected]> Date: Tue, 6 Jun 2017 02:44:41 -0500 Subject: [PATCH 7/7] guix: register-path: do deduplication. * guix/store.scm (get-temp-link, replace-with-link, deduplicate): new procedures. (register-path): uses deduplicate now. --- guix/store.scm | 47 ++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 40 insertions(+), 7 deletions(-) diff --git a/guix/store.scm b/guix/store.scm index cf08da632..6284736fa 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -43,7 +43,6 @@ #:use-module (ice-9 vlist) #:use-module (ice-9 popen) #:use-module (web uri) - #:use-module (sqlite3) #:use-module (guix store database) #:use-module (gnu build install) #:export (%daemon-socket-uri @@ -1246,11 +1245,44 @@ makes a wrapper around a port which implements GET-POSITION." (values hash size))))) -;; TODO: Run a "deduplication pass", whatever that involves. Also, handle -;; databases not existing yet (what should the default behavior be? Figuring -;; out how the C++ stuff currently does it sounds like a lot of grepping for -;; global variables...). Also, return #t on success like the documentation -;; says we should. +(define (get-temp-link target) + "Like mkstemp!, but instead of creating a new file and giving you the name, +it creates a new hardlink to TARGET and gives you the name." + (let try-again ((tempname (tmpnam))) + (catch + #t + (lambda () + (link target tempname) + tempname) + (lambda () + (try-again (tmpnam)))))) + +(define (replace-with-link target to-replace) + "Replaces the file TO-REPLACE with a hardlink to TARGET" + ;; According to the C++ code, this is how you replace it with a link + ;; "atomically". + (let ((temp-link (get-temp-link target))) + (delete-file to-replace) + (rename-file temp-link to-replace))) + +;; TODO: handling in case the .links directory doesn't exist? For now I'll +;; just assume it's the responsibility of whoever makes the store to create +;; it. +(define (deduplicate path store hash) + "Checks if a store item with hash HASH already exists. If so, replaces PATH +with a hardlink to the already-existing one. If not, it registers PATH so that +future duplicates can hardlink to it." + (let ((links-path (string-append store + "/.links/" + (bytevector->base16-string hash)))) + (if (file-exists? links-path) + (replace-with-link links-path path) + (link path links-path)))) + +;; TODO: Handle databases not existing yet (what should the default behavior +;; be? Figuring out how the C++ stuff currently does it sounds like a lot of +;; grepping for global variables...). Also, return #t on success like the +;; documentation says we should. (define* (register-path path #:key (references '()) deriver prefix state-directory) @@ -1309,7 +1341,8 @@ be used internally by the daemon's build hook." (with-output-to-port (%make-void-port "w") (lambda () - (reset-timestamps real-path)))))) + (reset-timestamps real-path))) + (deduplicate real-path store-dir hash)))) ;;; -- 2.13.0
> Thank you, and thumbs up for the quality work so far! > > Ludo’. Thanks!
