[sqlite] Re: [Monotone-devel] [sqlite] disk locality (and delta storage)

2006-02-11 Thread drh
Daniel Carosone <[EMAIL PROTECTED]> wrote:
> 
> > Just type (for example):
> > 
> >monotone httpserver &
> > 
> > and then point your webbrowser at 127.0.0.1.
> 
> My personal favourite is jetty in Java for this kind of thing; I'm not
> sure monotone itself should grow a http server :) 

The built-in webserver need not have all the features or
performance of apache.  Something very simple will suffice.
CVSTrac has the ability to act as its own web server, or 
to let inetd be the "server" that calls CVSTrac on each 
request.  Both features combined are implemented in less 
than 150 lines of C code.

To see the code visit
http://www.cvstrac.org/cvstrac/getfile/cvstrac/cgi.c
Search for cgi_http_server and cgi_handle_http_request.

--
D. Richard Hipp   <[EMAIL PROTECTED]>



[sqlite] Re: [Monotone-devel] [sqlite] disk locality (and delta storage)

2006-02-11 Thread Nathaniel Smith
On Fri, Feb 10, 2006 at 07:06:27PM -0500, [EMAIL PROTECTED] wrote:
> Daniel Carosone <[EMAIL PROTECTED]> wrote:
> > Wiki pages doesn't seem so hard, they're pretty much text documents
> > stored in a VCS anyway. 
> 
> There are some complications.  Each wiki page acts if it where
> its own independent project or branch.  And then you probably want
> some way see all current leaves and to do merges from the web
> interface.  
>
> If you intend your system to be publically accessible, then
> you will also need some mechanism to delete (or at least 
> permanently hide from public view) the spam that miscreants
> occasionally post on wiki pages and sometimes also on tickets.
> Some kind of "death cert" perhaps.

These all seem reasonably straightforward; basically just UI issues.

The trickiest bit is the making each page independent bit... but you
can just do that by having each page be a single file named "page.txt"
in its own branch.  Alternatively, sometimes it is very nice to allow
renaming in a wiki.  (I have definitely wanted this feature, when I
realized that the spec I have been working out on a wiki page actually
should be called something different than what it currently is.)  You
can allow this pretty straightforwardly even in the distributed wiki
case if you just use monotone's rename support...

> > Bug tracking state would require a little
> > more smarts, with suitable text formats stored in the VCS and
> > code/hooks to assist in making sane merges between multiple heads of a
> > given ticket, but even that doesn't seem terribly hard.
> 
> My thinking about tickets is that each change to a ticket
> is just a small, signed, immutable record (like a cert)
> that contains a variable number of name/value pairs.  The
> names are things like "title", "description", "status",
> "assigned-to", etc.  To reconstruct the current state of
> a ticket, you sort all the pieces by their creation time
> (their actual creation time, not the time they happened to
> arrive at your database) and read through them one by one.
> Values overwrite prior values with the same name.  So
> in the end, you have one value for each field name - and
> that is the current state of your ticket.

You seem to be reinventing history tracking with certs.

It seems a lot simpler to just make tickets, like, a text file, and
then use the existing history tracking tools.  They're quite tuned and
involve some non-obvious bits.  In particular, compared to this
scheme, they're insensitive to clock skew, they provide real merging
(in your scheme, if I change a field locally, and you change the same
field locally, then whichever of us has our clock set later at that
time will silently clobber the other person), and users can
investigate the history of a ticket using normal history browsing
tools...

> It is also very useful to have certs that record a link 
> between a revision and a ticket.  In this way you can record 
> that a bug was introduced by, or fixed by, a particular 
> revision.  The linkage between tickets and revisions has 
> proven to be particularly valuable in CVSTrac.

Indeed!  Enabling this kind of workflow integration is _exactly_ why
we have this generic cert thing built in.  No-one's done much with it
yet, though, so really we don't even know how far the possibilities go
-- so I get all excited when someone does :-).

> Once you have cross linking between revisions and tickets,
> if you add a google-like search you then have a very powerful
> system for doing source tree archeological work.  This comes
> up a lot in maintaining SQLite.  Somebody on the web reports
> a problem.  I have some vague memory of fixing a similar
> problem 9 months ago, but do not recall where it was or how
> I fixed it.  By using the search feature of CVSTrac together
> with the links between tickets and check-in comments, I can
> quickly find the historical problem and pinpoint exactly when
> and how the problem was fixed.  Then post a URL to the specific
> ticket that described the problem and the specific revision that
> fixed it.

Yep!  Monotone has much more structured information than CVS, and it's
much easier to get at; one can get at all sorts of things.  You can
track not just the list of commits, but things like branches, figure
out whether they're merged or not yet, see which branches work is
occurring on...

Or how about another cool use of certs:
  http://mt.xaraya.com/com.xaraya.core/index.psp


I think the move from CVS to systems with simpler, saner data models
opens up huge opportunities for better visualization, data mining,
data navigation, community awareness, workflow management... sadly, I
never get to work on this, because I end up spending my time trying to
make the basic pieces just work :-).  But I can't wait to see what
other people come up with, and can cheer loudly while they do...

-- Nathaniel

-- 
So let us espouse a less contested notion of truth and falsehood, even
if it is philosophically debatable (if we 

[sqlite] Re: [Monotone-devel] [sqlite] disk locality (and delta storage)

2006-02-10 Thread drh
Daniel Carosone <[EMAIL PROTECTED]> wrote:
> 
> It seems a little odd to me to build a centralised, online
> information system for tracking state and documenting activity around
> and about source code in a distributed and disconnected VCS.  
> 

Ah yes, you're right.  But in the system I envision, the wiki
and bug-tracking are also decentralized, disconnected, and
distributed.

> 
> What I'd really like to see is something that, instead of just
> plugging into monotone to show source code state and patches, actually
> used monotone for all storage and 'information transport', and allowed
> developers to update the wiki pages and bug tracking information in
> the same way they can update the code: offline, with syncs and merges
> later as needed.  

This is more in line with my thinking.

> 
> Wiki pages doesn't seem so hard, they're pretty much text documents
> stored in a VCS anyway. 

There are some complications.  Each wiki page acts if it where
its own independent project or branch.  And then you probably want
some way see all current leaves and to do merges from the web
interface.  

If you intend your system to be publically accessible, then
you will also need some mechanism to delete (or at least 
permanently hide from public view) the spam that miscreants
occasionally post on wiki pages and sometimes also on tickets.
Some kind of "death cert" perhaps.

> Bug tracking state would require a little
> more smarts, with suitable text formats stored in the VCS and
> code/hooks to assist in making sane merges between multiple heads of a
> given ticket, but even that doesn't seem terribly hard.

My thinking about tickets is that each change to a ticket
is just a small, signed, immutable record (like a cert)
that contains a variable number of name/value pairs.  The
names are things like "title", "description", "status",
"assigned-to", etc.  To reconstruct the current state of
a ticket, you sort all the pieces by their creation time
(their actual creation time, not the time they happened to
arrive at your database) and read through them one by one.
Values overwrite prior values with the same name.  So
in the end, you have one value for each field name - and
that is the current state of your ticket.

This approach gives you automatic merging and a complete
change history/audit trail.  Some people are initially shocked
that any user can update any field of the ticket, but that
kind of openness is in keeping with the wiki tradition and
has actually been used very successfully in CVSTrac.  If
you wanted to restrict changes on selected fields, you could
just ignore the name/value pairs for that field on certs
from unauthorized users.

Tickets also benefit from having "remarks" that people can
append to the ticket (without overwriting) and attachments.
Both are handled by separate certs. 

It is also very useful to have certs that record a link 
between a revision and a ticket.  In this way you can record 
that a bug was introduced by, or fixed by, a particular 
revision.  The linkage between tickets and revisions has 
proven to be particularly valuable in CVSTrac.

Once you have cross linking between revisions and tickets,
if you add a google-like search you then have a very powerful
system for doing source tree archeological work.  This comes
up a lot in maintaining SQLite.  Somebody on the web reports
a problem.  I have some vague memory of fixing a similar
problem 9 months ago, but do not recall where it was or how
I fixed it.  By using the search feature of CVSTrac together
with the links between tickets and check-in comments, I can
quickly find the historical problem and pinpoint exactly when
and how the problem was fixed.  Then post a URL to the specific
ticket that described the problem and the specific revision that
fixed it.

> 
> It could still be a web ui if people find that comfortable, just one
> that developers would often run pointing the browser at a local
> server, with a commit at the end of a session, and a later sync and
> merge. Of course, you could/would always have an instance of this
> running on a public webserver in a well-known place, just like
> monotone projects typically do for their source databases: for
> convenience, rather than necessity.

My thinking exactly.  I want the ability to drop a standalone
binary into a cgi-bin on a $7/mo hosting site and have an
instant sourceforge for some small project.  It is also nice
to be able to run things out of a database file in your home
directory.  Or do both.

Some things, like ticket reports, tend to want to use a web
interface.  So how do you do that when you're riding on an
airplane or otherwise cut off from your favorite server?
Just type (for example):

   monotone httpserver &

and then point your webbrowser at 127.0.0.1.

> 
> Some of these things might eventually go into monotone itself, perhaps
> building more tools for project policy and practice assistance around
> base mechanisms such as certs and the DAG structure.  More