Socialtext provides a relatively sane REST API to their Wiki service, and so I implemented a CVS-style interface that lets you edit Wiki pages at your leisure and merge them back up to the Wiki server when you feel like it.
Aside from the usual Web stuff (LWP, URI) this depends on the JSON module. I've tested it with version 1.00 of the JSON module from Debian. I'll put this up under a free-software license on http://pobox.com/~kragen/sw/stclient shortly, but I have to sleep first. For the time being, just notice that it has a notice that it is not in the public domain. #!/usr/bin/perl -w # -*- mode: cperl; encoding: utf-8 -*- use strict; use LWP::UserAgent; use HTTP::Request; use URI; # XXX there are a couple of inline modules here that probably should # be split out into separate files for maintainability, but having # them inline here makes this program easier to distribute. =head1 NAME st - CVS-style interaction with a Socialtext WikiWikiWeb =head1 SYNOPSIS st checkout http://www.socialtext.net/your-wiki-name cd your-wiki-name vi homepage st diff -u st update st diff -u st commit =head1 DESCRIPTION C<st> is an example application of Socialtext's REST API. It checks out a copy of your Socialtext Wiki into your local filesystem and lets you edit the text of your Wiki pages with your text editor of choice, and then upload them again later, possibly merging in changes. C<st checkout> or C<st co> prompts you for your email address and Socialtext password, creates the work directory, and downloads all of the pages in your Wiki. This takes a while, and won't be practical on a sufficiently large Wiki, but you only need to do it once. C<st diff> or C<st di> shows the edits you currently have waiting to be committed. It produces unified diff output by default; C<st diff -c> will give you context diffs instead. This command works even when you have no network access, such as when you're on a commercial flight. C<st update> or C<st up> updates the pages in your filesystem to the current version on the Wiki server, trying to automatically merge in any changes you have made locally. C<st commit> or C<st ci> uploads any edits you have made to the Wiki server. =head1 FILESYSTEM LAYOUT C<st> creates a directory named after your Socialtext workspace as a child of the current directory. Inside that directory, it creates a .st directory containing the following files: =over 4 =item C<user> Your email address. =item C<pass> Your Socialtext password. =item C<url> The URL of the Wiki. =item C<pristine> A directory containing the last versions of all of the pages downloaded from the server, one per file. =item C<etags> A directory containing the entity tags ("ETags") of all of the pages downloaded from the server, one per file. =back =head1 BUGS =over 4 =item * Asks you for your username and password even for unparsable URLs. =item * Has a lost-update race condition when saving. L<http://www.socialtext.net/st-rest-docs/index.cgi?http_status_codes> doesn't list C<412 Precondition Failed> (see L<http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.13>) so there probably isn't any way to avoid that. Should use If-Match (L<http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24>) or If-Unmodified-Since (L<http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.28>) in order to try to avoid that. =item * Will fail if someone puts a newline in the user or password file using C<vi>. =item * Won't be able to merge if C<diff3> isn't installed. C<diff3> is an "essential" package on Debian and has been part of Unix since at least the 1980s, so this probably won't be a problem on any Unix system. =item * Won't be able to show diffs if C<diff> isn't installed. =item * No support for tags. =item * No support for attachments. =item * No support for comments. =item * No support for users, breadcrumbs, backlinks. =item * Very inefficient at figuring out whether files have been locally modified. =item * Emits useless message apparently when there's a conflict: C<diff3 exited with 256. at ../st line 442.> =item * Doesn't keep you from committing with conflicts. =back =head1 COPYRIGHT Copyright 2008 Kragen Javier Sitaker. =cut my $sample_url = 'http://www.socialtext.net/st-rest-docs'; { package Dir; # "Dir" package to simplify reading and writing a bit. Does this # maybe exist in the standard library yet? sub new { my ($class, $dir) = @_; return bless { dir => $dir }, $class; } sub create { my ($self) = @_; mkdir $self->{dir} or die "Can't create directory $self->{dir}: $!"; } # XXX surely this is in the standard Perl library, right? This # version won't work on old MacOS or VMS --- does anyone still care? sub _join { my ($dir, $filename) = @_; return "$dir/$filename"; } sub child { my ($self, $filename) = @_; return _join($self->{dir}, $filename); } sub _read_file { my ($fn) = @_; open my $fh, '<', $fn or die "Couldn't open $fn: $!"; return do { local $/; <$fh> } } sub _write_file { my ($fn, $content) = @_; my $fn_new = "$fn.new.$$"; eval { open my $fh, '>', $fn_new or die "Couldn't open $fn_new: $!"; print $fh $content or die "Couldn't write to $fn_new: $!"; close $fh or die "Couldn't write to $fn_new: $!"; rename $fn_new, $fn or die "Couldn't rename $fn_new to $fn: $!"; }; unlink $fn_new; die $@ if $@; } sub get { my ($self, $filename) = @_; return _read_file($self->child($filename)); } sub set { my ($self, $filename, $content) = @_; return _write_file($self->child($filename), $content); } sub has { my ($self, $filename) = @_; return -e $self->child($filename); } sub subdir { my ($self, $subdirname) = @_; return Dir->new($self->child($subdirname)); } sub filenames { my ($self) = @_; opendir my $dh, $self->{dir}; return grep { $_ ne '.' and $_ ne '..' } readdir $dh; } sub pathnames { my ($self) = @_; return map { $self->child($_) } $self->filenames; } } { # An HTTP response for an x.socialtext-wiki page in the workspace. package Page; sub new { my ($class, $response) = @_; return bless { resp => $response }, $class; } sub content { my ($self) = @_; return $self->{resp}->content; } sub etag { my ($self) = @_; my $etag = $self->{resp}->header('ETag'); die "No ETag for page: " . $self->content if not defined $etag; return $etag; } } { # Encapsulates both the filesystem directory and the HTTP user-agent # to talk to the server. package Client; use JSON qw(); sub new { my ($class, $url, $user, $pass) = @_; my $uo = URI->new($url); my $local = $uo->path; unless ($local =~ [EMAIL PROTECTED]/([^/]+)\Z@) { die "Couldn't parse URL $local --- should be something like $sample_url\n"; } my $workspace_name = $1; my $ua = LWP::UserAgent->new; $ua->credentials($uo->host_port, "Socialtext", $user, $pass); return bless { url_obj => $uo, workspace => $workspace_name, user => $user, pass => $pass, ua => $ua, }, $class; } sub new_from_dir { my ($class, $dirname) = @_; my $dir = Dir->new($dirname); my $conf = $dir->subdir('.st'); # XXX duplication my $self = $class->new($conf->get('url'), $conf->get('user'), $conf->get('pass')); $self->{dir} = $dir; return $self; } sub conf { my ($self) = @_; return $self->{dir}->subdir('.st'); } sub pristine { my ($self) = @_; return $self->conf->subdir('pristine'); } sub etags { my ($self) = @_; return $self->conf->subdir('etags'); } sub data_url { my ($self, $trailing_path) = @_; my $uo = $self->{url_obj}->clone; $uo->path("/data/workspaces/$self->{workspace}/$trailing_path"); return $uo; } sub http_get { my ($self, $url, $content_type) = @_; my $req = HTTP::Request->new(GET => $url->as_string); $req->header(Accept => $content_type); my $resp = $self->{ua}->request($req); die "HTTP request failure: " . $resp->as_string if $resp->is_error; return $resp; } sub get_json_data { my ($self, $trailing_path) = @_; my $resp = $self->http_get($self->data_url($trailing_path), 'application/json'); return JSON::jsonToObj($resp->content); } sub get_page { my ($self, $page_id) = @_; return Page->new($self->http_get($self->data_url("pages/$page_id"), 'text/x.socialtext-wiki')); } # Sample page datum from JSON index: # { # 'page_uri' => 'https://www.socialtext.net/kregan-test-space/index.cgi?meeting_agendas', # 'page_id' => 'meeting_agendas', # 'name' => 'Meeting agendas', # 'modified_time' => '1205267435', # 'tags' => [ # 'Welcome', # 'Recent Changes' # ], # 'uri' => 'meeting_agendas', # 'revision_id' => '20080311203035', # 'workspace_name' => 'kregan-test-space', # # (yeah, somebody misspelled my name) # 'last_edit_time' => '2008-03-11 20:30:35 GMT', # 'last_editor' => '[EMAIL PROTECTED]', # 'revision_count' => '1' # }, sub list_pages { my ($self) = @_; return @{$self->get_json_data('pages')}; } sub outdated_files { my ($self) = @_; my @filenames = $self->etags->filenames; my %revision_ids = (); foreach my $page ($self->list_pages) { $revision_ids{$page->{page_id}} = $page->{revision_id}; } return grep { not $self->etags->has($_) or $self->etags->get($_) ne $revision_ids{$_} } keys %revision_ids; } # Called when a page has been updated from the server. sub updated { my ($self, $id, $resp) = @_; $self->pristine->set($id => $resp->content); $self->etags->set($id => $resp->etag); } # Download and save a page from the server; assumes no local changes sub update_page { my ($self, $id) = @_; print "U $id\n"; # XXX shouldn't this be in "command-line processing"? my $resp = $self->get_page($id); $self->{dir}->set($id => $resp->content); $self->updated($id => $resp); return $resp; } # Sets up a new workspace directory. sub create_dir { my ($self) = @_; $self->{dir} = Dir->new($self->{workspace}); if (-e $self->{workspace}) { die "The subdirectory '$self->{workspace}' already exists here.\n"; } $self->{dir}->create(); $self->conf->create(); $self->conf->set(user => $self->{user}); $self->conf->set(pass => $self->{pass}); $self->conf->set(url => $self->{url_obj}->as_string); $self->pristine->create(); $self->etags->create(); for my $page ($self->list_pages) { my $resp = $self->update_page($page->{page_id}); # We insist on the following because it allows us to find out # which pages have been updated with a single collection GET # request, rather than a HEAD request for each page: my $etag = $resp->etag; die "ETag for page $page->{page_id} is $etag " . "but should be $page->{revision_id}" if $etag ne $page->{revision_id}; } } # Merges updates from a single file. sub merge_updates { my ($self, $filename) = @_; my $tmpname = ".tmp.$$"; my $newfile = ".tmp2.$$"; eval { my $page = $self->get_page($filename); $self->{dir}->set($tmpname, $page->content); # XXX assumes {dir} means '.' open(my $pipe, '-|', 'diff3', '-m', $filename, $self->pristine->child($filename), $tmpname) or die "Can't open pipe: $!"; open my $newfh, '>', $newfile or die "Can't open $newfile: $!"; while (<$pipe>) { print $newfh $_; } if (not close $pipe) { die "popen of diff3 failed: $!" if $! != 0; warn "diff3 exited with $?." } close $newfh or die "Couldn't write to $newfile: $!"; rename $newfile, $filename; $self->updated($filename => $page); }; unlink $tmpname; unlink $newfile; die $@ if $@; } # Return a list of locally modified files. sub modified_files { my ($self) = @_; return grep { $self->{dir}->get($_) ne $self->pristine->get($_) } $self->pristine->filenames; } # Upload a (hopefully-up-to-date) file to the server. sub commit_file { my ($self, $filename) = @_; my $data = $self->{dir}->get($filename); # XXX hope this is a byte string my $req = HTTP::Request->new(PUT => $self->data_url("pages/$filename")); # XXX the documentation doesn't say the server supports If-Match, # but if it gets correct support for it, then we'll get # concurrency-safe updates without any further changes to this # code. $req->header("If-Match" => $self->etags->get($filename)); $req->header("Content-Type" => "text/x.socialtext-wiki"); $req->header("Content-Length" => length $data); $req->content($data); my $response = $self->{ua}->request($req); if ($response->is_error) { die "Error uploading new version of $filename: " . $response->as_string; } $self->update_page($filename); } } # Ask the user for a line of input. sub ask { my ($string) = @_; print $string; my $result = <STDIN>; chomp $result; return $result; } ### User command-line processing. sub checkout { my ($url) = @_; die_with_usage() if not $url; my $user = ask('Email address you use for your Socialtext account: '); my $pass = ask('Socialtext password (will be echoed): '); my $client = Client->new($url, $user, $pass); $client->create_dir(); } sub diff { my (@flags) = @_; @flags = qw(-u) if not @flags; # specify --normal if you're a masochist foreach my $filename (Client->new_from_dir('.')->pristine->pathnames) { system 'diff', @flags, $filename, '.'; } } sub update { my $client = Client->new_from_dir('.'); foreach my $filename ($client->outdated_files) { $client->merge_updates($filename); print "M $filename\n"; } } sub commit { my $client = Client->new_from_dir('.'); # XXX need to check for unresolved conflicts my @files = $client->outdated_files; if (@files) { for my $filename (@files) { print "M $filename\n"; } die "Bring the above files up-to-date with $0 update before committing.\n"; } foreach my $filename ($client->modified_files) { $client->commit_file($filename); print "! $filename\n"; } } sub die_with_usage { die <<EOF; $0: usage: one of the following: $0 checkout $sample_url $0 diff $0 update $0 commit EOF } my %cmds = ( co => \&checkout, checkout => \&checkout, di => \&diff, diff => \&diff, up => \&update, update => \&update, ci => \&commit, commit => \&commit, ); my $cmd = shift @ARGV; die_with_usage if not $cmd or not $cmds{$cmd}; $cmds{$cmd}->(@ARGV);