A couple of people asked about this, so here it is under a sensible title so
people can find it in the archives.

Below is a perl script which does a reasonable job of converting .svx
files to .th files. Written by Olly.

I have noticed a few problems, which I might as well document here as as
good a place as any: (I was going to tidy them up and send them to Ol as I
was given said script for 'testing').

*calibrate declination comes out wrong. It should be:
*calibrate declination n ->  declination n degrees

therion doesn't understand 'ignoreall' - needs commenting out

Doesn't try to deal with different 'team' syntax - but could be fixed to get
most of them right ((lower case them all), 'pics'->'pictures', 'disto'->'length
#disto' etc)

Only significant problem was with 'overview' files which include a number of 
others.
* All the equates need enclosing in 'centreline'/'endcentreline'
* The converter reads in any 'included' files and inserts them. It stopped
after doing two of these and truncated the file. It should probably just
convert '*include foo' to 'input foo.th'

It once inserted a top-level 'dummy' survey where one wasn't needed. This is
a feature of converting files individually rather than as a coherent dataset,
I suspect. It could probably do a better job if it 'spidered' the dataset
from a top-level overview file, but that's a lot more work  :-)

I found one case where the ';' remained for the comment char instead of it
getting converted to '#'. I need to look at that or send ol the offending
file.

Then you can do this to convert a whole directory:
for FILE in ls *.svx; do FILE=echo $FILE | sed "s/.svx//"; echo "Converting
$FILE.svx"; svx2th $FILE.svx > $FILE.th; done

#!/usr/bin/perl -w
use strict;
# svx2th v0.1
# Copyright (C) Olly Betts 2004

sub convert_file($);

my $in_survey = 0;
my $had_fix = 0;
print "encoding iso8859-1\n";
for my $filename (@ARGV) {
    convert_file($filename);
}
if ($in_survey) {
    print "endsurvey dummy\n";
}

sub convert_file($) {
    my $filename = shift;
    open F, "<", $filename or die "$filename: $!\n";
    my @lines = <F>;
    close F;
    my $in_centre_line = 0;
    my $lineno = 0;
    my $dummy_survey = -1;
    foreach $_ (@lines) {
        ++$lineno;
        # Replace ; with # as comment separator.
        # FIXME won't cope with ; in a filename or *title
        s/;/#/;

        # Comment out "*export" and "*entrance" as there seems to be no
        # equivalent of either.
        if (/^\s*\*\s*(?:export|entrance)\b/i) {
            print "#$_";
            next;
        }

        if ($in_centre_line) {
            if (/^\s*\*/ && 
!/^\s*\*\s*(?:date|calibrate|fix|equate|data|instrument|units|sd|infer|flags|team)\b/i)
 {
                print "endcentreline\n";
                $in_centre_line = 0;
            }
        } else {
            if (/^\s*[^\s*#]/ || 
/^\s*\*\s*(?:date|calibrate|fix|equate|data|instrument|units|sd|infer|flags|team)\b/)
 {
                if (!$in_survey) {
                    # Therion can't handle these outside a centreline which
                    # must be inside a survey, so we have to add a dummy
                    # top-level survey.
                    print "survey dummy -title \"Therion is crap\"\n";
                    ++$in_survey;
                }
                print "centreline\n";
                $in_centre_line = 1;
            }
        }

        # *begin <survey> -> survey <survey>
        if (s/^(\s*)\*(\s*)begin\b(\s*)(\S+)/$1$2survey$3$4 -title "$4" /i) {
            ++$in_survey;
            print $_;
            next;
        }
        # *begin -> <nothing>
        # The *begin will cause an endcentreline / centreline pair to be
        # output which hopefully prevents settings from escaping.  However
        # this doesn't restore the old settings, just undoes any new ones.
        # FIXME: Need to address this somehow...
        if (/^(\s*)\*(\s*)begin\b[ \t]*$/i) {
            ++$in_survey;
            if ($dummy_survey != 0) {
                # FIXME: doesn't coped with nested *begin with no arguments...
                die "This convertor doesn't currently handle nested *begin with 
no arguments\n";
            }
            $dummy_survey = $in_survey;
            print "#$_";
            next;
        }
        # *end [<survey>] -> endsurvey [<survey>]
        if (s/^(\s*)\*(\s*)end\b/$1$2endsurvey/i) {
            if ($dummy_survey == $in_survey) {
                $_ = "#$_";
                $dummy_survey = -1;
            }
            --$in_survey;
            if ($in_centre_line) {
                print "endcentreline\n";
                $in_centre_line = 0;
            }
            print $_;
            next;
        }
        # *title <title> -> # -title <title>
        # FIXME: just comment out for now - should really convert to -title on
        # the "survey" line.
        if (s/^(\s*)\*\s*title\b\s*/$1# -title /i) {
            print $_;
            next;
        }
        # *team and *instrument format is unspecified in Survex, and they're
        # just informational so comment them out for now...
        # NB therion seems to be case sensitive so "Compass" isn't a valid role
        # ("compass" is)...
        # NB in *team pics -> pictures
        if (s/^(\s*)\*(\s*(?:team|instrument))\b/#$1$2/i) {
            print $_;
            next;
        }
        # *include -> literal text inclusion.
        # Note that the *include means an implicit *begin, but the output
        # may not reflect this correctly (since we can't handle *begin
        # with no survey name anyway...)
        if (/^\s*\*\s*include\s*"?([^"\s]*)/i) {
            # Use Unix path separators (/ not \) - Survex understands either on
            # either platform.
            my $filename = $1;
            $filename =~ s!\\!/!g;
            $filename .= '.svx' unless $filename =~ /\.svx$/i;
            convert_file($filename);
            next;
        }
        # survey.subsurvey.12 -> 12 at subsurvey.survey
        if (s/^(\s*)\*(\s*equate)\b/$1$2/i) {
            # Ensure that a comment separator doesn't get eaten by station name.
            s/(\S)#/$1 #/;
            s/(\S+)\.(\S+)/"$2\@".join(".",reverse split m!\.!, $1)/ge;
            print $_;
            next;
        }
        if (/^\s*\*\s*fix\b/i) {
            $had_fix = 1;
        }
        s/^(\s*)\*/$1/;
        print;
    }
}

Wookey
--
Aleph One Ltd, Bottisham, CAMBRIDGE, CB5 9BA, UK Tel +44 (0) 1223 811679
work: http://www.aleph1.co.uk/ play: http://www.chaos.org.uk/~wookey/



Reply via email to