Jonathan Kaye wrote:
Harold Fuchs wrote:
Not using a Calc macro. If it were me I'd export the sheet as a CSV file,
write a Perl script to generate a new [correctly formatted] CSV file and
import that into a new sheet. I doubt a suitable Perl script would be more
than about 10 lines of *un*obfuscated code.
Hi Harold,
I tried it out using Unicon on the csv file. I used ISO 8859-15 encoding
which took care of the kinkier characters. It's a bit more than 10 lines but
when you take out the i/o stuff and the pretty formatting for ease of
reading it comes to about that. I had to use "=" as a field delimiter since
commas are crucial to splitting the records. The unary "\" operator is a
test for non-nullness. Here's the code:
---------------------------------------------------------------
procedure main()
datadir := "/home/jdkaye/MYPROGS/Data/"
outdir := "/home/jdkaye/MYPROGS/Output/"
intext := open(datadir || "8_sept_sample3.csv") | stop("can't open data
file")
outtext := open(outdir || "8_sept_sample3_fixed.csv", "w") | stop("can't
open output file")
while entry := read(intext) do {
entry ? if ((gloss := tab(upto('='))) & rem := tab(0)) then {
if gloss == "" then
next
while \find(",", gloss) do {
gloss ? if ((gl := tab(upto(','))) & move(1) &
nrem := tab(0)) then {
write(outtext, gl, rem)
gloss := nrem
}
}
}
write(outtext, gloss, rem)
}
end
---------------------------------------------------------------------------
Not too bad, eh? Thanks for the tip.
Jonathan
Hmmm.
Exactly 10 lines of Perl:
#!/usr/bin/perl
while (<>) {
($field1,@fields)=split(/;/,$_);
$field1 =~ s/"//g;
@subfields=split(/,/,$field1);
$list=join(";",@fields);
foreach $subfield (@subfields) {
print "\"$subfield\";$list";
}
}
Assuming the program is named "splitter.pl", use it as
splitter.pl <input_file >output_file
in other words the script reads stdin and writes stdout.
NB I saved the spreadsheet in CSV format using semicolon as the
delimiter to avoid confusion with the commas in column A. Now I can
split the columns on semicolon and the column A value on comma without
parsing problems.
Not a bad guess :-)
--
Harold Fuchs
London, England
Please reply *only* to [email protected]