--------
A user of my tune finder pointed out an especially bad example of
line wrapping, by way of asking me if there was something wrong with
my abc conversion code. You can see the tune by asking the tune
finder for Highlanders Farewell to Ireland, which will produce four
matches. The last one is my copy in my Test directory, which looks
like this:
X:95
T:Test: Wrap 1 (Highlanders Farewell to Ireland, The)
M:C|
L:1/8
Q:86
R:Strathspey
K:C
"Am"A,2 CB, A,/2A,/2A,A,>^F|"G"G>E D<B, G,/2G,/2G,G,>G|"Am"E>D E<G A>BA>
G|"Em"E<GD>B, "Am"A,/2A,/2A,A,2:|!
|:a>g a>b a/2a/2aa2|"G"g>ag>e d/2c/2B/2A/2 G>g|"Em"e>d e<g "Am"a>b a>g|"
C"e<g "G"d>B "Am"A/2A/2AA2:|!
|:e>A A/2A/2A e>Ae2|"G"d>G G/2G/2G B>G B<d|"Am"e>A A/2A/2A e>d e<a|"G"g>
e d>B "Am"A/2A/2AA2:|!
|:"Am"a>ba>e a>ba>e|"G"g>ag>e d/2c/2B/2A/2 G>g|"Em"e>d e<g "Am"a>b a>g|"
C"e<g "G"d>B "Am"A/2A/2AA2:||
I decided to capture this in my own file (in case Chris Ricker sees
this and decides to fix it ;-). Meanwhile, I've added a bit of a
kludge to my tune-extraction code that is in fact successful at
recovering from the damage here. If you use the tune finder to return
PS or PNG for this tune, it will look good. Before I added the
kludge, the tune was hopelessly garbled. It's easy to see why. In the
second (fourth?) music line, the initial C"e<g... has chord that look
like notes and vice versa, due to a line feed being inserted inside a
quoted string.
I'm mentioning this with the idea that maybe people have some good
ideas for heuristics to recover from such damage. There are several
possibilities here.
First, my kludge just looks for the ! at the end of the line, as a
clue that this is from a Microsoft system, which is the origin of
most of the line wrapping that I've seen. (Unix users have beat up
their email providers so badly over this issue that most unix-based
email software now just passes the message along, and doesn't try to
"fix" any of the text - except for that silly ">From" that is still
with us. But we don't know how to stop the Microsoft crowd from being
"user friendly" like this. And most of them have never even seen a
punch card. ;-)
The final ! isn't enough of a clue by itself. My kludge only combines
the lines if the preceding line ends with a few characters that
should never end ABC lines, [<>"] at present. This is to prevent
merging the line with a preceding header line. There is a potential
gotcha here: A header line inside the music might end with some of
these. For instance a w: line can. Maybe a more sophisticated test
is needed for a preceding header line.
Also, my kludge didn't catch the very last line at first, because it
doesn't end with ! but with a bar line. So I added a second kludge,
which looks for the invalid :|| token, and tries the same merger. I'm
not sure which programs generate :||, but I've only seen it from
Windows tools. My kludgery can also reduce it to :| while it's
looking, so this is a useful thing in general.
But I sorta suspect that the heuristics I've implemented aren't all
that good. And they certainly won't catch all the cases of line
wrapping. So I thought I'd toss it out to the user crowd, and see if
any good ideas pop up.
I've been seriously considering making my tune extractor do a full
parse of the ABC, and canonicalize it into something that abc2ps can
handle. I also use abc2midi, so there's an obvious qualification
here. But the tune extractor knows what format the user wants, so
it's easy to do a different canonicalization for abc2ps and abc2midi,
once I understand what each program's rules are.
OTOH, maybe a full parse and canonicalization isn't necessary, and
some good heuristics can be discovered that handle all but a few
pathological cases.
One obvious heuristic would discover the ABC for bass and alto clef
that uses all the excess commas, and adjust them so that they don't
come out with zillions of leger lines below the staff. I've been
thinking of how this might be added to a web interface in general, so
that users could get such adjustments intentionally. The debate over
the Right Way to handle clefs is fun but rather moot here. I can't
use Windows or Mac tools on my web site, so incoming ABC must be
adjusted to something that my Unix (FreeBSD and linux actually) tools
will handle correctly. And for my own purposes, I'd like to have the
ability to ask for octave and clef adjustments, regardless of what
the file's owner decided to use.
So, are there any good ideas out there for how to correct for damaged
ABC like the above?
To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html