Re: [abcusers] ABCp output data structure

2004-09-13 Thread Steven Bennett
Well, considering you really need to parse most fields *anyway* in order for
the parser to have a context for parsing the actual tune data (for example,
if Key is currently G, then the F note I just read is actually an F#...),
I'm not sure it makes much sense to leave that decision up to the calling
program.  You're going to have the parsed data in a structure *somewhere*
for the parser itself to access -- may as well pass back that structure as
part of the overall output, and save the calling application extra work.

Fields and settings the parser doesn't recognize *should* be passed back to
the caller as text.  For example, if that key field had a flag like:
abc2dulcimer:tuning=DAD
...then the parser should pass back that part of the key field intact (or
broken into tag name and tag value, or maybe even the name broken into app
name and tag name) as a substructure of a key structure.  Likewise for many
of the Xcommands, whose scope really doesn't fall into the job of a basic
ABC parser.

But any field which the parser *should* know about should be parsed and the
results passed back to the calling app.

IMHO,
--Steve Bennett

Remo D. wrote:

 To avoid unnecessary work, I decided to provide separate functions to parse
 the field bodies instead of parsing them anyway.
 
 In other words, you first receive the entire body of the field as a string
 and then, if you are really really interested, you call a function like:
 
 k=abcKey(string);
 
 to really parse the field body, extract all the information and pack them
 into k.

To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html


Re: [abcusers] ABCp output data structure

2004-09-11 Thread Bernard Hill
In message [EMAIL PROTECTED], Paul Rosen 
[EMAIL PROTECTED] writes
What about fermata over a barline?
I didn't know it was possible, but, OK.
For a little book giving a lot of information about Music notation I 
commend Gerou: Essential Dictionary of Music Notation. I don't think any 
writer of music software in any sense should be without it.


The FORMATTING element is one of:
End of line
break beam
How do you handle principal beams? ie
---
   
|   |  |   |   |  |
|   |  |   |   |  |
|   |  |   |   |  |
I didn't, but I didn't think ABC did either. For completeness, it would be
nice to specify, in case ABC version 3.0 handles it, but that is starting to
make my head hurt.
The NOTE element contains:
guitar chord unrecognized - arbitrary length string
guitar chord recognized - root pitch (enum), type (enum), base note
(enum)
gracings - enum
bowing - enum
staccato/legato - enum
one or more stacked notes, containing:
grace notes - array of pitches
Distinction between acciaccatura and appoggiatura?
I'm afraid that question is beyond my musical depth.
See Gerou. It's the slash in the tail. The acciaccatura (with the slash) 
is a short crushed note (or more) inserted, the appoggiatura is a note 
which steals its complete length from the next note. Acciaccaturas are 
always 8th notes (or if there are many, 16th notes) whereas 
appoggiaturas can be any length. Quarter note ones are quite common.


How do you handle chords in grace notes?
I didn't think it was possible.
In piano music it's quite common. For instance a low octave (2 notes) in 
the left hand followed by a chord in both hands.

If there are multiple notes stacked, then
each one can have an independant set of grace notes.
start tie - bit field
end tie - bit field
start slur - bit field
end slur - bit field
What's a bit field? Is it just an array of 0..1? If so, I don't
understand how a start tie et al is a bit field.
Sorry about the terminology. I meant that those items can only have the
value true or false. They can be represented by a byte that only
contains 0 or 1, or they can be represented by a particular bit that is
packed with other short fields. It doesn't matter conceptually. It does
matter if particular languages don't support bit fields.
OK
I was thinking that what is important in a tie or a slur is what note it
starts on and what note it ends on.
And what position it's in. Head end, or stem end. Or even mid-stem. See 
Gerou.

The app would interpret a slur over
three notes like this:
Process note 1. Notice that the start slur bit is ON. Remember that.
Process note 2. Notice that the end slur bit is OFF. Keep looking.
Process note 3: Notice that the end slur bit is ON. Now we have the info
to draw the slur from note 1 to note 3.
And slurs within slurs? ie a slur within a long phrase mark?
This also allows a slur to start and stop on the same note.
What about the articulations: accent (), tenuto, legato, staccato,
staccatissimo, martellato and don't forget these can be at the stem end
and the head end.
OK.
I think the above is fairly complete.
What about 1st/2nd/3rd time endings for repeats. Coda sections and DC
al fine and such like.
1st/2nd endings are in the BAR object.
Coda and DC al fine and anything else you can think of like that should
probably also be in the bar object.
What about the type of binding between staves making systems? ie
brackets, simple line, brace, brace+bracket?
I didn't see anything in ABC that allowed multiple staves, so I didn't
address that, although it bugged me. If we allow multiple staves we should
allow a comment to appear before them, like having four staves marked
violin violin viola cello.
Multiple staves are fine in ABC2 and a lot of music has been written 
with then.


What about arpeggiando marks?
Caesura and commas.
Out of my depth, again, I'm afraid.
Arpeggiando: the twiddly vertical line before a chord indicating a 
spread chord. It can also have arrows on top or bottom.

Caesura is a // mark indicating a complete break in the flow of music. A 
comma (breath mark) is a brief pause. Again, Gerou is invaluable.


Note size (cue sized, grace notes, normal).
I guess that note shape, like using diamonds and x, also. I guess that
should optionally be part of each note, and the default should be in the
header.
Stave size (cue sized or normal)
Hmmm... I don't understand this one.
When you have (eg) a flute and piano work, then the flautist gets only 
his own part. The piano player typically gets his own part in normal 
size, and above it the flute staff in a smaller size. It's called cue 
sized.


Fingerings.
I had suggested that under the note section. I don't think ABC supports it,
but it would be nice to have.
I suggest the following for indicating note length: Let's pick a small
value
that is the shortest note length we support: perhaps a triplet of 1/64th
notes. That would be indicated by 1. Then internally every length is a
multiple of that.
That 

Re: [abcusers] ABCp output data structure

2004-09-11 Thread Richard Robinson
On Sat, Sep 11, 2004 at 02:17:00AM -0400, Paul Rosen wrote:
 First of all, I was working from the 1.6 document, I probably should have
 been using a newer one. I noticed that
 http://www.gre.ac.uk/%7Ec.walshaw/abc/abc-draft.txt contains some
 interesting stuff. Is that the latest that has generally been agreed upon?
 
   elemskip - arbitrary length string.[What does this do?]
 
  It's a hangover from abc2mtex.  I don't think it's meaningful to any
  other program.
 
 We could either carry it forward, knowing that no one will use it, but some
 will be confused by it, or we can deprecate it. Or we can define it so that
 it is useful.

I've just seen John W's comments - that nonesssential information
should be preserved, and agree with it. It's conceivable that someone
would use this new parser to transpose a piece (or do something else
with it - who knows where the parser would end up ?) and then want to
feed the transformed piece back through abc2mtex. So itwould have to be
possible to carry any contents of this field forward.

  Since the default length can have only a limited number of values an
  int or
  enum would probably be more appropriate?

What about the Balkan-style signatures ? 

  In BarFly I opted for representing it as two integers, representing C
  and C| by
  making the numerator negative, but it's a kludge and I have no sensible
  way of
  extending it to deal with complex metres.  It needs some serious
  thought.
 
 I'm not familiar enough with the odd meters to know what they are supposed
 to look like when printed. And do playback programs need to be aware of
 them?

For stressing purposes, maybe.

 I'd still support a specific copyright notice.

It's definitely a thing that some of us need to be able to write.

   ignore some of them. Is there a recommended place that each of the
   header
   text fields should go?
 
  Apart from X: and K: they can go in any order.  What I do in BarFly is
  rather than parsing the header from top to bottom I seek for the fields
  that I actually need, ignoring everything else.
 
 I didn't state that clearly, and two out of two people misunderstood.
 Whoops! I meant, where the header fields go on the printed music. In other
 words, is there anything that says that Composer should go on the top
 right hand corner of the music instead of underneath the music? If I put
 vital info in the History field, will all programs display it somewhere?

I'dlike to see this handled with something like a printf-style format string.
I maybe want different fields printed, in different positions with
respect to the tune, for different purposes, any attempt by a printer
program to place them once and for all would be less than ideal. But I'm
not sure this is a problem for the parser ?


 Now my head is really starting to hurt.
 
 Whenever I see these obscure cases, I want to scream, But this is for FOLK
 music! But I try to contain myself, because the more complete we make this,
 the less likely the first person who tries to use this won't be frustrated.

Perhaps different folk-musics are simple in different ways ? (if at
all).


-- 
Richard Robinson
The whole plan hinged upon the natural curiosity of potatoes - S. Lem

To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html


Re: [abcusers] ABCp output data structure

2004-09-11 Thread Phil Taylor
On 11 Sep 2004, at 07:17, Paul Rosen wrote:

rhythm - arbitrary length string.[can this be interpreted in any 
way?]
Yes.  Several programs (BarFly, abcMus, abc2midi) use it to invoke a
stress
program when playing.  However, it's probably best to leave it as a
string
here and let the controlling program interpret it.
If there is a set of acceptable words, I'd think it is in our interest 
to
put them in the standard as preferred, while allowing any arbitrary
string. Then playback programs would know which rhythms to emulate. I'd
suggest this:

rhythm-enum: enum of common rhythm markings (apps are encouraged to 
support
these)
rhythm-other: arbitrary string.(apps don't need to worry about these 
other
than printing the string)
That's a very reasonable suggestion.  Both abcMus and BarFly allow users
to create their own stress programs and identify them by name in the
R: field.  It would be nice, however, to have a common set of names
which all programs which support this feature can be expected to 
recognise.


default length - double?
Since the default length can have only a limited number of values an
int or
enum would probably be more appropriate?
I was thinking that if we define the duration in a particular way that 
the
length would use the same units. (see below)

meter - double?
No.  As you point out below you are going to want to deal with the
distinction
between C and 4/4, and at some point deal with complex time signatures
like 2+2+2+3/8.
In BarFly I opted for representing it as two integers, representing C
and C| by
making the numerator negative, but it's a kludge and I have no 
sensible
way of
extending it to deal with complex metres.  It needs some serious
thought.
I'm not familiar enough with the odd meters to know what they are 
supposed
to look like when printed. And do playback programs need to be aware of
them?
They are printed much the way they appear in the abc, i.e. in the above
example the time signature would have 2+2+2+3 on top and 8 on the 
bottom.
Player programs could make use of this information for stress 
programming.


tempo - [note length and beats per minute] double? and int
parts - array of bytes
starting key - enum
That enum's definition is going to be very large to deal with all
possibilities.
It would probably be worth breaking this down into several fields, 
e.g.

tonic
mode
globalaccidentals
keysignature (an array of seven enums, each of which can represent #,
b, natural or null?)
OK.
I'd add double-sharp and double-flat to the keysignature array, too.
Yes.

There's also some additional stuff which can go in the K: field:
middle
transpose
capo
I guess transpose and capo are really flavors of the same thing, which 
is to
make the playback program play notes other than what's written, and 
put a
note on notation programs.

What does middle do?
middle defines the position of notes on the staff.  For a treble staff 
the
default is middle = B and you don't normally have to change it.  For a
bass staff some programs default to middle = d and others to middle = D,
which means that they display each other's abc output wrongly, unless
there is a middle directive in the abc which specifies exactly what is
intended.  (We have never been able to agree on what the default should
be, and the middle directive was a way around the impasse:-)

transpose applies only to player programs, and shifts the pitch by the
specified number of semitones.
capo applies only when playing, and shifts the pitch of guitar chords
only.
Now, the first thing that I would do is write a C++ class that 
accepts
that
structure as input, and puts it into collections, with lots of access
functions. We could simulate that with C, where we write a set of
functions
like:

GetNote(struct theData* pData, int iNoteNumber, struct aNote* pNote);
Personally, I'd rather work directly with the data, rather than use
access
functions.  It's possible that this would pose problems for some
languages
though.
I'm a little surprised, but that's ok, the data structure is there. 
With
variable length fields I'd find it easier to encapsulate finding the 
data.
I'm a bare metal kind of person, and if I were obliged to use accessor
functions I'd inevitably find myself wanting to do something that wasn't
supported.  On the other hand, using accessor functions has the 
advantage
that you can change the underlying data structures without breaking
existing programs.


I wouldn't worry too much about overhead, since modern computers have
vast
amounts of memory to play with.  Much better to plan for the future
(e.g.
by including blocks of unused space reserved for future use) than to
strive
to keep it as compact as possible.
With the variable length fields that are started by a type-byte, we 
have
that without adding blank space. We need a clear rule for the length 
of a
field so that unrecognized ones can be ignored. Perhaps all fields can 
be
defined the same, that is:

Byte 0: Type (enum of all the field types)
Byte 1-2: length of this 

Re: [abcusers] ABCp output data structure

2004-09-10 Thread Bernard Hill
In message [EMAIL PROTECTED], Paul Rosen 
[EMAIL PROTECTED] writes
 as you might have read in other posts, I would be very interested in any
work on API for accessing ABC file once parsed. I still did not have a
clue
for creating one and I would welcome any suggestion! Just let me know when
you got an idea.
I would break the problem into two parts: first decide what data needs to be
represented, then figure out the physical layout.
Here's my first shot at a comprehensive description of the data:
Have a header section followed by a repeating field section.
The header section contains:
version - (probably either 1.6 or 2.0 or now)
tune number - int
title - arbitrary length string.
area - arbitrary length string.
book - arbitrary length string
composer - arbitrary length string.
discography - arbitrary length string.
elemskip - arbitrary length string.[What does this do?]
group - arbitrary length string.
history - arbitrary length string.
information - arbitrary length string.
notes - arbitrary length string.
origin - arbitrary length string.
source - arbitrary length string.
transcription notes - arbitrary length string.
rhythm - arbitrary length string.[can this be interpreted in any way?]
default length - double?
meter - double?
String. To include C, C| or ยข or 3+3+2/8 or 4 (which is the 
same as 4/1). And don't forget that C| is not always 2/2. It can be 
n/2 in musical terms.

tempo - [note length and beats per minute] double? and int
or string. To include allegro etc.
parts - array of bytes
starting key - enum
Going from what to what? Are you including minor keys?
Or is this just a list of accidentals? What about 
one-sharp-plus-one-flat (etc)

I would suggest an array of [-2..2]
[I'd suggest the following additional fields that aren't in the spec:
clef, copyright and additional lyrics]
Then the header section is followed by a set of repeating fields of one of
three types.
The types are: note element, bar element, formatting element, and header
element.
the HEADER element is one of:
key (as above)
elemskip [What does this do?]
key (as above)
default length (as above)
meter (as above)
part - byte
tempo (as above)
title (as above)
words [how is this supposed to look?]
The BAR element contains:
bar type - enum (single, repeat left, etc.)
start ending - bit field
ending number - int
end ending - bit field
What about fermata over a barline?
The FORMATTING element is one of:
End of line
break beam
How do you handle principal beams? ie
---
   
|   |  |   |   |  |
|   |  |   |   |  |
|   |  |   |   |  |
The NOTE element contains:
guitar chord unrecognized - arbitrary length string
guitar chord recognized - root pitch (enum), type (enum), base note (enum)
gracings - enum
bowing - enum
staccato/legato - enum
one or more stacked notes, containing:
   grace notes - array of pitches
Distinction between acciaccatura and appoggiatura?
How do you handle chords in grace notes?

   pitch - enum [includes an enum for a rest]
   length - int
what's the unit here?
   start tie - bit field
   end tie - bit field
   start slur - bit field
   end slur - bit field
What's a bit field? Is it just an array of 0..1? If so, I don't 
understand how a start tie et al is a bit field.

[I'd also recommend the following extension: an array of syllables to appear
as the lyrics under the note.]
[Also, can we add loudness, fermat, and start crescendo, end
crescendo, fingering, retard, a tempo, etc.?]
Fermat?
What about the articulations: accent (), tenuto, legato, staccato, 
staccatissimo, martellato and don't forget these can be at the stem end 
and the head end.

---
I think the above is fairly complete.
What about 1st/2nd/3rd time endings for repeats. Coda sections and DC 
al fine and such like.

What about the type of binding between staves making systems? ie 
brackets, simple line, brace, brace+bracket?

What about arpeggiando marks?
Caesura and commas.
Note size (cue sized, grace notes, normal).
Stave size (cue sized or normal)
Fingerings.
Now, to represent it is tougher to
allow ease of use in all programming languages.
No it's not. Make it strings.
The way I'd represent it without using objects is with a stream of variable
length fields. That is, there would be a series of [type length data]
elements.
The overhead would be:
element type - enum
length - byte on some element types, short on some element types, and not
present for some element types.
data - length bytes of data, interpreted differently for each element type.
The beginning of the structure would contain a 2-byte version number and a
4-byte total length, and possibly a signature.
In addition, we could have an array of indexes into the start of each
element. Perhaps another array of indexes into the start of each note
element, so that a MIDI program wouldn't have to wade through non-sounding
elements.
The bit fields would be combined in a byte when possible.
The enums are a byte.
In the note structure, for each field there is a value that 

Re: [abcusers] ABCp output data structure

2004-09-10 Thread John Walsh
Paul Rosen writes:

elemskip - arbitrary length string.[What does this do?]


Elemskip is the distance between notes, a real number.  It is used by abc2mtex,
but probably not by any other program.  It's good to have the parser accept an 
arbitrary
string, tho, since if the field is eventually re-cycled, it could be used for something
having only text; then there'd be no backward compatibility problem.

The thing that has always puzzled me about ABC is all the header fields. As
far as I can tell, not all programs treat the headers the same, and some
ignore some of them. Is there a recommended place that each of the header
text fields should go?


Yes---elemskip is a good example---all programs but one ignore it. In the 
header
section, only the X: and K: fields have fixed positions. (Of course, it is important
whether the fields occur in the header or inline.) But the order of the fields is
purposely flexible; makes parsing harder, perhaps, but it cuts down enough on the 
errors
in writing tunes to make that worthwhile, especially to musicians. (!)  This goes for a
number of other features of the language, since it's supposed to be both human-readable
and human-writable, as well as machine-readable.  I gather from the comments I read in
these threads that the result is an uncomfortable cross between computer and human
languages, which might be aggravating when you're the one who has to write the parser. 
 
But then, this is yet another reason that a universal parser would be a boon.

There is one major limitation with the data as expressed above: If the point
of the application using it is to modify the file, then comments, line
breaks, and other details are important so that the file looks as much like
the original as possible. In other words, not only should the structure be a
straightforward description of the music, it should have all the information
that is required to write the tune back out identically. For instance, we
should be able to tell between C and 4/4 in the time signature. One way
to handle comments, spaces, and line breaks is to have a second structure
that contained them and instructions for inserting them back where they need
to be. Many programs would ignore that, a transposing program wouldn't.


A good point.  Since the notation is supposed to be human readable, you want to
keep just about everything in place--it's difficult to know beforehand what small 
changes
will confuse a human reader, or, for that matter, for what purposes the parser will be
used.  Secondly, this is a good test of your parser: if you can replace the tune from 
its
representation in the parser, you know that the parser is complete, i.e. it has all the
information it needs.  (In mathematical terms, the mapping abc --- parsed abc is
invertible.)

Cheers,
John Walsh

To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html


[abcusers] ABCp output data structure

2004-09-09 Thread Paul Rosen
  as you might have read in other posts, I would be very interested in any
 work on API for accessing ABC file once parsed. I still did not have a
clue
 for creating one and I would welcome any suggestion! Just let me know when
 you got an idea.

I would break the problem into two parts: first decide what data needs to be
represented, then figure out the physical layout.

Here's my first shot at a comprehensive description of the data:

Have a header section followed by a repeating field section.

The header section contains:
version - (probably either 1.6 or 2.0 or now)
tune number - int
title - arbitrary length string.
area - arbitrary length string.
book - arbitrary length string
composer - arbitrary length string.
discography - arbitrary length string.
elemskip - arbitrary length string.[What does this do?]
group - arbitrary length string.
history - arbitrary length string.
information - arbitrary length string.
notes - arbitrary length string.
origin - arbitrary length string.
source - arbitrary length string.
transcription notes - arbitrary length string.
rhythm - arbitrary length string.[can this be interpreted in any way?]
default length - double?
meter - double?
tempo - [note length and beats per minute] double? and int
parts - array of bytes
starting key - enum
[I'd suggest the following additional fields that aren't in the spec:
clef, copyright and additional lyrics]

Then the header section is followed by a set of repeating fields of one of
three types.

The types are: note element, bar element, formatting element, and header
element.

the HEADER element is one of:
key (as above)
elemskip [What does this do?]
key (as above)
default length (as above)
meter (as above)
part - byte
tempo (as above)
title (as above)
words [how is this supposed to look?]

The BAR element contains:
bar type - enum (single, repeat left, etc.)
start ending - bit field
ending number - int
end ending - bit field

The FORMATTING element is one of:
End of line
break beam

The NOTE element contains:
guitar chord unrecognized - arbitrary length string
guitar chord recognized - root pitch (enum), type (enum), base note (enum)
gracings - enum
bowing - enum
staccato/legato - enum
one or more stacked notes, containing:
grace notes - array of pitches
pitch - enum [includes an enum for a rest]
length - int
start tie - bit field
end tie - bit field
start slur - bit field
end slur - bit field
[I'd also recommend the following extension: an array of syllables to appear
as the lyrics under the note.]
[Also, can we add loudness, fermat, and start crescendo, end
crescendo, fingering, retard, a tempo, etc.?]

---

I think the above is fairly complete. Now, to represent it is tougher to
allow ease of use in all programming languages.

The way I'd represent it without using objects is with a stream of variable
length fields. That is, there would be a series of [type length data]
elements.

The overhead would be:
element type - enum
length - byte on some element types, short on some element types, and not
present for some element types.
data - length bytes of data, interpreted differently for each element type.

The beginning of the structure would contain a 2-byte version number and a
4-byte total length, and possibly a signature.

In addition, we could have an array of indexes into the start of each
element. Perhaps another array of indexes into the start of each note
element, so that a MIDI program wouldn't have to wade through non-sounding
elements.

The bit fields would be combined in a byte when possible.

The enums are a byte.

In the note structure, for each field there is a value that indicates that
the element isn't there. That is, one of the gracings is none.

---

This is just a sketch. Obviously we would have to define it more carefully,
and define all our enums. I think the most important thing, first, is to
make sure that we can express anything we want with the structure. Also, we
should make sure that the structure is backward and forward compatible:
unknown fields should be easy to ignore.

I think the above can express anything in ABC, but I'd want to look toward
the future and try to express most things that can appear in music.

There is one major limitation with the data as expressed above: If the point
of the application using it is to modify the file, then comments, line
breaks, and other details are important so that the file looks as much like
the original as possible. In other words, not only should the structure be a
straightforward description of the music, it should have all the information
that is required to write the tune back out identically. For instance, we
should be able to tell between C and 4/4 in the time signature. One way
to handle comments, spaces, and line breaks is to have a second structure
that contained them and instructions for inserting them back where they need
to be. Many programs would ignore that, a transposing program wouldn't.

---

The other major problem