Re: [abcusers] ABCp output data structure
Well, considering you really need to parse most fields *anyway* in order for the parser to have a context for parsing the actual tune data (for example, if Key is currently G, then the F note I just read is actually an F#...), I'm not sure it makes much sense to leave that decision up to the calling program. You're going to have the parsed data in a structure *somewhere* for the parser itself to access -- may as well pass back that structure as part of the overall output, and save the calling application extra work. Fields and settings the parser doesn't recognize *should* be passed back to the caller as text. For example, if that key field had a flag like: abc2dulcimer:tuning=DAD ...then the parser should pass back that part of the key field intact (or broken into tag name and tag value, or maybe even the name broken into app name and tag name) as a substructure of a key structure. Likewise for many of the Xcommands, whose scope really doesn't fall into the job of a basic ABC parser. But any field which the parser *should* know about should be parsed and the results passed back to the calling app. IMHO, --Steve Bennett Remo D. wrote: To avoid unnecessary work, I decided to provide separate functions to parse the field bodies instead of parsing them anyway. In other words, you first receive the entire body of the field as a string and then, if you are really really interested, you call a function like: k=abcKey(string); to really parse the field body, extract all the information and pack them into k. To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html
Re: [abcusers] ABCp output data structure
In message [EMAIL PROTECTED], Paul Rosen [EMAIL PROTECTED] writes What about fermata over a barline? I didn't know it was possible, but, OK. For a little book giving a lot of information about Music notation I commend Gerou: Essential Dictionary of Music Notation. I don't think any writer of music software in any sense should be without it. The FORMATTING element is one of: End of line break beam How do you handle principal beams? ie --- | | | | | | | | | | | | | | | | | | I didn't, but I didn't think ABC did either. For completeness, it would be nice to specify, in case ABC version 3.0 handles it, but that is starting to make my head hurt. The NOTE element contains: guitar chord unrecognized - arbitrary length string guitar chord recognized - root pitch (enum), type (enum), base note (enum) gracings - enum bowing - enum staccato/legato - enum one or more stacked notes, containing: grace notes - array of pitches Distinction between acciaccatura and appoggiatura? I'm afraid that question is beyond my musical depth. See Gerou. It's the slash in the tail. The acciaccatura (with the slash) is a short crushed note (or more) inserted, the appoggiatura is a note which steals its complete length from the next note. Acciaccaturas are always 8th notes (or if there are many, 16th notes) whereas appoggiaturas can be any length. Quarter note ones are quite common. How do you handle chords in grace notes? I didn't think it was possible. In piano music it's quite common. For instance a low octave (2 notes) in the left hand followed by a chord in both hands. If there are multiple notes stacked, then each one can have an independant set of grace notes. start tie - bit field end tie - bit field start slur - bit field end slur - bit field What's a bit field? Is it just an array of 0..1? If so, I don't understand how a start tie et al is a bit field. Sorry about the terminology. I meant that those items can only have the value true or false. They can be represented by a byte that only contains 0 or 1, or they can be represented by a particular bit that is packed with other short fields. It doesn't matter conceptually. It does matter if particular languages don't support bit fields. OK I was thinking that what is important in a tie or a slur is what note it starts on and what note it ends on. And what position it's in. Head end, or stem end. Or even mid-stem. See Gerou. The app would interpret a slur over three notes like this: Process note 1. Notice that the start slur bit is ON. Remember that. Process note 2. Notice that the end slur bit is OFF. Keep looking. Process note 3: Notice that the end slur bit is ON. Now we have the info to draw the slur from note 1 to note 3. And slurs within slurs? ie a slur within a long phrase mark? This also allows a slur to start and stop on the same note. What about the articulations: accent (), tenuto, legato, staccato, staccatissimo, martellato and don't forget these can be at the stem end and the head end. OK. I think the above is fairly complete. What about 1st/2nd/3rd time endings for repeats. Coda sections and DC al fine and such like. 1st/2nd endings are in the BAR object. Coda and DC al fine and anything else you can think of like that should probably also be in the bar object. What about the type of binding between staves making systems? ie brackets, simple line, brace, brace+bracket? I didn't see anything in ABC that allowed multiple staves, so I didn't address that, although it bugged me. If we allow multiple staves we should allow a comment to appear before them, like having four staves marked violin violin viola cello. Multiple staves are fine in ABC2 and a lot of music has been written with then. What about arpeggiando marks? Caesura and commas. Out of my depth, again, I'm afraid. Arpeggiando: the twiddly vertical line before a chord indicating a spread chord. It can also have arrows on top or bottom. Caesura is a // mark indicating a complete break in the flow of music. A comma (breath mark) is a brief pause. Again, Gerou is invaluable. Note size (cue sized, grace notes, normal). I guess that note shape, like using diamonds and x, also. I guess that should optionally be part of each note, and the default should be in the header. Stave size (cue sized or normal) Hmmm... I don't understand this one. When you have (eg) a flute and piano work, then the flautist gets only his own part. The piano player typically gets his own part in normal size, and above it the flute staff in a smaller size. It's called cue sized. Fingerings. I had suggested that under the note section. I don't think ABC supports it, but it would be nice to have. I suggest the following for indicating note length: Let's pick a small value that is the shortest note length we support: perhaps a triplet of 1/64th notes. That would be indicated by 1. Then internally every length is a multiple of that. That
Re: [abcusers] ABCp output data structure
On Sat, Sep 11, 2004 at 02:17:00AM -0400, Paul Rosen wrote: First of all, I was working from the 1.6 document, I probably should have been using a newer one. I noticed that http://www.gre.ac.uk/%7Ec.walshaw/abc/abc-draft.txt contains some interesting stuff. Is that the latest that has generally been agreed upon? elemskip - arbitrary length string.[What does this do?] It's a hangover from abc2mtex. I don't think it's meaningful to any other program. We could either carry it forward, knowing that no one will use it, but some will be confused by it, or we can deprecate it. Or we can define it so that it is useful. I've just seen John W's comments - that nonesssential information should be preserved, and agree with it. It's conceivable that someone would use this new parser to transpose a piece (or do something else with it - who knows where the parser would end up ?) and then want to feed the transformed piece back through abc2mtex. So itwould have to be possible to carry any contents of this field forward. Since the default length can have only a limited number of values an int or enum would probably be more appropriate? What about the Balkan-style signatures ? In BarFly I opted for representing it as two integers, representing C and C| by making the numerator negative, but it's a kludge and I have no sensible way of extending it to deal with complex metres. It needs some serious thought. I'm not familiar enough with the odd meters to know what they are supposed to look like when printed. And do playback programs need to be aware of them? For stressing purposes, maybe. I'd still support a specific copyright notice. It's definitely a thing that some of us need to be able to write. ignore some of them. Is there a recommended place that each of the header text fields should go? Apart from X: and K: they can go in any order. What I do in BarFly is rather than parsing the header from top to bottom I seek for the fields that I actually need, ignoring everything else. I didn't state that clearly, and two out of two people misunderstood. Whoops! I meant, where the header fields go on the printed music. In other words, is there anything that says that Composer should go on the top right hand corner of the music instead of underneath the music? If I put vital info in the History field, will all programs display it somewhere? I'dlike to see this handled with something like a printf-style format string. I maybe want different fields printed, in different positions with respect to the tune, for different purposes, any attempt by a printer program to place them once and for all would be less than ideal. But I'm not sure this is a problem for the parser ? Now my head is really starting to hurt. Whenever I see these obscure cases, I want to scream, But this is for FOLK music! But I try to contain myself, because the more complete we make this, the less likely the first person who tries to use this won't be frustrated. Perhaps different folk-musics are simple in different ways ? (if at all). -- Richard Robinson The whole plan hinged upon the natural curiosity of potatoes - S. Lem To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html
Re: [abcusers] ABCp output data structure
On 11 Sep 2004, at 07:17, Paul Rosen wrote: rhythm - arbitrary length string.[can this be interpreted in any way?] Yes. Several programs (BarFly, abcMus, abc2midi) use it to invoke a stress program when playing. However, it's probably best to leave it as a string here and let the controlling program interpret it. If there is a set of acceptable words, I'd think it is in our interest to put them in the standard as preferred, while allowing any arbitrary string. Then playback programs would know which rhythms to emulate. I'd suggest this: rhythm-enum: enum of common rhythm markings (apps are encouraged to support these) rhythm-other: arbitrary string.(apps don't need to worry about these other than printing the string) That's a very reasonable suggestion. Both abcMus and BarFly allow users to create their own stress programs and identify them by name in the R: field. It would be nice, however, to have a common set of names which all programs which support this feature can be expected to recognise. default length - double? Since the default length can have only a limited number of values an int or enum would probably be more appropriate? I was thinking that if we define the duration in a particular way that the length would use the same units. (see below) meter - double? No. As you point out below you are going to want to deal with the distinction between C and 4/4, and at some point deal with complex time signatures like 2+2+2+3/8. In BarFly I opted for representing it as two integers, representing C and C| by making the numerator negative, but it's a kludge and I have no sensible way of extending it to deal with complex metres. It needs some serious thought. I'm not familiar enough with the odd meters to know what they are supposed to look like when printed. And do playback programs need to be aware of them? They are printed much the way they appear in the abc, i.e. in the above example the time signature would have 2+2+2+3 on top and 8 on the bottom. Player programs could make use of this information for stress programming. tempo - [note length and beats per minute] double? and int parts - array of bytes starting key - enum That enum's definition is going to be very large to deal with all possibilities. It would probably be worth breaking this down into several fields, e.g. tonic mode globalaccidentals keysignature (an array of seven enums, each of which can represent #, b, natural or null?) OK. I'd add double-sharp and double-flat to the keysignature array, too. Yes. There's also some additional stuff which can go in the K: field: middle transpose capo I guess transpose and capo are really flavors of the same thing, which is to make the playback program play notes other than what's written, and put a note on notation programs. What does middle do? middle defines the position of notes on the staff. For a treble staff the default is middle = B and you don't normally have to change it. For a bass staff some programs default to middle = d and others to middle = D, which means that they display each other's abc output wrongly, unless there is a middle directive in the abc which specifies exactly what is intended. (We have never been able to agree on what the default should be, and the middle directive was a way around the impasse:-) transpose applies only to player programs, and shifts the pitch by the specified number of semitones. capo applies only when playing, and shifts the pitch of guitar chords only. Now, the first thing that I would do is write a C++ class that accepts that structure as input, and puts it into collections, with lots of access functions. We could simulate that with C, where we write a set of functions like: GetNote(struct theData* pData, int iNoteNumber, struct aNote* pNote); Personally, I'd rather work directly with the data, rather than use access functions. It's possible that this would pose problems for some languages though. I'm a little surprised, but that's ok, the data structure is there. With variable length fields I'd find it easier to encapsulate finding the data. I'm a bare metal kind of person, and if I were obliged to use accessor functions I'd inevitably find myself wanting to do something that wasn't supported. On the other hand, using accessor functions has the advantage that you can change the underlying data structures without breaking existing programs. I wouldn't worry too much about overhead, since modern computers have vast amounts of memory to play with. Much better to plan for the future (e.g. by including blocks of unused space reserved for future use) than to strive to keep it as compact as possible. With the variable length fields that are started by a type-byte, we have that without adding blank space. We need a clear rule for the length of a field so that unrecognized ones can be ignored. Perhaps all fields can be defined the same, that is: Byte 0: Type (enum of all the field types) Byte 1-2: length of this
Re: [abcusers] ABCp output data structure
In message [EMAIL PROTECTED], Paul Rosen [EMAIL PROTECTED] writes as you might have read in other posts, I would be very interested in any work on API for accessing ABC file once parsed. I still did not have a clue for creating one and I would welcome any suggestion! Just let me know when you got an idea. I would break the problem into two parts: first decide what data needs to be represented, then figure out the physical layout. Here's my first shot at a comprehensive description of the data: Have a header section followed by a repeating field section. The header section contains: version - (probably either 1.6 or 2.0 or now) tune number - int title - arbitrary length string. area - arbitrary length string. book - arbitrary length string composer - arbitrary length string. discography - arbitrary length string. elemskip - arbitrary length string.[What does this do?] group - arbitrary length string. history - arbitrary length string. information - arbitrary length string. notes - arbitrary length string. origin - arbitrary length string. source - arbitrary length string. transcription notes - arbitrary length string. rhythm - arbitrary length string.[can this be interpreted in any way?] default length - double? meter - double? String. To include C, C| or ยข or 3+3+2/8 or 4 (which is the same as 4/1). And don't forget that C| is not always 2/2. It can be n/2 in musical terms. tempo - [note length and beats per minute] double? and int or string. To include allegro etc. parts - array of bytes starting key - enum Going from what to what? Are you including minor keys? Or is this just a list of accidentals? What about one-sharp-plus-one-flat (etc) I would suggest an array of [-2..2] [I'd suggest the following additional fields that aren't in the spec: clef, copyright and additional lyrics] Then the header section is followed by a set of repeating fields of one of three types. The types are: note element, bar element, formatting element, and header element. the HEADER element is one of: key (as above) elemskip [What does this do?] key (as above) default length (as above) meter (as above) part - byte tempo (as above) title (as above) words [how is this supposed to look?] The BAR element contains: bar type - enum (single, repeat left, etc.) start ending - bit field ending number - int end ending - bit field What about fermata over a barline? The FORMATTING element is one of: End of line break beam How do you handle principal beams? ie --- | | | | | | | | | | | | | | | | | | The NOTE element contains: guitar chord unrecognized - arbitrary length string guitar chord recognized - root pitch (enum), type (enum), base note (enum) gracings - enum bowing - enum staccato/legato - enum one or more stacked notes, containing: grace notes - array of pitches Distinction between acciaccatura and appoggiatura? How do you handle chords in grace notes? pitch - enum [includes an enum for a rest] length - int what's the unit here? start tie - bit field end tie - bit field start slur - bit field end slur - bit field What's a bit field? Is it just an array of 0..1? If so, I don't understand how a start tie et al is a bit field. [I'd also recommend the following extension: an array of syllables to appear as the lyrics under the note.] [Also, can we add loudness, fermat, and start crescendo, end crescendo, fingering, retard, a tempo, etc.?] Fermat? What about the articulations: accent (), tenuto, legato, staccato, staccatissimo, martellato and don't forget these can be at the stem end and the head end. --- I think the above is fairly complete. What about 1st/2nd/3rd time endings for repeats. Coda sections and DC al fine and such like. What about the type of binding between staves making systems? ie brackets, simple line, brace, brace+bracket? What about arpeggiando marks? Caesura and commas. Note size (cue sized, grace notes, normal). Stave size (cue sized or normal) Fingerings. Now, to represent it is tougher to allow ease of use in all programming languages. No it's not. Make it strings. The way I'd represent it without using objects is with a stream of variable length fields. That is, there would be a series of [type length data] elements. The overhead would be: element type - enum length - byte on some element types, short on some element types, and not present for some element types. data - length bytes of data, interpreted differently for each element type. The beginning of the structure would contain a 2-byte version number and a 4-byte total length, and possibly a signature. In addition, we could have an array of indexes into the start of each element. Perhaps another array of indexes into the start of each note element, so that a MIDI program wouldn't have to wade through non-sounding elements. The bit fields would be combined in a byte when possible. The enums are a byte. In the note structure, for each field there is a value that
Re: [abcusers] ABCp output data structure
Paul Rosen writes: elemskip - arbitrary length string.[What does this do?] Elemskip is the distance between notes, a real number. It is used by abc2mtex, but probably not by any other program. It's good to have the parser accept an arbitrary string, tho, since if the field is eventually re-cycled, it could be used for something having only text; then there'd be no backward compatibility problem. The thing that has always puzzled me about ABC is all the header fields. As far as I can tell, not all programs treat the headers the same, and some ignore some of them. Is there a recommended place that each of the header text fields should go? Yes---elemskip is a good example---all programs but one ignore it. In the header section, only the X: and K: fields have fixed positions. (Of course, it is important whether the fields occur in the header or inline.) But the order of the fields is purposely flexible; makes parsing harder, perhaps, but it cuts down enough on the errors in writing tunes to make that worthwhile, especially to musicians. (!) This goes for a number of other features of the language, since it's supposed to be both human-readable and human-writable, as well as machine-readable. I gather from the comments I read in these threads that the result is an uncomfortable cross between computer and human languages, which might be aggravating when you're the one who has to write the parser. But then, this is yet another reason that a universal parser would be a boon. There is one major limitation with the data as expressed above: If the point of the application using it is to modify the file, then comments, line breaks, and other details are important so that the file looks as much like the original as possible. In other words, not only should the structure be a straightforward description of the music, it should have all the information that is required to write the tune back out identically. For instance, we should be able to tell between C and 4/4 in the time signature. One way to handle comments, spaces, and line breaks is to have a second structure that contained them and instructions for inserting them back where they need to be. Many programs would ignore that, a transposing program wouldn't. A good point. Since the notation is supposed to be human readable, you want to keep just about everything in place--it's difficult to know beforehand what small changes will confuse a human reader, or, for that matter, for what purposes the parser will be used. Secondly, this is a good test of your parser: if you can replace the tune from its representation in the parser, you know that the parser is complete, i.e. it has all the information it needs. (In mathematical terms, the mapping abc --- parsed abc is invertible.) Cheers, John Walsh To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html
[abcusers] ABCp output data structure
as you might have read in other posts, I would be very interested in any work on API for accessing ABC file once parsed. I still did not have a clue for creating one and I would welcome any suggestion! Just let me know when you got an idea. I would break the problem into two parts: first decide what data needs to be represented, then figure out the physical layout. Here's my first shot at a comprehensive description of the data: Have a header section followed by a repeating field section. The header section contains: version - (probably either 1.6 or 2.0 or now) tune number - int title - arbitrary length string. area - arbitrary length string. book - arbitrary length string composer - arbitrary length string. discography - arbitrary length string. elemskip - arbitrary length string.[What does this do?] group - arbitrary length string. history - arbitrary length string. information - arbitrary length string. notes - arbitrary length string. origin - arbitrary length string. source - arbitrary length string. transcription notes - arbitrary length string. rhythm - arbitrary length string.[can this be interpreted in any way?] default length - double? meter - double? tempo - [note length and beats per minute] double? and int parts - array of bytes starting key - enum [I'd suggest the following additional fields that aren't in the spec: clef, copyright and additional lyrics] Then the header section is followed by a set of repeating fields of one of three types. The types are: note element, bar element, formatting element, and header element. the HEADER element is one of: key (as above) elemskip [What does this do?] key (as above) default length (as above) meter (as above) part - byte tempo (as above) title (as above) words [how is this supposed to look?] The BAR element contains: bar type - enum (single, repeat left, etc.) start ending - bit field ending number - int end ending - bit field The FORMATTING element is one of: End of line break beam The NOTE element contains: guitar chord unrecognized - arbitrary length string guitar chord recognized - root pitch (enum), type (enum), base note (enum) gracings - enum bowing - enum staccato/legato - enum one or more stacked notes, containing: grace notes - array of pitches pitch - enum [includes an enum for a rest] length - int start tie - bit field end tie - bit field start slur - bit field end slur - bit field [I'd also recommend the following extension: an array of syllables to appear as the lyrics under the note.] [Also, can we add loudness, fermat, and start crescendo, end crescendo, fingering, retard, a tempo, etc.?] --- I think the above is fairly complete. Now, to represent it is tougher to allow ease of use in all programming languages. The way I'd represent it without using objects is with a stream of variable length fields. That is, there would be a series of [type length data] elements. The overhead would be: element type - enum length - byte on some element types, short on some element types, and not present for some element types. data - length bytes of data, interpreted differently for each element type. The beginning of the structure would contain a 2-byte version number and a 4-byte total length, and possibly a signature. In addition, we could have an array of indexes into the start of each element. Perhaps another array of indexes into the start of each note element, so that a MIDI program wouldn't have to wade through non-sounding elements. The bit fields would be combined in a byte when possible. The enums are a byte. In the note structure, for each field there is a value that indicates that the element isn't there. That is, one of the gracings is none. --- This is just a sketch. Obviously we would have to define it more carefully, and define all our enums. I think the most important thing, first, is to make sure that we can express anything we want with the structure. Also, we should make sure that the structure is backward and forward compatible: unknown fields should be easy to ignore. I think the above can express anything in ABC, but I'd want to look toward the future and try to express most things that can appear in music. There is one major limitation with the data as expressed above: If the point of the application using it is to modify the file, then comments, line breaks, and other details are important so that the file looks as much like the original as possible. In other words, not only should the structure be a straightforward description of the music, it should have all the information that is required to write the tune back out identically. For instance, we should be able to tell between C and 4/4 in the time signature. One way to handle comments, spaces, and line breaks is to have a second structure that contained them and instructions for inserting them back where they need to be. Many programs would ignore that, a transposing program wouldn't. --- The other major problem