On 10/12/2012 4:00 AM, Peter von Kaehne wrote:
Sorry, while the crash has gone, the function is not correct - at
all.

\cp is meant to give a printed chapter number which has no influence
on the underlying counting of verses and chapters. How exactly to
represent it in OSIS, we would need to figure out, but it should not
influence the creation of subsequent osisIDs. I would think <hi
type="bold"> is probably the best for our purposes. The OSIS
reference is not exactly helpful at this point, nor does it reflect
the reality of module making.

\cp (like \vp) is a workaround for a limitation in Paratext. Paratext requires that all chapter and verse numbers be numeric and strictly increasing. No lettered or out-of-order or repeated verse or chapter numbers are permissible. However, actual Bibles sometimes include these things. So Paratext requires that you enumerate the chapters/verses with strictly increasing numerals. \cp and \vp let Paratext substitute the correct underlying number when rendering.

The description of \cp in the USFM docs states: "This is a chapter marker (number, letter) which would be displayed in the published text (where the published marker is different than the \c # used within the translation editing environment)." The words "translation editing environment" are a reference to Paratext specifically, and the description as a whole conveys that \cp is the real chapter number if a different \c value is necessitated by Paratext.

OSIS doesn't have this limitation. You can encode the real verse and chapter numbers in OSIS, without need for a workaround.

So usfm2osis.py's replacement of the numeric dummy-chapter with the chapter number specified in \cp is correct.

If you look at your USFM document, I anticipate you see something like:

\c 1
\cp A
...
\c 2
\cp 1
...
\c 3
\cp 2
...
\c 4
\cp 3
...
\c 5
\cp B
...
\c 6
\cp 3
...
\c 7
\cp 4

The strictly increasing \c values are just dummy values for Paratext. The \cp values represent the actual underlying chapter numbers for this reference scheme. There aren't two different chapter 3s in Esther, just one that is briefly interrupted by chapter B, but Paratext can't deal with the underlying reference system, so it requires the \cp workaround. Likewise, chapter 4 (\cp 4) isn't really chapter 7 (\c 7).

This is mostly based on my experience encoding USX docs for ABS. If your USFM encoder intends that the value in \c be the chapter value, then \cp should not be used. You should look into \ca or \cl as alternatives.

Right now the code does two things: It replaces in the sample below
the chapter number 1 with an A for the subsequent verse's osisID
("Esth.A.1" instead of "Esth.1.1") and it leaves the \cp A in place.
This is both not right - both acc OSIS reference and acc the desires
of the USFM writer in my example.

With the update just committed, usfm2osis.py should now correctly remove \cp (and \vp). That was a bug--actually a set of bugs. Again, I regrettably haven't tested this, but the code looks good to me.

--Chris


_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to