On Fri, Jul 27, 2012 at 2:11 AM, Gunnar Hellström <[email protected]> wrote: > I see a need to deal with the 'xml:lang' attribute in XEP-0301. > > This attribute can introduce alternative language variants of the text in > messages and other elements. > The use is described in RFC 6221. For us it is of interest to study its use > for the <body/> element: > > ----copy from RFC 6221 section 5.2.3 Body element-------------------------- > There are no attributes defined for the <body/> element, with the exception > of the 'xml:lang' attribute. Multiple instances of the <body/> element MAY > be included in a message stanza for the purpose of providing alternate > versions of the same body, but only if each instance possesses an 'xml:lang' > attribute with a distinct language value (either explicitly or by > inheritance from the 'xml:lang' value of an element farther up in the XML > hierarchy, which from the sender's perspective can include the XML stream > header as described in [XMPP-CORE]). > > <message from='[email protected]/balcony' > id='z94nb37h' to='[email protected]' type='chat' xml:lang='en'> > <body>Wherefore art thou, Romeo?</body> > <body xml:lang='cs'> PročeŽ jsi ty, Romeo? </body> > </message> > > -----------end of copy--------------------------------- > > For XEP-0301 it would be natural to either offer the same opportunity to > provide the alternative languages in the same message, or explicitly say > that alternative languages are not supported. > > This would at least go into section 4.2 RTT attributes and 4.5.3.1 <t/> > element > > Each language will have its own editing elements and values, so the xml:lang > attribute should be on the <rtt/> level. > > I propose insertion a new subsection in 4.2 > ----------------------------------------------------------------------------------------------------------------------------------------------- > 4.2.4 Language > Multiple instances of the <rtt/> element MAY be included in a message stanza > for the purpose of providing alternate versions of the same real-time text, > but only if each instance possesses an 'xml:lang' attribute with a distinct > language value (either explicitly or by inheritance from the 'xml:lang' > value of an element farther up in the XML hierarchy, which from the sender's > perspective can include the XML stream header as described in RFC 6220 [ > ]). The support for language variants SHALL follow the principles of support > for language variants in message bodies specified in RFC 6221[ ]. > > This example provides a small part of real-time text in the default language > English and the alternative language Check. > > <message from='[email protected]/balcony' > id='z94nb37h' to='[email protected]' type='chat' xml:lang='en'> > <rtt xmlns='urn:xmpp:rtt:0' seq='89002'><t>tho</t></rtt> > <rtt xmlns='urn:xmpp:rtt:0' seq='32304' xml:lang='cs'> <t>ty</t></rtt> > </message> > > -------------------------------------------------------------------------------------------------------------------------------------------------- > The second line from the bottom of 4.1 should be changed from > "There MUST NOT be more than one <rtt/> element per <message/> stanza." > to > "There MUST NOT be more than one <rtt/> element per language variant in each > <message/> stanza." > > ----------------------------------------------------- > Gunnar
This best require delibrations for an extended period -- People can only type on one keyboard simultaneously, so this is of interest only in special situations such as simultaneous interpreters running concurrently (e.g. European Union, United Nations meetings). Although you can solve this mechanism by having separate nicknames for each language (InterpreterEN, InterpreterFR, InterpreterCS, etc.) The XSF meeting logs show that other people want to review Version 0.6 this weekend, so I'm going to submit 0.6 tonight. Since more deliberations are needed about the language, I am going to need to leave this out of Version 0.6 unless there's pressure from XSF, or unless there's a good reason (e.g. Europeans promised quick inclusion of XEP-0301 during simultaneous-multiple-translation at European Union meetings) to say "HOLD THE PRESSES" Also, here is an alternate method that keeps one <rtt/> per message stanza. I suggest that this is preferable, because the interpreters will be typing keypresses separately of each other, and interpreters may have pauses independently of each other, so there's no good reason to combine multiple <rtt/> into the same message stanza: <message from='[email protected]/balcony' id='z94nb37h' to='[email protected]' type='chat' xml:lang='en'> <rtt xmlns='urn:xmpp:rtt:0' seq='89002'><t>Hello</t></rtt> </message> <message from='[email protected]/balcony' id='z94nb37h' to='[email protected]' type='chat' xml:lang='fr'> <rtt xmlns='urn:xmpp:rtt:0' seq='32304'><t>Bonjour</t></rtt> </rtt> The advantage is that the above continues to use the existing XEP-0301 protocol, and keeps the language attribute out of the <rtt/> element. It is more backwards compatible, I think. Clients that don't track multiple languages, will just simply focus only on the default language (if it already filters XML to a specific language) or will simply stall ("Keeping Real-Time Text Synchronized"), while clients that distinguish the language attribute, would know to keep separate real-time messages per language. This can also easily be done as a private extension for a single specialized client. On the other hand, at this time, I need an opinion from XSF about whether this is acceptable to hold off to 0.7, since this is a very "niche" and specialized feature, but it can have merit in international and supranational organizations, where members of the public might download off-the-market software to watch captioning/translations in their languages. XSF: I need comments by the end of today, about whether it is OK to hold this off till 0.7. Gunnar: I need comments, is this related to the European procurement interests that you told me about? Thanks Mark Rejhon
