Re: [sword-devel] RTF in conf files

2024-04-25 Thread Peter von Kaehne

  
  
  

The wiki is and has always been work in progress, and everybody is invited to improve upon it. David and I did the bulk of it at one point and used for such and similar aspects the code as we understood it and the mailing list together with the old documentation site as base . We have been quite explicit about it. Including the fact that neither of us are programmers. In that sense it is a bit upsetting and annoying to read you here , Jaak. You could have easily then pointed this out so that we either improved upon it or at the least commented upon our lack of clarity and any confusion or shortcomings  in the code.That all said, much of the coverage of all the filters has always been a moving target - if anyone asked with a good use case for improved or clarified coverage of whatever this has often been added. I certainly did a lot of that in the sword filters - making them comply better with OSIS, xhtml , rtf and whatever. Ok, my moan is over. I suggest you make out of the list of your moans a list of bugs and then they might end up getting squashed one by one. Sent from Outlook for iOS
  

 From: sword-devel  on behalf of Jaak Ristioja Sent: Friday, April 26, 2024 2:12 AMTo: sword-devel@crosswire.org Subject: Re: [sword-devel] RTF in conf files When I tried to write a similar parser some years ago (or rewrite the 
libsword parser(s) in Sword++), I discovered to my dismay that the wiki 
page is quite insufficient. The lack of a formal specification for the 
configuration format leads to various serious ambiguities or questions 
when wanting to write a parser. Some examples:

   * How should different parsing errors be handled?
   * What are the phases for parsing? Should the output of each phase be 
a single string, or a list of strings parsed separately by next phases 
(e.g. lines in case of continuations)?
   * Should continuations be handled in a phase before or after parsing 
RTF? How should "\n\n" be parsed?
   * How to include a literal backslash? If escaped, in which phase of 
parsing?
   * Should official Microsoft RTF syntax rules be used for RTF control 
word tokenization and semantics? Which version(s) of RTF exactly? The 
rules on the Crosswire wiki page might differ from RTF specs.
   * The wiki page states that "using the actual UTF-8 character is 
preferred" to RTF "\u" escapes, but the RTF syntax only allows 7-bit 
ASCII characters. Does this mean that all UTF-8 characters should be 
converted to "\u"-style RTF escapes before handing off to the RTF 
parser? Since the "\u" escapes can only handle code points U+ to 
U+, how should other UTF-8 code points beyond U+ be handled?

The original libsword implementation also seemed to suffer from various 
issues and was not of much help to me, thus I eventually ended up 
abandoning this effort.

J

On 16.04.24 10:20, domcox wrote:
> 
> Only a very small, restricted subset of RTF markup is supported, see:
> https://wiki.crosswire.org/DevTools:conf_Files#RTF
> 
> 
> "David \"Judah's Shadow\" Blue"  writes:
> 
>> I'm working on an info command to display some basic info about 
>> modules, and I
>> ran into the fact that, at least in the About entry, the conf file can 
>> contain
>> RTF formatting. As it stands I strip out \pard, replace \par with \n, and
>> strip out the tag portion of any anchor/link tags found. My question 
>> is, are
>> there any other tags that are likely to appear in conf entries that I 
>> should
>> be either handling or stripping (since my front end does no formatting 
>> of text
>> whatsoever)?
> 
> 

___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] RTF in conf files

2024-04-25 Thread Jaak Ristioja
When I tried to write a similar parser some years ago (or rewrite the 
libsword parser(s) in Sword++), I discovered to my dismay that the wiki 
page is quite insufficient. The lack of a formal specification for the 
configuration format leads to various serious ambiguities or questions 
when wanting to write a parser. Some examples:


  * How should different parsing errors be handled?
  * What are the phases for parsing? Should the output of each phase be 
a single string, or a list of strings parsed separately by next phases 
(e.g. lines in case of continuations)?
  * Should continuations be handled in a phase before or after parsing 
RTF? How should "\n\n" be parsed?
  * How to include a literal backslash? If escaped, in which phase of 
parsing?
  * Should official Microsoft RTF syntax rules be used for RTF control 
word tokenization and semantics? Which version(s) of RTF exactly? The 
rules on the Crosswire wiki page might differ from RTF specs.
  * The wiki page states that "using the actual UTF-8 character is 
preferred" to RTF "\u" escapes, but the RTF syntax only allows 7-bit 
ASCII characters. Does this mean that all UTF-8 characters should be 
converted to "\u"-style RTF escapes before handing off to the RTF 
parser? Since the "\u" escapes can only handle code points U+ to 
U+, how should other UTF-8 code points beyond U+ be handled?


The original libsword implementation also seemed to suffer from various 
issues and was not of much help to me, thus I eventually ended up 
abandoning this effort.


J

On 16.04.24 10:20, domcox wrote:


Only a very small, restricted subset of RTF markup is supported, see:
https://wiki.crosswire.org/DevTools:conf_Files#RTF


"David \"Judah's Shadow\" Blue"  writes:

I'm working on an info command to display some basic info about 
modules, and I
ran into the fact that, at least in the About entry, the conf file can 
contain

RTF formatting. As it stands I strip out \pard, replace \par with \n, and
strip out the tag portion of any anchor/link tags found. My question 
is, are
there any other tags that are likely to appear in conf entries that I 
should
be either handling or stripping (since my front end does no formatting 
of text

whatsoever)?





___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] RTF in conf files

2024-04-24 Thread David "Judah's Shadow" Blue
On Tuesday, April 23, 2024 10:41:39 AM EDT Troy A. Griffitts wrote:
> Yes, you are correct, there was no RTFPlain filter.  If you svn update,
> you should see it now.  I just copied the RTFHTML filter and changed it
> to output newlines instead of  and a couple tabs for center.  I was
> surprised to see how few RTF tags we support in this filter, but these
> must be the only ones we list on the wiki because I use the RTFHTML
> filter everywhere.  I also remember we support 1 single HTML tag, so you
> will likely need to handle this in your frontend: Link
> Text

Awesome. Since we just use the 3 formatting tags so far, I'll keep my 
processing for now so I'm not requiring users to build against SVN. Once 1.9.1 
(or whatever is next) is released I'll work on switching it over to the new 
filter.

The only RTF tag I'm not handling right now is the \uXXX? Tag. I assume the 
new RTFPlain filter will properly substitute the correct Unicode/ASCII as 
needed?

> Sorry about the misinformation.

Not at all. I was certain I was missing something myself.



___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] RTF in conf files

2024-04-23 Thread Troy A. Griffitts
Yes, you are correct, there was no RTFPlain filter.  If you svn update, 
you should see it now.  I just copied the RTFHTML filter and changed it 
to output newlines instead of  and a couple tabs for center.  I was 
surprised to see how few RTF tags we support in this filter, but these 
must be the only ones we list on the wiki because I use the RTFHTML 
filter everywhere.  I also remember we support 1 single HTML tag, so you 
will likely need to handle this in your frontend: Link 
Text


Sorry about the misinformation.

Troy


On 4/23/24 11:06, David "Judah's Shadow" Blue wrote:

On Tuesday, April 16, 2024 5:35:50 AM EDT Troy A. Griffitts wrote:

There is an SWFilter to help with this.

E.g., to get HTML, try something like:

#include 
SWBuf confValue = module.getConfigValue("About");
RTFHTML().processText(confValue);

If you don't want HTML, I believe there are also other RTF filter
conversions like RTFPlain which should give you things like your newlines
instead.

Having looked into this some in the class documentation, I'm not finding a
RTFPlain filter that I can see in the docs, There is a SWBasicFilter but I
can't figure out how to call its processText() method.

I had thought to possibly use RTFHTML and then a filter to take HTML to plain
text, but I'm not seeing any implementation of SWFIlter that will do what I'm
after.


___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] RTF in conf files

2024-04-23 Thread David "Judah's Shadow" Blue
On Tuesday, April 16, 2024 5:35:50 AM EDT Troy A. Griffitts wrote:
> There is an SWFilter to help with this.
> 
> E.g., to get HTML, try something like:
> 
> #include 
> SWBuf confValue = module.getConfigValue("About");
> RTFHTML().processText(confValue);
> 
> If you don't want HTML, I believe there are also other RTF filter
> conversions like RTFPlain which should give you things like your newlines
> instead.

Having looked into this some in the class documentation, I'm not finding a 
RTFPlain filter that I can see in the docs, There is a SWBasicFilter but I 
can't figure out how to call its processText() method.

I had thought to possibly use RTFHTML and then a filter to take HTML to plain 
text, but I'm not seeing any implementation of SWFIlter that will do what I'm 
after.


___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] RTF in conf files

2024-04-16 Thread Troy A. Griffitts
There is an SWFilter to help with this.

E.g., to get HTML, try something like:

#include 
SWBuf confValue = module.getConfigValue("About");
RTFHTML().processText(confValue);

If you don't want HTML, I believe there are also other RTF filter conversions 
like RTFPlain which should give you things like your newlines instead.

On April 16, 2024 04:20:51 GMT-03:00, domcox  wrote:
>
>Only a very small, restricted subset of RTF markup is supported, see:
>https://wiki.crosswire.org/DevTools:conf_Files#RTF
>
>
>"David \"Judah's Shadow\" Blue"  writes:
>
>> I'm working on an info command to display some basic info about modules, and 
>> I
>> ran into the fact that, at least in the About entry, the conf file can 
>> contain
>> RTF formatting. As it stands I strip out \pard, replace \par with \n, and
>> strip out the tag portion of any anchor/link tags found. My question is, are
>> there any other tags that are likely to appear in conf entries that I should
>> be either handling or stripping (since my front end does no formatting of 
>> text
>> whatsoever)?
>
>
>-- 
>Dom
>___
>sword-devel mailing list: sword-devel@crosswire.org
>http://crosswire.org/mailman/listinfo/sword-devel
>Instructions to unsubscribe/change your settings at above page

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] RTF in conf files

2024-04-16 Thread domcox



Only a very small, restricted subset of RTF markup is supported, 
see:

https://wiki.crosswire.org/DevTools:conf_Files#RTF


"David \"Judah's Shadow\" Blue"  writes:

I'm working on an info command to display some basic info about 
modules, and I
ran into the fact that, at least in the About entry, the conf 
file can contain
RTF formatting. As it stands I strip out \pard, replace \par 
with \n, and
strip out the tag portion of any anchor/link tags found. My 
question is, are
there any other tags that are likely to appear in conf entries 
that I should
be either handling or stripping (since my front end does no 
formatting of text

whatsoever)?



--
Dom
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


[sword-devel] RTF in conf files

2024-04-15 Thread David "Judah's Shadow" Blue
I'm working on an info command to display some basic info about modules, and I
ran into the fact that, at least in the About entry, the conf file can contain
RTF formatting. As it stands I strip out \pard, replace \par with \n, and
strip out the tag portion of any anchor/link tags found. My question is, are
there any other tags that are likely to appear in conf entries that I should
be either handling or stripping (since my front end does no formatting of text
whatsoever)?


___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page