Well, for converting HTML to RTF, I believe Johan was meaning that you should 
be using an HTML parser AND a RTF Generator to:

read HTML file watching for events
when an event happens check the event data such as what tag fired the event and 
then pass that info along with the (tag) data off to the RTF generator object.  

This would be very similar to how the XML::SAX* modules work.  I have not 
really worked with the XML::SAX* modules but a few times, but basically, you 
write your own package object and use XML::SAX* to capture events in the HTML 
source file.   It passes these to your package subs, which you can then do 
conditional processing based on what event is sent.  And then, you pass this 
data to where ever you need (usually an XML writer.  Basically, this is a way 
to transform one xml document into something else, either XML, HTML, CSV, or 
whatever format you can write up).  

While going with this approach would take a little longer:
1)its main advantage is that it is easier to package into a real module to 
share (hint)
2) its extensable
3) with the events already defined in HTML and the events already defined in 
RTF output, it will be far less work to change the parsing rules then the role 
your own approach taken in the sub below.(you dont have to worry about.

I can "agree" with you on your point about RTF::Parser's lack of documentation, 
but it still is a decent prebuilt package.  Generally, "we"  end up missing 
something when trying to do something manually that a module already has been 
built to do.

I tried the RTF::Parser's rtf2html.bat and found it did a very good job.  Now, 
granted, I did not pass anything odd into it the html file, but it created very 
nice HTML output.

Hope this helps.

Joe Frazier, Jr.
Technical Support Engineer
Peopleclick Service Support

Tel:  +1-800-841-2365
E-Mail: mailto:[EMAIL PROTECTED]



> -----Original Message-----
> From: Ultimate Red Dragon [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 07, 2002 6:22 PM
> To: perl-win32-gui-users@lists.sourceforge.net
> Subject: [perl-win32-gui-users] Re: Re: RTF 2 HTML
> 
> 
> Well, in reply to Johan.  I'll admit that I kinda knew those 
> were there, but 
> the documentation on them is either horrible or non-existent 
> (depending on 
> which RTF modules you look at.)  As for the HTML2RTF, I know 
> of no already 
> existing interpreter, but I plan on using HTML::Parser to 
> make it simpler.
> 
> Anyway, I managed to get it to properly translate '<', '>' 
> and '&' into 
> their HTML counterparts.  Please point out any bugs or 
> suggestions you have.
> 
> sub rtf2html{
>    my $re = $main->reDesc;  #Just set this to the RichEdit object
>    my $oldtext = $re->Text();
>    my @escapes;
>    {
>       my $temp = -1;
>       while(($temp = index($oldtext,'<',$temp+1)) != -1){
>          push(@escapes,[$temp,'&lt']);
>       }
>       $temp = -1;
>       while(($temp = index($oldtext,'>',$temp+1)) != -1){
>          push(@escapes,[$temp,'&gt;']);
>       }
>       $temp = -1;
>       while(($temp = index($oldtext,'&',$temp+1)) != -1){
>          push(@escapes,[$temp,'&amp;']);
>       }
>    }
> 
>    @escapes = sort({ $a->[0] <=> $b->[0] } @escapes);
>    foreach (@escapes){
>       print $_->[0]." = ".$_->[1]."\n";
>    }
> 
>    my $i = 0;
>    my $b = 0;
>    my $u = 0;
>    my $text = '';
> 
>    my $offset = 0;
>    foreach my $x (0..length($oldtext)){
>       $re->Select($x,$x+1);
>       my %att = $re->GetCharFormat();
>       if(($i && !exists($att{-italic})) || (!$i && 
> exists($att{-italic}))){
>          $i = $att{-italic};
>          $text .= ($i ? '<I>' : '</I>');
>       }
>       if(($b && !exists($att{-bold})) || (!$b && 
> exists($att{-bold}))){
>          $b = $att{-bold};
>          $text .= ($b ? '<B>' : '</B>');
>       }
>       if(($u && !exists($att{-underline})) || (!$u && 
> exists($att{-underline}))){
>          $u = $att{-underline};
>          $text .= ($u ? '<U>' : '</U>');
>       }
>       if(defined($escapes[0]->[0]) && $x == $escapes[0]->[0]){
>          my $temp = shift(@escapes);
>          $text .= $temp->[1];
>       }else{
>          $text .= substr($oldtext,$x,1);
>       }
>    }
>    $text =~ s/\r//g;
>    $text =~ s/\n/<BR>/gi;
>    return $text;
> }
> 
> 
> 
> Date: Thu, 07 Mar 2002 09:47:52 +0100
> To: perl-win32-gui-users@lists.sourceforge.net
> From: Johan Lindstrom <[EMAIL PROTECTED]>
> Subject: Re: [perl-win32-gui-users] RTF 2 HTML
> 
> At 23:37 2002-03-06 -0500, Ultimate Red Dragon wrote:
>  >It's not that great, I don't claim it's efficient, just 
> that it works.
>  >
>  >Currently, it supports new lines, bold, italics and underline.
> 
> This seems to be similar to what you want:
> http://search.cpan.org/search?dist=RTF-Parser
> 
> 
>  >I'm working on converting < and > correctly, as well as a 
> HTML 2 RTF sub
>  >(or is there already one?)
> 
> There are HTML parsers and RTF generators on CPAN.
> 
> Here is the search for module names with RTF:
> http://search.cpan.org/search?mode=module&query=rtf
> (but note that you often can get a lot more results by searching the
> documentation rather than the module name)
> 
> 
> /J
> 
> -------- ------ ---- --- -- --  --  -    -     -      -         -
> Johan Lindström    Sourcerer @ Boss Casinos     [EMAIL PROTECTED]
> 
> Latest bookmark: "(GUI) Windows Programming FAQ"
> http://www.perlmonks.org/index.pl?node_id=108708
> 
> _________________________________________________________________
> Send and receive Hotmail on your mobile device: http://mobile.msn.com
> 
> 
> _______________________________________________
> Perl-Win32-GUI-Users mailing list
> Perl-Win32-GUI-Users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perl-win32-gui-users
> 

Reply via email to