Re: [OT] HTML to XHTML conversion

2002-08-28 Thread Jean-Michel Hiver

On Fri 23-Aug-2002 at 11:07:35AM -0500, D. Hageman wrote:
 
 My suggestion would to just use a XML parser module like XML::LibXML.  
 Load the file up using the HTML loading functions and print it using the
 XML printing functions ... since the only difference I can see between 
 HTML and XHMTL is that optional ending tags are no longer optional (per 
 XML spec) and single tags must be ended properly (per XML spec).

There's a lot more than that.

bodybody/body/body is not valid XHTML for example.
input type=text name=foo/input is not valid XHTML either.
You have to be careful about block-level and inline elements.

etc. etc...

Besides, you cannot use an XML parser to parse HTML. You have to use
something like HTML::TreeBuilder instead. Part of HTML::Tree, excellent
module IMHO.

Cheers,
-- 
IT'S TIME FOR A DIFFERENT KIND OF WEB

  Jean-Michel Hiver - Software Director
  [EMAIL PROTECTED]
  +44 (0)114 255 8097

  VISIT HTTP://WWW.MKDOC.COM



Re: [OT] HTML to XHTML conversion

2002-08-28 Thread Ilya Martynov

 On Wed, 28 Aug 2002 10:07:07 +0100, Jean-Michel Hiver [EMAIL PROTECTED] said:

JM bodybody/body/body is not valid XHTML for example.
JM input type=text name=foo/input is not valid XHTML either.
JM You have to be careful about block-level and inline elements.

Actually input type=text name=foo/input is valid XHTML.

Correct me if I'm wrong but AFAIK xxx/xxx is exactly equivalent to
xxx/.

input type=text name=foosomething/input is not valid.

JM etc. etc...

JM Besides, you cannot use an XML parser to parse HTML. You have to use
JM something like HTML::TreeBuilder instead. Part of HTML::Tree, excellent
JM module IMHO.

XML::LibXML supports HTML too.

-- 
Ilya Martynov (http://martynov.org/)



RE: [OT] HTML to XHTML conversion

2002-08-26 Thread Narins, Josh

Reviewing the What is different between HTML and XHTML? we have

All tags must close X/X
or be single tags like X /

Tag names are case sensitive

All attributes must be name=value (double quotes required, no more
multiple,checked,selected)

And all tags must nest properly

XHTML also has rules about which elements can appear where (the XHTML DTD)

NOTE: There are two XHTML DTD's of interest, Strict and Transitional.
Transitional is much more forgiving. I always View Source of www.w3.org to
get the strange DOCTYPE syntax for Transitional, and the path to the DTD









-Original Message-
From: D. Hageman [mailto:[EMAIL PROTECTED]]
Sent: Friday, August 23, 2002 12:08 PM
To: Jonathan M. Hollin
Cc: [EMAIL PROTECTED]
Subject: Re: [OT] HTML to XHTML conversion



My suggestion would to just use a XML parser module like XML::LibXML.  
Load the file up using the HTML loading functions and print it using the
XML printing functions ... since the only difference I can see between 
HTML and XHMTL is that optional ending tags are no longer optional (per 
XML spec) and single tags must be ended properly (per XML spec).



On Fri, 23 Aug 2002, Jonathan M. Hollin wrote:

 [OFF TOPIC]
 
 I am trying to find a module that can convert HTML to XHTML, but have 
 drawn a blank on CPAN and GOOGLE.  Is there anything out there to do 
 this other than HTML TIDY?
 
 I am developing a mod_perl CMS application at the moment.  All its 
 output is compliant with XHTML Transitional.  But its users can create 
 content that isn't (and are likely to) and I'd like to parse this and 
 convert it XHTML before it goes into the RDBMS if possible.
 
 If nothing exists along these lines - would anyone like to collaborate 
 on the development of a module for this purpose?  HTML::XHTML anyone?
 
 
 

-- 
//\\
||  D. Hageman[EMAIL PROTECTED]  ||
\\//


--
This message is intended only for the personal and confidential use of the designated 
recipient(s) named above.  If you are not the intended recipient of this message you 
are hereby notified that any review, dissemination, distribution or copying of this 
message is strictly prohibited.  This communication is for information purposes only 
and should not be regarded as an offer to sell or as a solicitation of an offer to buy 
any financial product, an official confirmation of any transaction, or as an official 
statement of Lehman Brothers.  Email transmission cannot be guaranteed to be secure or 
error-free.  Therefore, we do not represent that this information is complete or 
accurate and it should not be relied upon as such.  All information is subject to 
change without notice.





Re: [CGI] [OT] HTML to XHTML conversion

2002-08-25 Thread Roy Schroeder

Complete automatic conversion is not possible since someone could enter
HTML code that omits or contains certain attributes are either required
or not allowed in XHTML Transitional and the conversion program would
1. not know what value to add for required but omitted attributes, or
2. removing the not allowed attributes will seriously change the
rendering of the page.

Of course the simple mechanical rules -
1. all tag names in lower case
2. all tags closed
3. all attribute values quoted
4. proper tag nesting
etc.
could be automated and may be sufficient for your purposes.

Regards
Roy


- Original Message -
From: Jonathan M. Hollin [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: CGI List [EMAIL PROTECTED]
Sent: Friday, August 23, 2002 8:54 AM
Subject: [CGI] [OT] HTML to XHTML conversion


 [NOTICE: see the message footer for important information]
 [OFF TOPIC]

 I am trying to find a module that can convert HTML to XHTML, but have
 drawn a blank on CPAN and GOOGLE.  Is there anything out there to do
 this other than HTML TIDY?

 I am developing a mod_perl CMS application at the moment.  All its
 output is compliant with XHTML Transitional.  But its users can create
 content that isn't (and are likely to) and I'd like to parse this and
 convert it XHTML before it goes into the RDBMS if possible.

 If nothing exists along these lines - would anyone like to collaborate
 on the development of a module for this purpose?  HTML::XHTML anyone?


 --
 Jonathan M. Hollin

 Co-ordinator:  WYPUG (http://wypug.pm.org/)

 --
 To unusbcribe, send an email contining the words: 'unsubscribe
cgi-list' to the following email address: [EMAIL PROTECTED]

 Archives of the following mailing lists are available at:
http://www.perl.jann.com/
 the CGI Mailing List
 the mod_perl mailing list
 the embperl mailing list
 Searching, browsing and posting are available at
http://www.perl.jann.com/





Re: [OT] HTML to XHTML conversion

2002-08-25 Thread Adrian Howard


On Friday, August 23, 2002, at 04:54  pm, Jonathan M. Hollin wrote:

 [OFF TOPIC]

 I am trying to find a module that can convert HTML to XHTML, but have 
 drawn a blank on CPAN and GOOGLE.  Is there anything out there to do 
 this other than HTML TIDY?
[snip]
 If nothing exists along these lines - would anyone like to collaborate 
 on the development of a module for this purpose?  HTML::XHTML anyone?

Out of curiosity... why not tidy? It seems to do a pretty darn good job 
of it - I use it all of the time.

Adrian