http://www.w3.org/TR/1998/REC-xml-19980210#sec-cdata-sect

But I think it's only for XML, not HTML. And if you can't influence the HTML, because you get the pages anywhere from the web, it's no further interesting.

Joerg

Anna Afonchenko wrote:
Hi Joerg.
Thanks for answering.

The HTML DTD does not allow script inside the table (I think so).
But somebody actually uses script inside a table, and it probably works.
BTW, if I embed the script tag inside the tr/td elements, then JTidy doesn't
touch it.
But this is not my code, and when I will use the JTidy inside HTMLGenerator,
I will load pages from the web that are not mine, just any page, so I can't
really control the content of these pages, but I want to run my stylesheet
on those. And if JTidy messes things up like this, it becomes really
complicated.
What do you mean by CDATA? You say that if I embed the script tag into
CDATA, it wil not be messed up? Can you clear this thing for me please?

Anyway, I will post the bug report on JTidy SourceForge.
Thank you very much for your help.

Anna

----- Original Message -----
From: "Joerg Heinicke" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, December 26, 2002 2:54 PM
Subject: Re: Bug in JTidy?


Hello Anna,

JTidy is sometimes to intelligent ;-) It tries to fix to much. Have a
look into the HTML DTD and see, whether <script> is allowed in <table>.
If yes, post a bug at JTidy SourceForge, otherwise the behaviour of
JTidy is ok. We encountered many similar problems with JTidy.

In your case JTidy gets especially confused by <tr> and <td> in the
script. Maybe you must fix these pages by hand. Does CDATA exist in
HTML?? If yes, maybe this helps.

Regards,

Joerg

Anna Afonchenko wrote:

Hi all. I use an HTMLGenerator to tidy up the pages that I load, and I
encountered a very strange behaviour concerning scripts. This is my
input file:
test.html

<html>
<head>
 <title>Testing JTidy page</title>
</head>
<body>
   <p>This is test</p>
   <table>
       <tr>
           <td>Hello world</td>
       </tr>
       <script language="JavaScript">
           document.write('<tr>');
           document.write('<td>');
           document.write('testing the JavaScript');
           document.write('</td>');
           document.write('</tr>');
       </script>
       <tr>
           <td>After script</td>
       </tr>
   </table>
 </body>
</html>

As you can notice, the script tag is not inside the tr/td tag, but it
writes them, so the result table contains three rows (one of them output
by the script).
This is the actual code that I took from somebody's page.

When I put this page into the pipeline, using HTMLGenerator (to tidy
it), this is the VERY weird result that I get:
pipeline:
<map:match pattern="test">
   <map:generate src="test.html" type="html"/>
   <map:serialize type="xml"/>
</map:match>

the result shown in the Cocoon browser window:
<?xml version="1.0" encoding="utf-8" ?>
**<html>
   <head>
* *     <title>Testing JTidy page</title>
* *</head>
   <body>
* *     <p>*This is test*</p>
* *     <script language="*JavaScript*" type="*text/javascript*" />
*     * *document.write(''); document.write(''); document.write(''); *
       <table>
           <tr>
* *             <td>*Hello world*</td>
* *        </tr>
           <tr>
* *             <td>*'); document.write('testing the JavaScript');
document.write('*</td>
* *        </tr>
* * </table>
    <table>
       <tr>
* *        <td>*After script*</td>
* *    </tr>
* * </table>
* *</body>
</html>

The JTidy took out the script and messed the table!

Somebody encountered such behaviour when using HTMLGenerator?
I know that this is not really related to the Cocoon, but Cocoon uses
JTidy, so I thought that somebody may have dealt with this thing already.
Also, I looked on the JTidy page on sourceforge, but I didn't find
anything related to this.

Please, I somebody understands what going on with this JTidy feature,
please help me.

Sorry for a not-so-related question.

Thank you very much for help.

Anna



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <[EMAIL PROTECTED]>
For additional commands, e-mail:   <[EMAIL PROTECTED]>


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <[EMAIL PROTECTED]>
For additional commands, e-mail:   <[EMAIL PROTECTED]>



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <[EMAIL PROTECTED]>
For additional commands, e-mail:   <[EMAIL PROTECTED]>

Reply via email to