RE: TidySerializer

2003-06-05 Thread Arje Cahn
JH>> In conclusion:
JH>> 1. We need a patch for the HTMLSerializer for the namespace issue.
JH>> 2. A validation transformer seems to be really welcome.
JH>> 3. For human readability we do not need really a new 
JH>> serializer. What about the indent parameter on the serializer 
JH>> (like indent in )? -> patch

Agree, agree and agree

BD> I'm convinced there is a place for a tidyserializer. Its 
BD> purpose is not
BD> to replace the htmlserializer, but it provides some 
BD> features that can be
BD> useful in some cases (mainly validation and beautifying).
BD> FWIW, I'm +1 on adding the tidyserializer.

Agree - I can imagine it *might* be useful, although all my troubles have been fixed 
with the above 3 points.


Re: TidySerializer

2003-06-05 Thread Joerg Heinicke


Bruno Dumon wrote:
On Wed, 2003-06-04 at 21:09, Joerg Heinicke wrote:

In conclusion:
1. We need a patch for the HTMLSerializer for the namespace issue.
2. A validation transformer seems to be really welcome.
3. For human readability we do not need really a new serializer. What 
about the indent parameter on the serializer (like indent in 
)? At the moment you can set it to true, but this only adds 
line breaks AFAIK and does not indent the code. Maybe we simply need a 
patch too?


Back to the start...

I'm convinced there is a place for a tidyserializer. Its purpose is not
to replace the htmlserializer, but it provides some features that can be
useful in some cases (mainly validation and beautifying).
FWIW, I'm +1 on adding the tidyserializer.
-0.1 for the known reasons



Re: TidySerializer

2003-06-05 Thread Bruno Dumon
On Wed, 2003-06-04 at 21:09, Joerg Heinicke wrote:
> In conclusion:
> 1. We need a patch for the HTMLSerializer for the namespace issue.
> 2. A validation transformer seems to be really welcome.
> 3. For human readability we do not need really a new serializer. What 
> about the indent parameter on the serializer (like indent in 
> )? At the moment you can set it to true, but this only adds 
> line breaks AFAIK and does not indent the code. Maybe we simply need a 
> patch too?

Back to the start...

I'm convinced there is a place for a tidyserializer. Its purpose is not
to replace the htmlserializer, but it provides some features that can be
useful in some cases (mainly validation and beautifying).
FWIW, I'm +1 on adding the tidyserializer.

-- 
Bruno Dumon http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
[EMAIL PROTECTED]  [EMAIL PROTECTED]



Re: TidySerializer

2003-06-05 Thread Joerg Heinicke
In conclusion:
1. We need a patch for the HTMLSerializer for the namespace issue.
2. A validation transformer seems to be really welcome.
3. For human readability we do not need really a new serializer. What 
about the indent parameter on the serializer (like indent in 
)? At the moment you can set it to true, but this only adds 
line breaks AFAIK and does not indent the code. Maybe we simply need a 
patch too?

Joerg

Torsten Knodt wrote:
On Tuesday 03 June 2003 23:46, Joerg Heinicke wrote:
JH> For debug output I normalize-space every text node and simply indent
JH> them by counting the ancestor nodes. This has no influence for HTML,
JH> because HTML normalizes text nodes too (exception: ).
Right, would be enough for html. And for others? tidy can also work on normal 
xml and do this. There you need strip-space.

JH> But it's also resource consuming if it's integrated in Xalan :)

XSLT 2.0 will consume some resources yes, but computers get faster. And the 
features of 2.0 help in many cases (like grouping of data).
BTW: Why does nearly everyone get afraid of resource usage for security or 
quality checks and spend all resources for flash, videos and others?

JH> TK> Redirecting to a internal error page, like with other errors to.
JH> What's the advantage for the user? Who cares about invalid HTML pages?
JH> It's only important for development, must no be running on live systems.
First, it shouldn't be to often, that an internal error occurs. When, I think 
it is better to have an error page, than invalid output. When you get this 
invalid output after the development phase, you can't be sure where the 
problem is. Perhaps your transformation process has an error, which also 
gives wrong data on which the users makes decisions. Also bots or agents 
parsing the data should interest valid output.
Also filtering proxys can make good use of valid content. They do not have to 
do regular expression matching or things like this. They can read their html 
and xmlize it or take the xml, validate it (and it will be valid) and then do 
their filtering on a nice DOM tree or SAX stream.

JH> > cocoon. When this is a parameter or a view, even better.
JH> Ok, reason accepted :) But what about an extra validating transformer as
JH> last pipeline step? Seems to make more sense IMO.
This would be the perfect solution.
I've already looked how this could be done.
But at least xerces can only do validation on an input stream or DOM tree. And 
even on the DOM tree only with XML Schema, when I have read right. At least 
for the servlet environment, I would prefer a SAX solution.




Re: TidySerializer

2003-06-05 Thread Torsten Knodt
On Tuesday 03 June 2003 23:46, Joerg Heinicke wrote:
JH> For debug output I normalize-space every text node and simply indent
JH> them by counting the ancestor nodes. This has no influence for HTML,
JH> because HTML normalizes text nodes too (exception: ).

Right, would be enough for html. And for others? tidy can also work on normal 
xml and do this. There you need strip-space.


JH> But it's also resource consuming if it's integrated in Xalan :)

XSLT 2.0 will consume some resources yes, but computers get faster. And the 
features of 2.0 help in many cases (like grouping of data).
BTW: Why does nearly everyone get afraid of resource usage for security or 
quality checks and spend all resources for flash, videos and others?


JH> TK> Redirecting to a internal error page, like with other errors to.
JH> What's the advantage for the user? Who cares about invalid HTML pages?
JH> It's only important for development, must no be running on live systems.

First, it shouldn't be to often, that an internal error occurs. When, I think 
it is better to have an error page, than invalid output. When you get this 
invalid output after the development phase, you can't be sure where the 
problem is. Perhaps your transformation process has an error, which also 
gives wrong data on which the users makes decisions. Also bots or agents 
parsing the data should interest valid output.
Also filtering proxys can make good use of valid content. They do not have to 
do regular expression matching or things like this. They can read their html 
and xmlize it or take the xml, validate it (and it will be valid) and then do 
their filtering on a nice DOM tree or SAX stream.


JH> > cocoon. When this is a parameter or a view, even better.
JH> Ok, reason accepted :) But what about an extra validating transformer as
JH> last pipeline step? Seems to make more sense IMO.

This would be the perfect solution.
I've already looked how this could be done.
But at least xerces can only do validation on an input stream or DOM tree. And 
even on the DOM tree only with XML Schema, when I have read right. At least 
for the servlet environment, I would prefer a SAX solution.

-- 
Domain in provider transition, hope for smoothness. Planed date is 1.7.2003.



RE: TidySerializer

2003-06-05 Thread Arje Cahn
1) 

RE: TidySerializer

2003-06-04 Thread Hunsberger, Peter
Arje Cahn <[EMAIL PROTECTED]> wrote:

> Another inventory.
> 
> 1) 

RE: ValidatingTransformer (WAS RE: TidySerializer)

2003-06-04 Thread Geissel, Adrian
> > 
> > 
> >  
> > 
> > 

Or, perhaps 

 

/Adrian

Any e-mail message from the European Central Bank (ECB) is sent in good faith but 
shall neither be binding nor construed as constituting a commitment by the ECB except 
where provided for in a written agreement.
This e-mail is intended only for the use of the recipient(s) named above. Any 
unauthorised disclosure, use or dissemination, either in whole or in part, is 
prohibited.
If you have received this e-mail in error, please notify the sender immediately via 
e-mail and delete this e-mail from your system.



RE: TidySerializer

2003-06-04 Thread Arje Cahn
> > BD> TK> We have a current problem, that cocoon is not 
> useable in many cases,

I think I just changed my opinion. I don't need a TidySerializer as desperately as I 
thought I did. 

What I need is HTML-valid (whatever that may be) output from Cocoon. I saw Jeorg 
rescue someone on the users list who's  tag got messed up. This issue has been 
going on for a long long time.. And I have the feeling a lot of users really don't 
understand this behauviour. I know it is fixed now - but it's not clear enough for 
users. It's probably a matter of time, but some Wikifying could speed the process 
[note to myself]. So that's done.

Another inventory.

1)