"&amp".... wow.. no wonder my url's never worked, i just used the "&", 
then i gave up..
thanks for the new info.. it will really help...

On Saturday, August 10, 2002, at 11:03 PM, Justin Fagnani-Bell wrote:

> Hi again,
>
>   <warning> this is a long post </warning>
>
>   I'm still working on HTML forms where the user (me for the moment:) 
> is supposed to input HTML into a text area that will be stored in an 
> XML format. I'm still having problems, so I haven't written a SUMMARY 
> post...
>
> My new problem occurred last night when I'm testing the system and I 
> put in an anchor tag with a url that has request parameters... like 
> this:
>
> <a href="http://www.something.net/apage.jsp?p1=hi&p2=bye";>link</a>
>
> Well, when I hit submit the form is supposed to come back filled out, 
> but instead I get an error that states "the entity 'p2' must end with 
> a ';'.
>
> So I do some searching on on w3.org and sure enough URLs in XHTML have 
> to use '&amp;' instead of '&'. Arrgh, I know this will cause problems 
> once people who are used to normal HTML start using this. I'm 
> considering writing a filter that will escape illegal characters on the 
> way in, and un-escape them going back to the user, but that seems like 
> a bit of a pain and combined with the problems I'm having making people 
> type XML compliant HTML in the first place I'm wondering if there's a 
> completely different way I could do this.
>
> I'm sure someone else out there has come across these problems before. 
> It seems inevitable when building a webapp where users can edit some 
> content, that uses XML on the backend. The users only marginally know 
> HTML in the first place and can't be expected to always follow the 
> rules correctly every time. The app after all, is supposed to be easy 
> to use.
>
> I would love to start some discussion on different ideas for handling 
> these types of problems. They must be common among Cocoon users, and 
> maybe we can come up with a set of solutions (HOW-TO's, Java helper 
> classes, taglibs) to make life easier on Cocoon developers and 
> end-users.
>
> Here's my little list of requirements, issues, and assumptions when 
> dealing with forms, user input, and xml.
>
> 1) My users are used to HTML, not XML
> 2) My users are not fail proof, and are probably prone to occasional 
> mistakes
> 3) Ideally I want them to be able to input HTML(non XML compliant), 
> plain text, or XML (not HTML, but any XML. this is actually preferred, 
> but sometimes users are just entering a news item, or a BBS post, and 
> it seems reasonable to allow them to use HTML for formatting rather 
> than inventing my own xml dialect)
> 4) The data is going to be in an XML document/SAX stream at some point
>   (either stored that way, or stored in a database and turned into xml 
> through a generator)
> 5) sometimes I want to run xsl transformations on the data when it is 
> output.
> 6) when editing the data, I'd like to have it appear exactly as the 
> user typed
> 7) but i'd also like to have the ability to clean it up (as on option)
> 8) The browsers like HTML 4 much better than XHTML, therefore the pages 
> I send them work better if I use the HTMLSerializer
>
> Here are some problems I've encountered so far.
>
> 1) users don't follow XML rules very well (goes along with point 1)
> 2) the HTMLSerializer changes the users data by turning <br/> into 
> <br>, etc
> 3) the XML Serializer changes the users data by turning 
> <textarea></textarea> into <textarea/>, etc
> 4) bad user input will cause SAXExceptions if it's not enclosed in 
> CDATA sections
>
> (oh, to clarify here, I typically have two pages which show the data, 
> one is the 'edit' page with the form, the other is where the data 
> actually shows up, the 'viewing page', the HTMLserializer is no problem 
> on the viewing page, just the editing page)
>
> Some of these points interfere with some solutions. For example, I 
> could wrap the data in a CDATA section to get around XML compliance, 
> but then I wouldn't be able to run XSL transformations on it (correct 
> me if I'm wrong anywhere). Maybe I could check if the data is xml 
> compliant and wrap it only if it isn't.
>
> Here are some ideas for solutions:
>
> 1) Create a new HTMLSerializer that can selectively determine which 
> tags it will convert into HTML and which is will leave alone. This way 
> you could specify that all textarea tags and their contents shouldn't 
> be touched (I would think this would be a reasonable default feature 
> anyway)
> 2) Create a jTidy like program that will turn HTML into XHTML, but work 
> for fragments (jTidy seems to only output complete HTML documents)
> 3) Create a class that can find an XML error, and report it nicely back 
> to the user so they can fix it. (I recall a demo with Cocoon 1.8.x that 
> had something like this...)
>
> Hmm, these three things might do it. the new serializer would work for 
> editing, the Tidy-like class work work for either storing the data as 
> xml, or just viewing it as xml. I think I have an idea on how to do the 
> serializer, but it wouldn't rely on a transformer like the current one. 
> I looked at the code for jTidy and there's a ton of classes, so I've 
> yet to fully comprehend how it works, it might already be able to do 
> what i want, and like I said I saw something similar to 3) a year or so 
> ago...
>
> ok, that's my thoughts...
>
> Justin
>
>
>
> ---------------------------------------------------------------------
> Please check that your question  has not already been answered in the
> FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>
>
> To unsubscribe, e-mail:     <[EMAIL PROTECTED]>
> For additional commands, e-mail:   <[EMAIL PROTECTED]>
>
>


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <[EMAIL PROTECTED]>
For additional commands, e-mail:   <[EMAIL PROTECTED]>

Reply via email to