"&".... wow.. no wonder my url's never worked, i just used the "&", then i gave up.. thanks for the new info.. it will really help...
On Saturday, August 10, 2002, at 11:03 PM, Justin Fagnani-Bell wrote: > Hi again, > > <warning> this is a long post </warning> > > I'm still working on HTML forms where the user (me for the moment:) > is supposed to input HTML into a text area that will be stored in an > XML format. I'm still having problems, so I haven't written a SUMMARY > post... > > My new problem occurred last night when I'm testing the system and I > put in an anchor tag with a url that has request parameters... like > this: > > <a href="http://www.something.net/apage.jsp?p1=hi&p2=bye">link</a> > > Well, when I hit submit the form is supposed to come back filled out, > but instead I get an error that states "the entity 'p2' must end with > a ';'. > > So I do some searching on on w3.org and sure enough URLs in XHTML have > to use '&' instead of '&'. Arrgh, I know this will cause problems > once people who are used to normal HTML start using this. I'm > considering writing a filter that will escape illegal characters on the > way in, and un-escape them going back to the user, but that seems like > a bit of a pain and combined with the problems I'm having making people > type XML compliant HTML in the first place I'm wondering if there's a > completely different way I could do this. > > I'm sure someone else out there has come across these problems before. > It seems inevitable when building a webapp where users can edit some > content, that uses XML on the backend. The users only marginally know > HTML in the first place and can't be expected to always follow the > rules correctly every time. The app after all, is supposed to be easy > to use. > > I would love to start some discussion on different ideas for handling > these types of problems. They must be common among Cocoon users, and > maybe we can come up with a set of solutions (HOW-TO's, Java helper > classes, taglibs) to make life easier on Cocoon developers and > end-users. > > Here's my little list of requirements, issues, and assumptions when > dealing with forms, user input, and xml. > > 1) My users are used to HTML, not XML > 2) My users are not fail proof, and are probably prone to occasional > mistakes > 3) Ideally I want them to be able to input HTML(non XML compliant), > plain text, or XML (not HTML, but any XML. this is actually preferred, > but sometimes users are just entering a news item, or a BBS post, and > it seems reasonable to allow them to use HTML for formatting rather > than inventing my own xml dialect) > 4) The data is going to be in an XML document/SAX stream at some point > (either stored that way, or stored in a database and turned into xml > through a generator) > 5) sometimes I want to run xsl transformations on the data when it is > output. > 6) when editing the data, I'd like to have it appear exactly as the > user typed > 7) but i'd also like to have the ability to clean it up (as on option) > 8) The browsers like HTML 4 much better than XHTML, therefore the pages > I send them work better if I use the HTMLSerializer > > Here are some problems I've encountered so far. > > 1) users don't follow XML rules very well (goes along with point 1) > 2) the HTMLSerializer changes the users data by turning <br/> into > <br>, etc > 3) the XML Serializer changes the users data by turning > <textarea></textarea> into <textarea/>, etc > 4) bad user input will cause SAXExceptions if it's not enclosed in > CDATA sections > > (oh, to clarify here, I typically have two pages which show the data, > one is the 'edit' page with the form, the other is where the data > actually shows up, the 'viewing page', the HTMLserializer is no problem > on the viewing page, just the editing page) > > Some of these points interfere with some solutions. For example, I > could wrap the data in a CDATA section to get around XML compliance, > but then I wouldn't be able to run XSL transformations on it (correct > me if I'm wrong anywhere). Maybe I could check if the data is xml > compliant and wrap it only if it isn't. > > Here are some ideas for solutions: > > 1) Create a new HTMLSerializer that can selectively determine which > tags it will convert into HTML and which is will leave alone. This way > you could specify that all textarea tags and their contents shouldn't > be touched (I would think this would be a reasonable default feature > anyway) > 2) Create a jTidy like program that will turn HTML into XHTML, but work > for fragments (jTidy seems to only output complete HTML documents) > 3) Create a class that can find an XML error, and report it nicely back > to the user so they can fix it. (I recall a demo with Cocoon 1.8.x that > had something like this...) > > Hmm, these three things might do it. the new serializer would work for > editing, the Tidy-like class work work for either storing the data as > xml, or just viewing it as xml. I think I have an idea on how to do the > serializer, but it wouldn't rely on a transformer like the current one. > I looked at the code for jTidy and there's a ton of classes, so I've > yet to fully comprehend how it works, it might already be able to do > what i want, and like I said I saw something similar to 3) a year or so > ago... > > ok, that's my thoughts... > > Justin > > > > --------------------------------------------------------------------- > Please check that your question has not already been answered in the > FAQ before posting. <http://xml.apache.org/cocoon/faq/index.html> > > To unsubscribe, e-mail: <[EMAIL PROTECTED]> > For additional commands, e-mail: <[EMAIL PROTECTED]> > > --------------------------------------------------------------------- Please check that your question has not already been answered in the FAQ before posting. <http://xml.apache.org/cocoon/faq/index.html> To unsubscribe, e-mail: <[EMAIL PROTECTED]> For additional commands, e-mail: <[EMAIL PROTECTED]>