you might want to put your findings into the Lenya Wiki ;-)
Michi
Douglas Hurbon wrote:
I just spent a bit of time looking into this issue in terms of Lenya
1.2. See my discussion with myself from 3/30/06 in this forum, subject
"Entity Resolution"
The bottom line is that the one form editor can be made to accept
entities, but this requires modifying
org.apache.lenya.cms.cocoon.acting.OneFormEditorSaveAction. This is
fast and will work, but it means that all your docs have to adhere to
the same DTD.
Consequently, I suppose you could re-write the pipelines for the form
based editors, and have the serializes that provide their content
output a DTD and have the forms pass that DTD back as part of the
document, so that they can be be saved. The default publication
doesn't do this, and I've not experimented with it, but it seems like
it'd work. Keep in mind that without getting Xalan to stop
transforming entities, your entities will be saved as their numeric
equivalent once they leave the form-based editors (assuming that this
setup works (which I can't vouch for)). Anyone else played with this?
BXE just won't allow entities: it resolves them when the document opens
(and this means you have to get the lenya serializer providing the XML
to BXE to output a DTD -- but then of course any entities in that doc
will have already been resolved by Xalan on the generate statement) and
then BXE doesn't expect to see them anymore. I'd suggest using
snippets to provide special characters with the numeric character
references.
Here are my notes on my final conclusions concerning Lenya 1.2 and
implementing a custom set of xml entities to be provided by Cocoon's
entity catalogues:
On Entities in Lenya:
Use the hexadecimal equivalent of html entities, and we have no
problems in Lenya as long as the final serializer outputs a browser
friendly encoding. The safest seems to be "ISO (blah blah blah)".
Anything outside that character set gets sent as • instead of the
actual character • , which makes IE happy.
Entities cannot persist in Lenya: any time Xalan parses an XML
document, that is generates an XML document (whether it's a .xsl or
not), that is reads it from a disk into memory, it resolves entities
according to whatever DTD is declared in that documents. I've heard
this can be turned off, but haven't played with it.
Any round trip of entities will be overly complicated, and not at all
possible with BXE -- BXE itself will resolve any entities passed to it
according to the doctype, and so it will also write "Baruch College" if
we manage to pass it a DTD and &baruch;
I think the best we can do without completely re-writing lenya and bxe
is to use the snippets in BXE for special characters, and write those
special characters as their hexidecimal equivalent.
Hand editing (bbedit via webdav) with entities requires declaring an
XHTML doctype. So that's no problem -- the generation of that file
will resolve the entities no problem.
The forms editors are processed finally by java classes which recieve a
document without a DTD or XML declaration, such as:
org.apache.lenya.cms.cocoon.acting.OneFormEditorSaveAction
// Aggregate content
String encoding = request.getCharacterEncoding();
String content =
"<?xml version=\"1.0\" encoding=\""
+ encoding
+ "\"?>\n"
+ addNamespaces(namespaces, request.getParameter
("content"));
They currently add an XML declaration, and it'd be here that they'd
need to have DTD impossed on them. Imposing a DTD on them by harding
coding it here in the java works perfectly, allowing entities in the
oneform editor.
The great problem with this is it doesn't allow you to mix and match
DTD. Our custom XML resource types would cause an error when saving
because it doesn't match the XHTML doctype imposed by the hard coded
java library.
Further, we wouldn't benifit much from this, because the saved document
would have "Baruch College" written there, and NOT &baruch; as passing
the doc through Xalan as part of the save process transforms all the
entities. So we could do a lot of work to allow editors to use
entities ONCE, and thereafter have to corrently spell everything. Not
a great idea.
On Apr 11, 2006, at 6:20 AM, Andreas Hartmann wrote:
[EMAIL PROTECTED] schrieb:
Michael Ralston schrieb:
What do you mean by 'Predefined'... is it possible to edit this
definition?
Predefined entities are declared in the XML spec:
http://www.w3.org/TR/REC-xml/#sec-predefined-ent
BTW, is not one of them.
So if is not a predefined entity... where is it defined?
In the corresponding DTD (e.g., HTML 4 and XHTML).
I really need to work out where to define ’
I find it hard to believe nobody has solved this problem before...
I always use numeric character references, like  .
-- Andreas
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Michael Wechner
Wyona Inc. - Open Source Content Management - Apache Lenya
http://www.wyona.com http://cocoon.apache.org/lenya/
[EMAIL PROTECTED] [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]