(Context: I'm trying to understand the requirements for our serializers in case we rewrite them [in Rust].)
The HTML fragment parsing algorithm can have only one context node. The context is never a chain of nodes towards to the root, since such a thing wouldn't affect the result per the HTML parsing algorithm. However, when the HTML parsing algorithm is in the non-fragment mode, some tags get ignored without appropriate parent, so e.g. to represent <td> in the non-fragment mode, you need to include <table>, etc. But that's about it. The Windows CF_HTML clipboard format, https://msdn.microsoft.com/en-us/library/windows/desktop/ms649015(v=vs.85).aspx , represents fragments by designating them in a full HTML document, so what are logically fragments have to work with non-fragment parsing. This indicates that when we export a fragment to the clipboard, we should serialize its parent if not table-related or reconstruct a full table if table-related. Yet, it seems that we serialize much more ancestor context. Is there a good reason to? For example, does Microsoft office (our old bugs suggest that Excel is the pickiest consumer) or other CF_HTML consumers on Windows care about more context than the standard HTML parsing algorithm? What could consumers possibly do with knowlegde about ancestors beyond parent or the nearest <table>? (I'm ignoring SVG and MathML for the moment.) OTOH, it seems that we include only some element types in the context (https://searchfox.org/mozilla-central/source/dom/base/nsDocumentEncoder.cpp#1540). It's unclear to me why. The first revision of the list came from jst during the Netscape 6 crunch without an explanation either in Bugzilla or code comments. (https://bugzilla.mozilla.org/show_bug.cgi?id=50742) Does anyone know why? -- Henri Sivonen hsivo...@hsivonen.fi https://hsivonen.fi/ _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform