Re: [Wikitech-l] Html.php line 269
I was digging and I found that commit you sent - eefe1b13a but you can see that this commit modified 2 functions: both open and closeElement, however for some reason the code was removed from closeElement so now calling openElement('html') would produce empty string, while closeElement('html') would produce /html and that should IMHO create HTML code that would only contain closing tags in case that html / head didn't have any parameters. Actually head, at least on wmf installation doesn't have any parameters, but it has opening tag, which is weird. (Perhaps wgWellFormedXml is true by default?) On Wed, Feb 18, 2015 at 10:31 PM, Brion Vibber bvib...@wikimedia.org wrote: On Wed, Feb 18, 2015 at 1:23 PM, Petr Bena benap...@gmail.com wrote: Can someone explain the point of these lines to me? https://github.com/wikimedia/mediawiki/blob/master/includes/Html.php#L269 Someone thought it would be clever to reduce page size by a few bytes here and there. :) In practice we always have attributes on the html so this would only take effect on head for this particular case. However there are other bits which can be more aggressively dropped, such as closing tags for table parts etc. Here's the commit that added this and related bits: https://github.com/wikimedia/mediawiki/commit/eefe1b13a382a7d11c2137bbf900b783ee445323 If these tags are optional, they can be there, so why remove them? If you remove them, you should probably also care about them in closeElement() so that mediawiki doesn't produce html which has only ending tags for html and head. I believe that's also covered, yes. (Keep in mind also that the way the HTML content model works is that those elements always exist in the DOM; the tags can simply be omitted from the markup because they are implied by the surrounding content.) It seems that on WMF installation we don't use anyway as there is head tag. So why is it there? What purpose does it server? I think we still have the well-formed XML mode enabled for backwards-compatibility? -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
On Wed, Feb 18, 2015 at 1:23 PM, Petr Bena benap...@gmail.com wrote: Can someone explain the point of these lines to me? https://github.com/wikimedia/mediawiki/blob/master/includes/Html.php#L269 Someone thought it would be clever to reduce page size by a few bytes here and there. :) In practice we always have attributes on the html so this would only take effect on head for this particular case. However there are other bits which can be more aggressively dropped, such as closing tags for table parts etc. Here's the commit that added this and related bits: https://github.com/wikimedia/mediawiki/commit/eefe1b13a382a7d11c2137bbf900b783ee445323 If these tags are optional, they can be there, so why remove them? If you remove them, you should probably also care about them in closeElement() so that mediawiki doesn't produce html which has only ending tags for html and head. I believe that's also covered, yes. (Keep in mind also that the way the HTML content model works is that those elements always exist in the DOM; the tags can simply be omitted from the markup because they are implied by the surrounding content.) It seems that on WMF installation we don't use anyway as there is head tag. So why is it there? What purpose does it server? I think we still have the well-formed XML mode enabled for backwards-compatibility? -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
I think it's the other way, omiting the html tags is probably correct, as long as both opening and ending tags are omitted :P so I think the closeElement function needs a fix On Wed, Feb 18, 2015 at 10:42 PM, Ricordisamoa ricordisa...@openmailbox.org wrote: Uploaded https://gerrit.wikimedia.org/r/191473/ for consistency with closeElement. Thanks for reporting. Il 18/02/2015 22:38, Petr Bena ha scritto: I was digging and I found that commit you sent - eefe1b13a but you can see that this commit modified 2 functions: both open and closeElement, however for some reason the code was removed from closeElement so now calling openElement('html') would produce empty string, while closeElement('html') would produce /html and that should IMHO create HTML code that would only contain closing tags in case that html / head didn't have any parameters. Actually head, at least on wmf installation doesn't have any parameters, but it has opening tag, which is weird. (Perhaps wgWellFormedXml is true by default?) On Wed, Feb 18, 2015 at 10:31 PM, Brion Vibber bvib...@wikimedia.org wrote: On Wed, Feb 18, 2015 at 1:23 PM, Petr Bena benap...@gmail.com wrote: Can someone explain the point of these lines to me? https://github.com/wikimedia/mediawiki/blob/master/includes/Html.php#L269 Someone thought it would be clever to reduce page size by a few bytes here and there. :) In practice we always have attributes on the html so this would only take effect on head for this particular case. However there are other bits which can be more aggressively dropped, such as closing tags for table parts etc. Here's the commit that added this and related bits: https://github.com/wikimedia/mediawiki/commit/eefe1b13a382a7d11c2137bbf900b783ee445323 If these tags are optional, they can be there, so why remove them? If you remove them, you should probably also care about them in closeElement() so that mediawiki doesn't produce html which has only ending tags for html and head. I believe that's also covered, yes. (Keep in mind also that the way the HTML content model works is that those elements always exist in the DOM; the tags can simply be omitted from the markup because they are implied by the surrounding content.) It seems that on WMF installation we don't use anyway as there is head tag. So why is it there? What purpose does it server? I think we still have the well-formed XML mode enabled for backwards-compatibility? -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
That patch you made even breaks tests: https://integration.wikimedia.org/ci/job/mediawiki-core-phpcs-HEAD/13002/console :) so either remove a unit test for that feature, or just fix the closeElement but I don't even know if it's really broken, maybe there is some more magic on behing that would fix the broken html before it gets printed out On Wed, Feb 18, 2015 at 10:47 PM, Petr Bena benap...@gmail.com wrote: I think it's the other way, omiting the html tags is probably correct, as long as both opening and ending tags are omitted :P so I think the closeElement function needs a fix On Wed, Feb 18, 2015 at 10:42 PM, Ricordisamoa ricordisa...@openmailbox.org wrote: Uploaded https://gerrit.wikimedia.org/r/191473/ for consistency with closeElement. Thanks for reporting. Il 18/02/2015 22:38, Petr Bena ha scritto: I was digging and I found that commit you sent - eefe1b13a but you can see that this commit modified 2 functions: both open and closeElement, however for some reason the code was removed from closeElement so now calling openElement('html') would produce empty string, while closeElement('html') would produce /html and that should IMHO create HTML code that would only contain closing tags in case that html / head didn't have any parameters. Actually head, at least on wmf installation doesn't have any parameters, but it has opening tag, which is weird. (Perhaps wgWellFormedXml is true by default?) On Wed, Feb 18, 2015 at 10:31 PM, Brion Vibber bvib...@wikimedia.org wrote: On Wed, Feb 18, 2015 at 1:23 PM, Petr Bena benap...@gmail.com wrote: Can someone explain the point of these lines to me? https://github.com/wikimedia/mediawiki/blob/master/includes/Html.php#L269 Someone thought it would be clever to reduce page size by a few bytes here and there. :) In practice we always have attributes on the html so this would only take effect on head for this particular case. However there are other bits which can be more aggressively dropped, such as closing tags for table parts etc. Here's the commit that added this and related bits: https://github.com/wikimedia/mediawiki/commit/eefe1b13a382a7d11c2137bbf900b783ee445323 If these tags are optional, they can be there, so why remove them? If you remove them, you should probably also care about them in closeElement() so that mediawiki doesn't produce html which has only ending tags for html and head. I believe that's also covered, yes. (Keep in mind also that the way the HTML content model works is that those elements always exist in the DOM; the tags can simply be omitted from the markup because they are implied by the surrounding content.) It seems that on WMF installation we don't use anyway as there is head tag. So why is it there? What purpose does it server? I think we still have the well-formed XML mode enabled for backwards-compatibility? -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
Most of browsers would probably handle it OK, but validators not. On Wed, Feb 18, 2015 at 10:45 PM, Petr Bena benap...@gmail.com wrote: Yep That makes me feel like this is bug which everyone overlooked because it's true by default. If you set it to false mediawiki would likely write out syntactically wrong code. On Wed, Feb 18, 2015 at 10:43 PM, Gergo Tisza gti...@wikimedia.org wrote: On Wed, Feb 18, 2015 at 1:38 PM, Petr Bena benap...@gmail.com wrote: (Perhaps wgWellFormedXml is true by default?) It is: https://www.mediawiki.org/wiki/Manual:$wgWellFormedXml ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
Uploaded https://gerrit.wikimedia.org/r/191473/ for consistency with closeElement. Thanks for reporting. Il 18/02/2015 22:38, Petr Bena ha scritto: I was digging and I found that commit you sent - eefe1b13a but you can see that this commit modified 2 functions: both open and closeElement, however for some reason the code was removed from closeElement so now calling openElement('html') would produce empty string, while closeElement('html') would produce /html and that should IMHO create HTML code that would only contain closing tags in case that html / head didn't have any parameters. Actually head, at least on wmf installation doesn't have any parameters, but it has opening tag, which is weird. (Perhaps wgWellFormedXml is true by default?) On Wed, Feb 18, 2015 at 10:31 PM, Brion Vibber bvib...@wikimedia.org wrote: On Wed, Feb 18, 2015 at 1:23 PM, Petr Bena benap...@gmail.com wrote: Can someone explain the point of these lines to me? https://github.com/wikimedia/mediawiki/blob/master/includes/Html.php#L269 Someone thought it would be clever to reduce page size by a few bytes here and there. :) In practice we always have attributes on the html so this would only take effect on head for this particular case. However there are other bits which can be more aggressively dropped, such as closing tags for table parts etc. Here's the commit that added this and related bits: https://github.com/wikimedia/mediawiki/commit/eefe1b13a382a7d11c2137bbf900b783ee445323 If these tags are optional, they can be there, so why remove them? If you remove them, you should probably also care about them in closeElement() so that mediawiki doesn't produce html which has only ending tags for html and head. I believe that's also covered, yes. (Keep in mind also that the way the HTML content model works is that those elements always exist in the DOM; the tags can simply be omitted from the markup because they are implied by the surrounding content.) It seems that on WMF installation we don't use anyway as there is head tag. So why is it there? What purpose does it server? I think we still have the well-formed XML mode enabled for backwards-compatibility? -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
On Wed, Feb 18, 2015 at 1:38 PM, Petr Bena benap...@gmail.com wrote: (Perhaps wgWellFormedXml is true by default?) It is: https://www.mediawiki.org/wiki/Manual:$wgWellFormedXml ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
Yep That makes me feel like this is bug which everyone overlooked because it's true by default. If you set it to false mediawiki would likely write out syntactically wrong code. On Wed, Feb 18, 2015 at 10:43 PM, Gergo Tisza gti...@wikimedia.org wrote: On Wed, Feb 18, 2015 at 1:38 PM, Petr Bena benap...@gmail.com wrote: (Perhaps wgWellFormedXml is true by default?) It is: https://www.mediawiki.org/wiki/Manual:$wgWellFormedXml ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
On 19/02/15 08:43, Gergo Tisza wrote: On Wed, Feb 18, 2015 at 1:38 PM, Petr Bena benap...@gmail.com wrote: (Perhaps wgWellFormedXml is true by default?) It is: https://www.mediawiki.org/wiki/Manual:$wgWellFormedXml There was a Bugzilla report and Gerrit change requesting that it be set to false: https://phabricator.wikimedia.org/T52040 https://gerrit.wikimedia.org/r/#/c/70036/ I was against it, partly because of the omitted head tag. -- Tim Starling ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html.php line 269
I would also recommend against actively trying to emit barely parsing output. Any savings after compression should be rather small, and if only end tags are omitted the DOM will of course still be the same size after parsing. In Parsoid we went to some modest lengths https://github.com/wikimedia/parsoid/blob/master/lib/XMLSerializer.js to produce polyglot markup http://www.w3.org/TR/html-polyglot/, which is both valid XML and HTML5. This has enabled consumers to use either XML or HTML5 parsers, which has proven very useful in practice. For example, this makes it easier to consume this content using PHP's libxml. Doing the same in MediaWiki core is admittedly harder, but I still think that we should follow the robustness principle https://en.wikipedia.org/wiki/Robustness_principle wherever we can. Gabriel On Wed, Feb 18, 2015 at 5:59 PM, Tim Starling tstarl...@wikimedia.org wrote: On 19/02/15 08:43, Gergo Tisza wrote: On Wed, Feb 18, 2015 at 1:38 PM, Petr Bena benap...@gmail.com wrote: (Perhaps wgWellFormedXml is true by default?) It is: https://www.mediawiki.org/wiki/Manual:$wgWellFormedXml There was a Bugzilla report and Gerrit change requesting that it be set to false: https://phabricator.wikimedia.org/T52040 https://gerrit.wikimedia.org/r/#/c/70036/ I was against it, partly because of the omitted head tag. -- Tim Starling ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l