Arlolra has uploaded a new change for review.
https://gerrit.wikimedia.org/r/230268
Change subject: Remove resetting the parse position
......................................................................
Remove resetting the parse position
* This was actually broken by Ib2193153341d92304a47b8689de88bad77415eac,
peg$reportedPos corresponds to the startOffset(), but was replaced
with the endOffset().
* Why didn't that matter? Well, the original intent of it was to not eat
the spaces necessary to separate the next attribute, but the lookahead
for = meant that any proceeding attribute name would not match
anyways. So, some confusion on my part apparently.
Indeed, if you checkout Iac546cca5a9e441723c8c8e474b30b7df617a34a and
remove the if blocks, the test pages parse as expected.
frwikisource/La_Mirlitantouille_(Lenotre)?oldid=4669681
plwiki/Nedre_Eiker?oldid=37975712
There's also the test case "div with multiple empty attribute values"
which obviously continued to pass.
* I'm also removing the ? here to make it clearer that we don't want to
eat the spaces. These _att_value rules are both used optionally.
* In any case, this should all change as we move closer to the html5
attribute parsing algorithm in T108134. I'm just trying to justify
that this isn't a change in behaviour yet.
Change-Id: I3fc1965ddc6068e65d925311d122dd08c74f0a1e
---
M lib/pegTokenizer.pegjs.txt
1 file changed, 4 insertions(+), 16 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/mediawiki/services/parsoid
refs/changes/68/230268/1
diff --git a/lib/pegTokenizer.pegjs.txt b/lib/pegTokenizer.pegjs.txt
index 20e2389..5cebc1b 100644
--- a/lib/pegTokenizer.pegjs.txt
+++ b/lib/pegTokenizer.pegjs.txt
@@ -1346,14 +1346,8 @@
/ valPos1:("" { return endOffset(); })
t2:attribute_preprocessor_text_double_broken? valPos2:("" { return endOffset();
}) &[|>]
{ return tu.getAttributeValueAndSource(input, t2, valPos1,
valPos2); } )
{ return r; }
- / space_or_newline* valPos1:("" { return endOffset(); })
t:attribute_preprocessor_text? !"=" valPos2:("" { return endOffset(); })
- { if (t === null) {
- t = "";
- // Reset the current parse position in order to reparse any
- // captured spaces, which are needed to separate attributes.
- peg$currPos = endOffset();
- }
- return tu.getAttributeValueAndSource(input, t, valPos1, valPos2); }
+ / space_or_newline* valPos1:("" { return endOffset(); })
t:attribute_preprocessor_text !"=" valPos2:("" { return endOffset(); })
+ { return tu.getAttributeValueAndSource(input, t, valPos1, valPos2); }
// Attribute value, restricted to a single line.
table_att_value
@@ -1369,14 +1363,8 @@
/ valPos1:("" { return endOffset(); })
t2:attribute_preprocessor_text_double_line_broken? valPos2:("" { return
endOffset(); }) &[|>\n]
{ return tu.getAttributeValueAndSource(input, t2, valPos1,
valPos2); } )
{ return r; }
- / space* valPos1:("" { return endOffset(); })
t:attribute_preprocessor_text_line? !"=" valPos2:("" { return endOffset(); })
- { if (t === null) {
- t = "";
- // Reset the current parse position in order to reparse any
- // captured spaces, which are needed to separate attributes.
- peg$currPos = endOffset();
- }
- return tu.getAttributeValueAndSource(input, t ? t : "", valPos1,
valPos2); }
+ / space* valPos1:("" { return endOffset(); })
t:attribute_preprocessor_text_line !"=" valPos2:("" { return endOffset(); })
+ { return tu.getAttributeValueAndSource(input, t ? t : "", valPos1,
valPos2); }
/*
* A variant of generic_tag, but also checks if the tag name is a block-level
--
To view, visit https://gerrit.wikimedia.org/r/230268
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3fc1965ddc6068e65d925311d122dd08c74f0a1e
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/services/parsoid
Gerrit-Branch: master
Gerrit-Owner: Arlolra <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits