[MediaWiki-commits] [Gerrit] Remove resetting the parse position - change (mediawiki...parsoid)

Arlolra (Code Review) Fri, 07 Aug 2015 18:56:41 -0700

Arlolra has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/230268


Change subject: Remove resetting the parse position
......................................................................

Remove resetting the parse position

 * This was actually broken by Ib2193153341d92304a47b8689de88bad77415eac,
   peg$reportedPos corresponds to the startOffset(), but was replaced
   with the endOffset().

 * Why didn't that matter? Well, the original intent of it was to not eat
   the spaces necessary to separate the next attribute, but the lookahead
   for = meant that any proceeding attribute name would not match
   anyways. So, some confusion on my part apparently.

   Indeed, if you checkout Iac546cca5a9e441723c8c8e474b30b7df617a34a and
   remove the if blocks, the test pages parse as expected.

     frwikisource/La_Mirlitantouille_(Lenotre)?oldid=4669681
     plwiki/Nedre_Eiker?oldid=37975712

   There's also the test case "div with multiple empty attribute values"
   which obviously continued to pass.

 * I'm also removing the ? here to make it clearer that we don't want to
   eat the spaces. These _att_value rules are both used optionally.

 * In any case, this should all change as we move closer to the html5
   attribute parsing algorithm in T108134. I'm just trying to justify
   that this isn't a change in behaviour yet.

Change-Id: I3fc1965ddc6068e65d925311d122dd08c74f0a1e
---
M lib/pegTokenizer.pegjs.txt
1 file changed, 4 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/services/parsoid 
refs/changes/68/230268/1

diff --git a/lib/pegTokenizer.pegjs.txt b/lib/pegTokenizer.pegjs.txt
index 20e2389..5cebc1b 100644
--- a/lib/pegTokenizer.pegjs.txt
+++ b/lib/pegTokenizer.pegjs.txt
@@ -1346,14 +1346,8 @@
         / valPos1:("" { return endOffset(); }) 
t2:attribute_preprocessor_text_double_broken? valPos2:("" { return endOffset(); 
}) &[|>]
             { return tu.getAttributeValueAndSource(input, t2, valPos1, 
valPos2); } )
                 { return r; }
-  / space_or_newline* valPos1:("" { return endOffset(); }) 
t:attribute_preprocessor_text? !"=" valPos2:("" { return endOffset(); })
-        {   if (t === null) {
-                t = "";
-                // Reset the current parse position in order to reparse any
-                // captured spaces, which are needed to separate attributes.
-                peg$currPos = endOffset();
-            }
-            return tu.getAttributeValueAndSource(input, t, valPos1, valPos2); }
+  / space_or_newline* valPos1:("" { return endOffset(); }) 
t:attribute_preprocessor_text !"=" valPos2:("" { return endOffset(); })
+        { return tu.getAttributeValueAndSource(input, t, valPos1, valPos2); }
 
 // Attribute value, restricted to a single line.
 table_att_value
@@ -1369,14 +1363,8 @@
         / valPos1:("" { return endOffset(); }) 
t2:attribute_preprocessor_text_double_line_broken? valPos2:("" { return 
endOffset(); }) &[|>\n]
             { return tu.getAttributeValueAndSource(input, t2, valPos1, 
valPos2); } )
                 { return r; }
-  / space* valPos1:("" { return endOffset(); }) 
t:attribute_preprocessor_text_line? !"=" valPos2:("" { return endOffset(); })
-        {   if (t === null) {
-                t = "";
-                // Reset the current parse position in order to reparse any
-                // captured spaces, which are needed to separate attributes.
-                peg$currPos = endOffset();
-            }
-            return tu.getAttributeValueAndSource(input, t ? t : "", valPos1, 
valPos2); }
+  / space* valPos1:("" { return endOffset(); }) 
t:attribute_preprocessor_text_line !"=" valPos2:("" { return endOffset(); })
+        { return tu.getAttributeValueAndSource(input, t ? t : "", valPos1, 
valPos2); }
 
 /*
  * A variant of generic_tag, but also checks if the tag name is a block-level

-- 
To view, visit https://gerrit.wikimedia.org/r/230268
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3fc1965ddc6068e65d925311d122dd08c74f0a1e
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/services/parsoid
Gerrit-Branch: master
Gerrit-Owner: Arlolra <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

[MediaWiki-commits] [Gerrit] Remove resetting the parse position - change (mediawiki...parsoid)

Reply via email to