I have two questions about this:
1) Any idea how this affects performance?
2) Is it reasonable to just create an instance of a new class here, or
maybe a struct, rather than passing a hash? The hash is built the
same every time (although obviously with different data), which
implies it makes sense to use a class.
It'd be best if we could just return the instance to the parser,
rather than [:token, ...], but we'd probably need to find some way to
make the token evaluate appropriately for the grammar.
That is, it might work to add a 'to_s' method that returns the token
string.
The performance thing probably isn't a big deal, but considering the
number of lexing tokens in a given configuration, it's at least worth
looking at. It'd be great if someone with a big configuration can
give this a try.
On Jun 28, 2009, at 7:34 AM, Brice Figureau wrote:
>
> Careful inspection of the parser code show that when we
> associate a source line number for an AST node, we use the
> current line number of the currently lexed token.
> In many case, this is correct, but there are some cases where
> this is incorrect.
> Unfortunately due to how LALR parser works the ast node creation
> of a statement can appear _after_ we lexed another token after
> the current statement:
>
> 1. $foo = 1
> 2.
> 3. class test
>
> When the parser asks for the class token, it can reduce the
> assignement statement into the AST VarDef node, because no other
> grammar rule match. Unfortunately we already lexed the class token
> so we affect to the VarDef node the line number 3 instead of 1.
>
> This is not a real issue for error reporting, but becomes a real
> concern when we associate documentation comments to AST node for
> puppetdoc.
>
> The solution is to enhance the tokens lexed and returned to the parser
> to carry their declaration line number.
> Thus a token value becomes a hash: { :value => tokenvalue, :line }
>
> Next, each time we create an AST node, we use the line number of
> the correct token (ie the foo line number in the previous example).
>
> Signed-off-by: Brice Figureau <[email protected]>
> ---
> lib/puppet/parser/grammar.ra | 119 ++++++-----
> lib/puppet/parser/lexer.rb | 8 +-
> lib/puppet/parser/parser.rb | 391 +++++++++++++++++++
> +---------------
> lib/puppet/parser/parser_support.rb | 7 +-
> spec/unit/parser/lexer.rb | 34 +++-
> 5 files changed, 316 insertions(+), 243 deletions(-)
>
> diff --git a/lib/puppet/parser/grammar.ra b/lib/puppet/parser/
> grammar.ra
> index d7828d7..4c5d320 100644
> --- a/lib/puppet/parser/grammar.ra
> +++ b/lib/puppet/parser/grammar.ra
> @@ -75,26 +75,30 @@ statement: resource
> fstatement: NAME LPAREN funcvalues RPAREN {
> args = aryfy(val[2])
> result = ast AST::Function,
> - :name => val[0],
> + :name => val[0][:value],
> + :line => val[0][:line],
> :arguments => args,
> :ftype => :statement
> }
> | NAME LPAREN funcvalues COMMA RPAREN {
> args = aryfy(val[2])
> result = ast AST::Function,
> - :name => val[0],
> + :name => val[0][:value],
> + :line => val[0][:line],
> :arguments => args,
> :ftype => :statement
> } | NAME LPAREN RPAREN {
> result = ast AST::Function,
> - :name => val[0],
> + :name => val[0][:value],
> + :line => val[0][:line],
> :arguments => AST::ASTArray.new({}),
> :ftype => :statement
> }
> | NAME funcvalues {
> args = aryfy(val[1])
> result = ast AST::Function,
> - :name => val[0],
> + :name => val[0][:value],
> + :line => val[0][:line],
> :arguments => args,
> :ftype => :statement
> }
> @@ -126,7 +130,7 @@ namestring: name
> | selector
> | quotedtext
> | CLASSNAME {
> - result = ast AST::Name, :value => val[0]
> + result = ast AST::Name, :value => val[0][:value]
> }
>
> resource: classname LBRACE resourceinstances endsemi RBRACE {
> @@ -267,15 +271,16 @@ collstatement: collexpr
> result.parens = true
> }
>
> -colljoin: AND | OR
> +colljoin: AND { result=val[0][:value] }
> + | OR { result=val[0][:value] }
>
> collexpr: colllval ISEQUAL simplervalue {
> - result = ast AST::CollExpr, :test1 => val[0], :oper =>
> val[1], :test2 => val[2]
> + result = ast AST::CollExpr, :test1 => val[0], :oper => val[1]
> [:value], :test2 => val[2]
> #result = ast AST::CollExpr
> #result.push *val
> }
> | colllval NOTEQUAL simplervalue {
> - result = ast AST::CollExpr, :test1 => val[0], :oper =>
> val[1], :test2 => val[2]
> + result = ast AST::CollExpr, :test1 => val[0], :oper => val[1]
> [:value], :test2 => val[2]
> #result = ast AST::CollExpr
> #result.push *val
> }
> @@ -305,11 +310,11 @@ undef: UNDEF {
> }
>
> name: NAME {
> - result = ast AST::Name, :value => val[0]
> + result = ast AST::Name, :value => val[0][:value], :line =>
> val[0][:line]
> }
>
> type: CLASSREF {
> - result = ast AST::Type, :value => val[0]
> + result = ast AST::Type, :value => val[0][:value], :line =>
> val[0][:line]
> }
>
> resourcename: quotedtext
> @@ -320,17 +325,17 @@ resourcename: quotedtext
> | array
>
> assignment: VARIABLE EQUALS expression {
> - if val[0] =~ /::/
> + if val[0][:value] =~ /::/
> raise Puppet::ParseError, "Cannot assign to variables in
> other namespaces"
> end
> # this is distinct from referencing a variable
> - variable = ast AST::Name, :value => val[0]
> - result = ast AST::VarDef, :name => variable, :value => val[2]
> + variable = ast AST::Name, :value => val[0][:value], :line =>
> val[0][:line]
> + result = ast AST::VarDef, :name => variable, :value =>
> val[2], :line => val[0][:line]
> }
>
> append: VARIABLE APPENDS expression {
> - variable = ast AST::Name, :value => val[0]
> - result = ast AST::VarDef, :name => variable, :value =>
> val[2], :append => true
> + variable = ast AST::Name, :value => val[0][:value], :line =>
> val[0][:line]
> + result = ast AST::VarDef, :name => variable, :value =>
> val[2], :append => true, :line => val[0][:line]
> }
>
> params: # nothing
> @@ -348,11 +353,11 @@ params: # nothing
> }
>
> param: NAME FARROW rvalue {
> - result = ast AST::ResourceParam, :param => val[0], :value =>
> val[2]
> + result = ast AST::ResourceParam, :param => val[0]
> [:value], :line => val[0][:line], :value => val[2]
> }
>
> addparam: NAME PARROW rvalue {
> - result = ast AST::ResourceParam, :param => val[0], :value =>
> val[2],
> + result = ast AST::ResourceParam, :param => val[0]
> [:value], :line => val[0][:line], :value => val[2],
> :add => true
> }
>
> @@ -404,29 +409,29 @@ rvalue: quotedtext
> funcrvalue: NAME LPAREN funcvalues RPAREN {
> args = aryfy(val[2])
> result = ast AST::Function,
> - :name => val[0],
> + :name => val[0][:value], :line => val[0][:line],
> :arguments => args,
> :ftype => :rvalue
> } | NAME LPAREN RPAREN {
> result = ast AST::Function,
> - :name => val[0],
> + :name => val[0][:value], :line => val[0][:line],
> :arguments => AST::ASTArray.new({}),
> :ftype => :rvalue
> }
>
> quotedtext: DQTEXT {
> - result = ast AST::String, :value => val[0]
> + result = ast AST::String, :value => val[0][:value], :line =>
> val[0][:line]
> } | SQTEXT {
> - result = ast AST::FlatString, :value => val[0]
> + result = ast AST::FlatString, :value => val[0][:value], :line
> => val[0][:line]
> }
>
> boolean: BOOLEAN {
> - result = ast AST::Boolean, :value => val[0]
> + result = ast AST::Boolean, :value => val[0][:value], :line =>
> val[0][:line]
> }
>
> resourceref: NAME LBRACK rvalues RBRACK {
> Puppet.warning addcontext("Deprecation notice: Resource
> references should now be capitalized")
> - result = ast AST::ResourceReference, :type => val[0], :title =>
> val[2]
> + result = ast AST::ResourceReference, :type => val[0]
> [:value], :line => val[0][:line], :title => val[2]
> } | classref LBRACK rvalues RBRACK {
> result = ast AST::ResourceReference, :type => val[0], :title =>
> val[2]
> }
> @@ -482,52 +487,52 @@ else: # nothing
>
> expression: rvalue
> | expression PLUS expression {
> - result = ast AST::ArithmeticOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ArithmeticOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression MINUS expression {
> - result = ast AST::ArithmeticOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ArithmeticOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression DIV expression {
> - result = ast AST::ArithmeticOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ArithmeticOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression TIMES expression {
> - result = ast AST::ArithmeticOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ArithmeticOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression LSHIFT expression {
> - result = ast AST::ArithmeticOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ArithmeticOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression RSHIFT expression {
> - result = ast AST::ArithmeticOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ArithmeticOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | MINUS expression =UMINUS {
> result = ast AST::Minus, :value => val[1]
> }
> | expression NOTEQUAL expression {
> - result = ast AST::ComparisonOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ComparisonOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression ISEQUAL expression {
> - result = ast AST::ComparisonOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ComparisonOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression GREATERTHAN expression {
> - result = ast AST::ComparisonOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ComparisonOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression GREATEREQUAL expression {
> - result = ast AST::ComparisonOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ComparisonOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression LESSTHAN expression {
> - result = ast AST::ComparisonOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ComparisonOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression LESSEQUAL expression {
> - result = ast AST::ComparisonOperator, :operator =>
> val[1], :lval => val[0], :rval => val[2]
> + result = ast AST::ComparisonOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | NOT expression {
> result = ast AST::Not, :value => val[1]
> }
> | expression AND expression {
> - result = ast AST::BooleanOperator, :operator => val[1], :lval
> => val[0], :rval => val[2]
> + result = ast AST::BooleanOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | expression OR expression {
> - result = ast AST::BooleanOperator, :operator => val[1], :lval
> => val[0], :rval => val[2]
> + result = ast AST::BooleanOperator, :operator => val[1]
> [:value], :lval => val[0], :rval => val[2]
> }
> | LPAREN expression RPAREN {
> result = val[1]
> @@ -605,7 +610,7 @@ selectlhand: name
> | boolean
> | undef
> | DEFAULT {
> - result = ast AST::Default, :value => val[0]
> + result = ast AST::Default, :value => val[0][:value], :line =>
> val[0][:line]
> }
>
> # These are only used for importing, and we don't interpolate there.
> @@ -626,14 +631,14 @@ import: IMPORT qtexts {
> #definition: DEFINE NAME argumentlist parent LBRACE statements
> RBRACE {
> definition: DEFINE classname argumentlist LBRACE statements RBRACE {
> @lexer.commentpop
> - newdefine classname(val[1]), :arguments => val[2], :code =>
> val[4]
> + newdefine classname(val[1]), :arguments => val[2], :code =>
> val[4], :line => val[0][:line]
> @lexer.indefine = false
> result = nil
>
> #} | DEFINE NAME argumentlist parent LBRACE RBRACE {
> } | DEFINE classname argumentlist LBRACE RBRACE {
> @lexer.commentpop
> - newdefine classname(val[1]), :arguments => val[2]
> + newdefine classname(val[1]), :arguments => val[2], :line =>
> val[0][:line]
> @lexer.indefine = false
> result = nil
> }
> @@ -643,30 +648,30 @@ hostclass: CLASS classname classparent LBRACE
> statements RBRACE {
> @lexer.commentpop
> # Our class gets defined in the parent namespace, not our own.
> @lexer.namepop
> - newclass classname(val[1]), :code => val[4], :parent => val[2]
> + newclass classname(val[1]), :code => val[4], :parent =>
> val[2], :line => val[0][:line]
> result = nil
> } | CLASS classname classparent LBRACE RBRACE {
> @lexer.commentpop
> # Our class gets defined in the parent namespace, not our own.
> @lexer.namepop
> - newclass classname(val[1]), :parent => val[2]
> + newclass classname(val[1]), :parent => val[2], :line => val[0]
> [:line]
> result = nil
> }
>
> nodedef: NODE hostnames nodeparent LBRACE statements RBRACE {
> @lexer.commentpop
> - newnode val[1], :parent => val[2], :code => val[4]
> + newnode val[1], :parent => val[2], :code => val[4], :line =>
> val[0][:line]
> result = nil
> } | NODE hostnames nodeparent LBRACE RBRACE {
> @lexer.commentpop
> - newnode val[1], :parent => val[2]
> + newnode val[1], :parent => val[2], :line => val[0][:line]
> result = nil
> }
>
> -classref: CLASSREF
> +classref: CLASSREF { result = val[0][:value] }
>
> -classname: NAME
> - | CLASSNAME
> +classname: NAME { result = val[0][:value] }
> + | CLASSNAME { result = val[0][:value] }
>
> # Multiple hostnames, as used for node names. These are all literal
> # strings, not AST objects.
> @@ -674,13 +679,13 @@ hostnames: hostname
> | hostnames COMMA hostname {
> result = val[0]
> result = [result] unless result.is_a?(Array)
> - result << val[2]
> + result << val[2][:value]
> }
>
> -hostname: NAME
> - | SQTEXT
> - | DQTEXT
> - | DEFAULT
> +hostname: NAME { result = val[0][:value] }
> + | SQTEXT { result = val[0][:value] }
> + | DQTEXT { result = val[0][:value] }
> + | DEFAULT { result = val[0][:value] }
>
> nil: {
> result = nil
> @@ -708,15 +713,15 @@ arguments: argument
>
> argument: NAME EQUALS rvalue {
> Puppet.warning addcontext("Deprecation notice: must now include
> '$' in prototype")
> - result = [val[0], val[2]]
> + result = [val[0][:value], val[2]]
> }
> | NAME {
> Puppet.warning addcontext("Deprecation notice: must now include
> '$' in prototype")
> - result = [val[0]]
> + result = [val[0][:value]]
> } | VARIABLE EQUALS rvalue {
> - result = [val[0], val[2]]
> + result = [val[0][:value], val[2]]
> } | VARIABLE {
> - result = [val[0]]
> + result = [val[0][:value]]
> }
>
> nodeparent: nil
> @@ -732,7 +737,7 @@ classparent: nil
> classnameordefault: classname | DEFAULT
>
> variable: VARIABLE {
> - result = ast AST::Variable, :value => val[0]
> + result = ast AST::Variable, :value => val[0][:value], :line =>
> val[0][:line]
> }
>
> array: LBRACK rvalues RBRACK {
> diff --git a/lib/puppet/parser/lexer.rb b/lib/puppet/parser/lexer.rb
> index 6884e68..e296872 100644
> --- a/lib/puppet/parser/lexer.rb
> +++ b/lib/puppet/parser/lexer.rb
> @@ -353,7 +353,7 @@ class Puppet::Parser::Lexer
>
> return if token.skip
>
> - return token, value
> + return token, { :value => value, :line => @line }
> end
>
> # Go up one in the namespace.
> @@ -415,13 +415,15 @@ class Puppet::Parser::Lexer
> @last_return = false
> end
>
> - final_token, value = munge_token(matched_token, value)
> + final_token, token_value = munge_token(matched_token,
> value)
>
> unless final_token
> skip()
> next
> end
>
> + value = token_value[:value]
> +
> if match = @@pairs[value] and final_token.name !
> = :DQUOTE and final_token.name != :SQUOTE
> @expected << match
> elsif exp = @expected[-1] and exp == value and
> final_token.name != :DQUOTE and final_token.name != :SQUOTE
> @@ -432,7 +434,7 @@ class Puppet::Parser::Lexer
> commentpush
> end
>
> - yield [final_token.name, value]
> + yield [final_token.name, token_value]
>
> if @previous_token
> namestack(value) if @previous_token.name == :CLASS
> diff --git a/lib/puppet/parser/parser_support.rb b/lib/puppet/parser/
> parser_support.rb
> index 1d81e60..2fb0755 100644
> --- a/lib/puppet/parser/parser_support.rb
> +++ b/lib/puppet/parser/parser_support.rb
> @@ -324,6 +324,7 @@ class Puppet::Parser::Parser
> args[:code] = code if code
> args[:parentclass] = parent if parent
> args[:doc] = doc
> + args[:line] = options[:line]
>
> @astset.classes[name] = ast AST::HostClass, args
> end
> @@ -350,7 +351,8 @@ class Puppet::Parser::Parser
> :code => options[:code],
> :parser => self,
> :classname => name,
> - :doc => options[:doc]
> + :doc => options[:doc],
> + :line => options[:line]
> }
>
> [:code, :arguments].each do |param|
> @@ -374,7 +376,8 @@ class Puppet::Parser::Parser
> args = {
> :name => name,
> :parser => self,
> - :doc => doc
> + :doc => doc,
> + :line => options[:line]
> }
> if options[:code]
> args[:code] = options[:code]
> diff --git a/spec/unit/parser/lexer.rb b/spec/unit/parser/lexer.rb
> index 2952b73..c6b6e82 100755
> --- a/spec/unit/parser/lexer.rb
> +++ b/spec/unit/parser/lexer.rb
> @@ -4,6 +4,20 @@ require File.dirname(__FILE__) + '/../../spec_helper'
>
> require 'puppet/parser/lexer'
>
> +# This is a special matcher to match easily lexer output
> +Spec::Matchers.create :be_like do |ary|
> + match do |result|
> + r = true
> + result.zip(ary) do |a,b|
> + unless a[0] == b[0] and ((a[1].is_a?(Hash) and a[1]
> [:value] == b[1]) or a[1] == b[1])
> + r = false
> + break
> + end
> + end
> + r
> + end
> +end
> +
> describe Puppet::Parser::Lexer do
> describe "when reading strings" do
> before { @lexer = Puppet::Parser::Lexer.new }
> @@ -474,7 +488,7 @@ describe Puppet::Parser::Lexer, "when lexing
> comments" do
>
> it "should skip whitespace before lexing the next token after a
> non-token" do
> @lexer.string = "/* 1\n\n */ \ntest"
> - @lexer.fullscan.include?([:NAME, "test"]).should be_true
> + @lexer.fullscan.should be_like([[:NAME, "test"],
> [false,false]])
> end
> end
>
> @@ -491,7 +505,7 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
> }
> strings.each { |str,ary|
> @lexer.string = str
> - @lexer.fullscan().should == ary
> + @lexer.fullscan().should be_like(ary)
> }
> end
>
> @@ -515,7 +529,7 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
> }
> strings.each { |str,array|
> @lexer.string = str
> - @lexer.fullscan().should == array
> + @lexer.fullscan().should be_like(array)
> }
> end
>
> @@ -535,7 +549,7 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
>
> it "should correctly identify keywords" do
> @lexer.string = "case"
> - @lexer.fullscan.should == [[:CASE, "case"], [false, false]]
> + @lexer.fullscan.should be_like([[:CASE, "case"], [false,
> false]])
> end
>
> it "should correctly match strings" do
> @@ -545,11 +559,11 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
>
> names.each { |t|
> @lexer.string = t
> - @lexer.fullscan.should == [[:NAME,t],[false,false]]
> + @lexer.fullscan.should be_like([[:NAME,t],[false,false]])
> }
> types.each { |t|
> @lexer.string = t
> - @lexer.fullscan.should == [[:CLASSREF,t],[false,false]]
> + @lexer.fullscan.should be_like([[:CLASSREF,t],
> [false,false]])
> }
> end
>
> @@ -558,7 +572,7 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
>
> string.each { |t|
> @lexer.string = t
> - @lexer.fullscan.should == [[:NAME,t],[false,false]]
> + @lexer.fullscan.should be_like([[:NAME,t],[false,false]])
> }
> end
>
> @@ -575,7 +589,7 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
>
> @lexer.string = string
>
> - @lexer.fullscan.should == [[:AT, "@"], [:NAME, "type"],
> [:LBRACE, "{"], [false,false]]
> + @lexer.fullscan.should be_like([[:AT, "@"], [:NAME,
> "type"], [:LBRACE, "{"], [false,false]])
> end
>
> it "should correctly deal with namespaces" do
> @@ -618,7 +632,7 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
>
> @lexer.scan do |t, s|
> t.should == :VARIABLE
> - string.sub(/^\$/, '').should == s
> + string.sub(/^\$/, '').should == s[:value]
> break
> end
> end
> @@ -630,7 +644,7 @@ describe "Puppet::Parser::Lexer in the old
> tests" do
>
> string.each do |foo|
> @lexer.string = foo
> - @lexer.fullscan[0].should == [:CLASSREF, foo]
> + @lexer.fullscan.should be_like([[:CLASSREF, foo],
> [false,false]])
> end
> end
>
> --
> 1.6.0.2
>
>
> >
--
I am a kind of paranoiac in reverse. I suspect people of plotting
to make me happy. --J. D. Salinger
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Puppet Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/puppet-dev?hl=en
-~----------~----~----~----~------~----~------~--~---