On Fri, May 24, 2013 at 2:53 PM, Albert-Jan Roskam <fo...@yahoo.com> wrote: > Why do I need to use a trailing comma to create a singleton > tuple? Without a comma it seems to mean "parenthesized single > object", ie the parentheses are basically not there.
Here are some technical notes and references to augment the other answers. I think this is on topic in this thread. Even though the title is "Difference between types", I think Citizen Kant is mostly concerned with tokenizing and parsing source code. Python 3.3's grammar definition: http://hg.python.org/cpython/file/3.3/Grammar/Grammar Here's an incomplete subset: power: atom trailer* ['**' factor] factor: ('+'|'-'|'~') factor | power term: factor (('*'|'/'|'%'|'//') factor)* arith_expr: term (('+'|'-') term)* shift_expr: arith_expr (('<<'|'>>') arith_expr)* and_expr: shift_expr ('&' shift_expr)* xor_expr: and_expr ('^' and_expr)* expr: xor_expr ('|' xor_expr)* star_expr: '*' expr comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='| 'in'|'not' 'in'|'is'|'is' 'not' comparison: expr (comp_op expr)* not_test: 'not' not_test | comparison and_test: not_test ('and' not_test)* or_test: and_test ('or' and_test)* test: or_test ['if' or_test 'else' test] | lambdef testlist: test (',' test)* [','] exprlist: (expr|star_expr) (',' (expr|star_expr))* [','] testlist_comp: (test|star_expr) ( comp_for | (',' (test|star_expr))* [','] ) atom: ('(' [yield_expr|testlist_comp] ')' | '[' [testlist_comp] ']' | '{' [dictorsetmaker] '}' | NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False') augassign: ('+=' | '-=' | '*=' | '/=' | '%=' | '&=' | '|=' | '^=' | '<<=' | '>>=' | '**=' | '//=') testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [','] expr_stmt: testlist_star_expr ( augassign (yield_expr|testlist) | ('=' (yield_expr|testlist_star_expr))*) pass_stmt: 'pass' small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | import_stmt | global_stmt | nonlocal_stmt | assert_stmt) simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated stmt: simple_stmt | compound_stmt eval_input: testlist NEWLINE* ENDMARKER file_input: (NEWLINE | stmt)* ENDMARKER single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE A possible match for atom is a set of empty parentheses, which defines an empty tuple. Another match is a testlist_comp in parentheses. In this case, if there's only one test/star_expr, then a trailing comma is required to define a tuple. The final parse tree depends on where you "start". For example, the interactive shell uses single_input mode, so it looks for either NEWLINE (i.e. you hit <enter> at a blank prompt), a simple_stmt, or compound_stmt NEWLINE. If you enter a compound statement such as a for_stmt, you have to hit enter on an empty line in order to finish parsing, compile, and execute the statement. Note also that eval('') is a syntax error because there's nothing to match testlist in eval_input mode. The parser start constants are defined in Include/graminit.h: #define single_input 256 #define file_input 257 #define eval_input 258 The corresponding C API constants are defined in Include/compile.h: #define Py_single_input 256 #define Py_file_input 257 #define Py_eval_input 258 For the built-in compile() function these constants map to the modes 'single', 'exec', and 'eval'. The parser module has a couple of functions that create syntax trees: parser.expr (eval_input mode) and parser.suite (file_input mode). The resulting st objects have tolist() and totuple() methods, but the resulting nested sequences of numeric constants aren't all that helpful for interactive inspection. I scribbled a little code to pretty print them using the corresponding names from the tokenize and symbol modules. import parser from pprint import pprint from tokenize import tok_name from symbol import sym_name stnames = dict(tok_name) stnames.update(sym_name) def stnumtoname(st): if isinstance(st, parser.STType): st = st.tolist() stnew = [stnames[st[0]]] for item in st[1:]: if isinstance(item, (list, tuple)): stnew.append(stnumtoname(item)) elif isinstance(item, int): stnew.append(stnames[item]) else: stnew.append(item) return stnew def pprintst(st, **kwds): if 'width' not in kwds: kwds['width'] = 60 pprint(stnumtoname(st), **kwds) Here's an example tree for the compound statement "for i in (0,): pass": >>> pprintst(parser.suite('for i in (0,): pass')) ['file_input', ['stmt', ['compound_stmt', ['for_stmt', ['NAME', 'for'], ['exprlist', ['expr', ['xor_expr', ['and_expr', ['shift_expr', ['arith_expr', ['term', ['factor', ['power', ['atom', ['NAME', 'i']]]]]]]]]]], ['NAME', 'in'], ['testlist', ['test', ['or_test', ['and_test', ['not_test', ['comparison', ['expr', ['xor_expr', ['and_expr', ['shift_expr', ['arith_expr', ['term', ['factor', ['power', ['atom', ['LPAR', '('], ['testlist_comp', ['test', ['or_test', ['and_test', ['not_test', ['comparison', ['expr', ['xor_expr', ['and_expr', ['shift_expr', ['arith_expr', ['term', ['factor', ['power', ['atom', ['NUMBER', '0']]]]]]]]]]]]]]], ['COMMA', ',']], ['RPAR', ')']]]]]]]]]]]]]]]], ['COLON', ':'], ['suite', ['simple_stmt', ['small_stmt', ['pass_stmt', ['NAME', 'pass']]], ['NEWLINE', '']]]]]], ['NEWLINE', ''], ['ENDMARKER', '']] If you have a debug build of CPython, use the -d command-line option to see the parser at work. For example here's the [abridged] output for parsing "(0,)": >>> (0,) Token LPAR/'(' ... It's a token we know DFA 'single_input', state 0: Push ... DFA 'simple_stmt', state 0: Push ... DFA 'small_stmt', state 0: Push ... DFA 'expr_stmt', state 0: Push ... DFA 'testlist_star_expr', state 0: Push ... DFA 'test', state 0: Push ... [snip] DFA 'atom', state 0: Shift. Token NUMBER/'0' ... It's a token we know DFA 'atom', state 1: Push ... DFA 'testlist_comp', state 0: Push ... DFA 'test', state 0: Push ... [snip] DFA 'atom', state 0: Shift. DFA 'atom', state 4: Direct pop. Token COMMA/',' ... It's a token we know DFA 'power', state 1: Pop ... DFA 'factor', state 2: Pop ... [snip] DFA 'test', state 1: Pop ... DFA 'testlist_comp', state 1: Shift. Token RPAR/')' ... It's a token we know DFA 'testlist_comp', state 3: Pop ... DFA 'atom', state 6: Shift. DFA 'atom', state 4: Direct pop. Token NEWLINE/'' ... It's a token we know DFA 'power', state 1: Pop ... [snip] DFA 'test', state 1: Pop ... DFA 'testlist_star_expr', state 1: Pop ... DFA 'expr_stmt', state 1: Pop ... DFA 'small_stmt', state 1: Pop ... DFA 'simple_stmt', state 1: Shift. DFA 'simple_stmt', state 3: Direct pop. DFA 'single_input', state 1: Direct pop. ACCEPT. (0,) The concrete syntax tree is only the first stage. The next step is to transform the CST into an AST (abstract syntax tree), constructed according to the following abstract syntax description (ASDL): http://hg.python.org/cpython/file/3.3/Parser/Python.asdl Use ast.parse() and ast.dump() to build and inspect the AST for source code (it defaults to 'exec' mode): >>> print(ast.dump(ast.parse('for i in (0,): pass'))) Module(body=[For(target=Name(id='i', ctx=Store()), iter=Tuple(elts=[Num(n=0)], ctx=Load()), body=[Pass()], orelse=[])]) _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor