Better way to do parsing?

2005-05-04 Thread André Roberge
Hi all,

I posted the following on the python tutor list 3 days ago ... and 
haven't heard a peep from anyone - which is highly unusual on that list.

[Apologies for the slightly longer post due to code
with tests cases included at the end .]

I have created a severely restricted environment within with
users can learn the basics of programming in Python.

[note: eval, exec, chr, input, raw_input are not allowed.]

Within that environment, I want to have the user test the five
valid forms that an import statement can have, by attempting to
import a fake module whose name is useful.  Other import
statements are disallowed.

1. import useful
2. from useful import *
3. from useful import valid_function1 [, valid_function2, ...]
4. from useful import valid_function as better_named_function
5. import useful as not_so_useful_after_all

As far as I can tell, the following works, but it looks rather
clunky to me.  My *very limited* experience with the
re module may have something to do with this.

Any suggestion would be most welcome.

André

===Here's the code formatted (fingers crossed) to work if cut and
pasted from email =


# test_import.py

import re

isolate_words = re.compile(r'\W+')  # used with .split()

# pre-compiled some regular expression with allowable use of import

imp_use = re.compile('^import useful', re.MULTILINE)
imp_use_as = re.compile('^import useful as (\w+)', re.MULTILINE)
from_use_imp_star = re.compile('^from useful import \*', re.MULTILINE)
from_use_imp_names = re.compile(
 ^from useful import (\w+(,[ ]*\w+)*),
  re.MULTILINE)
from_use_imp_as = re.compile(
^from useful import (\w+) as (\w+),
re.MULTILINE)


# In the following, r is used so that \b identifies a word boundary,
# and is not interpreted as backslash by Python.

import_misuse = re.compile(r'\bimport\b', re.MULTILINE)

# use to commenting out the valid import statements after processed.

comment_from = re.compile('^from ', re.MULTILINE)
comment_import = re.compile('^import ', re.MULTILINE)

# Create a fake module which can be imported

right = turn_right():\n+\
  turn_left()\n+\
  turn_left()\n+\
  turn_left()\n\n

around = turn_around():\n+\
  turn_left()\n+\
  turn_left()\n\n

up_east = climb_up_east():\n+\
  turn_left()\n+\
  move()\n+\
  turn_left()\n+\
  turn_left()\n+\
  turn_left()\n\n

up_west = climb_up_west():\n+\
  turn_left()\n+\
  turn_left()\n+\
  turn_left()\n+\
  move()\n+\
  turn_left()\n\n

down_west =  climb_down_west():\n+\
  turn_left()\n+\
  move()\n+\
  turn_left()\n+\
  turn_left()\n+\
  turn_left()\n\n

down_east =  climb_down_east():\n+\
  turn_left()\n+\
  turn_left()\n+\
  turn_left()\n+\
  move()\n+\
  turn_left()\n\n

commands = {'turn_right': right, 'turn_around': around,
  'climb_up_east': up_east, 'climb_up_west': up_west,
  'climb_down_east': down_east, 'climb_down_west': down_west}

#=== end of info on fake module

# The following fonctions are helper functions to
# process the import statement:
# they add the appropriate imported commands
# before the import statement,
# before commenting out (by pre-pending #) the import statement line

def import_useful():
  added_text = ''
  for instruction in commands:
  new = def  + 'useful.' + commands[instruction]
  added_text += new
  return added_text, True

def from_useful_import_star():
  added_text = ''
  for instruction in commands:
  new = def  + commands[instruction]
  added_text += new
  return added_text, True

def import_useful_as(syn):
  added_text = ''
  for instruction in commands:
  new = def  + syn + '.' + commands[instruction]
  added_text += new
  return added_text, True

def from_useful_import_names(names):
  added_text = ''
  for instruction in isolate_words.split(names):
  try:
  new = def  + commands[instruction]
  except:
  print instruction,  not found in module useful
  added_text += new
  return added_text, True

def from_useful_import_as(name, syn):
  added_text = ''
  try:
  new = def  + commands[name].replace(name, syn)
  except:
  print name,  not found in module useful
  added_text += new
  return added_text, True

def process_no_import():
  added_text = ''
  return added_text, True

# the basic processing function

def process_file(file_text):
  if imp_use_as.search(file_text): # look for import useful as ...
  syn = imp_use_as.findall(file_text)
  

Re: Better way to do parsing?

2005-05-04 Thread [EMAIL PROTECTED]
Isn't it a better idea to manipulate the import statement with use of
of the 'imp' module to get a restriction on the import statement?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Better way to do parsing?

2005-05-04 Thread Bengt Richter
On Wed, 04 May 2005 07:13:22 -0300, =?ISO-8859-1?Q?Andr=E9_Roberge?= [EMAIL 
PROTECTED] wrote:

Hi all,

I posted the following on the python tutor list 3 days ago ... and 
haven't heard a peep from anyone - which is highly unusual on that list.

[Apologies for the slightly longer post due to code
with tests cases included at the end .]

I have created a severely restricted environment within with
users can learn the basics of programming in Python.

[note: eval, exec, chr, input, raw_input are not allowed.]

Within that environment, I want to have the user test the five
valid forms that an import statement can have, by attempting to
import a fake module whose name is useful.  Other import
statements are disallowed.

1. import useful
2. from useful import *
3. from useful import valid_function1 [, valid_function2, ...]
4. from useful import valid_function as better_named_function
5. import useful as not_so_useful_after_all

As far as I can tell, the following works, but it looks rather
clunky to me.  My *very limited* experience with the
re module may have something to do with this.

Any suggestion would be most welcome.

I don't in any way want to discourage your enthusiastic pursuit of
your goal, but I suspect you might have more fun with another approach,
unless you really want to learn about the limitations of regexes first ;-)
(Don't get me wrong, regexes are great for what they do best).

I.e., if you want to parse python, there are modules that will help you
a lot more than just re will. If you want to validate source
according to your own rules before compiling, you could walk the
ast and raise exceptions where your rules are violated. Or if you
want to emulate excution as you walk the tree, you can do that too.
Either way, this post of Michael Spencer's ought to give you food for thought:

http://mail.python.org/pipermail/python-list/2005-March/270760.html

Also for more general (not just python syntax) parssing the pyparsing
program referenced in the followon post looks very nice, though I have not 
tried it.

http://pyparsing.sourceforge.net/


Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list