[Caja] Re: An Alpha Renamer that makes it safe to embed user code in generated code.

ihab . awad Wed, 09 Sep 2009 21:11:25 -0700


I have not looked in detail at the rules in AlphaRenamingRewriter or the
tests in AlphaRenamingTest.


Will do these next. Meanwhile, here are the remarks I have so far.


http://codereview.appspot.com/110096/diff/3023/3028
File src/com/google/caja/parser/quasiliteral/AlphaRenaming.java (right):

http://codereview.appspot.com/110096/diff/3023/3028#newcode48
Line 48: * that has no masking identifiers, and that only has free
variables within an
What is a "masking identifier"?

http://codereview.appspot.com/110096/diff/3023/3028#newcode65
Line 65: * non-synthetic identifiers is trivial.
You seem to be describing some process but I don't get the flow. Perhaps
a small worked example would help? Also see my questions below about the
API -- perhaps an example here would clarify?

http://codereview.appspot.com/110096/diff/3023/3028#newcode75
Line 75: *     used to generate new names for local variables.
Is this description correct? If so, I don't understand it.

http://codereview.appspot.com/110096/diff/3023/3028#newcode81
Line 81: Expression e, NameContext<String, ?> ctxt, Set<String>
freeSynthetics,
First of all, I would have expected "ctxt" and "freeSynthetics" to be
named, respectively, something like "renamableEntryPoints" and
"nonRenamableEntryPoints" -- since that seems to be the distinction of
interest for the purposes of this class. If that would not be a useful
nomenclature, then I misunderstand the use case; perhaps that should be
clarified in the class or method docs.

That said, and more broadly, I don't understand the need for the "ctxt"
argument. Perhaps other code (yet to be submitted for review) clarifies
this, but it would be nice if this class could explain itself all by its
lonesome. As I see it, we have some chunk of code like the following,
where the stuff in double brackets is subject to renaming:

  var a = 1, b;
  b = [[ (function(theArg) { return theArg + 1; })(a); ]]

where I expect "a" and "b" are not renameable, since they are provided
by the stuff in which the [[ ... ]] expression is embedded, but "theArg"
can be renamed to something.

If so, then all we need for the API of this class here is  just
"freeSynthetics" (aka "nonRenamableEntryPoints"), right? Or not? If not,
why?

And finally, if the variables in the embedding ("a" and "b" in my
example) are themselves being alpha renamed, then why does this class
not operate on something bigger than just the teeny little Expression?
Why not save all the alpha renaming until you have a whole chunk (like,
a Block or a Program or what not) and run the renaming on that? This
would presumably ensure that everything got renamed in one go and all
you'd need would be "freeSynthetics" (aka "nonRenamableEntryPoints").

http://codereview.appspot.com/110096/diff/3023/3028#newcode103
Line 103: // If the input NameContext contains (foo => a, bar => b) then
the program
This creates in my mind more confusion about the input NameContext; see
remarks above.

http://codereview.appspot.com/110096/diff/3023/3028#newcode138
Line 138: if (!outers.contains(name) && isOuter(name, s)) {
Don't need "!outers.contains(name)" since "outers" is a Set.

http://codereview.appspot.com/110096/diff/3023/3028#newcode164
Line 164: final class AlphaRenamingRewriter extends Rewriter {
It would be nice if this were in a separate file....

http://codereview.appspot.com/110096/diff/3023/3027
File src/com/google/caja/parser/quasiliteral/NameContext.java (right):

http://codereview.appspot.com/110096/diff/3023/3027#newcode32
Line 32: * scoped, so we alpha-rename all variables to prevent
collisions.
"... we alpha-rename all variables ..." seems overly general; do you
mean only the variables introduced by the IHTML?

http://codereview.appspot.com/110096/diff/3023/3027#newcode34
Line 34: * @param <NAME> a type that can work as a hashtable key
Also describe <BINDING> param.

http://codereview.appspot.com/110096/diff/3023/3027#newcode82
Line 82: public NameContext<NAME, BINDING> subScope() {
Rename this method to "newChildContext" or "makeChildContext"?
Especially to make it complementary to "getParentContext".

http://codereview.appspot.com/110096/diff/3023/3031
File src/com/google/caja/util/SafeIdentifierMaker.java (right):

http://codereview.appspot.com/110096/diff/3023/3031#newcode25
Line 25: * {...@link SafeIdentifierMaker#isSafeIdentifier unsafe}
The method "isSafeIdentifier" and the doc comment "unsafe" have reverse
Boolean sense. :)

http://codereview.appspot.com/110096/diff/3023/3031#newcode26
Line 26: * identifiers.
One idea might be to describe here why generating a lexicographic
sequence from a custom alphabet is better than simply doing (say) "v0",
"v1", "v2", ..., "v3941", .... I assume the reason is that, with a
larger alphabet (54 chars, as opposed to 10 for decimal and 16 for hex),
you can have smaller, "denser" names. If so, a comment about this might
help the reader get oriented as to the purpose for this class.

http://codereview.appspot.com/110096/diff/3023/3031#newcode38
Line 38: this(charRanges('a', 'z', 'A', 'Z'));
Given that a larger alphabet means "denser" and shorter identifers, it
behooves one to choose the largest alphabet possible. If you make
"isSafeIdentifier" reject ones starting with a digit, you can use
'0'..'9' as well, boosting your |alphabet| from 54 to 64. Perhaps a few
other ASCII characters could be similarly pressed into service?

http://codereview.appspot.com/110096/diff/3023/3031#newcode58
Line 58: if (isSafeIdentifier(word)) { return word; }
This algorithm of "generate guff and throw away the bad eggs" is very
good. :)

http://codereview.appspot.com/110096/diff/3023/3031#newcode70
Line 70: private static char[] charRanges(char... startsAndEnds) {
"assert((startsAndEnds %2) == 0)"? For documentation more than anything
else....

http://codereview.appspot.com/110096/diff/3023/3024
File tests/com/google/caja/parser/quasiliteral/AlphaRenamingTest.java
(right):

http://codereview.appspot.com/110096/diff/3023/3024#newcode303
Line 303: private void assertRenamed(String golden, String input,
String... globals)
Based on the rest of our tests, I'm used to the input coming first, then
the expected output. It was hard to read the tests until I reoriented
myself.

http://codereview.appspot.com/110096

[Caja] Re: An Alpha Renamer that makes it safe to embed user code in generated code.

Reply via email to