Re: new FAQs

chromatic Wed, 23 May 2007 19:10:54 -0700

On Wednesday 23 May 2007 18:06:38 Will Coleda wrote:

> I confess to not grasping the point you claim is simple.  As you
> understand it, what is there about a register based machine, as
> opposed to a stack based machine, that specifically improves the
> performance of operating on dynamically typed data, without regard to
> performance differences between the two architectures that are
> independent of typing models?


Parrot has four types of registers, for integers, floating point numbers, 
strings, and PMCs.

With hot-spot operations as you might find in number crunching code, Parrot 
and the underlying CPU may be able to operate on the values in registers 
almost directly (that is, load from Parrot register into CPU register, store 
back) without having to marshall and demarshall all of the Parrot arguments 
for a section of code into and out of a single stack.

Without particular benchmarks, however, it's difficult to give any particular 
performance guidelines.

For a point of comparison, however, Perl 5 spends a lot of time in managing 
the argument stack--a significant amount of time for otherwise small and fast 
operations.

> It sounds like you are saying that languages are free to implement
> their own semantics using their own code, and that they can choose not
> to interoperate with predefined Parrot types or types from other
> languages when that would negatively impact their goals, such as
> performance. While that rings true, it seems that Parrot is not
> providing that ability -- languages can already implement whatever
> they want without Parrot.  And if languages are free to ignore
> predefined and foreign types, when what benefit will they actually get
> from Parrot?

 - better compiler tools than lex and yacc.
 - native support for plenty of dynamic language features, if they want them
 - whatever portability and maturity Parrot has at the time
 - interoperability to whatever degree they allow

> Moreover, this does not address my initial question.  I am asking, to
> rephrase it bluntly, "If Parrot makes dynamic typing faster, doesn't
> that have to make static typing slower?"  That is, is Parrot making a
> tradeoff here?

Of course.  We can't do the C++ trick of compiling class attribute access into 
slot offset access.  The world doesn't cool to near absolute zero when 
compile time ends.  Someday, someone may want to add an attribute to a class 
at runtime.

> If it is, how large is the tradeoff and what is its 
> nature.

No benchmark; won't speculate.

> If it is not, then why doesn't everyone else simply do what 
> you are doing and gain the same benefit?

Some static compilers and environments do use runtime profiling to identify 
optimizations that static analysis was unable to discover.

> It would seem that Parrot either has to be different from the JVM and
> CLR due to design or implementation optimizations that favor a
> specific typing model over others -- which is what it seems to claim --
> or else it does not -- either it is not thus differently designed, or
> it is not thus differently implemented.  If it does not, then it seems
> inappropriate for it to make the claim -- and thus would raise the
> question of why Parrot should be considered a superior target for
> dynamically or statically typed language compilers.

Snarkily, it's better than the JVM because it actually supports features of 
dynamic languages natively without forcing all dynamic languages built on it 
to invent everything besides "look up named method at runtime".

It's better than the CLR because the Parrot copyright holders don't have the 
plans or (as far as I know) the ability to bring patent suits against anyone 
who uses Parrot.  Also our definition of portability is more than "Both 
Windows XP AND Vista".

> What tradeoffs could Parrot be making that will have a significant
> benefit for dynamically typed languages -- significant enough to
> justify the creation of Parrot itself -- without significant detriment
> to statically typed languages?  Again, if these tradeoffs are so
> broadly beneficial, why would the JVM or CLR not simply implement them
> themselves?

> Most simply: What is being lost to gain whatever is being gained?

Structural typing and related optimizations.

As for gains, it's not difficult for me to imagine that Parrot could support 
something like Smalltalk's browser.

> If Parrot is designed to benefit of dynamically typed languages, how
> will Parrot handle statically typed code in those languages.

I don't understand this question.  Are they statically typed or dynamically 
typed?

Now if you're asking whether, as with some Lisp implementations, you can give 
optional type hints to the compiler so that it can perform optimizations, 
then that's indeed a feature of Perl 6 and Parrot will support it somehow.

We haven't reached the point of worrying about optimizations, however.

> Will Parrot discourage the use of static typing features in languages like
> Perl by making that code execute more slowly or inefficiently than
> equivalent dynamically typed code?

No benchmark; won't speculate.

>  > > 2. General Features
>  > >
>  > > a. How will Parrot support reflection and attributes?
>  > >
>  > > b. How will Parrot support generics types?
>  > >
>  > > c. How will Parrot support interface types?
>  > >
>  > > d. What kind of security models will Parrot support?
>  > >
>  > > e. How will Parrot support small-footprint systems?
>  >
>  > Perhaps miniparrot can help take care of this.  If miniparrot's a
>  > miniature parrot, and perhaps supporting only those features that

> While many things are perhaps true, this answer sounds like "There is
> no definite plan for supporting this."

More charitably you might say "There are no concrete designs for these systems 
yet."

>  > > f. How will Parrot support direct access to "unmanaged" resources?

>  > Is this like UnmanagedStruct?

> I mean supporting direct access to the underlying address space and
> support for determining the sizes of data within that memory.  For
> example, direct access to a framebuffer.

This is UnManagedStruct.

>  > > g. How will Parrot facilitate distributed processing?
>  >
>  > With native threading support.
>
> I think you misunderstood my question.  By "distributed", I meant the
> execution of code in multiple address spaces, or the non-concurrent
> execution of code.  What support will Parrot provide for migrating
> data or code between environment with different byte orders.  How will
> Parrot support capturing execution state into a preservable or
> transportable form?

Unspecified.

> Again, this does not seem to be clear, so I will provide an
> example. If a Perl compiler is compiling Perl code, and that code is
> written to increment the result of a call into some Python code that
> returns a PythonString, how can the compiler ask the PythonString PMC
> if it implements the "increment", so that it can detect at compile
> time what the behavior of the statement will be?

I don't believe it would do so at compile time.  Neither Python nor Perl do 
this to my knowledge.

> More broadly, how can statically typed code determine if the values
> produced by an operation will conform to the type requirements?

I suppose it would perform type analysis on some sort of abstract tree 
structure before generating Parrot bytecode.

> What are "basic things"?

Primitive handling operations.
Aggreggate access.
Method lookup and invocation.

> What if a language inherently differs in how 
> it handles those things?

Then their semantics differ, and you can only rely on their documented 
interfaces for appropriate behavior, the same way that you can only rely on 
the documented interface of objects outside of your control.

> For example, incrementing a scalar would 
> seem to be a basic operation in Perl, but Python will not implement
> that basic thing in the same way.  It would seem that one or both
> sides of this cross-language exchange of very basic types of data will
> be problematic.

I don't see why.  Certainly you have to be cognizant if you're crossing a 
language boundary, but all of the cross-language code I've written so far has 
worked without problems.

> You say "the best way for parrot" -- how can Parrot have a judgmental
> reference point independent from the languages that target it and the
> users of those languages?

I don't understand the question.  The point of view of Parrot is "What 
behavior does Parrot need to support to be a good host for dynamic 
languages?"  That's not at all independent of target languages.

> You say "No" initially, but then go on to say "yes" in substance.  If
> the PMCs are responsible for this, and if languages provide the PMCs,
> then the languages are responsible for this.
>
> To explicitly state what is implied by this question.  If every
> language must provide PMCs that understand how to interact with types
> of other languages, then languages will only be able to interact with
> each other to the degree that one or both of those languages provides
> support.  For Perl to use data returned from Python code, either Perl
> will have to recognize Python types or Python will have to know to
> produce Perl types.  Then for Perl to call Tcl code, Perl and/or Tcl
> will have to be taught about each other.  And then for Python to call
> Tcl, yet additional code will need to be created.  Indeed, it could be
> necessary for Python code to call Perl code that calls Tcl code,
> because Perl might understand how to handle a Tcl type that Python
> does not.  And the more languages that are added, the more types each
> language will be asked to implement code to interact with.
>
> This seems like a scalability problem.

Thus, vtables.

All string-like PMCs implement a set of string vtable entries.  All access to 
string data goes through those vtables.

A GroovyString PMC that performs the "clear string" operation when code 
invokes its "concatenate" vtable entry is buggy in the same way that it would 
be if it launched Nethack instead.

For a language to interoperate with other languages through its PMCs, those 
PMCs must adhere to the semantics of the appropriate PMC vtable interfaces.

> This would mean that any cross-language code could generate runtime
> exceptions in operations that otherwise are generally considered not to
> be able to fail.  Indeed, it would seem that every possible operation
> would possibly fail at runtime when handling foreign data.

I think you're overstating the case.

> This would seem to strongly discourage multi-language programming --
> to the point of it never happening.

Ditto.

> What will Parrot do to make this acceptable?  Will end-users be forced
> to write their own test cases that attempt all valid combinations of
> all data between all languages they wish to use?

I don't see it as such a problem.  I can imagine that Parrot could detect 
cross-language calls and morph variables appropriately, but I have some 
amount of confidence that the vtable interfaces are well-chosen.


> Now, this was not the best of examples in the first place, because I
> would not argue that 'ToString' is not the kind of really-useful thing
> you want in a core data type.  The essential meaning of the routine
> being "make something a human can read" -- and humans are the people
> using the machines.  But, as you can see, there was no need for the
> core data type to provide me with an implemented 'addValue' -- it can
> simply be layered on using a more primitive and extensible runtime
> support for properties.

Of course, and we could suggest in a very JVM-ish way that all dynamic 
languages should roll their own string concatenation operators (hey, just two 
string fetches, a splice, and a store!).

That's not Parrot's goal, however.

> I don't see the simplicity or the speed benefit.  I do see the memory
> cost.  If anything, I suspect that these larger objects will fill a
> CPU cache faster and be slower to load because of this increased size,
> leading to slower runtime performance.

No benchmark; won't speculate beyond saying that the object structure only 
needs a pointer to a vtable shared by all entities of that PMC type.

> No, I mean why is the type-specific functionality not pushed down into
> the next tier where it is actually needed, like the JVM and CTS do,
> leaving the base PMC with only the same four or five methods those
> systems have?

I don't understand this question.

> Without opening a can of bees, this sounds like Parrot's performance
> will vary greatly, depending on the quantity of variables in scope in
> subroutines.

Indeed, in the same way that C's performance (especially during compilation) 
can vary greatly, depending on the quantitiy of variables in scope in 
subroutines.

> While it is generally true for most languages that a 
> large number of variables can trigger load/store operations when the
> register capacity is exceeded, Parrot will switch from JIT code to
> purely interpreted code?

Perhaps.  Perhaps not.  Unimplemented.

> While most people don't worry about 
> incurring a few load/store operations, this kind of variation may
> cause programmers to alter their programming style significantly in
> order to avoid unacceptable performance.

I doubt that.  By far most of the programs I've ever written (let alone seen) 
spend more time IO-bound than CPU bound.

> As you say, i386 has fewer registers, but it is a very common
> platform.  Given that, many programmers may consider it necessary to
> write code that will be JIT-able on that platform, leading to a rather
> awkward programming style, encouraging the use of a larger number of
> subroutines, thus more calling, and ultimately a lot of register
> shuffling anyway.

I consider the use of smaller subroutines to offer orders of magnitude in 
benefit with regard to maintainable code, far more than any potential gain or 
loss of performance.

I suspect that line of thinking is prevalent among many other users of dynamic 
languages.

> When I asked this question, I thought I was asking if the compiler
> could suggest which variables should map to registers and which ones
> should be loaded/stored.  But it seems this is a question of which
> subroutines will use registers at all.  In that case, I wonder what
> mechanisms Parrot will provide to inform a compiler how JIT-able a
> subroutine is -- both on the current platform and on other
> architectures -- to enable the compiler to know when it would make
> sense to either automatically modify the code into JIT-able form, or
> to warn the developer.

Unspecified at the moment.

> Frankly, this is not much of an answer.  I am not asking if CISC
> architectures exist, but rather I am asking why you are choosing to
> create one.

Interoperability.

To some degree, optimization.

It's easier to optimize a reasonably simple, if specialized, operation than it 
is to optimize the same operation built out of many more, if simpler, 
components.

You can certainly have only one "add" opcode in a system, and that's fine.  It 
pops two arguments off of the stack, checks their types, and performs the 
right type of operation, deciding if it's an integral addition or a floating 
point addition and what type of result to provide.

Or you could have "add_int_int_int" (which takes two ints and produces an int) 
and "add_float_int_float" (which takes an int and a float and produces a 
float) and compile that code down to JITted instructions.

Then get rid of the stack and you can dispatch to those ops without going 
through the thunk.

Then use VM registers, and you always know how to map to and from the right 
CPU registers with the proper offsets into your call frame, and things might 
get even faster.

Then again, no benchmarks; shouldn't speculate.

> It is not sufficient to say that one can write the code.  How will
> Parrot inform an existing compiler that the new operation exists (or
> does not exist if the version of Parrot is older).

I don't really understand the question.

> Will compilers have to themselves be recompiled even if they do not use the 
> new operators?

I don't understand this question either.  Can you give an example?

> Also, this seems, as a design, to simply be a bag of operations.

So is Perl.  I find understanding Perl 5's design helpful in working on 
Parrot.

> Finally, I would like to add some additional questions.
>
> 2.h. Will Parrot support inline assembly language?

Doubtful.

> 2.i. Will Parrot support primitive types?

It already supports strings, floats, and integers.  Do you mean bits and 
chunks of memory x-bits wide?  There is support for that to some degree.

> 4.c. How will registers benefit PMCs (e.g. PerlScalar), which are not
> primitive types and cannot be stored in a hardware register?

I don't understand which type of register you mean when you first use the 
word.  Additionally, I don't understand the question either way I could 
interpret it.

-- c

Re: new FAQs

Reply via email to