Re: [Jchat] algebraic data types

Raul Miller Thu, 09 Feb 2012 09:33:47 -0800

[I am only quoting the parts of the previous thread that I thought
seemed relevant for the points I am responding to.  See the forum
archives if you need the full text.]

On Thu, Feb 9, 2012 at 10:53 AM, Viktor Cerovski
<[email protected]> wrote:
> For programs that don't use append, the very existence of append
> and its properties is of course totally irrelevant.

Ok.

I think this means that we have some significant "implicit scope"
constraints on the conversation which I was not aware of.

>>>> viktor=:4 :0
>>>>   R=. x,y
>>>>   if. (1-:#R) *. 0 e. x ,&(#@$) y do. {.R end.
>>>> )
>>>>
>>>> ..and let's say that I use this 'viktor' operation in place of ',' --
>>>> has the definition of "Array" changed?
>>>
>>> Array hasn't changed, but append *has* changed to accommodate
>>> the exception.
>>
>> No, append has not changed.  Not unless we define append as something
>> like "whatever operation we happen to be considering".  The existence
>> of (viktor) does not replace the existence of (,).
>>
> append is supposed to add one extra property to , without breaking
> any of the current properties of , .  It's a tiny bit more general than ,

Note that a current property of (,) is that its result always has a
rank greater than zero.

(append) added an exception: except when one argument is rank zero and
another argument is rank one and length zero.

(victor) added a different exception: when one argument is rank zero
and another argument is empty on its leading dimension, do not include
that [initially empty] dimension in the result.  I thought that this
was more self-consistent than adding a special case for rank 1
arguments.

>>>  Equivalently,
>>>
>>>   p =: *.&(''&-:) $@]
>>>   append =: ,`[@.(p~)`]@.p
>>
>>    3 (append -: viktor) i. i. 3
>> 0
>>
>>> (x p y is the predicate: is x scalar and y empty array?)
>>>
>>>> Is your argument that , is a language primitive where 'viktor' is not
>>>> a language primitive?  If so, does that mean that
>>>>
>>>>    i.2 2
>>>> 0 1
>>>> 2 3
>>>>
>>>> is not a matrix, because +/ .* is not a primitive?
>>>
>>> I think that if , were defined as you suggest or in a similar way
>>> we would have had somewhat better definition of append, more
>>> attuned to the 'everything is array' statement.
>>
>> But why?
>>
>> We can have all three definitions -- the language allows us to do that.
>
> It's about what should , as a primitive operation of the language do.

"should" is a deep topic, especially in the context of language design.

Issues include: simplicity of implementation, simplicity of
documentation, relevance to application contexts, and patterns
involving other parts of the language.

>>> Well, the problem is that Jarr type, together with operations + and ,
>>> is isomorphic to (which roughly means "the same as" or "equivalent to"
>>> or "structurally same as") the following type:
>>>
>>> Jarr' = int + [int]*[int]
>>
>> Or we could say:
>>
>> Jarr'= int + [int]+ [[int]] + [[[int]]] + ...
>>
>> with some constraints on the sizes of the underlying sublists.
>>
>> But I see no advantage to mixing explicit shape with implicit shape
>> for different instances of the same type.
>>
> You have to specify type in such a way that I know how to construct
> value of that type.  More importantly, type has to be define so that
> computer knows how to construct a value of the type.  So, the
> last definition is faulty insofar as it does not contain the shape
> information in it.
>
> Here is one in the same spirit:
>
> Jarr'1 = int + int*[int] + int*int*[int] +int*int*int*[int] + ...

Ok, but even if [int] were infinite the spirit here is different.

Here is a variant that constraints lists to a fixed size:

Jarr'1 = int + int*[int] + int*int*[[int]] +int*int*int*[[[int]]] + ...

If you have three constraints on list size here, you have three list
contexts that need to be constrained.  This is consistent with
eliminating the list structure in the context where you have only a
single item.  As for constraints: the big constraint is going to be
available memory.  We also have a constraint on structure -- each
dimension has to have a length that fits into an int.  The final
constraint -- that the number of dimensions also has to fit into an
int -- is not represented here, but I think that that is because the
notation does not give us a good way of expressing that.  And, if I
understood how algebraic data types stored the "which option are we
picking" information, I think that I could make that problem go away.

In Jarr', how do we store the information which lets us decide whether
a scalar or a non-scalar is present?

Note also that if lists are infinite size, by default, then Jarr was
also incorrect.  Even if we constrained the contained content to only
be in, it would need to be:

Jarr = int*[int]*[int]

int * _ * _    is rank
_ * [int] * _     is shape and the length of the shape list is the
rank of the array
_ * _ * [int]    is data and the length of the data list is the
product of the shape list

In previous proposals, we do not know how many elements are in the shape list.

> Now the issue here is that of termination (i.e what's "..."),
> so we could end Jarr'1 with some finite numbers of int*int*..*int
> (the number would be the maximal allowed rank of an array in
> the corresponding implementation of J int array) or define some
> alternative way to have it extend indefinitely.

This difficulty seems to be related to the absence of a rank
declaration in early formulations of Jarr.  Also, I do not know how
the + alternation syntax for algebraic data types can represent the
choice of which alternate is being used.

> That alternative way is of course present in both Jarr and Jarr'
> due to shape being a list.

Except, that was not sufficient because [int] is infinite so shape was
infinite and its product was not available in the general case.

>>> where the list representing rank is now further constrained to
>>> *non-empty* ranks, while operations + and , now work as:
>>> 2 + 5 = 7
>>> 2 , 5 = [2]*[2,5]
>>> 2 , [1]*[5] = [3]*[2,1,5]
>>> etc
>>
>> Here 7 is equivalent to []*[7]
>>
> It's equivalent in the sense of two structures being isomorphic,
> but when we get to programming, we have to choose between
> the two (or have them both, when they would be independent
> one from another).

And, since [] indicates an infinite list, []*[7] was inadequate, we
would need 0*[]*[7]

> Whichever way, when we have scalars represented via Jarr',
> 2 + 5 is 7, not []*[7].  That is in a nutshell the main advantage
> of Jarr' over Jarr -- it can specialize verbs over numbers!

Then would it not also be an advantage to specialize rank 1 arrays
(such as character strings) from rank 2 arrays?

> Let me give you one real-world example:
> I'm working as the spare time permits on a LISP implementation of
> J, I call it J. (with the dot after J).
>
> This implementation does lots of type-foo a-la Jarr', and is sometimes
> able to get through and generate and execute fast code on the spot.
> Here is an example:
>
> In J:
>
>   (+%)/ %:i.1000000
> 0.652664
>   ts'(+%)/ %:i.1000000'
> 1.67805e7 0.885019
>
> In J. implementation of J:
>
> (+%)/ %:i.100000.0
> Evaluation took:
>  0.139 seconds of real time
>  0.130000 seconds of total run time (0.110000 user, 0.020000 system)
>  93.53% CPU
>  164 lambdas converted
>  323,250,718 processor cycles
>  24,330,976 bytes consed
>
>  0.6526637623347914
>
> So J. implementation somehow manages to compile 164 lambdas
> and still compute the result ~6x faster than J with the somewhat
> larger memory footprint (~24Mb in J. and ~16MB in J).  No special
> code has been used.

Except if we consider a computed lambda to be special code.  But I
understand your point here: that this is somewhat generic special
code.

Anyways, I expect that the underlying issue with special code remains:
 all code paths need to pay the costs which support the special code.
[The choice of what special code to include depends somewhat on how
useful it seems likely to be.]

> Detailed analysis would be too involved and the speed is not exclusively
> because of Jarr'-like representation, and I haven't yet studied Roger's code
> (I will eventually of course), but J. so far looks very promising.
>
> I don't want to give false impression that J. is always so fast---in fact
> it is not and for some quite standard J expressions can be like 15x
> slower than J, so there are still few things to figure out.
> I'll post further info on the list about J. project when it reaches some
> kind of alpha stage.

This does sound interesting.

And, in a lisp implementation, I would be tempted to factor things
into "scalar" and "list" cases.  But this is because of the structure
of lisp, not because of the structure of J.

>>>>>> But a bigger problem is that we don't really have a good definition of
>>>>>> "type".  And by "good" I mean:
>>>>>>
>>>>>> a. concise,
>>>>>> b. accurate, and
>>>>>> c. consistent.
>>>>>>
>>>>> I don't know what's wrong with types as we use it today,
>>>>> either in simplest cases or in some deep theoretical sense.
>>>>
>>>> That sounds excellent -- but what is this definition for 'type'?
>>>>
>>> We gather some things together and give them a common name:
>>> like numbers 1,2,3,4,5 and call them int1to5, or True and False
>>> and call them Bool, or all 32 bit machine integers and call them Int32,
>>> all possible lists of ints, including empty list, and call them [int]
>>> etc.
>>>
>>> There also have to be some operations available and/or definable
>>> in order to make programs, and it is often that types are chosen
>>> according to operations that are to be done with values of a given
>>> type.
>>
>> Ok... I will try and keep this definition in mind next time I see
>> statements that make assertions based on concepts of "type", to see if
>> the statements still make sense.  Thank you.
>>
>> But I often see "type" used as almost equivalent to "domain" and
>> "range" -- for example, the distinction between "static" and "dynamic"
>> type systems seems to imply that in a "static type system" the domain
>> of all functions must exactly match the underlying type.  I am not
>> sure how I could make sense of that kind of thinking with your
>> definition.
>>
> In first approximation, we can think functionally: every operation is
> a function that maps its domain to its codomain.  Domain and
> codomain are types.  Dynamically typed languages give impression
> that there is one or a few types that are sufficient for everything,
> so instead of thinking "what's the type that's suited for my problem",
> I use the provided types for representation, taking them as givens that
> I'm accustomed to using.

Your definition of "type" essentially treats all types as
enumerations.  This seems like a viable approach, but I think it
ignores any internal structure in any implementation of type.  You
can, hypothetically speaking, use "functions on that type" to deal
with this issue, but that leads us back into blurry territory
(especially where the same value, in the context of the functions
being used, has different "enumerations").

> Well, single or a few types are always sufficient. One can program
> anything in sed, using streams of strings. Period.

So, yes, we need some criteria besides "sufficient" if we are to
distinguish between implementations.

> However, explicit typing allows us to: tune programs by fiddling with
> data representations (here I wan't array, not list, single float, not
> double);

This can also be done through choice of functions.  And, if types are
totally opaque, functions are the only entities which represent type
structure.

> make programs where compiler proves things for us, like I want to be sure
> that all functions of my module generate only non-empty strings;
> have conformance of input and output of a function to predefined types so
> that program does not have to check run-time whether empty string or
> some malformed input has been fed to a function; have purely functional
> input-output, etc.

Agreed -- type specialization in the context of the relevant ranges of
values for an application can be a big win.

> You know the joke: if program compiles in Haskell, then it's correct.

And, the other joke: "I have only proved it correct, not tried it."

> Of course, one can go blind in Haskell figuring out why a particular
> f o g doesn't compile given types of f and g and compiler error message,
> just as well as in J, figuring why u"1 _1 doesn't give the result one
> wants and expects given some input arrays, or get conjunctivitis doing
> tacit J.

Personally, I know how to debug J programs.

Basically, there are four stages in J program execution:

0. starting point -- here I need to understand the data
1. parsing -- here I need to understand the grammar (including adverbs
and conjunctions)
2. execution -- here I need to understand the verbs
3. finish -- here I need to understand my expected results

0 and 3 are the hardest.

1 and 2 are mixed in explicit code, and sometimes I need to use
breakpoints and single stepping to find the relevant piece of explicit
code.

Once I have isolated a problem down to a specific sentence (where the
starting point is good and the finish is bad) I can start eliminating
final stages of the process, and I can inspect intermediate forms of
the data.  When I find something that does not make sense, I need to
investigate that thing until I understand it (which lets me determine
if it was correct, in the "tried it" sense).

>>>> Hypothetically, I could see a J array being:  T*[int]*[atom] where T
>>>> matches the type of the atoms and product([int]) matches the number of
>>>> atoms.
>>>
>>> Sure thing (just the type would be more like T*[int]*[T], and my
>>> previous example with Jarr' would now become T*(T+[int]*[T]).)
>>
>> And, of course:  T*[int]*[T]  <  T*(T+[int]*[T])
>>
>> But if we ignore the higher ranked arrays, your variant simplifies to
>> T*T which seems wrong.
>>
> You're right, I made the mistake.  We can parametrize Jarr or Jarr'
> over some type T simply as:
>
> Jarr T = [int]*[T]
> Jarr' T = T+[int]*[T]
>
> to cover Jarr float, Jarr char, etc...

I think that this needs to be updated to represent rank, given that
[int] is infinite in length.

>>>>   But the underlying values here seem to be near-infinities, and
>>>> I am not certain how I could work with them.  Or, if I take their base
>>>> 2 log, they seem to be a [relatively bland] restatement of the numbers
>>>> used by memory management.
>>>>
>>> I'm not sure what you mean: when we define type, we haven't really
>>> constructed a single instance of the type, just constructively described
>>> (finite or potentially infinite) collection of values that belong to the
>>> type.
>>
>> I mean that the algebraic data type system seems to want to use
>> extremely large numbers when describing otherwise practical problems
>> (remembering that if ('int' represents a 32 bit integer value) then
>> ('int' is 2^32) -- coincidentally a value too large to be represented
>> directly, using the 'int' data type).
>>
> Still don't get when/where consideration of such large numbers
> is necessary within ADT design?  There are 2^32 ints, but int itself
> is not a value, but type, and int stands for any one particular int,
> not all of them at once.

Algebraic data types are based on the idea that types can be
represented as values.  The value of the type is the number of
distinct cases represented by that type.  It is this property which
makes them algebraic, and this is the basis for talking about (for
example) the derivative of a type.

> Note also that the type of list of ints, [int], describes infinitely many
> values: one empty list, 2^32 lists of one int, 2^32*2^32 lists of two ints,
> ad infinum, but of course we don't have to count them, let alone
> construct them all, to do ADT.

That is only because we avoid realizing any case where the list is
actually infinite.

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jchat] algebraic data types

Reply via email to