Re: [Chapel-developers] reading into arrays

Michael Ferguson Mon, 03 Aug 2015 06:14:38 -0700

Hi Damian -

Thanks for your response. I'll answer some specific questions below..

On 7/31/15, 6:50 PM, "Damian McGuckin" <[email protected]> wrote:

>
>There are a lot of issues here.
>
>On Fri, 31 Jul 2015, Michael Ferguson wrote:
>
>> I've found it useful to allow some Chapel arrays to be read without
>> knowing their size in advance. In particular, non-strided 1-D Chapel
>> arrays that have sole ownership over their domain could be read into
>> where that operation will resize the array to match the data read. I've
>> prototyped this for JSON and Chapel style textual array formats
>?
>> (e.g. [1,2,3,4] ).

JSON and Chapel array literals both use the square-bracket syntax.
See json.org if you're not familiar with JSON.
I'd like the I/O system to be able to read such arrays. In particular,
I'd like the I/O system to be able to read arrays written in the
same way as a Chapel array literal.

If you're using a Chapel array literal, e.g.:

 var A = [1,2,3,4];

you don't have to specify the domain before-hand. So why should
you have to read a domain before the array contents when you
are doing I/O?

>>
>> E.g.
>> var A:[1..0] int;
>
>... revisiting history here.
>
>This look like what Algol68 called flexible array bounds. Note that I was
>not programming when Algol68 was released!! I do remember reading stuff
>about the issues when I was using this in the early 80s. Because it was
>so 
>long ago that I do not have references, but unlike when I was younger, we
>now have access to Google. So fishing for
>
>       algol68 flexible array
>
>Not that it says anything negative
>
>       http://www.cs.virginia.edu/~mpw7t/cs655/pos2.html
>
>There are books by the guy who coined the term 'Software Engineering',
>Friedrich L Bauer. He was one of the original Algol68 architects who died
>only about 4 months ago). Both discuss this topic. One of the books is
>titled 'Compiler Construction: An Advanced Course', the other
>'Algorithmic 
>Language and Program Development'.
>
>From memory, 'flex' bounds were regarded as a big no-no by many people,
>even though those saying this agreed that there were lots of cases where
>having them would be nice and make for much cleaner algorithms.

What would be most useful about this bit of history is if we knew
*why* flexible array bounds were regarded as a no-no by many
people. A lot has changed in languages and their implementation since
then, and it would be easy to dismiss as a problem that just needed
a slightly better solution.

>
>They were the basic mechanism behind Algol68's strings.
>
>> mychannel.read(A);
>>
>> could read into A any number of elements and adjust its domain
>>accordingly.
>> The alternative is that such an operation is an error.
>
>> So, I think that this kind of feature would be an improvement, but I'm
>>not
>> sure everyone will agree. To start the discussion, I have four design
>> questions:
>
>> 1) Does changing the size of a read array when possible seem like the
>>    right idea? Or should reading an array always insist that the input
>>    has the same size as the existing array (which I believe is behavior
>>    we are stuck with for arrays that share domains...)
>
>I always prefer consistency. Not sure whether that is a very valid reason.
>
>That said, and as Michael mentioned later, reading is a bit separate to
>how something is stored.

Both ways are getting us consistency to a different thing.
I desire consistency between I/O operations and array literals
in Chapel. I don't think we can have that and also it work for
all array scenarios. However, I don't think making it an error
to read an array of a different shape when the domain is shared
would be so bad... we are already doing that kind of thing with
the array-as-list operations like push_back. I think it's worth
the improvement in productivity, and I don't think we can convince
people who have used Python and the like that they can't have
e.g. 

 myArray.push_back(1);

because it resize the array.

Similarly, in the I/O scenario I'm working on, I'm reading JSON
which is an existing format (that we have no control over). JSON
does not use a format for arrays that starts with the array length,
and all arrays are variable sized.

>
>> 2) Should any-dimensional rectangular arrays be written in binary in a
>>    form that encodes the size of each dimension? (In other words, write
>>    the domain first).  Such a feature would make something like (1)
>>    possible for multi-dimensional arrays but might not match what people
>>    expect for binary array formats.  (I don't think we've documented
>>    what you actually get when writing an array in binary yet...)
>
>Remind me what the argument is against demanding that prior to reading
>the 
>array contents, the domain be read and then used to allocate the array?

Three reasons:
1) Similarity with Chapel's array literal syntax
2) Some file formats do not include the length (e.g. JSON)
3) You can't do a whole-array read from JSON or a Chapel array
   literal-like format without this functionality. So, it makes
   the I/O code more complicated.

For example, if I have

record MyRecord {
  var myArray: [1..0] int; // (start with empty array)
}

I might want to read something in this JSON format:
 {"myArray":[1,2,3]}
 {"myArray":[4,5]}

I'd like to be able to do that with repeated calls to readf
that indicate to use JSON format like this:

 var r:MyRecord;

 readf("%jt", r);

where the format string has j for JSON format and t means
"read or write anything" (vs e.g. a number or string).

If I had to read the array's size separately I would have
several problems in expressing this:
1) I now have to write a readThis for MyRecord when I didn't
   before (the readThis would resize the array). I still have
   to resize the array, I just have to do it in user code
2) Since when doing the I/O, I don't actually know the size
   of the array, I have to either:
    a) write my own array element reading code that resizes the
       array as it goes, or
    b) read the array twice, once to count the number of elements
       and again to actually read them.

I think all this comes at a serious productivity cost (and a cost
that I don't think is worthwhile if all we are hoping for is
consistency between array types - which has a nebulous benefit
on productivity especially if we can provide a reasonable error
message when the implementation behaves differently than someone
expects).

>
>> 3) Any suggestions for a Chapel array literal format for
>>    multi-dimensional arrays?How would you write such arrays in JSON (and
>>    would anyone want to)?  At one point there was a proposal to put the
>>    domain in array literals, like this:
>
>>      var A = [ over {1..10} ];
>>
>>    but that doesn't really answer how to write multidimensional array
>>    literals. One approach would be to store the array elements in a flat
>>    way and just reshape them while reading; e.g.
>>
>>      var A = [ over {1..2, 1..3}
>>                11, 12, 13,
>>                21, 22, 23 ];
>>
>>   where the spacing would not be significant.
>
>Do you mean reshape (resize?) then after reading?

I don't see the difference, but whether it's during or after reading
is an implementation matter that I don't think we need to decide
upon now.

>
>> If we had a reasonable format, we could extend support like (1) to
>> any-dimensional arrays that do not share domains, even for some textual
>> formats.
>
>Not sure what you mean here.

We'd be able to read(myArray) even if myArray is multidimensional,
if we're using the Chapel array literal format that stores the
domain as the first thing. (but only if the array's domain is
not shared, since we don't want to resize the array if it means
resizing other arrays also).

Cheers,

-michael

------------------------------------------------------------------------------
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Re: [Chapel-developers] reading into arrays

Reply via email to