Re: [Jprogramming] Ragged Array Shapes are Trees

'Bo Jacoby' via Programming Tue, 09 Sep 2014 04:41:46 -0700

Note that ordinal fractions also allows for trees with arrays on their leaves, 
and arrays with trees as array elements.
- Bo.




Den 9:25 tirsdag den 9. september 2014 skrev Raul Miller 
<[email protected]>:
 

>
>
>I've worked with a representation sort of like that, but [in my
>opinion] considerably more efficient and flexible.
>
>I'll model the two main routines here:
>
>ssdir=:3 :0
>   assert. 1=#$y
>   (}:,.2 -~/\])I.(={.)(,{.)y
>)
>
>ssndx=:4 :0
>   assert. 1=#$y
>   assert. 2=#$x
>   assert. 2={:$x
>   (;<@(+i.)/"1 x){y
>)
>
>Example use:
>
>   ssdir ' this is a test'
>0 5
>5 3
>8 2
>10 5
>
>   ex=: ' this is a test'
>   ((0 3{ssdir) ssndx ]) ex
>this test
>
>In other words, ssndx was analogous to { in design.
>
>Other notes:
>
>(0) the 'ss' part stood for Segmented String.
>
>(1) This was something like 20 years ago, using APL.
>
>(2) we had compiled APL back then, for a rank 0 subset of APL (and
>with type inference within that mini language - anything which
>constrained a type was equivalent to a type declaration), I think our
>first implementations of this code were using that compiled APL
>mechanism (but then I think it was redone in assembly language - I"m a
>bit foggy on the details at this point, but I do remember that
>performance was decent).
>
>(3) There was an additional constraint on the code, in the final
>version, such that only type character was supported. We were mostly
>interested in working with text (we were building search engines and
>related stuff ... Google's emphasis on map-reduce algorithms seems
>like a rather straightforward and obvious development from my
>perspective). But it's easy enough to build another one (not quite as
>fast as the hand assembled version) to work on other types.
>
>[Actually, I'll have to admit that I haven't been really impressed
>with a lot of the advancement of state of the art our industry. It's
>been looking like a lot of activity but most of the work (including my
>own) just has not impressed me as being all that significant. There
>have been significant improvements, of course, but not in proportion
>to the number of people employed by the industry.]
>
>Anyways... that's just two primitives, but they are quite flexible.
>You can replace ssdir with other mechanisms in some contexts, for
>example. And you can perform operations on the underlying characters
>and then use ssndx to or straight indexing to extract it in a useful
>form. (Obviously you'll need a lot more code, but this alleviated a
>significant performance bottleneck.)
>
>Thanks,
>
>-- 
>Raul
>
>On Tue, Sep 9, 2014 at 1:13 AM, Michal D. <[email protected]> wrote:
>> One can imagine that it's possible for
>>
>>    ] x =. (<@i."0) 1+i.3
>> +-+---+-----+
>> |0|0 1|0 1 2|
>> +-+---+-----+
>>
>> to be laid out in memory as [0 0 1 0 1 2] with an additional vector
>> representing the starting point of each rank-1 array [0 1 3 6].  I would be
>> willing to wager a large sum that this is much more computationally
>> efficient than having a representation where each rank-1 array is floating
>> off somewhere in memory [0] [0 1] [0 1 2] with an array of pointers to them.
>>
>> Of course this representation is possible, but somewhat verbose, to do
>> manually in J (http://dl.acm.org/citation.cfm?id=804488).  That's why I'm
>> fantasizing about a language where such a representation would be closer to
>> the core of the language.
>>
>> Something like +/"1 x instead of +/&.> x (obviously there's not much
>> difference here).
>>
>> Like we've both suggested, it may be that boxing with a sufficiently
>> optimized interpreter is the solution.
>>
>> Cheers,
>>
>> Mike
>>
>> On Mon, Sep 8, 2014 at 8:10 AM, Raul Miller <[email protected]> wrote:
>>
>>> What does this mean "shapes extended to support ragged arrays without
>>> boxing"?
>>>
>>> K has ragged arrays without boxing, but K's concept of shape only
>>> applies to the non-ragged prefix dimensions.
>>>
>>> Meanwhile, it seems to me that the guarantee you suggest would be
>>> computationally expensive.
>>>
>>> (Not saying the ideas shouldn't be explored, but I suspect what you
>>> are really after is a more efficient boxing implementation and that
>>> will lead in an entirely different direction.)
>>>
>>> Thanks,
>>>
>>> --
>>> Raul
>>>
>>>
>>> On Mon, Sep 8, 2014 at 11:01 AM, Michal D. <[email protected]>
>>> wrote:
>>> > Thanks Pascal, that's a nice formulation for the 2-level deep case.  It
>>> > breaks down of course for deeper nesting but could serve as a base.
>>> >
>>> >     ([:|. $ ;~ $ each) ((i.2 3);(,0)) ; (i. 2 3)
>>> > +-+-------+
>>> > |2|+-+---+|
>>> > | ||2|2 3||
>>> > | |+-+---+|
>>> > +-+-------+
>>> >
>>> > There's no specific problem I'm trying to solve directly in J at the
>>> > moment.  It's more of a language design problem.  My interest in this is
>>> a
>>> > J like language, but with shapes extended to support ragged arrays
>>> without
>>> > boxing.  Failing that, we could have some optimization guarantees like
>>> 'an
>>> > array of boxes containing equal rank values are allocated contiguously'.
>>> >
>>> > Cheers,
>>> >
>>> > Mike
>>> >
>>> >
>>> >
>>> > On Sat, Sep 6, 2014 at 9:28 AM, 'Pascal Jasmin' via Programming <
>>> > [email protected]> wrote:
>>> >
>>> >> Hi Michal,
>>> >>
>>> >> maybe this is an interesting representation for what you want?
>>> >>
>>> >>    ([:|. $ ;~ $ each) (i. 2 3) ; (,0) ; 0 1
>>> >> ┌─┬─────────┐
>>> >> │3│┌───┬─┬─┐│
>>> >> │ ││2 3│1│2││
>>> >> │ │└───┴─┴─┘│
>>> >> └─┴─────────┘
>>> >>
>>> >>
>>> >>
>>> >> ----- Original Message -----
>>> >> From: Michal D. <[email protected]>
>>> >> To: [email protected]
>>> >> Cc:
>>> >> Sent: Saturday, September 6, 2014 12:06 PM
>>> >> Subject: [Jprogramming] Ragged Array Shapes are Trees
>>> >>
>>> >> I just came to the realization that ragged array shapes, regardless of
>>> how
>>> >> you want to represent them, are trees.  I would be interested in any
>>> >> references to prior exploration of this idea.
>>> >>
>>> >> Some example data (boxes are used to display ragged arrays, but
>>> otherwise
>>> >> have no semantic meaning):
>>> >>
>>> >>    ] A =. i. 2 3
>>> >> 0 1 2
>>> >> 3 4 5
>>> >>    ] B =. (< @ i."0) 1+i.2
>>> >> +-+---+
>>> >> |0|0 1|
>>> >> +-+---+
>>> >>    ] C =. A ; B
>>> >> +-----+-+---+
>>> >> |0 1 2|0|0 1|
>>> >> |3 4 5| |   |
>>> >> +-----+-+---+
>>> >>    ] D =. A ; < B
>>> >> +-----+-------+
>>> >> |0 1 2|+-+---+|
>>> >> |3 4 5||0|0 1||
>>> >> |     |+-+---+|
>>> >> +-----+-------+
>>> >>
>>> >> The corresponding shape trees (monospace font required, root node is on
>>> the
>>> >> left):
>>> >>
>>> >> A: 2 - 3
>>> >>
>>> >>        1
>>> >>      /
>>> >> B: 2
>>> >>      \
>>> >>        2
>>> >>
>>> >>        2 - 3
>>> >>      /
>>> >> C: 3 - 1
>>> >>      \
>>> >>        2
>>> >>
>>> >>        2 - 3
>>> >>      /
>>> >> D: 2       1
>>> >>      \   /
>>> >>        2
>>> >>          \
>>> >>            2
>>> >>
>>> >> In some sense the shape tree for A is compressed, the verbose
>>> alternative
>>> >> being:
>>> >>
>>> >>        3
>>> >>      /
>>> >> A: 2
>>> >>      \
>>> >>        3
>>> >>
>>> >> Compressing D in a similar manner leaves the full shape of the tree
>>> >> ambiguous - there is no way to distinguish the duplicate structure of
>>> leaf
>>> >> 3 from the structure of leaves 1, 2.
>>> >>
>>> >>            3
>>> >>          /
>>> >> D: 2 - 2 - 1
>>> >>          \
>>> >>            2
>>> >>
>>> >> We could resolving this ambiguity by listing the shapes of all the items
>>> >> explicitly.  The problem here is that 'compression' could very easily
>>> lead
>>> >> to an increase in representation size, although it is nice and uniform
>>> for
>>> >> ragged arrays of uniform rank.
>>> >>
>>> >>             3
>>> >>           /
>>> >>          |  3
>>> >>          | /
>>> >> D: 2 - 2 +
>>> >>          | \
>>> >>          |  1
>>> >>           \
>>> >>             2
>>> >>
>>> >> Regardless of compression, I would be interested in prior work in
>>> >> representing shapes of ragged arrays as trees.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> Mike
>>> >> ----------------------------------------------------------------------
>>> >> For information about J forums see http://www.jsoftware.com/forums.htm
>
>>> >>
>>> >> ----------------------------------------------------------------------
>>> >> For information about J forums see http://www.jsoftware.com/forums.htm
>>> >>
>>> > ----------------------------------------------------------------------
>>> > For information about J forums see http://www.jsoftware.com/forums.htm
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>----------------------------------------------------------------------
>For information about J forums see http://www.jsoftware.com/forums.htm
>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Ragged Array Shapes are Trees

Reply via email to