Re: [Jprogramming] A canon of J violated

Roger Hui Sat, 17 Oct 2015 00:50:21 -0700

The point is that you can not use (i.#v)=i.~v because you can get some
strange results, results that you can not justify as being correct.  That
is, wrong results.  For example:


   t=: 9!:18 ''
   v=: 1+(t%2)*i:3

   nubsieve     =: i.@# = i.~
   nub          =: nubsieve # ]
   selfclassify =: nub =/ ]

   nubsieve v
1 0 0 0 0 0 0
   nub v
1
   selfclassify v
1 1 0 0 0 0 0

nubsieve v is wrong because there are items in v not tolerantly equal to
the first item.  Therefore nub v is wrong.  Therefore selfclassify v is
wrong.  As well, +/,selfclassify v should be #v .

In contrast, if we do the preprocessing (in the interpreter),

   v1=: i.~ v          NB. preprocessing
   nubsieve v1
1 0 1 0 1 1 1
   (nubsieve v1) # v   NB. AKA nub v
1 1 1 1 1
   selfclassify v1
1 1 0 0 0 0 0
0 0 1 1 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 1 0
0 0 0 0 0 0 1

In my previous message I did not specify how the preprocessing idea would
affect key.  The dictionary defines key as:  x u/.y ←→ (=x) u@# y .  With
the preprocessing idea, this becomes (=i.~x) u@# y .  For example:

   (=i.~v) <@# v
┌───┬───┬─┬─┬─┐
│1 1│1 1│1│1│1│
└───┴───┴─┴─┴─┘
   (=i.~v) <@# i.#v
┌───┬───┬─┬─┬─┐
│0 1│2 3│4│5│6│
└───┴───┴─┴─┴─┘
   (=v1) <@# i.#v
┌───┬───┬─┬─┬─┐
│0 1│2 3│4│5│6│
└───┴───┴─┴─┴─┘

So just to be clear, the preprocessing i.~x happens in the interpreter,
when it detects that you are applying one of these set/array operations
that requires tolerant equal.  It's the way I propose to define the set
operations on arguments requiring tolerant comparison, and the way I
propose to fix the bugs.



On Fri, Oct 16, 2015 at 8:48 PM, Henry Rich <[email protected]> wrote:

> I would be content to use
>
> (i.#v)=i.~v
>
> as the definition for nub etc.
>
> The big question to me is, Can f/. be implemented so that it uses the same
> partition as is induced by (i.#v)=i.~v  ?
>
> It feels as if yes... calculate the partition p using i.~, and then make a
> pass through the result, front to back, replacing element i{p with (i{p){p .
>
> This has cost O(n)... and could be avoided for integers and f/.!.0 .  Is
> there an easy way to tell whether tolerant i.~ encountered items that were
> tolerantly equal but not intolerantly equal?
>
> Henry Rich
>
>
> On 10/16/2015 5:47 PM, Roger Hui wrote:
>
>> I agree that there are bugs, but fixing them requires some thinking.  As
>> Bill Lam surmised, the problem ultimately can be attributed to tolerant
>> comparison, tolerant equality in particular.
>>
>> The J version I am using in this message is the following:
>>
>>     9!:14 ''
>> j602/2008-03-03/16:45
>>
>> but the bugs/problems exist in all J versions, going as far back and as
>> far
>> forward as you care to go.  The tolerance I will use is the default in a
>> session:
>>
>>     ] t=: 9!:18 ''
>> 5.68434e_14
>>
>>     t - 2^_44    NB. i.e. t is exactly 2^_44
>> 0
>>
>> but any non-zero tolerance will have similar bugs/problems.
>>
>> Tolerance makes = non-transitive (a=b and b=c, but a may not equal c):
>>
>>     a=: 1-t
>>     b=: 1
>>     c=: 1+t
>>     a=b
>> 1
>>     b=c
>> 1
>>     a=c
>> 0
>>
>> Non-transitivity is OK when tolerant equality is used for its intended
>> purpose, namely compensating for the vicissitudes of floating point
>> representation and floating point computations.  It is OK if you are
>> comparing scalar numbers, but causes problems when it is used with
>> insufficient care to deal with sets and arrays.  Non-transitivity
>> invalidates some properties, important properties, that we take for
>> granted; properties that we assume without being conscious that we are
>> assuming them.
>>
>> The primitives that are affected are as follows.  (There may be a few
>> other
>> more minor ones.)
>>
>> i.  dyad   based on =
>> ~.  monad  based on = or i.
>> ~:  monad  based on = or i.
>> =   dyad   based on ~. =
>> f/. dyad   based on i.
>>
>> The dictionary entry on the monad ~. has it "correct", among other
>> possible
>> correct definitions:
>>
>> More precisely, the nub is found by selecting the leading item,
>> suppressing
>> from the argument all items tolerantly equal to it, selecting the next
>> remaining item, and so on.
>>
>>
>> I am loath to use this in the implementation because it is O(n*2).  More
>> importantly, it is "correct" only if you believe that nub or uniqueness
>> makes
>> sense when = is non-transitive.  For example, the following verb
>> implements
>> the dictionary definition for vectors:
>>
>> nub=: 3 : 0
>>   z=. ''
>>   while. *#y do.
>>    z=. z,{.y
>>    y=. y#~y~:{.y
>>   end.
>>   z
>> )
>>
>> When = is transitive, as it is when tolerance is not involved, "nub" is
>> fine:
>>
>>     nub 'mississippi'
>> misp
>>     nub 3 1 4 1 5 9 2 6 5 3 5 8 9 7 9
>> 3 1 4 5 9 2 6 8 7
>>
>> But if tolerant equality is involved:
>>
>>     X=: 1 + (t%2) * ?.1600$100
>>     # nub X
>> 28
>>     # nub /:~X
>> 34
>>     # nub \:~X
>> 34
>>     # nub X {~ ?~#X
>> 27
>>     # nub X {~ ?~#X
>> 28
>>     /:~ ~. 3 : '# nub X {~ ?~ #X' "0 i. 1e4
>> 23 24 25 26 27 28 29 30 31 32
>>     /:~ ~. 3 : '# nub X {~ ?~ #X' "0 i. 3e4
>> 23 24 25 26 27 28 29 30 31 32
>>
>> That is, if you permute X, the number of "unique items" changes.  The last
>> two expressions show that the number of unique items can range from 23 to
>> 32 when you permute X.  Note also that 34 can be the number of unique
>> items
>> but does not come up as a result in 3e4 trial random permutations.  The
>> problem is not the particular definition in the dictionary or its
>> implementation above, but with ANY definition of nub based on tolerant and
>> therefore non-transitive equality.
>>
>> To discuss the bugs that Henry discussed, I'll use the following array.
>>   (Similar to the one Henry used.)
>>
>>     v=: 1+(t%2)*i:3
>>     ~. v
>> 1
>>     ~: v
>> 1 0 0 0 0 0 0
>>     = v
>> 1 1 0 0 0 0 0
>>     v </. v
>> ┌───┬───┬─┬─┬─┐
>> │1 1│1 1│1│1│1│
>> └───┴───┴─┴─┴─┘
>>     #/.~ v
>> 2 1
>>
>> The problem here is that the "idiom" (i.#v)=i.~v is used or assumed
>> for computing
>> the nub or the nubsieve.  The same idiom has been used in APL since time
>> immemorial for such purposes, but is not valid or, at the very least,
>> gives
>> misleading results, when transitivity does not hold.
>>
>>     i.~ v
>> 0 0 1 1 2 3 4
>>
>>     (i.#v) = i.~ v
>> 1 0 0 0 0 0 0
>>     ~:v
>> 1 0 0 0 0 0 0
>>
>>     v #~ (i.#v)=i.~v
>> 1
>>     ~.v
>> 1
>>
>>     (v#~(i.#v)=i.~v) =/ v
>> 1 1 0 0 0 0 0
>>     =v
>> 1 1 0 0 0 0 0
>>
>> I believe that x i. y is correct, the index of the first item in x that is
>> tolerantly equal to an item of y, modeled as follows:
>>
>>     ix=: #@[ - +/ @ (+./\) @ (=/~)
>>     ix~ v
>> 0 0 1 1 2 3 4
>>     i.~ v
>> 0 0 1 1 2 3 4
>>
>> But using the result of i. in other set type things such as nub, key, etc.
>> requires more analysis and care.  So what can be done?
>>
>> a. Use zero tolerance.  This can be done via the fit conjunction, =!.0,
>> ~:!.0, etc., without doing anything drastic (or anything at all) to the
>> system.
>>
>> b. For the primitives in question, on an argument that uses tolerant
>> equality,
>> pre-process the argument with i.~ and, if necessary, post-process by
>> selecting from the original argument.  The result i.~ are representatives
>> for the original items; comparisons on these representatives are
>> transitive.  For example:
>>
>>     ] v1=: i.~ v   NB. representatives of v
>> 0 0 1 1 2 3 4
>>
>>     (i.#v1) = i.~ v1
>> 1 0 1 0 1 1 1
>>     ~: v1          NB. AKA ~:v
>> 1 0 1 0 1 1 1
>>
>>     v #~ ~:v1      NB. AKA ~.v, post-processing by selecting from v
>> 1 1 1 1 1
>>     (t%2) %~ _1 + v #~ ~:v1
>> _3 _1 1 2 3
>>
>>     (~.v1)=/v1     NB. AKA =v
>> 1 1 0 0 0 0 0
>> 0 0 1 1 0 0 0
>> 0 0 0 0 1 0 0
>> 0 0 0 0 0 1 0
>> 0 0 0 0 0 0 1
>>
>>     #/.~ v1        NB. AKA #/.~v
>> 2 2 1 1 1
>>
>> What about the "number of unique items" under permutation example?
>>
>>     X1=: i.~ X   NB. representatives of X
>>     /:~ ~. 3 : '# nub X1 {~ ?~ #X1' "0 i. 3e4
>> 34
>>
>>
>>
>>
>> On Sat, Oct 10, 2015 at 11:40 PM, bill lam <[email protected]> wrote:
>>
>> I think the issue is tolerance, it works for 0-tolerance.
>>> eg.
>>>    (#,{.)/.!.0~ tb
>>> 1 1
>>> 1 1
>>> 1 1
>>> 1 1
>>> 1 1
>>> 1 1
>>> 1 1
>>>
>>> Сб, 10 окт 2015, Henry Rich написал(а):
>>>
>>>> B Jonas reported that ~. is inconsistent with the definition:
>>>>
>>>>     tb=. 1+2e_14*i:3
>>>>
>>>>     ~: tb
>>>>
>>>> 1 0 0 0 0 0 0
>>>>
>>>>     =tb
>>>>
>>>> 1 1 1 0 0 0 0
>>>>
>>>>
>>>> You could fix that in the docs.  But the problem is much more serious:
>>>>
>>> the
>>>
>>>> partitioning induced by ~. does not match that induced by u/.  .  That
>>>>
>>> these
>>>
>>>> be identical is a canon of J, relied on by code whose name is legion.
>>>>
>>>>
>>>>      ~. tb
>>>>
>>>> 1
>>>>
>>>>      tb </. tb
>>>>
>>>> +-----+-+-+-+-+
>>>>
>>>> |1 1 1|1|1|1|1|
>>>>
>>>> +-----+-+-+-+-+
>>>>
>>>>
>>>> What!? Only 1 unique item, but 4 partitions???
>>>>
>>>>
>>>>
>>>>
>>>> The special code for #/. shares the problem:
>>>>
>>>>
>>>>     #/.~ tb
>>>>
>>>> 3
>>>>
>>>>
>>>> only 3 items from a list of 7!
>>>>
>>>>
>>>>
>>>> The interpreter itself relies on the equivalence, and is let down:
>>>>
>>>> (#,{.)/.~ tb
>>>>
>>>> |length error
>>>>
>>>> | (#,{.)/.~tb
>>>>
>>>>
>>>> there shouldn't be any error.
>>>>
>>>>
>>>>
>>>> IMO this will be the first bug for us to fix, if Jsoftware ever creates
>>>> a
>>>> proper open-source setup that allows user contributions to get into the
>>>> official release.
>>>>
>>>> Henry Rich
>>>>
>>>> ----------------------------------------------------------------------
>>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>>
>>> --
>>> regards,
>>> ====================================================
>>> GPG key 1024D/4434BAB3 2008-08-24
>>> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
>>> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>
>>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] A canon of J violated

Reply via email to