Re: [Jprogramming] Mask from list of indices with multiplicity

Roger Hui Tue, 07 Sep 2010 15:31:44 -0700

Given that I.^:_1 doesn't exist, you can call it whatever you like ;-)

I needed a term for the situation where an (integer) argument has
a small difference between the smallest and the largest element,
small relative to the size of the argument.  "Small range" seems apt.




----- Original Message -----
From: Marshall Lochbaum <[email protected]>
Date: Tuesday, September 7, 2010 15:23
Subject: Re: [Jprogramming] Mask from list of indices with multiplicity
To: 'Programming forum' <[email protected]>

> Well, I guess we don't mean exactly the same thing. The idea 
> with I.^:_1 is that if I. returns 2e9 as one of its results, you 
> know you have a list of length at least 2e9. If this list is 
> also short, then prefacing it with i. 2e9 would be ridiculous; 
> you would rather start with 2e9$0 and amend that.
> So yes, the two terms are different, but mine is indeed the 
> correct one for I.^:_1.
> 
> Marshall
> 
> -----Original Message-----
> From: [email protected] [mailto:programming-
> [email protected]] On Behalf Of Roger Hui
> Sent: Monday, September 06, 2010 11:18 PM
> To: Programming forum
> Subject: Re: [Jprogramming] Mask from list of indices with 
> multiplicity
> Perhaps we are using the same terms to mean different things.
> 
>    (# % >./) 0 2e9
> 1e_9
>    (# % >./) 2e9 2e9
> 1e_9
> 
> The way I think of it, 0 2e9 has large range because the 
> difference between the smallest and largest elements is large 
> compared to the length of the list, and 2e9 2e9 has small range 
> because the difference between the smallest and largest elements 
> is small.
> Yet your discriminant gives the same value for each.
> 
> 
> 
> ----- Original Message -----
> From: Marshall Lochbaum <[email protected]>
> Date: Monday, September 6, 2010 19:42
> Subject: Re: [Jprogramming] Mask from list of indices with 
> multiplicityTo: 'Programming forum' <[email protected]>
> 
> > Did you read my examination of
> > #/.`(~...@])`(0 $~ >:@(>./)@])}~  (small-range) Versus
> > [: <:@(#/.~) i.@>:@(>./) , ]  (large-range) ?
> > Basically what I got was about the same thing you have for 
> nub: 
> > I used (# % >./), which is just about instantaneous, as a 
> > discriminator and got that you should use small-range when it 
> is less 
> > than 1 and large-range when it is more than 1.
> > The speed increase is not gigantic here, and so choosing to 
> use or not 
> > to use the two cases is probably not that important.
> > 
> > I can substantiate my result on Boolean arrays, which is 
> weird. 
> > It is pretty consistently as represented, averaging over 100 trials.
> > I am using 64-bit windows with a dual-core processor. Would 
> any part 
> > of that make the difference?
> > 
> > Marshall
> > 
> > 
> > 
> > -----Original Message-----
> > From: [email protected] [mailto:programming- 
> > [email protected]] On Behalf Of Roger Hui
> > Sent: Monday, September 06, 2010 10:09 PM
> > To: Programming forum
> > Subject: Re: [Jprogramming] Mask from list of indices with 
> > multiplicity Candidates for special code are legion.  One 
> often has to 
> > exercise judgment, sometimes arbitrary judgment, in whether 
> some 
> > particular special code is worth it.  In this particular case
> > 
> > - The general expression #/.~ already has special code and is 
> already 
> > pretty fast.
> > - It is not acceptable to slow down the general case too much 
> in order 
> > to detect a special case, in this case to detect that the 
> result is 
> > going to be boolean.
> > 
> > I can not reproduce the timings posted in one of your earlier msgs:
> > 
> > > On the Boolean case, which I would consider the most
> > important, ([: 
> > > <:@(#/.~) i.@>:@(>./) , ]) is not optimal:
> > > 
> > >    a=.0.1> 100000 ?...@$ 0
> > >    ia=.I. a
> > >    6!:2 '([: <:@(#/.~) i.@>:@(>./) , ]) ia'
> > > 0.00682521
> > >    6!:2 '(e.~ i.@>:@(>./)) ia'
> > > 0.002004
> > >    6!:2 '1 ia} (0 $~ >:@(>./)) ia'
> > > 0.000679754
> > 
> > What I got instead is:
> > 
> >    a=.0.1> 100000 ?...@$ 0
> >    ia=.I. a
> >    10 timer '([: <:@(#/.~) i.@>:@(>./) , ]) ia'
> > 0.00639606
> >    10 timer '(e.~ i.@>:@(>./)) ia'
> > 0.00181928
> >    10 timer '1 ia} (0 $~ >:@(>./)) ia'
> > 0.00160842
> > 
> > The last two times are what I'd expect, approximately equal, 
> because 
> > the way e. on such integers is implemented is basically the 
> "amend" 
> > method.
> > 
> > Allow me to pose a interesting problem in this regard.
> > Several important functions in J (sort, i.) have fast 
> implementations 
> > if the argument are "small range integers".
> > 
> >    x=: 1e6 ?...@$
> > 2e9        NB. large range
> >    10 timer 'i.~x'
> > 0.106972
> >    10 timer '/:~x'
> > 0.0685986
> > 
> >    y=: 1e9 + 1e6 ?...@$ 1e5  NB. small range
> >    10 timer 'i.~y'
> > 0.0197962
> >    10 timer '/:~y'
> > 0.0209372
> > 
> > It is therefore important to be able to detect inexpensively (in
> > C) when something is small range.
> > So that is the interesting problem.  (It is not a hard 
> problem.)> 
> > For a related puzzle, see:
> > http://www.jsoftware.com/jwiki/Essays/Index%20in%20Nub
> > 
> > 
> > 
> > ----- Original Message -----
> > From: Marshall Lochbaum <[email protected]>
> > Date: Monday, September 6, 2010 9:55
> > Subject: Re: [Jprogramming] Mask from list of indices with
> > multiplicityTo: 'Programming forum' <[email protected]>
> > 
> > > The tolerance thing is a problem... you might be able to solve
> > it with
> > > a weird construct like I.!.(tol)^:_1!.(universe) y.
> > > I think this would be the same as
> > > 
> > >    (universe) (3 :0)"1 y
> > > ($:~ i.@>:@(>./)) y
> > > :
> > > (universe) <:@(#/.!.(tol))@, y
> > > )
> > > 
> > > Where universe and tol are optional. As for the domain problem
> > with
> > > I.^:_1, I. has the same problem in that it only works on
> > numerics, so
> > > it is properly reflective of the inverse, and
> > > I.^:_1 I. is always the identity minus trailing zeros (which
> > can be
> > > added back with ($ $!.0 I.^:_...@i.).
> > > 
> > > I agree that #/.~ is fast, but for arrays that were Boolean
> > before I. 
> > > (ie. now return 1 for (-:~.)) the amend method is about 10
> > times
> > > faster.
> > > Also,
> > > #/.`(~...@])`(0 $~ >:@(>./)@])}~
> > > Is a candidate that works the same as the original but is
> > faster for
> > > more sparse arrays. However, it only works if we assume a
> > numeric
> > > universe. That can probably be fixed--I'll get back to it later.
> > > 
> > > Marshall
> > > 
> > > -----Original Message-----
> > > From: [email protected] [mailto:programming- 
> > > [email protected]] On Behalf Of Henry Rich
> > > Sent: Monday, September 06, 2010 11:35 AM
> > > To: Programming forum
> > > Subject: Re: [Jprogramming] Mask from list of indices with 
> > > multiplicity Allowing !. to change the universe is an
> > interesting
> > > idea, but wouldn't !. be needed in its usual role of
> > tolerance, since
> > > there are implied comparisons in the operation?
> > > 
> > > 
> > > Also, your Ii works only on numeric operands, while
> > > 
> > > universe   <:@(#/.~)@,  y
> > > 
> > > works on all types (and shapes).
> > > 
> > > 
> > > Since the key part,   #/.~  , is already fast,
> > I think that nothing
> > > more is really needed.  This is a matter of taste.
> > > 
> > > Henry Rich
> > > 
> > > 
> > > 
> > > 
> > > Henry Rich
> > > 
> > > On 9/6/2010 10:06 AM, Marshall Lochbaum wrote:
> > > > The only way I. I.^:_1 changes x is that it sorts it, and
> > > I.^:_1 I. only cuts out the trailing 0s:
> > > >     Ii=. ([:<:@(#/.~) i.@>:@(>./) , ])
> > > >     (; (;Ii)@I.) 1...@$2
> > > > ┌───────────────────┬───────────┬─────────────────┐
> > > > │1 1 1 0 0 1 1 0 1 0│0 1 2 5 6 8│1 1 1 0 0 1 1 0 1│ 
> > > >└───────────────────┴───────────┴─────────────────┘
> > > >     (; (;I.)@Ii) 6...@$10
> > > > ┌───────────┬─────────────────┬───────────┐
> > > > │0 0 8 6 6 3│2 0 0 1 0 0 2 0 1│0 0 3 6 6 8│ 
> > > >└───────────┴─────────────────┴───────────┘
> > > >
> > > > I think each of these cases is acceptable, especially if you
> > > allow !. to change the universe (this removes the I.^:_1 I. 
> > problem).> >
> > > > Marshall
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: [email protected] 
> > > > [mailto:[email protected]] On Behalf Of
> > Roger Hui
> > > > Sent: Monday, September 06, 2010 2:26 AM
> > > > To: Programming forum
> > > > Subject: Re: [Jprogramming] Mask from list of indices with 
> > > > multiplicity
> > > >
> > > > I.^:_1 seems too far from what<:#/.~universe,x does.
> > > > Another way to say that is that I. I.^:_1 x can bear scant
> > > resemblance to x .
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: Zsbán Ambrus<[email protected]>
> > > > Date: Sunday, September 5, 2010 14:21
> > > > Subject: Re: [Jprogramming] Mask from list of indices with 
> > > > multiplicity
> > > > To: Programming forum<[email protected]>
> > > >
> > > >> On Sun, Sep 5, 2010 at 10:57 PM, Roger
> > > Hui<[email protected]>  wrote:
> > > >>>    <: #/.~ (i.7) ,x
> > > >>
> > > >> Nice solution.
> > > >>
> > > >>> ----- Original Message -----
> > > >>> From: Zsbán Ambrus<[email protected]>
> > > >>>> Would it make
> > > >>>> sense if I.^:_1 did this?
> > > >>
> > > >> ^^^ What's your opinion on this?
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Mask from list of indices with multiplicity

Reply via email to