Well, ... the implementation posted by xash is very nice, but in http://jsoftware.com/pipermail/programming/2022-January/059790.html Pawel Jakubas specified that the value appears first, and the index appears second, and that the indices start with 1 for the first value.
Also, somewhere along the line, was the suggestion that the result of f should be empty rather than throwing an error if there was no such duplicate value. FYI, -- Raul On Thu, Jan 27, 2022 at 12:50 PM Hauke Rehr <[email protected]> wrote: > > I wonder why there’s still so much traffic on this thread > (okay, I’m to blame for quite some of it) > I thought xash published the best correct solution. > Am I wrong? > > Am 27.01.22 um 18:42 schrieb 'Mike Day' via Programming: > > Pawel Jakubas has moved on to something else in his correspondence, > > presumably related to > > this topic. > > > > Anyway, I think we need his adjudication, as I don't think Raul Miller's f > > is doing what PJ asked > > for. > > > > Hauke Rehr's proposal is > > hrf =: {{ (,~ {&y) {. I. (>: x) e."1 +/\ =/~ y }} > > > > > > My latest version, mdf, is long-winded but perhaps reasonably efficient for > > large arguments; > > > > probably not an important consideration. > > > > > > Consider: (better in fixed width font - which I can't get this end!) > > > > (,:~i.@#)d NB. example and indices > > 0 1 2 3 4 5 6 7 8 9 10 11 12 > > 1 _1 2 3 4 2 5 6 3 8 10 3 2 > > > > > > hrf&d each _1 0 1 2 3 4 NB. not error-checking > > +---+---+---+----+---+---+ > > |1 0|1 0|2 5|3 11|1 0|1 0| > > +---+---+---+----+---+---+ > > mdf&d each _1 0 1 2 3 4 > > ++---+---+----+++ > > ||1 0|2 5|3 11||| > > ++---+---+----+++ > > f&d each _1 0 1 2 3 4 NB. Raul Miller's verb, f > > +++---+---+----+----+ > > |||2 5|3 8|3 11|2 12| > > +++---+---+----+----+ > > > > > > I _think_ Pawel asked for the first x-th repeated value (among all > > x-repeats), where repeat x > > > > requires at least x+1 mentions. > > > > So 2 5 is correct - the 1st repeat of 2 is at 5, while that of 3 is at 8, > > and 5 < 8. > > > > And I think 3 11 is the required result for x = 2 - the 2nd repeat of 3 is > > at 11, while that of 2 is at 12, and 11 < 12. > > > > > > Sorry if that's wrong! > > > > > > FWIW, mdf is pretty long-winded. Its speed would be compromised if the > > > > the value of x, the left-hand-side, is highish, as there's an explicit loop > > with x + 1 entries. > > > > > > mdf =: {{ > > irepvals =. '' > > NB. return empty if there are no x-repeats > > r =. '' NB. empty default > > nn =. >: x > > if. 0 = nn =. x do. > > r =. 0,~ {.y NB. not much to do, as must it be first value, at 0! > > elseif. # repvals =. (~. #~ (nn<#)/.~ ) d do. NB. find instances, if any, > > of values with sufficient repeats > > min =. <: <./ y NB. value to substitute for unwanted early occurrences of > > repeated values > > for. i. 0 >. >: nn do. > > y =. min irepvals } y NB. substitute early occurrences with irrelevant > > values > > irepvals =. y i. repvals NB. get next indices > > end. > > end. > > > > if. # irepvals do. > > r =. y ({~,] ) <./ irepvals NB. find the lowest relevant index and return > > required pair > > else. > > r NB. return prepared default > > end. > > }} > > > > Cheers, > > > > > > Mike > > > > > > On 27/01/2022 13:04, Raul Miller wrote: > >> Here's how I would modify my implementation to produce an empty result > >> when a non-duplicate is referenced. > >> > >> d=: 1 _1 2 3 4 2 5 6 3 8 10 3 2 > >> f=: {{y ({~,]) I.</\(* * x = ])+/\(i.@#~:i.~)y }} > >> > >> 0 f d > >> > >> 1 f d > >> 2 5 > >> 2 f d > >> 3 8 > >> 3 f d > >> 3 11 > >> 4 f d > >> 2 12 > >> 5 f d > >> > >> Here, instead of using x i. ... to find the relevant duplicate index, > >> I am using I. </\ (* * x = ]) ... > >> > >> Breaking down that expression, consider: > >> > >> 2 = 0 0 0 0 0 1 1 1 2 2 2 3 4 > >> 0 0 0 0 0 0 0 0 1 1 1 0 0 > >> </\ 0 0 0 0 0 0 0 0 1 1 1 0 0 > >> 0 0 0 0 0 0 0 0 1 0 0 0 0 > >> > >> So, we're finding all values starting at the indicated duplicate and > >> stopping before the next duplicate, then refining that to the first > >> location (which is the duplicate we want). > >> > >> The * * x... part is because there is no duplicate 0. > >> > >> *0 0 0 0 0 1 1 1 2 2 2 3 4 > >> 0 0 0 0 0 1 1 1 1 1 1 1 1 > >> > >> Anyways, I. here will either return a list of a single index (which > >> basically gives behavior like the previous version) or a list > >> containing no indices (which gives an empty result). > >> > >> (Note also that if we were working with extremely long lists, and if > >> the duplicates we were looking for were typically at the beginning of > >> those long lists, it would be statistically advantageous to do a > >> little more work up front to avoid scanning the entire list.) > >> > >> I hope this helps. > >> > >> FYI, > >> > >> > >> -- > >> Raul > >> > >> On Thu, Jan 27, 2022 at 6:28 AM 'Mike Day' via Programming > >> <[email protected]> wrote: > >>> Perhaps I should point out that my verb, h (see below), uses a lot of > >>> space for a large list, especially if there are few repeats, as it’s > >>> using an outer product. > >>> > >>> I later produced an even more verbose effort which runs in less space on > >>> a 10^6 long vector, but takes 3-4x the space & time of Raul’s f, so it > >>> would be nice to get that one right. > >>> > >>> Cheers, > >>> > >>> Mike > >>> > >>> Sent from my iPad > >>> > >>>> On 26 Jan 2022, at 22:36, Mike Day <[email protected]> wrote: > >>>> > >>>> Unfortunately, Pawel wants 2 f d to be 3 11. However, I find that 3 f d > >>>> IS 3 11. > >>>> Other results are a bit strange, too: > >>>> 4 f d > >>>> 2 12 > >>>> 8 f d > >>>> |index error: f > >>>> | y ({~,])x i.~+/\(i.@#~:i.~)y > >>>> > >>>> I wasn’t going to post my effort, but it might interest Pawel. This > >>>> version works on the slightly more intuitive (for me at least, here) > >>>> origin 1 value of “x”: > >>>> > >>>> g =: ] ({~ , ]) (i.~ >./@:(+/\"1)@:(~. =/ ])) > >>>> > >>>> 2 g d > >>>> 2 5 > >>>> 3 g d > >>>> 3 11 > >>>> 4 g d. NB. also not error-checked, though! > >>>> |index error: g > >>>> | 4 g d > >>>> > >>>> This is a quick get-around to act as Pawel asks, and to give an empty > >>>> result rather than an error if nothing satisfies the left argument: > >>>> > >>>> h =: (g~ >:)~ :: ‘’ > >>>> 2 h d > >>>> 3 11 > >>>> 5 h d > >>>> > >>>> #5 h d > >>>> 0 > >>>> > >>>> FWIW, > >>>> > >>>> Mike > >>>> > >>>> Sent from my iPad > >>>> > >>>>> On 26 Jan 2022, at 21:00, Raul Miller <[email protected]> wrote: > >>>>> > >>>>> Here's a variation that works: > >>>>> > >>>>> d=: 1 _1 2 3 4 2 5 6 3 8 10 3 2 > >>>>> f=: {{y ({~,]) x i.~ +/\(i.@#~:i.~)y }} > >>>>> 1 f d > >>>>> 2 5 > >>>>> 2 f d > >>>>> 3 8 > >>>>> > >>>>> The phrase (i.@# ~: i.~) finds the locations of duplicates > >>>>> > >>>>> (i.@#~:i.~) 1 _1 2 3 4 2 5 6 3 8 10 3 2 > >>>>> 0 0 0 0 0 1 0 0 1 0 0 1 1 > >>>>> > >>>>> And, +/\ computes a running sum > >>>>> +/\(i.@#~:i.~) d > >>>>> 0 0 0 0 0 1 1 1 2 2 2 3 4 > >>>>> > >>>>> With this, we can find the index of the first occurrence of a > >>>>> duplicate count number using x i.~ ... > >>>>> > >>>>> Once we have the index of a duplicate, we can return that index and > >>>>> the corresponding value from the list. > >>>>> > >>>>> -- > >>>>> Raul > >>>>> > >>>>>> On Wed, Jan 26, 2022 at 3:49 PM Pawel Jakubas > >>>>>> <[email protected]> wrote: > >>>>>> > >>>>>> It should be of course > >>>>>> 1 f d > >>>>>> 2 5 > >>>>>> > >>>>>> Would be great if you could decompose your solution and the idea > >>>>>> behind the > >>>>>> solution. Many thanks. > >>>>>> > >>>>>> Cheers, > >>>>>> Pawel > >>>>>> ---------------------------------------------------------------------- > >>>>>> For information about J forums see http://www.jsoftware.com/forums.htm > >>>>> ---------------------------------------------------------------------- > >>>>> For information about J forums see http://www.jsoftware.com/forums.htm > >>> ---------------------------------------------------------------------- > >>> For information about J forums see http://www.jsoftware.com/forums.htm > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > -- > ---------------------- > mail written using NEO > neo-layout.org > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
