fnames=: <;._1 tolower ' John Dakota Wilson Diana Joan Roberto John John'
  lnames=: <;._1 tolower ' Smith Jones Chan Wilson Saxon Angelo Smith
Wilson'
    ppl i. tolower each p2
4


2007/10/12, Sherlock, Ric <[EMAIL PROTECTED]>:
>
> I was doing a case-insensitive lookup of firstname and lastname in a
> 2-column boxed table.
>    fnames=: <;._1 ' John Dakota Wilson Diana Joan Roberto John John'
>    lnames=: <;._1 ' Smith Jones Chan Wilson Saxon Angelo Smith Wilson'
>    ]ppl=:500 $ fnames,.lnames
> +-------+------+
> |John   |Smith |
> +-------+------+
> |Dakota |Jones |
> +-------+------+
> |Wilson |Chan  |
> +-------+------+
> |Diana  |Wilson|
> +-------+------+
> |Joan   |Saxon |
> +-------+------+
> |Roberto|Angelo|
> +-------+------+
> |John   |Smith |
> +-------+------+
> |John   |Wilson|
> ...
>
>    p=: 'Joan';'Saxon'
>    p2=:'JOAN';'saxon'
>    ppl i. p
> 4
>   (tolower each ppl) i. tolower each p2
> 4
>
> However performance wasn't great, which I tracked it down to having to
> run the verb tolower so many times. Below I've documented a solution to
> this performance problem using inverted tables, but would be interested
> in other possible ways of bypassing the performance hit caused by making
> the lookup case-insensitive.
>
> A solution using inverted tables.
> (Load collected definitions from
> http://www.jsoftware.com/jwiki/Essays/Inverted_Table )
>
>    mfv=: ,:^:(#&$ = 1:) NB. Create 1 row matrix from vector
>    pplinv=: ifa ppl
>    pinv =:  ifa mfv p
>    p2inv=:  ifa mfv p2
>    pplinv tindexof pinv
> length error
>
> The problem is that converting ppl to an inverted table extends each
> name to the length of the longest name. For pinv to match, its names
> also need to be extended to that same width.
> How can this best be done?
>
> My solutions as follows:
>    textend=: {:@$&.>@[ {."1&.> ]
>    pplinv textend pinv
> +-------+------+
> |Joan   |Saxon |
> +-------+------+
>
>    pplinv tindexof pplinv textend pinv
> 4
>
> Or more directly:
>
>    tindexof1=: [ tindexof {:@$&.>@[ {."1&.> ]
>    pplinv tindexof1 pinv
> 4
>    (tolower each pplinv) tindexof1 tolower each p2inv
> 4
>
>    ts=: 6!:2 , 7!:[EMAIL PROTECTED]
>    ts '(tolower each ppl) i. tolower each p2'
> 0.0206076470613 147456
>    ts '(tolower each ppl2inv) tindexof1 tolower each p2inv'
> 0.000427987355935 48000
>
> About 48 times faster and 3 time leaner using inverted tables.
>
> Even for a single lookup the overhead of converting to inverted tables
> is worthwhile:
>    ts '(tolower each ifa ppl) tindexof1 tolower each ifa mfv p2'
> 0.000516546097339 57216
>
> About 40 times faster and 2.5 times leaner.
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>



-- 
Björn Helgason, Verkfræðingur
Fugl&Fiskur ehf, Þerneyjarsund 23, Box 127
801 Grímsnes ,t-póst: [EMAIL PROTECTED]
Skype: gosiminn, gsm: +3546985532
Landslags og skrúðgarðagerð, gröfuþjónusta
http://groups.google.com/group/J-Programming


Tæknikunnátta höndlar hið flókna, sköpunargáfa er meistari einfaldleikans

góður kennari getur stigið á tær án þess að glansinn fari af skónum
          /|_      .-----------------------------------.
         ,'  .\  /  | Með léttri lund verður        |
     ,--'    _,'   | Dagurinn í dag                     |
    /       /       | Enn betri en gærdagurinn  |
   (   -.  |        `-----------------------------------'
   |     ) |        (\_ _/)
  (`-.  '--.)       (='.'=)
   `. )----'        (")_(")
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to