fnames=: <;._1 tolower ' John Dakota Wilson Diana Joan Roberto John John'
lnames=: <;._1 tolower ' Smith Jones Chan Wilson Saxon Angelo Smith
Wilson'
ppl i. tolower each p2
4
2007/10/12, Sherlock, Ric <[EMAIL PROTECTED]>:
>
> I was doing a case-insensitive lookup of firstname and lastname in a
> 2-column boxed table.
> fnames=: <;._1 ' John Dakota Wilson Diana Joan Roberto John John'
> lnames=: <;._1 ' Smith Jones Chan Wilson Saxon Angelo Smith Wilson'
> ]ppl=:500 $ fnames,.lnames
> +-------+------+
> |John |Smith |
> +-------+------+
> |Dakota |Jones |
> +-------+------+
> |Wilson |Chan |
> +-------+------+
> |Diana |Wilson|
> +-------+------+
> |Joan |Saxon |
> +-------+------+
> |Roberto|Angelo|
> +-------+------+
> |John |Smith |
> +-------+------+
> |John |Wilson|
> ...
>
> p=: 'Joan';'Saxon'
> p2=:'JOAN';'saxon'
> ppl i. p
> 4
> (tolower each ppl) i. tolower each p2
> 4
>
> However performance wasn't great, which I tracked it down to having to
> run the verb tolower so many times. Below I've documented a solution to
> this performance problem using inverted tables, but would be interested
> in other possible ways of bypassing the performance hit caused by making
> the lookup case-insensitive.
>
> A solution using inverted tables.
> (Load collected definitions from
> http://www.jsoftware.com/jwiki/Essays/Inverted_Table )
>
> mfv=: ,:^:(#&$ = 1:) NB. Create 1 row matrix from vector
> pplinv=: ifa ppl
> pinv =: ifa mfv p
> p2inv=: ifa mfv p2
> pplinv tindexof pinv
> length error
>
> The problem is that converting ppl to an inverted table extends each
> name to the length of the longest name. For pinv to match, its names
> also need to be extended to that same width.
> How can this best be done?
>
> My solutions as follows:
> textend=: {:@$&.>@[ {."1&.> ]
> pplinv textend pinv
> +-------+------+
> |Joan |Saxon |
> +-------+------+
>
> pplinv tindexof pplinv textend pinv
> 4
>
> Or more directly:
>
> tindexof1=: [ tindexof {:@$&.>@[ {."1&.> ]
> pplinv tindexof1 pinv
> 4
> (tolower each pplinv) tindexof1 tolower each p2inv
> 4
>
> ts=: 6!:2 , 7!:[EMAIL PROTECTED]
> ts '(tolower each ppl) i. tolower each p2'
> 0.0206076470613 147456
> ts '(tolower each ppl2inv) tindexof1 tolower each p2inv'
> 0.000427987355935 48000
>
> About 48 times faster and 3 time leaner using inverted tables.
>
> Even for a single lookup the overhead of converting to inverted tables
> is worthwhile:
> ts '(tolower each ifa ppl) tindexof1 tolower each ifa mfv p2'
> 0.000516546097339 57216
>
> About 40 times faster and 2.5 times leaner.
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
--
Björn Helgason, Verkfræðingur
Fugl&Fiskur ehf, Þerneyjarsund 23, Box 127
801 Grímsnes ,t-póst: [EMAIL PROTECTED]
Skype: gosiminn, gsm: +3546985532
Landslags og skrúðgarðagerð, gröfuþjónusta
http://groups.google.com/group/J-Programming
Tæknikunnátta höndlar hið flókna, sköpunargáfa er meistari einfaldleikans
góður kennari getur stigið á tær án þess að glansinn fari af skónum
/|_ .-----------------------------------.
,' .\ / | Með léttri lund verður |
,--' _,' | Dagurinn í dag |
/ / | Enn betri en gærdagurinn |
( -. | `-----------------------------------'
| ) | (\_ _/)
(`-. '--.) (='.'=)
`. )----' (")_(")
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm