I was doing a case-insensitive lookup of firstname and lastname in a 2-column boxed table. fnames=: <;._1 ' John Dakota Wilson Diana Joan Roberto John John' lnames=: <;._1 ' Smith Jones Chan Wilson Saxon Angelo Smith Wilson' ]ppl=:500 $ fnames,.lnames +-------+------+ |John |Smith | +-------+------+ |Dakota |Jones | +-------+------+ |Wilson |Chan | +-------+------+ |Diana |Wilson| +-------+------+ |Joan |Saxon | +-------+------+ |Roberto|Angelo| +-------+------+ |John |Smith | +-------+------+ |John |Wilson| ...
p=: 'Joan';'Saxon' p2=:'JOAN';'saxon' ppl i. p 4 (tolower each ppl) i. tolower each p2 4 However performance wasn't great, which I tracked it down to having to run the verb tolower so many times. Below I've documented a solution to this performance problem using inverted tables, but would be interested in other possible ways of bypassing the performance hit caused by making the lookup case-insensitive. A solution using inverted tables. (Load collected definitions from http://www.jsoftware.com/jwiki/Essays/Inverted_Table ) mfv=: ,:^:(#&$ = 1:) NB. Create 1 row matrix from vector pplinv=: ifa ppl pinv =: ifa mfv p p2inv=: ifa mfv p2 pplinv tindexof pinv length error The problem is that converting ppl to an inverted table extends each name to the length of the longest name. For pinv to match, its names also need to be extended to that same width. How can this best be done? My solutions as follows: textend=: {:@$&.>@[ {."1&.> ] pplinv textend pinv +-------+------+ |Joan |Saxon | +-------+------+ pplinv tindexof pplinv textend pinv 4 Or more directly: tindexof1=: [ tindexof {:@$&.>@[ {."1&.> ] pplinv tindexof1 pinv 4 (tolower each pplinv) tindexof1 tolower each p2inv 4 ts=: 6!:2 , 7!:[EMAIL PROTECTED] ts '(tolower each ppl) i. tolower each p2' 0.0206076470613 147456 ts '(tolower each ppl2inv) tindexof1 tolower each p2inv' 0.000427987355935 48000 About 48 times faster and 3 time leaner using inverted tables. Even for a single lookup the overhead of converting to inverted tables is worthwhile: ts '(tolower each ifa ppl) tindexof1 tolower each ifa mfv p2' 0.000516546097339 57216 About 40 times faster and 2.5 times leaner. ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
