M$ Excel biff uses similar method to store strings inside a workbook
into a global table (sst), text in a cell is encoded as an index to
that sst, you may browse the tara source code for detail. The real M$
Excel is more efficient because it allows using hash lookup instead of
pure dyad i.
ven, 19 Feb 2010, Alex Rufon skribis:
> I know I've been on this path before but how did you guys do the conversion
> from string to numbers (and vice versa) in APL? Can we replicated this in J?
>
> Right now, I'm doing this by making a global vector of unique strings I've
> encountered.
> DICTIONARY_z_ =: < every a.
>
> search=: verb define
> data=. , boxopen y.
> DICTIONARY_z_=: ~. DICTIONARY_z_,data
> DICTIONARY_z_ i. boxopen y.
> )
>
> NB. boxopen is defined as:
> NB. boxopen=: <^:(L. = 0:)
>
>
> NB. Sample string data
> [data=. (;: 'aa c3 d5 ae af ax ac ee'),<'The answer is 42'
> +--+--+--+--+--+--+--+--+----------------+
> |aa|c3|d5|ae|af|ax|ac|ee|The answer is 42|
> +--+--+--+--+--+--+--+--+----------------+
>
> NB. Test data is 2 dimensions
> [testdata=. 8 3 $ data
> +--+--+----------------+
> |aa|c3|d5 |
> +--+--+----------------+
> |ae|af|ax |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5 |
> +--+--+----------------+
> |ae|af|ax |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5 |
> +--+--+----------------+
> |ae|af|ax |
> +--+--+----------------+
>
> NB. Numeric equivalent of the strings
> [asnumbers=. search testdata
> 265 257 258
> 266 267 268
> 269 263 264
> 265 257 258
> 266 267 268
> 269 263 264
> 265 257 258
> 266 267 268
>
> NB. Check if the numbers convert to strings
> asnumbers { DICTIONARY
> +--+--+----------------+
> |aa|c3|d5 |
> +--+--+----------------+
> |ae|af|ax |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5 |
> +--+--+----------------+
> |ae|af|ax |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5 |
> +--+--+----------------+
> |ae|af|ax |
> +--+--+----------------+
>
> NB. Switch the columns
> DICTIONARY {~ 1 2 1 { "1 asnumbers
> +--+----------------+--+
> |c3|d5 |c3|
> +--+----------------+--+
> |af|ax |af|
> +--+----------------+--+
> |ee|The answer is 42|ee|
> +--+----------------+--+
> |c3|d5 |c3|
> +--+----------------+--+
> |af|ax |af|
> +--+----------------+--+
> |ee|The answer is 42|ee|
> +--+----------------+--+
> |c3|d5 |c3|
> +--+----------------+--+
> |af|ax |af|
> +--+----------------+--+
>
> NB. Only get rows that's equal to 'aa' in the first column
> [akey=. search 'aa'
> 265
> [arows=. akey = {. "1 asnumbers
> 1 0 0 1 0 0 1 0
> DICTIONARY {~ arows # asnumbers
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
>
> NB. Or simply
> DICTIONARY {~ asnumbers #~ (search 'aa') = {. "1 asnumbers
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
>
> As you can see, it works. But I'm curious if there is another (better?) was
> to do it. The only requirement is that the numbers generated can be processed
> to get the original strings.
>
> Thanks.
>
> r/Alex
>
> -----Original Message-----
> [---=| TOFU protection by t-prot: 17 lines snipped |=---]
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm