M$ Excel biff uses similar method to store strings inside a workbook
into a global table (sst), text in a cell is encoded as an index to
that sst, you may browse the tara source code for detail. The real M$
Excel is more efficient because it allows using hash lookup instead of
pure dyad i.

ven, 19 Feb 2010, Alex Rufon skribis:
> I know I've been on this path before but how did you guys do the conversion 
> from string to numbers (and vice versa) in APL? Can we replicated this in J?
> 
> Right now, I'm doing this by making a global vector of unique strings I've 
> encountered.
> DICTIONARY_z_ =: < every a.
> 
> search=: verb define
> data=. , boxopen y.
> DICTIONARY_z_=: ~. DICTIONARY_z_,data
> DICTIONARY_z_ i. boxopen y.
> )
> 
> NB. boxopen is defined as:
> NB. boxopen=: <^:(L. = 0:)
> 
> 
> NB. Sample string data
>    [data=. (;: 'aa c3 d5 ae af ax ac ee'),<'The answer is 42'
> +--+--+--+--+--+--+--+--+----------------+
> |aa|c3|d5|ae|af|ax|ac|ee|The answer is 42|
> +--+--+--+--+--+--+--+--+----------------+
> 
> NB. Test data is 2 dimensions
>    [testdata=. 8 3 $ data
> +--+--+----------------+
> |aa|c3|d5              |
> +--+--+----------------+
> |ae|af|ax              |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5              |
> +--+--+----------------+
> |ae|af|ax              |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5              |
> +--+--+----------------+
> |ae|af|ax              |
> +--+--+----------------+
> 
> NB. Numeric equivalent of the strings
>    [asnumbers=. search testdata
> 265 257 258
> 266 267 268
> 269 263 264
> 265 257 258
> 266 267 268
> 269 263 264
> 265 257 258
> 266 267 268
> 
> NB. Check if the numbers convert to strings
>    asnumbers { DICTIONARY
> +--+--+----------------+
> |aa|c3|d5              |
> +--+--+----------------+
> |ae|af|ax              |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5              |
> +--+--+----------------+
> |ae|af|ax              |
> +--+--+----------------+
> |ac|ee|The answer is 42|
> +--+--+----------------+
> |aa|c3|d5              |
> +--+--+----------------+
> |ae|af|ax              |
> +--+--+----------------+
> 
> NB. Switch the columns
>    DICTIONARY {~ 1 2 1 { "1 asnumbers
> +--+----------------+--+
> |c3|d5              |c3|
> +--+----------------+--+
> |af|ax              |af|
> +--+----------------+--+
> |ee|The answer is 42|ee|
> +--+----------------+--+
> |c3|d5              |c3|
> +--+----------------+--+
> |af|ax              |af|
> +--+----------------+--+
> |ee|The answer is 42|ee|
> +--+----------------+--+
> |c3|d5              |c3|
> +--+----------------+--+
> |af|ax              |af|
> +--+----------------+--+
> 
> NB. Only get rows that's equal to 'aa' in the first column
>    [akey=. search 'aa'
> 265
>    [arows=. akey = {. "1 asnumbers
> 1 0 0 1 0 0 1 0   
>    DICTIONARY {~ arows # asnumbers
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> 
> NB. Or simply
>    DICTIONARY {~ asnumbers #~ (search 'aa') = {. "1 asnumbers
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+
> |aa|c3|d5|
> +--+--+--+   
> 
> As you can see, it works. But I'm curious if there is another (better?) was 
> to do it. The only requirement is that the numbers generated can be processed 
> to get the original strings.
> 
> Thanks.
> 
> r/Alex   
> 
> -----Original Message-----
> [---=| TOFU protection by t-prot: 17 lines snipped |=---]

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to