Hello,

I'm new to J so please forgive me if this is a FAQ.

I wrote some short sentences to parse a log file. I want to retrieve all the
unique values of some attribute. The way it shows in the log file is
<attribute name>SPACE<attribute value> such as "..... csn 92892849893284
..."

My initial (brute force) program is:

text =: 1!:1 < '/tmp/logfile'
words =: cutopen text
bv =: (<'csn') = words
srbv =: _1 |.!.0 bv
csns =: ~. srbv # words

Now csns holds the unique values as requested.

The program works fine for small files (few megabytes).

My question is, what should be done to make it work for large files (say,
1GB or more)? I guess it involves memory mapped files but I have no clue
where to continue from here.

Further, is there any notion of 'laziness' (evaluate only when the data is
really needed) in J? can a verb be decalred as a lazy verb?

Thanks,

Yoel
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to