Hi Nick -

I use J to work with the data for the Netflix challenge albeit with little
success so far.  In any case, this data comprises over 100 million records
(where I represent a record as 4 integers) so it's over 1.5 GB in total.  I
broke the data into about 100 pieces to deal with it: I treat it as a
numbered set of variables, say 'murd0' to 'murd99' and work on the entire
set by using an adverb or two.  You can check out an essay I wrote on
improving this adverb at
http://www.jsoftware.com/jwiki/NYCJUG/2007-03-13?highlight=%28gavinfo%29#head-0c69d286922c2243a3e9c74bb1f06a4d9e88bbfd.

The essay pertains to a long, complex operation but most simple things I do
to all 100 million records take about 2 minutes.  A heavy-duty re-ordering
of the entire set might take 5 or 6 hours.

Let me know if you'd like to know more.

Regards,

Devon

On 12/27/07, Nick Kostirya <[EMAIL PROTECTED]> wrote:
>
> Hello All.
>
> To estimate the situations I can use J for, I am looking for the
> successful stories of processing data arrays with J.
>
> First of all I am interested in learning the data levels, the space
> they occupied in the storage and the information regarding the hardware
> used for computing. Besides, the knowledge regarding the specificity of
> operations with those huge data arrays would be desirable.
>
> I'll be much obliged for the detailed information.
>
> All the best, Nick
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>



-- 
Devon McCormick, CFA
^me^ at acm.
org is my
preferred e-mail
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to