Hi Mattia - I was working on a longer answer to your solver question because I have something written up on that but I'll just say that the short answer is there are any number of simple, elegant solvers that can be implemented in J and I have examples of using a couple of them. Unfortunately, searching the J wiki for "solver" turns up nothing of interest though I'm sure there is material out there.
As far as working with large datasets, I've been vainly attempting to meet the Netflix challenge though I have been able to work easily with its dataset in a crude way. It's about 100 million small records - about 1% the size of yours in terms of bytes - and I can do simple things to all the records in about five minutes on a 2 GHz XP machine having 1 GB RAM. I don't know if this is good or bad from your perspective. The crude technique I've implemented to work with this large dataset is to break it up into about 100 files that I treat en-masse by applying a J verb across the group or selected parts of it. It works well enough for my purposes. Anyway, with any luck we'll continue this dialog. Good luck, Devon On 3/30/08, Mattia Landoni <[EMAIL PROTECTED]> wrote: > > Hi all, > > this is a narrowed-down version of an email I just sent to the general > list > with the same subject > > > I am an economist and I discovered J a few days ago. I haven't been so > excited since when I was 13 and Santa brought me an 8-bit Nintendo > Entertainment System. Yet before taking a week off from work to study J > > (just kidding) I would like to be sure it does everything I need. Here is > what concerns me the most. > > > - How does J deal with very large datasets? currently I am dealing with a > 65-Gb dataset. So far only software I can use is SAS. Performing an SQL > query [SELECT, GROUP BY] in SAS on a dedicated server takes me six hours, > of > which a large part of the time is network I/O (I guess SAS's computing > time > would be an hour, perhaps two). The data is divided in 7 chunks of 7 to 13 > Gb each. Having the same amount of data on a good computer, would I be > able > to perform the same operations with J? Assume plentiful RAM and speedy > processor: what's the order of magnitude of the time it would take? > - I read something about memory mapping in past posts and I intuitively > understand what it means but I never did it. What are the limits of memory > mapping? In general, what are the techniques to deal with large datasets? > > > Any answer, hint, link,... most welcome. > > > Mattia > > > -- > Mattia Landoni > 1201 S Eads St Apt 417 > Arlington, VA 22202-2837 > USA > Greenwich -5 hours > > Office: +1 202 62 35922 > Cell: +1 202 492 3404 > Home: +1 360 968 1684 > > Govern a great country as you would fry a small fish: do not poke at it > too > much. > -- Lao Tzu > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > -- Devon McCormick, CFA ^me^ at acm. org is my preferred e-mail ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
