I took the liberty of changing the subject line on this since it is a different topic.
I have been handling large amounts of data for the Netflix challenge by using the WS.ijs functions to which I've already alluded. It's kind of klugy but it works OK: I break the 100 million elements of the dataset into about 100 variables with names like "murd0", "murd1"..."murd99". I then define an adverb to apply an arbitrary verb across the variables on file. Other than this, I've used the J ODBC functionality to work with large amounts of data in a database. I've also used memory-mapped files for fairly large amounts of simply-structured data. Having used these different methods, I most like the ODBC method for data of any complexity of structure though I might simply use it with .CSV files as my database tables to avoid the limitations and restrictions of any particular database system. On 1/29/07, dly <[EMAIL PROTECTED]> wrote:
This brings up the topic of good ways to handle large amounts of data, not just data that can be regenerated but reference data or records in databases. J is very powerful so there needs to be care taken not to overwrite or erase data files. Are there applications people use with J to manage this? dly [EMAIL PROTECTED] On 29-Jan-07, at 3:41 PM, Devon McCormick wrote: > That's what I was thinking of since the existing code is geared toward > saving individual nouns. > > On 1/29/07, Roger Hui <[EMAIL PROTECTED]> wrote: >> >> When the workspace is 250 MB almost certainly most >> of it is data (nouns). One variant could be to save >> just the verbs adverbs and conjunctions. You can >> get this by using nl__y 1 2 3 instead of >> nl__y i.4 . ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
-- Devon McCormick ^me^ at acm. org is my preferred e-mail ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
