[Jgeneral] Handling large amounts of data

Devon McCormick Mon, 29 Jan 2007 08:13:29 -0800

I took the liberty of changing the subject line on this since it is a
different
topic.


I have been handling large amounts of data for the Netflix challenge by
using the WS.ijs functions to which I've already alluded.  It's kind of
klugy
but it works OK: I break the 100 million elements of the dataset into about
100 variables with names like "murd0", "murd1"..."murd99".  I then define
an adverb to apply an arbitrary verb across the variables on file.

Other than this, I've used the J ODBC functionality to work with large
amounts
of data in a database.  I've also used memory-mapped files for fairly large
amounts
of simply-structured data.

Having used these different methods, I most like the ODBC method for data
of any complexity of structure though I might simply use it with .CSV files
as my
database tables to avoid the limitations and restrictions of any particular
database
system.

On 1/29/07, dly <[EMAIL PROTECTED]> wrote:

This brings up the topic of good ways to handle large amounts of data,
not just data that can be regenerated but reference data or records in
databases.  J is very powerful so there needs to be care taken not to
overwrite or erase data files.

Are there applications people use with J to manage this?

dly
[EMAIL PROTECTED]

On 29-Jan-07, at 3:41 PM, Devon McCormick wrote:

> That's what I was thinking of since the existing code is geared toward
> saving individual nouns.
>
> On 1/29/07, Roger Hui <[EMAIL PROTECTED]> wrote:
>>
>> When the workspace is 250 MB almost certainly most
>> of it is data (nouns).  One variant could be to save
>> just the verbs adverbs and conjunctions.  You can
>> get this by using  nl__y 1 2 3  instead of
>> nl__y i.4 .

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm




--
Devon McCormick
^me^ at acm.
org is my
preferred e-mail
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jgeneral] Handling large amounts of data

Reply via email to