RE: [Jgeneral] Successful stories

Joey K Tuttle Thu, 03 Jan 2008 20:58:05 -0800

One advantage of fixed length rows is that a file of them
can be mapped as a matrix, rather than having to deal with
CSV as varying length bits of a stream.


- joey

At 11:32  +0800 2008/01/04, Alex Rufon wrote:

Hi Nick,

If your only choice is between fixed length and CSV ... I'd go with CSV.

I actually do not know if its possible but maybeyou can store your data using SQLite or anydatabase. That way you can partition your datato only what you need (although since yourcolumns are the coordinates so it may be a pain)by constructing clever sql statements. I believethat the problem is not in storing the data butin retrieving the right amount of data to fitinto your memory/work space.


r/Alex

-----Original Message-----
From: Nick Kostirya [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 03, 2008 9:23 PM
To: General forum
Cc: Alex Rufon
Subject: Re: [Jgeneral] Successful stories

Ç Thu, 3 Jan 2008 17:58:59 +0800
"Alex Rufon" <[EMAIL PROTECTED]> ÔË¯ÂÚ:

 ...

 Nick. Just tell us what you think of doing. Maybe we can help. There
 are a lot of brilliant people here on the list and maybe we can help
 thing move along for you. ;)


Thanks a lot.

I have two goals.

The first one adds up to doing a giant system of linear equations.
The matrix A in the equation Y=A*X is rather sparse, but large.
The size of matrix A is 100 million to 100 million, however, the matrix
row will only have around 100 cells with the values different from zero.

The description of this matrix is located in the file with 3 columns.
The first two columns contain the cell's coordinates, the third column
contains the value. That is, the file contains 10 billion rows. I
haven't decided yet what format is better to store the data in one
file. It can be either fixed length rows, or CSV. What is better from
the J viewpoint ?

That said, the task consists of the following:
1)  read the file and generate the matrix
2)  do a system of linear equations
3)  save the result

In a word, it's simple , but for the newbie in J it's hard to define
the best way for now. Say, the process of "Connection matrix"
generation described here
http://www.jsoftware.com/help/dictionary/samp20.htm doesn't seem quite
optimal to me for the above task solution.

The second task is connected to the factor analysis, however, there are
less data here. So, I believe upon the first task solution I'll acquire
the experience needed to solve the second one.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

RE: [Jgeneral] Successful stories

Reply via email to