> I doubt that it is stored as an explicit matrix. Each page would probably > have a big table (or file) entry and would have a list of links including > link text.
Oh.. Probably, and some random walk on the link graph. On Wed, Oct 29, 2008 at 2:12 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > On Tue, Oct 28, 2008 at 5:15 PM, Edward J. Yoon <[EMAIL PROTECTED]>wrote: > >> ... >> In single machine, as far as we >> know graph can be stored to linked list or matrix. >> > > Since the matrix is normally very sparse for large graphs, these two > approaches are pretty similar. > > >> ... So, I guess google's web graph will be stored as a matrix in a >> bigTable. >> > > I doubt that it is stored as an explicit matrix. Each page would probably > have a big table (or file) entry and would have a list of links including > link text. > > > Have you seen my 2D block algorithm post?? -- >> http://blog.udanax.org/2008/10/parallel-matrix-multiply-on-hadoop.html >> > > I have now. Block decomposition for multiplies almost always applies only > to dense matrix operations. For most sparse matrix representations > extracting a block is only efficient if it is full width or height. For > very sparse matrix operations, the savings due to reuse of intermediate > results are completely dominated by the I/O cost so block decompositions are > much less helpful. > > In many cases, it isn't even very helpful to send around entire rows and > sending individual elements is about as efficient. > > FYI, Hama (http://incubator.apache.org/hama/) will be handled graph >> algorithms since it is a related with adjacency matrix and topological >> algebra. And I think 2000 node hadoop/hbase cluster is big enough if a >> sequential/random read/write speed will be improved 800%. :-) >> > > I think that a 5 node cluster is big enough without any improvement in > read/write speed. > > Of course, it depends on the size of the problem. I was only working with a > matrix with a few tens of billions of non-zero values. > -- Best regards, Edward J. Yoon [EMAIL PROTECTED] http://blog.udanax.org
