Oops - correction. The goal is to be as fast or faster than Matlab/Octave/Scilab, but in certain cases, we may accept being within 1.5. So, report an issue if a particular sparse matrix operation is slower. Even if we can't address some such issues right away, it is good to have them in the issues to address later.
-viral On Friday, November 14, 2014 9:06:23 AM UTC+5:30, Viral Shah wrote: > > There is no reason we should be slower than Matlab's C implementation by > more than a factor of 2, ideally 1.5. Not considering GC time, which is > being addressed separately anyways, if you conclusively find certain sparse > matrix operations slower than Matlab, please file an issue. There are > always some code paths and certain operations that may not have been fully > optimized just yet. > > -viral > > On Thursday, November 13, 2014 1:09:15 AM UTC+5:30, Joshua Tokle wrote: >> >> I unrolled the loop as suggested and it brought the runtime down to >> 85-100 seconds from around 220 seconds. Thanks! We're now approaching the >> speed of the original matlab code, which runs in about 70 seconds. I'll try >> to think of other was to bypass the unnecessary creation of temporary >> objects. >> >> Out of curiosity, why do you suppose the matlab code performs so well on >> the original implementation compared to julia? Better sparse matrix >> operations? Is it better optimized for temporary objects? Something else? >> >> On Tue, Nov 11, 2014 at 11:25 PM, Tony Kelman <[email protected]> wrote: >> >>> At this point the most effective way to write decently-performing sparse >>> matrix code in Julia is to familiarize yourself with the compressed sparse >>> column format ( >>> http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29) >>> >>> and work directly on the colptr, rowval, and nzval vectors. This isn't much >>> better than writing sparse matrix code in C (just keep in mind that the >>> arrays are 1-based in Julia, whereas if you were writing a mex file or >>> standalone C code they would be 0-based), but it's not too terrible once >>> you've done it a few times. >>> >>> Any time you are trying to do a standard elementwise, binary, or >>> reduction operation across dimensions - min, max, sum, sumabs, sumabs2, etc >>> - on sparse matrices and you see the current Julia version is performing >>> poorly because the fallback version is dense, you can implement it manually >>> without too much trouble by operating directly on the colptr, rowval, and >>> nzval vectors. Pull requests to add additional sparse-specialized >>> operations to sparsematrix.jl are welcome. I'd personally like to see a >>> more general design of the array system long-term that would reduce the >>> number of specialized methods that need to be written for each sparse >>> matrix format, but that's a ways off. >>> >>> >>> On Tuesday, November 11, 2014 7:11:22 AM UTC-8, Milan Bouchet-Valat >>> wrote: >>>> >>>> Le mardi 11 novembre 2014 à 09:54 -0500, Joshua Tokle a écrit : >>>> > I think I see: you're suggesting a single loop that performs the >>>> > a*b*(c+d+e) operation element-wise, and this will prevent the >>>> > allocation of intermediate results. Is that right? >>>> Yes. >>>> >>>> > Yes, the sparsity pattern will differ between the vectors I'm >>>> > operating on, but maybe I can adapt the code from sparsematrix.jl to >>>> > my special case. >>>> Indeed, this makes my proposition more challenging. A general algorithm >>>> to combine several sparse matrices would be very useful. >>>> >>>> >>>> Regards >>>> >>>> > On Tue, Nov 11, 2014 at 9:43 AM, Milan Bouchet-Valat >>>> > <[email protected]> wrote: >>>> > Le mardi 11 novembre 2014 à 04:52 -0800, Joshua Tokle a >>>> > écrit : >>>> > > Thanks for your response. Because the operations are on >>>> > sparse >>>> > > matrices I'm pretty sure the arithmetic is already more >>>> > optimized than >>>> > > something I would write: >>>> > > >>>> > > >>>> > https://github.com/JuliaLang/julia/blob/master/base/sparse/ >>>> sparsematrix.jl#L530 >>>> > > I did actually spend some time searching for sparse matrix >>>> > arithmetic >>>> > > algorithms, but I didn't come up with anything that seemed >>>> > like it >>>> > > would be an improvement. >>>> > Of course the algorithm is optimized for sparse matrices, but >>>> > every call >>>> > to + or .* creates a copy of the matrix, which is not >>>> > efficient at all. >>>> > >>>> > > It seems that the issue may be with garbage collection. I'm >>>> > going to >>>> > > post a top-level reply with more on this. >>>> > I'm not an expert of sparse matrices, but one efficient way >>>> of >>>> > avoid >>>> > temporary copies and garbage collection is to manually unroll >>>> > your loop, >>>> > as I advised in my first message. Now I must admit it's more >>>> > complex >>>> > than it seems if the structure of nonzeros is not always the >>>> > same in the >>>> > various matrices you want to combine. Is that the case? >>>> > >>>> > >>>> > Regards >>>> > >>>> > >>>> > Regards >>>> > >>>> > > On Monday, November 10, 2014 5:05:29 PM UTC-5, Milan >>>> > Bouchet-Valat >>>> > > wrote: >>>> > > Le lundi 10 novembre 2014 à 13:03 -0800, Joshua >>>> > Tokle a >>>> > > écrit : >>>> > > > Hello! I'm trying to replace an existing matlab >>>> > code with >>>> > > julia and >>>> > > > I'm having trouble matching the performance of >>>> the >>>> > original >>>> > > code. The >>>> > > > matlab code is here: >>>> > > > >>>> > > > >>>> > > >>>> > https://github.com/jotok/InventorDisambiguator/blob/ >>>> julia/Disambig.m >>>> > > > >>>> > > > The program clusters inventors from a database of >>>> > patent >>>> > > applications. >>>> > > > The input data is a sparse boolean matrix (named >>>> > XX in the >>>> > > script), >>>> > > > where each row defines an inventor and each >>>> column >>>> > defines a >>>> > > feature. >>>> > > > For example, the jth column might correspond to a >>>> > feature >>>> > > "first name >>>> > > > is John". If there is a 1 in the XX[i, j], this >>>> > means that >>>> > > inventor >>>> > > > i's first name is John. Given an inventor i, we >>>> > find similar >>>> > > inventors >>>> > > > by identifying rows in the matrix that agree with >>>> > XX[i, :] >>>> > > on a given >>>> > > > column and then applying element-wise boolean >>>> > operations to >>>> > > the rows. >>>> > > > In the code, for a given value of `index`, >>>> > C_lastname holds >>>> > > the unique >>>> > > > column in XX corresponding to a "last name" >>>> > feature such >>>> > > that >>>> > > > XX[index, :] equals 1. C_firstname holds the >>>> > unique column >>>> > > in XX >>>> > > > corresponding to a "first name" feature such that >>>> > > XX[index, :] equals >>>> > > > 1. And so on. The following code snippet finds >>>> all >>>> > rows in >>>> > > the matrix >>>> > > > that agree with XX[index, :] on full name and one >>>> > of patent >>>> > > assignee >>>> > > > name, inventory city, or patent class: >>>> > > > >>>> > > > lump_index_2 = step & ((C_assignee | C_city | >>>> > C_class)) >>>> > > > >>>> > > > The `step` variable is an indicator that's used >>>> to >>>> > prevent >>>> > > the same >>>> > > > inventors from being considered multiple times. >>>> My >>>> > attempt >>>> > > at a >>>> > > > literal translation of this code to julia is >>>> here: >>>> > > > >>>> > > > >>>> > > >>>> > https://github.com/jotok/InventorDisambiguator/blob/ >>>> julia/disambig.jl >>>> > > > >>>> > > > The matrix X is of type SparseMatrixCSC{Int64, >>>> > Int64}. >>>> > > Boolean >>>> > > > operations aren't supported for sparse matrices >>>> in >>>> > julia, so >>>> > > I fake it >>>> > > > with integer arithmetic. The line that >>>> > corresponds to the >>>> > > matlab code >>>> > > > above is >>>> > > > >>>> > > > lump_index_2 = find(step .* (C_name .* >>>> > (C_assignee + >>>> > > C_city + C_class))) >>>> > > You should be able to get a speedup by replacing >>>> > this line >>>> > > with an >>>> > > explicit `for` loop. First, you'll avoid memory >>>> > allocation >>>> > > (one for each >>>> > > + or .* operation). Second, you'll be able to >>>> return >>>> > as soon >>>> > > as the >>>> > > index is found, instead of computing the value for >>>> > all >>>> > > elements (IIUC >>>> > > you're only looking for one index, right?). >>>> > > >>>> > > >>>> > > My two cents >>>> > > >>>> > > > The reason I grouped it this way is that >>>> initially >>>> > `step` >>>> > > will be a >>>> > > > "sparse" vector of all 1's, and I thought it >>>> might >>>> > help to >>>> > > do the >>>> > > > truly sparse arithmetic first. >>>> > > > >>>> > > > I've been testing this code on a Windows 2008 >>>> > Server. The >>>> > > test data >>>> > > > contains 45,763 inventors and 274,578 possible >>>> > features (in >>>> > > other >>>> > > > words, XX is an 45,763 x 274,58 sparse matrix). >>>> > The matlab >>>> > > program >>>> > > > consistently takes about 70 seconds to run on >>>> this >>>> > data. The >>>> > > julia >>>> > > > version shows a lot of variation: it's taken as >>>> > little as 60 >>>> > > seconds >>>> > > > and as much as 10 minutes. However, most runs >>>> take >>>> > around >>>> > > 3.5 to 4 >>>> > > > minutes. I pasted one output from the sampling >>>> > profiler here >>>> > > [1]. If >>>> > > > I'm reading this correctly, it looks like the >>>> > program is >>>> > > spending most >>>> > > > of its time performing element-wise >>>> multiplication >>>> > of the >>>> > > indicator >>>> > > > vectors I described above. >>>> > > > >>>> > > > I would be grateful for any suggestions that >>>> would >>>> > bring >>>> > > the >>>> > > > performance of the julia program in line with the >>>> > matlab >>>> > > version. I've >>>> > > > heard that the last time the matlab code was run >>>> > on the full >>>> > > data set >>>> > > > it took a couple days, so a slow-down of 3-4x is >>>> a >>>> > > signficant burden. >>>> > > > I did attempt to write a more idiomatic julia >>>> > version using >>>> > > Dicts and >>>> > > > Sets, but it's slower than the version that uses >>>> > sparse >>>> > > matrix >>>> > > > operations: >>>> > > > >>>> > > > >>>> > > >>>> > https://github.com/jotok/InventorDisambiguator/blob/ >>>> julia/disambig2.jl >>>> > > > >>>> > > > Thank you! >>>> > > > Josh >>>> > > > >>>> > > > >>>> > > > [1] >>>> > https://gist.github.com/jotok/6b469a1dc0ff9529caf5 >>>> > > > >>>> > > > >>>> > > >>>> > >>>> > >>>> > >>>> > >>>> >>>> >>
