> though to be frank, I don't understand your second paragraph i.e, how > turning the vectors into sparse vectors will enable me to do transpose in a > easier fashion without resorting to doing it manually), however, I suppose > the purpose of the DRM format was to make step 5,6 much easier so I guess I
What i meant, since you can use sparse vectors, you don't have to number them strictly sequentially with one reducer. You still might have several reducers that would number them sequentially within just single reducer's range but not universally and it still will not be detrimental from the problem size point of view. -d > > > Thanks again! > > > > On Thu, May 5, 2011 at 9:40 AM, Dmitriy Lyubimov <[email protected]> wrote: > >> I think first step is to decide on pipeline of algorithms. Once u know the >> algorithms u want to run thru, it would be easier to come up with >> vectorization requirements. >> >> That said, for the sake of trasposition, note that mahout supports sparse >> vectors, I. e. It doesn't matter what the element index is, for as long as >> it unique, only how many nonzero elements, does. So I don't think that u >> are >> per se constrained in number of reducers during vectorization for >> transpose. >> That would have been pretty scale restricting, indeed. >> >> apologies for brevity. >> >> Sent from my android. >> -Dmitriy >> On May 5, 2011 6:58 AM, "Vckay" <[email protected]> wrote: >> >
