Hi, The main optimization step is still left, I wanted to be sure that I get ICF right before moving ahead. And the time complexity of the entire algorithm is dominated by ICF decomposition. Will update you guys soon when I have the final implementation done, I am eager to try it on datasets as well :)
On Fri, May 18, 2012 at 1:44 AM, Thomas Jungblut < [email protected]> wrote: > Thanks for the explanation! > I have plenty of time today so I can clone your stuff and play arround with > it. > Are there any steps left to use this as SVM? I wanted to try it out on the > mushroom set. > > 2012/5/18 Aditya Sarawgi <[email protected]> > > > @Edward its not urgent, I am ready when you are :) > > > > @Thomas Thanks for the feedback and help. Sure, you can use the code > > for the jiras. But do remember it is slightly different from the actual > icf > > in the sense > > that here the dimension of the result matrix would n x p ( where p is > > typically sqrt(n) ) > > and the approximation error changes with what p. If p is close to n the > > error is low. > > > > It seems to work on smaller matrices pretty well. I tried it by varying > the > > values of p and > > as p approaches n, the decomposition has less error. > > I have to do some more testing though. > > > > > > On Thu, May 17, 2012 at 11:06 AM, Thomas Jungblut < > > [email protected]> wrote: > > > > > instanceof is slow as hell, but if you have no other solution then this > > is > > > okay. > > > > > > 2) What is like the standard way to load matrices in different nodes > > with a > > > > custom partitioning scheme > > > > > > > > > It is depending on your algorithm needs, but I think you will need to > > > implement your own partitioner, since HashPartitioning may not apply to > > > this ICF. > > > Generally you need to use the input system to read a part of a matrix > > into > > > each peer. > > > > > > We also script a mapreduce job that will create random input for x GB > to > > > check scalability. > > > Here is that for graphs: > https://issues.apache.org/jira/browse/HAMA-558 > > > But I think this is easily extendable to matrices. There is an issue > for > > > that as well, I don't know how far Mikalai came with that. > > > > > > BTW your code looks good ;) > > > > > > Can we use this for https://issues.apache.org/jira/browse/HAMA-94 or > > > https://issues.apache.org/jira/browse/HAMA-553 ? Would be a great > > addition > > > if it works! > > > > > > Greetings from Germany, > > > Thomas > > > > > > 2012/5/17 Aditya Sarawgi <[email protected]> > > > > > > > Thanks Thomas. > > > > I am actually using tags for something else. So for now using > > instanceof > > > is > > > > just fine with me. > > > > > > > > I had a couple of more questions, regarding benchmarking stuff on > > hama. I > > > > have a working implementation of > > > > Parallel row based icf that given a n x n matrix returns a > decomposed n > > > x p > > > > matrix. > > > > > > > > > > > > > > > > > > https://github.com/truncs/hello-world/blob/master/shttps://issues.apache.org/jira/browse/HAMA-558rc/main/java/edu/sunysb/cs/Icf.java > > > < > > > > > > https://github.com/truncs/hello-world/blob/master/src/main/java/edu/sunysb/cs/Icf.java > > > > > > > > > > > > Now I would like to test this on a big input and possibly in full > > > > distributed mode, so I was wondering how do > > > > people usually do these sort of benchmarking. > > > > > > > > Specifically, > > > > 1) Do they setup a cluster on AWS ? > > > > 2) What is like the standard way to load matrices in different nodes > > > with a > > > > custom partitioning scheme > > > > 3) Is there anything else that I should know > > > > > > > > On Thu, May 17, 2012 at 3:20 AM, Thomas Jungblut < > > > > [email protected]> wrote: > > > > > > > > > Hi Aditya, > > > > > > > > > > that's where the concept of Message Tagging comes into play. You > have > > > > tags > > > > > in each message which are hardcoded as Strings. > > > > > But as Edward told you can use GenericWritable or ObjectWritable > > > instead, > > > > > so they will tag your messages with the classnames and give you the > > > > correct > > > > > class. > > > > > > > > > > Is there any way by which I can pop from the receive queue ? > > > > > > > > > > > > > > > peer.getCurrentMessage() is popping from the received queue. > > > > > > > > > > 2012/5/17 Aditya Sarawgi <[email protected]> > > > > > > > > > > > Hi, > > > > > > > > > > > > But thats not the only problem, consider this case > > > > > > that there are variable number of messages being sent, so I would > > > have > > > > to > > > > > > maintain > > > > > > counts for each peer pointing to the last unread message. > > > > > > > > > > > > Is there any way by which I can pop from the receive queue ? > > > > > > > > > > > > > > > > > > On Wed, May 16, 2012 at 10:23 PM, Suraj Menon < > > > [email protected] > > > > > > >wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > Please take a look at this snippet of code copied and modified > > from > > > > > > > Mapper class to implement your scenario. - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/ssmenon/hama/edit/master/hama-mapreduce/src/org/apache/hama/computemodel/mapreduce/Trials.java > > > > > > > Between lines 233 to 245 I am able to send different type of > > > > messages. > > > > > > > With type checks and generics you shouldn't be encountering > > > Classcast > > > > > > > exception at receiving end too. I am yet to test the next > > > superstep, > > > > > > > shall update you with sample code for the next superstep > > mimicking > > > > > > > your scenario for receiving. > > > > > > > > > > > > > > For elegance, we have an experimental Superstep#compute > > > > > > > API(org.apache.hama.bsp.Superstep). I have encountered an issue > > in > > > > job > > > > > > > submission framework with this method in distributed mode; fix > > for > > > > > > > this would be pushed to trunk in next few hours. You can still > > run > > > it > > > > > > > using LocalBSPRunner for now. > > > > > > > > > > > > > > -Suraj > > > > > > > > > > > > > > On Wed, May 16, 2012 at 9:18 PM, Aditya Sarawgi > > > > > > > <[email protected]> wrote: > > > > > > > > Hi Edward, > > > > > > > > > > > > > > > > Yes that is what I did > > > > > > > > I wrote an ArrayMessage class (doesn't use generics for now > but > > > can > > > > > be > > > > > > > > converted easily) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/truncs/hello-world/blob/master/src/main/java/edu/sunysb/cs/ArrayMessage.java > > > > > > > > > > > > > > > > But the problem is that I am sending a IntegerMessage before > > and > > > > > after > > > > > > > > reading the IntegerMessage I am sending > > > > > > > > an ArrayMessage but the previous IntegerMessage is still > there. > > > > > > > > > > > > > > > > On Wed, May 16, 2012 at 8:34 PM, Edward J. Yoon < > > > > > [email protected] > > > > > > > >wrote: > > > > > > > > > > > > > > > >> Hi, > > > > > > > >> > > > > > > > >> To send or receive multiple Message types, I think you can > use > > > > > > > >> GenericWritable. You can also implement your own > > GenericMessage > > > > and > > > > > > > >> contribute it to our project! > > > > > > > >> > > > > > > > >> Hope this helps you. > > > > > > > >> > > > > > > > >> On Thu, May 17, 2012 at 7:48 AM, Aditya Sarawgi > > > > > > > >> <[email protected]> wrote: > > > > > > > >> > Hi Guys, > > > > > > > >> > > > > > > > > >> > I am wondering how do the receive queues in hama work. > > > Consider > > > > > this > > > > > > > case > > > > > > > >> > that I want to sent a different type of BSPMessage in 2 > > > > > consecutive > > > > > > > >> > superstep. > > > > > > > >> > In this first superstep I am sending IntMessage and in the > > > next > > > > > one > > > > > > I > > > > > > > am > > > > > > > >> > sending a ArrayMessage ( custom message class). > > > > > > > >> > > > > > > > > >> > Now in the second super step when I do a > > > > > > > >> > while ((arrayMessage = (ArrayMessage) > > > peer.getCurrentMessage()) > > > > > != > > > > > > > >> null) { > > > > > > > >> > > > > > > > > >> > it is throwing a java.lang.ClassCastException, which is > > > obvious > > > > > > since > > > > > > > its > > > > > > > >> > trying to cast IntMessage to ArrayMessage. > > > > > > > >> > I thought the message is dropped from the queue after it > is > > > > read, > > > > > is > > > > > > > this > > > > > > > >> > not the case ? > > > > > > > >> > And if it is not, how can this be handled elegantly ? > > > > > > > >> > > > > > > > > >> > -- > > > > > > > >> > Cheers, > > > > > > > >> > Aditya Sarawgi > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> -- > > > > > > > >> Best Regards, Edward J. Yoon > > > > > > > >> @eddieyoon > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Cheers, > > > > > > > > Aditya Sarawgi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Cheers, > > > > > > Aditya Sarawgi > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Thomas Jungblut > > > > > Berlin <[email protected]> > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > Aditya Sarawgi > > > > > > > > > > > > > > > > -- > > > Thomas Jungblut > > > Berlin <[email protected]> > > > > > > > > > > > -- > > Cheers, > > Aditya Sarawgi > > > > > > -- > Thomas Jungblut > Berlin <[email protected]> > -- Cheers, Aditya Sarawgi
