Hey walker, cool thing, can you share a link to your cNeural library? Martin is working on GPU with Hama in relation to Hama pipes ( https://issues.apache.org/jira/browse/HAMA-619). He wants to go the native way, but personally I made pretty good experiences with JCUDA, I used it for my neural net implementation for large input neurons in image recognition tasks. However that's just a part of making the matrix multiplications faster which usually takes most of the time.
The HBase storage is really interesting, thanks for sharing! 2012/9/20 顾荣 <[email protected]> > Hi, Thomas. > > I read your blog and github on the information about training NN on Hama > several days ago. I am agree with you on this topic for my experiences on > implementing NN in a distribute way. > That happens when I did not know Hama project. Thus, I implemented a > customized distribuited system for training NN with large scale training > data myself, the system is called cNeural. > It is basically fellows a Master/Slave archtecture.I adopted Hadoop RPC > for communication and HBase for storing large scale training dataset, and I > used a batch-mode the BP tranining algorithm. > BTW, HBase is very suitable for store traning data sets for machine > learning. No matter how large a traning data set is, a HTable can easily > store it across many regionservers. > Each traning sample can be stored as a record in HTable, even it's sparse > coded. Furtherly, HBase provide random access to your training sample. In > my experience, > it's much better to store the structured data in Hbase than directly in > HDFS. > > Back to this topic, as you mentioned, I can read training data directlly > from HDFS through HDFS API, during the setup stage of the vertex. > I also considered this and know how to use HDFS API long ago, thanks for > hint anyway:) > However, I am afraid of that it may cost quite a lot of time, because for > a large sacle NN with thousands of neurons, > each neuron vertex almost simutanluously reads the same traning sample > would cost a lot of network traffic > and put too much stress on HDFS. What's more, it seems unnecessary. I > planed to select a master vector > responsible for reading samples for HDFS, and intialize each input neuro > by sending the feature value to this vertex. > However, even though I can do this, there are a lot more tough problems to > solve, such as partition. As you said, to > conrol this training workflow in a distributed way is too complex. And > with so many network communication and distribute > synchronization, it will be much slower than the sequential programe > executed on a single machine. In a word, > this tough distribution wil probably leads to no improvment but slower > speed and high complexity. As you talk about for high > dimensionalities, I suggest to use GPU to handle this. Distribution may > not be a good solution in this case. Of course, we > can combine GPU with Hama, and it's necessary in the near future, I > believe. > > As I have mentioned at the beginning of this mail. I implemented cNeural, > and I also compare cNeural with Hadoop for sloving this problem. > The experiment results can be find in the attachment of this mail. In > general, cNeural adopted a parallel strategy like BSP model. So, I am about > to reimplement cNeural on Hama BSP. I learned Hama Graph this week, and > just come across a thought of implementing NN on Hama Graph, > considered about this case, and asked this question. I am agree with you > on your analysis. > > Regards, > Walker. > > > > 2012/9/20 Thomas Jungblut <[email protected]> > >> Hi, >> >> nice idea, but I'm certainly unsure if the graph module really fits your >> needs. >> In Backprop you need to set the input to different neurons in your input >> layer and you have to forwardpropagate these until you reach the output >> layer. Calculating the error from this single step in your architecture >> would consume many supersteps. This is totally inefficient in my >> opinion, but let's just take this thought away. >> >> Assuming you have an n by m matrix which contains your whole trainingset >> and in the m-th column there is the outcome of the previous features. >> A input vertex should have the ability to read a row of the corresponding >> column vector from the trainingset and the output neurons need to do the >> same. >> Good news, you can do this by reading a file within the setup function of >> a >> vertex or by reading it line by line when compute is called. You can >> access >> filesystems with the Hadoop DFS API pretty easily. Just type it into your >> favourite search engines, it is just called FileSystem and you can get it >> by using FileSystem.get(Configuration conf). >> >> Now here is my experience with a raw BSP and neural networks if you >> consider this against the graph module: >> - partition the neurons horizontally (through the layers) not by the >> layers >> - weights mustbe averaged across multiple tasks >> >> I came for myself to conclude that it is fairly better to implement a >> function optimizer with raw BSP to train the weights (a simple >> StochasticGradientDescent totally works out for almost every normal >> usecase >> if your network has a convex costfunction). >> Of course this doesn't work out well for higher dimensionalities, but more >> data usually wins, even with simpler models. At the end you can always >> boost it anyway. >> >> I will of course support you on this if you like, I'm fairly certain that >> your way can work, but will be slow as hell. >> Just my usual two cents on various topics ;) >> >> 2012/9/20 顾荣 <[email protected]> >> >> > Hi, guys. >> > >> > As you are calling for some application programs on Hama in the *Future >> > Plans* of the Hama programming wiki here ( >> > >> > >> https://issues.apache.org/jira/secure/attachment/12528218/ApacheHamaBSPProgrammingmodel.pdf >> > ), >> > I am so interested in machine learning. I have a plan to implement >> neural >> > networks (eg.Multilayer Perceptron with BP) on Hama. Hama seems to be a >> > nice tool for training large scale neural networks. Esepcailly, for >> those >> > with large scale structure (many hidden layers and many neurons), I find >> > Hama Graph provided a good solution. We can regard each neuron in >> NN(neural >> > network) as a vertex in Hama Graph, and the links between neurons as >> eages >> > in the Graph. Then, the training process can be regarded as updating the >> > weights of the eages among vetices. However, I encounted a problem in >> the >> > current Hama Graph implementation. >> > >> > Let me explain this to you. As you maybe now, during the training >> process >> > of many machine learning algorithms, we need to input many training >> samples >> > into the model one by one. Usaually, more training samples will lead to >> > preciser models. However, as far as I know, the only input file >> interface >> > provided by the Hama Graph is the input for graph structure. Sadly, it's >> > hard to read the distribute the training samples during running time, as >> > users can only make their computing logics by overriding the some key >> > functions such as compute() int the Vetex class. So, does hama graph >> > provide any flexible file reading interface for users in running time? >> > >> > Thanks in advance. >> > >> > Walker. >> > >> > >
