Hey walker,

cool thing, can you share a link to your cNeural library?
Martin is working on GPU with Hama in relation to Hama pipes (
https://issues.apache.org/jira/browse/HAMA-619).
He wants to go the native way, but personally I made pretty good
experiences with JCUDA, I used it for my neural net implementation for
large input neurons in image recognition tasks.
However that's just a part of making the matrix multiplications faster
which usually takes most of the time.

The HBase storage is really interesting, thanks for sharing!

2012/9/20 顾荣 <[email protected]>

> Hi, Thomas.
>
> I read your blog and github on the information about training NN on Hama
> several days ago. I am agree with you on this topic for my experiences on
> implementing NN in a distribute way.
> That happens when I did not know Hama project. Thus, I implemented a
> customized distribuited system for training NN with large scale training
> data myself, the system is called cNeural.
> It is basically fellows a Master/Slave archtecture.I adopted Hadoop RPC
> for communication and HBase for storing large scale training dataset, and I
> used a batch-mode the BP tranining algorithm.
> BTW, HBase is very suitable for store traning data sets for machine
> learning. No matter how large a traning data set is, a HTable can easily
> store it across many regionservers.
> Each traning sample can be stored as a record in HTable, even it's sparse
> coded. Furtherly, HBase provide random access to your training sample. In
> my experience,
> it's much better to store the structured data in Hbase than directly in
> HDFS.
>
> Back to this topic, as you mentioned, I can read training data directlly
> from HDFS through HDFS API, during the setup stage of the vertex.
> I also considered this and know how to use HDFS API long ago, thanks for
> hint anyway:)
> However, I am afraid of that it may cost quite a lot of time, because for
> a large sacle NN with thousands of neurons,
> each neuron vertex almost simutanluously reads the same traning sample
> would cost a lot of network traffic
> and put too much stress on HDFS. What's more, it seems unnecessary. I
> planed to select a master vector
> responsible for reading samples for HDFS, and intialize each input neuro
> by sending the feature value to this vertex.
> However, even though I can do this, there are a lot more tough problems to
> solve, such as partition. As you said, to
> conrol this training workflow in a distributed way is too complex. And
> with so many network communication and distribute
> synchronization, it will be much slower than the sequential programe
> executed on a single machine. In a word,
> this tough distribution wil probably leads to no improvment but slower
> speed and high complexity. As you talk about for high
> dimensionalities, I suggest to use GPU to handle this. Distribution may
> not be a good solution in this case. Of course, we
> can combine GPU with Hama, and it's necessary in the near future, I
> believe.
>
> As I have mentioned at the beginning of this mail. I implemented cNeural,
> and I also compare cNeural with Hadoop for sloving this problem.
> The experiment results can be find in the attachment of this mail. In
> general, cNeural adopted a parallel strategy like BSP model. So, I am about
>  to reimplement cNeural on Hama BSP. I learned Hama Graph this week, and
> just come across a thought of implementing NN on Hama Graph,
> considered about this case, and asked this question. I am agree with you
> on your analysis.
>
> Regards,
> Walker.
>
>
>
> 2012/9/20 Thomas Jungblut <[email protected]>
>
>> Hi,
>>
>> nice idea, but I'm certainly unsure if the graph module really fits your
>> needs.
>> In Backprop you need to set the input to different neurons in your input
>> layer and you have to forwardpropagate these until you reach the output
>> layer. Calculating the error from this single step in your architecture
>> would consume many supersteps. This is totally inefficient in my
>> opinion, but let's just take this thought away.
>>
>> Assuming you have an n by m matrix which contains your whole trainingset
>> and in the m-th column there is the outcome of the previous features.
>> A input vertex should have the ability to read a row of the corresponding
>> column vector from the trainingset and the output neurons need to do the
>> same.
>> Good news, you can do this by reading a file within the setup function of
>> a
>> vertex or by reading it line by line when compute is called. You can
>> access
>> filesystems with the Hadoop DFS API pretty easily. Just type it into your
>> favourite search engines, it is just called FileSystem and you can get it
>> by using FileSystem.get(Configuration conf).
>>
>> Now here is my experience with a raw BSP and neural networks if you
>> consider this against the graph module:
>> - partition the neurons horizontally (through the layers) not by the
>> layers
>> - weights mustbe averaged across multiple tasks
>>
>> I came for myself to conclude that it is fairly better to implement a
>> function optimizer with raw BSP to train the weights (a simple
>> StochasticGradientDescent totally works out for almost every normal
>> usecase
>> if your network has a convex costfunction).
>> Of course this doesn't work out well for higher dimensionalities, but more
>> data usually wins, even with simpler models. At the end you can always
>> boost it anyway.
>>
>> I will of course support you on this if you like, I'm fairly certain that
>> your way can work, but will be slow as hell.
>> Just my usual two cents on various topics ;)
>>
>> 2012/9/20 顾荣 <[email protected]>
>>
>> > Hi, guys.
>> >
>> > As you are calling for some application programs on Hama in the *Future
>> > Plans* of the Hama programming wiki here (
>> >
>> >
>> https://issues.apache.org/jira/secure/attachment/12528218/ApacheHamaBSPProgrammingmodel.pdf
>> > ),
>> > I am so interested in machine learning. I have a plan to implement
>> neural
>> > networks (eg.Multilayer Perceptron with BP) on Hama. Hama seems to be a
>> > nice tool for training large scale neural networks. Esepcailly, for
>> those
>> > with large scale structure (many hidden layers and many neurons), I find
>> > Hama Graph provided a good solution. We can regard each neuron in
>> NN(neural
>> > network) as a vertex in Hama Graph, and the links between neurons as
>> eages
>> > in the Graph. Then, the training process can be regarded as updating the
>> > weights of the eages among vetices. However, I encounted a problem in
>> the
>> > current Hama Graph implementation.
>> >
>> > Let me explain this to you. As you maybe now, during the training
>> process
>> > of many machine learning algorithms, we need to input many training
>> samples
>> > into the model one by one. Usaually, more training samples will lead to
>> > preciser models. However, as far as I know, the only input file
>> interface
>> > provided by the Hama Graph is the input for graph structure. Sadly, it's
>> > hard to read the distribute the training samples during running time, as
>> > users can only make their computing logics by overriding the some key
>> > functions such as compute() int the Vetex class. So, does hama graph
>> > provide any flexible file reading interface for users in running time?
>> >
>> > Thanks in advance.
>> >
>> > Walker.
>> >
>>
>
>

Reply via email to