Thanks for the code example Joonas. I think there are GPU / OpenCL coders on this list that can look at it and contribute ideas. We need to have enough understanding of GPU acceleration to have a discussion and proof of concept about using shader / bitmap convolution operations vs. structured kernals (looping) OpenCL operation and when / where to use existing library code. This requires understanding of CLA theory, CLA core in NuPic and GPU architecture (beyond my skill-set at the moment). It may impact data structures we use in the core CLA if we change them to conform to GPU friendly versions.
We can't leave out the possibility of running many small regions on individual processor cores instead of offloading processing of one large region. This might work if it was easy to wire up small regions in networks that scaled out dynamically, something we are hoping to achieve after the NuPic core extraction. If this is the case then porting region code would be a relatively easy 1:1 operation. This would allow use to run NuPic on big iron (SPARC / Solaris) or distributed low power ARM arrays like Parlella ( https://www.kickstarter.com/projects/adapteva/parallella-a-supercomputer-for-everyone) -Doug On Thu, May 8, 2014 at 2:51 PM, Joonas Haapala <[email protected]> wrote: > Hi, > > I did some work six months ago in porting the whitepaper to OpenCL. The > WIP code is available here: https://github.com/Jontte/CortiCL > > CLA doesn't port to GPU as well as deep neural networks due to some parts > of the algorithm needing memory access to the neighbouring columns > (neighbourhood inhibition in SP) and the need to maintain variable sized > memory buffers (list of proposed changes to a segment), but it all can be > worked around. > > CPU<->GPU memory bandwidth shouldn't be a problem, since the state of the > whole network doesn't have to be accessible from the CPU side during > operation unless one wants debug diagnostics. In my application I only move > the network input/output bit patterns across the boundary each step: > https://github.com/Jontte/CortiCL/blob/master/src/cltemporal.cpp#L83 > > Joonas Haapala > > > > > 2014-05-05 8:28 GMT+03:00 Sergey Bryukov <[email protected]>: > >> Hi, is there any progress for CLA on GPU? >> >> There is a mention of GPU data base. Dont know if it would be useful for >> CLA. >> >> "Known as MapD <http://geops.csail.mit.edu/docs/mapd_overview.pdf>, or >> massively parallel database, the new technology achieves big speed gains by >> storing the data in the onboard memory of graphics processing units (GPUs) >> instead of in central processing units (CPUs), as is conventional." >> >> >> http://www.technologyreview.com/news/520021/graphics-chips-help-process-big-data-sets-in-milliseconds/ >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
