RE: Using CUDA within Spark / boosting linear algebra

2015-02-05 Thread Ulanov, Alexander
Hi Evan, Thank you for suggestion! BIDMat seems to have terrific speed. Do you know what makes them faster than netlib-java? The same group has BIDMach library that implements machine learning. For some examples they use Caffe convolutional neural network library owned by another group in

Re: Using CUDA within Spark / boosting linear algebra

2015-02-05 Thread Evan R. Sparks
I'd be surprised of BIDMat+OpenBLAS was significantly faster than netlib-java+OpenBLAS, but if it is much faster it's probably due to data layout and fewer levels of indirection - it's definitely a worthwhile experiment to run. The main speedups I've seen from using it come from highly optimized

PSA: Maven supports parallel builds

2015-02-05 Thread Nicholas Chammas
Y’all may already know this, but I haven’t seen it mentioned anywhere in our docs on here and it’s a pretty easy win. Maven supports parallel builds https://cwiki.apache.org/confluence/display/MAVEN/Parallel+builds+in+Maven+3 with the -T command line option. For example: ./build/mvn -T 1C

Re: PSA: Maven supports parallel builds

2015-02-05 Thread Patrick Wendell
I've done this in the past, but back when I wasn't using Zinc it didn't make a big difference. It's worth doing this in our jenkins environment though. - Patrick On Thu, Feb 5, 2015 at 4:52 PM, Dirceu Semighini Filho dirceu.semigh...@gmail.com wrote: Thanks Nicholas, I didn't knew this.

Re: spark 1.3 sbt build seems to be broken

2015-02-05 Thread shane knapp
here's the hash of the breaking commit: Started on Feb 5, 2015 12:01:01 PM Using strategy: Default [poll] Last Built Revision: Revision de112a2096a2b84ce2cac112f12b50b5068d6c35 (refs/remotes/origin/branch-1.3) git ls-remote -h https://github.com/apache/spark.git branch-1.3 # timeout=10 [poll]

Re: PSA: Maven supports parallel builds

2015-02-05 Thread Dirceu Semighini Filho
Thanks Nicholas, I didn't knew this. 2015-02-05 22:16 GMT-02:00 Nicholas Chammas nicholas.cham...@gmail.com: Y’all may already know this, but I haven’t seen it mentioned anywhere in our docs on here and it’s a pretty easy win. Maven supports parallel builds

RE: Using CUDA within Spark / boosting linear algebra

2015-02-05 Thread Ulanov, Alexander
Thank you for explanation! I’ve watched the BIDMach presentation by John Canny and I am really inspired by his talk and comparisons with Spark MLlib. I am very interested to find out what will be better within Spark: BIDMat or netlib-java with CPU or GPU natives. Could you suggest a fair way to

spark 1.3 sbt build seems to be broken

2015-02-05 Thread shane knapp
https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.3-SBT/ we're seeing java OOMs and heap space errors: https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.3-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=centos/19/console

Re: Using CUDA within Spark / boosting linear algebra

2015-02-05 Thread Joseph Bradley
Hi Alexander, Using GPUs with Spark would be very exciting. Small comment: Concerning your question earlier about keeping data stored on the GPU rather than having to move it between main memory and GPU memory on each iteration, I would guess this would be critical to getting good performance.

Re: When will Spark Streaming supports Kafka-simple consumer API?

2015-02-05 Thread Xuelin Cao.2015
Hi, Tathagata Thanks for the information, I'm trying to build 1.3 snapshot and make another try. There are 2 reasons for why we use Kafka SimpleConsumer API 1. Previously, in our company, all of the real time processing system were build on Apache Storm. So, the kafka environment

Re: Using CUDA within Spark / boosting linear algebra

2015-02-05 Thread Evan R. Sparks
I'd expect that we can make GPU-accelerated BLAS faster than CPU blas in many cases. You might consider taking a look at the codepaths that BIDMat ( https://github.com/BIDData/BIDMat) takes and comparing them to netlib-java/breeze. John Canny et. al. have done a bunch of work optimizing to make

Using CUDA within Spark / boosting linear algebra

2015-02-05 Thread Ulanov, Alexander
Dear Spark developers, I am exploring how to make linear algebra operations faster within Spark. One way of doing this is to use Scala Breeze library that is bundled with Spark. For matrix operations, it employs Netlib-java that has a Java wrapper for BLAS (basic linear algebra subprograms)

Re: Broken record a bit here: building spark on intellij with sbt

2015-02-05 Thread Stephen Boesch
Hi Akhil Those instructions you provided are showing how to manually build an sbt project that may include adding spark dependencies. Whereas my OP was about how to open the existing spark sbt project . These two are not similar tasks. 2015-02-04 21:46 GMT-08:00 Akhil Das

Re: Broken record a bit here: building spark on intellij with sbt

2015-02-05 Thread Arush Kharbanda
I follow these ones to import sbt projects. 1. Install sbt plugins: Goto File - Settings - Plugins - Install IntelliJ Plugins - Search for sbt and install it 2. File -Import-browse the root of spark source code I hope this helps On Fri, Feb 6, 2015 at 1:41 AM, Stephen Boesch