Large variation in spark in Task Deserialization Time

2016-10-10 Thread Pulasthi Supun Wickramasinghe
Hi All, I am seeing a huge variation on spark Task Deserialization Time for my collect and reduce operations. while most tasks complete within 100ms a few take mote than a couple of seconds which slows the entire program down. I have attached a screen shot of the web ui where you can see the

How to perform reduce operation in the same order as partition indexes

2016-05-18 Thread Pulasthi Supun Wickramasinghe
Hi Devs/All, I am pretty new to Spark. I have a program which does some map reduce operations with matrices. Here *shortrddFinal* is a of type " *RDD[Array[Short]]"* and consists of several partitions *var BC = shortrddFinal.mapPartitionsWithIndex(calculateBCInternal).reduce(mergeBC)* The map

Re: Creating BlockMatrix with java API

2015-09-23 Thread Pulasthi Supun Wickramasinghe
efine 'rdd' as JavaRDD<Tuple2<Tuple2<Object, > Object>, Matrix>> > > As Yanbo has mentioned, I think a Java friendly constructor is still in > demand. > > 2015-09-23 13:14 GMT+08:00 Pulasthi Supun Wickramasinghe > <pulasthi...@gmail.com>: > > Hi Sab

Re: Creating BlockMatrix with java API

2015-09-22 Thread Pulasthi Supun Wickramasinghe
BlockMatrix/RowMatrix/IndexedRowMatrix/CoordinateMatrix do > not provide Java friendly constructors. I have file a SPARK-10757 > <https://issues.apache.org/jira/browse/SPARK-10757> to track this issue. > > 2015-09-18 3:36 GMT+08:00 Pulasthi Supun Wickramasinghe < > pulasthi...

Re: Creating BlockMatrix with java API

2015-09-22 Thread Pulasthi Supun Wickramasinghe
w BlockMatrix(rdd.rdd(), 2, 2) > > should work. > > Regards > Sab > > On Tue, Sep 22, 2015 at 10:50 PM, Pulasthi Supun Wickramasinghe < > pulasthi...@gmail.com> wrote: > >> Hi Yanbo, >> >> Thanks for the reply. I thought i might be missing something. A

Creating BlockMatrix with java API

2015-09-17 Thread Pulasthi Supun Wickramasinghe
Hi All, I am new to Spark and i am trying to do some BlockMatrix operations with the Mllib API's. But i can't seem to create a BlockMatrix with the java API. I tried the following Matrix matrixa = Matrices.rand(4, 4, new Random(1000)); List,Matrix>> list = new

Re: Do I need to learn Scala for spark ?

2014-04-21 Thread Pulasthi Supun Wickramasinghe
Hi, I think you can do just fine with your Java knowledge. There is a Java API that you can use [1]. I am also new to Spark and i have got around with just my Java knowledge. And Scala is easy to learn if you are good with Java. [1] http://spark.apache.org/docs/latest/java-programming-guide.html