Hi, Can someone please suggest some real life application implemented in spark ( things like gene sequencing) that is of type below code. Basically, the application should have jobs submitted via as many threads as possible. I need similar kind of spark application for benchmarking.
val threadA = new Thread(new Runnable { def run() { for(i<- 0 until end) { val numAs = logData.filter(line => line.contains("a")) // numAs.saveAsTextFile("hdfs:/t1") println("Lines with a: %s".format(numAs.count)) } } }) val threadB = new Thread(new Runnable { def run() { for(i<- 0 until end) { val numBs = logData.filter(line => line.contains("b")) // numBs.saveAsTextFile("hdfs:/t2") println("Lines with b: %s".format(numBs.count)) } } }) val threadC = new Thread(new Runnable { def run() { for(i<- 0 until end) { val numCs = logData.filter(line => line.contains("c")) // numCs.saveAsTextFile("hdfs:/t3") println("Lines with c: %s".format( numCs.count)) } } }) val threadD = new Thread(new Runnable { def run() { for(i<- 0 until end) { val numDs = logData.filter(line => line.contains("d")) // numDs.saveAsTextFile("hdfs:/t4") println("Lines with d: %s".format( numDs.count)) } } }) Regards Karthik -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Need-a-spark-application-tp21552.html Sent from the Apache Spark User List mailing list archive at Nabble.com.