Hi,

Can someone please suggest some real life application implemented in spark
( things like gene sequencing) that is of type below code. Basically, the
application should have jobs submitted via as many threads as possible.  I
need similar kind of spark application for benchmarking.


val threadA = new Thread(new Runnable {
      def run() {
      for(i<- 0 until end)
      {
        val numAs = logData.filter(line => line.contains("a"))
      //  numAs.saveAsTextFile("hdfs:/t1")
        println("Lines with a: %s".format(numAs.count))
      }
     }
    })

   val threadB = new Thread(new Runnable {
      def run() {
      for(i<- 0 until end)
      {
        val numBs = logData.filter(line => line.contains("b"))
      //  numBs.saveAsTextFile("hdfs:/t2")
        println("Lines with b: %s".format(numBs.count))
      }
      }
    })

    val threadC = new Thread(new Runnable {
      def run() {
      for(i<- 0 until end)
      {
       val numCs = logData.filter(line => line.contains("c"))
     //   numCs.saveAsTextFile("hdfs:/t3")
        println("Lines with c: %s".format( numCs.count))
      }
      }
    })

     val threadD = new Thread(new Runnable {
      def run() {
     for(i<- 0 until end)
      {
       val numDs = logData.filter(line => line.contains("d"))
       // numDs.saveAsTextFile("hdfs:/t4")
        println("Lines with d: %s".format( numDs.count))
       }
      }
    })

Regards
Karthik




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Need-a-spark-application-tp21552.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to