Hi, everyone!
    I consider flatmap as a narrow dependency , but why it has shuffle?
as shown on the web UI:
my code is as below :
val transferRDD = sc.textFile("hdfs://host:port/path")



    val rdd = transferRDD.map(line => {

                          val trunks = line.split("\t")

                          if(trunks.length == 32){

                            (trunks(11), trunks(13), 
Try(java.lang.Long.parseLong(trunks(9))).getOrElse(0l), trunks(14), trunks(19))

                          }

                        }).filter(arg =>arg != ()).map(arg => 
arg.asInstanceOf[(String, String, Long, String, String)]).filter(arg => arg._3 
!= 0)val flatMappedRDD = rdd.flatMap(arg => List((arg._1, (arg._2, arg._3, 1)), 
(arg._2, (arg._1, arg._3, 0))))
Thank for your help!


qinwei

Reply via email to