I am not sure what you try to achieve here. Can you please tell us what the goal of the program is. Maybe with some example data?
Besides this, I have the feeling that it will fail once it is not used in a single node scenario due to the reference to the global counter variable. Also unclear why you collect the data first to parallelize it again. > On 18 Sep 2016, at 14:26, sudhindra <smag...@gmail.com> wrote: > > Hi i have coded something like this , pls tell me how bad it is . > > package Spark.spark; > import java.util.List; > import java.util.function.Function; > > import org.apache.spark.SparkConf; > import org.apache.spark.SparkContext; > import org.apache.spark.api.java.JavaRDD; > import org.apache.spark.api.java.JavaSparkContext; > import org.apache.spark.sql.DataFrame; > import org.apache.spark.sql.Dataset; > import org.apache.spark.sql.Row; > import org.apache.spark.sql.SQLContext; > > > > public class App > { > static long counter=1; > public static void main( String[] args ) > { > > > > SparkConf conf = new > SparkConf().setAppName("sorter").setMaster("local[2]").set("spark.executor.memory","1g"); > JavaSparkContext sc = new JavaSparkContext(conf); > > SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc); > > DataFrame df = sqlContext.read().json("path"); > DataFrame sortedDF = df.sort("id"); > //df.show(); > //sortedDF.printSchema(); > > System.out.println(sortedDF.collectAsList().toString()); > JavaRDD<Row> distData = sc.parallelize(sortedDF.collectAsList()); > > > List<String >missingNumbers=distData.map(new > org.apache.spark.api.java.function.Function<Row, String>() { > > > public String call(Row arg0) throws Exception { > // TODO Auto-generated method stub > > > if(counter!=new Integer(arg0.getString(0)).intValue()) > { > StringBuffer misses = new StringBuffer(); > long newCounter=counter; > while(newCounter!=new > Integer(arg0.getString(0)).intValue()) > { > misses.append(new String(new Integer((int) > counter).toString()) ); > newCounter++; > > } > counter=new Integer(arg0.getString(0)).intValue()+1; > return misses.toString(); > > } > counter++; > return null; > > > > } > }).collect(); > > > > for (String name: missingNumbers) { > System.out.println(name); > } > > > > } > } > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/filling-missing-values-in-a-sequence-tp5708p27748.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >