[ https://issues.apache.org/jira/browse/SPARK-21664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
jifei_yang closed SPARK-21664. ------------------------------ We can use the partition to save the column names, such as: {code:java} case class UserInfo(name:String,favorite_number:Int,favorite_color:String) extends Serializable{} def mainSaveAsParquet(args: Array[String]) { val fileName=new Random().nextInt(43952858) val outPath = s"G:/project/idea15/xlwl/bigdata002/bigdata/sparkmvn/outpath/user/spark/parquet/temp/$fileName" val sparkConf = new SparkConf().setAppName("Spark Avro Test").setMaster("local[4]") MyKryoRegistrator.register(sparkConf) val sc = new SparkContext(sparkConf) val sqlContext=new SQLContext(sc) val array=new Array[UserInfo](3001) for(i <- 0 to 3000){ val choose=i % 2 choose match { case 0 =>array(i)= UserInfo("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36", 256+(i/102), "blue") case 1 =>array(i)= UserInfo("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 Edge/15.15063", 256+i, "blue") } } import sqlContext.implicits._ val records: DataFrame = sc.parallelize(array).toDF() records.repartition(1).write.partitionBy("name","favorite_number").format("parquet").mode(SaveMode.ErrorIfExists).save(outPath) sc.stop() } {code} This will handle the column name and favorite_number as input fields. > Use the column name as the file name. > -------------------------------------- > > Key: SPARK-21664 > URL: https://issues.apache.org/jira/browse/SPARK-21664 > Project: Spark > Issue Type: Question > Components: Input/Output > Affects Versions: 2.2.0 > Reporter: jifei_yang > Priority: Major > > When we save the dataframe, we want to use the column name as the file name. > PairRDDFunctions are achievable. Can Dataframe be implemented? Thank you. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org