I am not sure why you need to create an RDD first.  You can create a data frame directly from csv file, for instance:

spark.read.format("csv").option("header","true").schema(yourSchema).load(ftpUrl)

-- ND

On 8/5/21 3:14 AM, igyu wrote:
val ftpUrl ="ftp://test:test@ip:21/upload/test/_temporary/0/_temporary/task_20191211114756_0002_m_000000_0/*"; val rdd = spark.sparkContext.wholeTextFiles(ftpUrl)
val value = rdd.map(_._2).map(csv=>csv.split(",").toSeq)

val schemas =StructType(List(
         new StructField("id", DataTypes.StringType, true), new StructField("name", 
DataTypes.StringType, true), new StructField("year", DataTypes.IntegerType, true), new 
StructField("city", DataTypes.StringType, true)))
val DF = spark.createDataFrame(value,schemas)
How can I createDataFrame

------------------------------------------------------------------------
igyu

Reply via email to