documentDF = spark.createDataFrame([ ("Hi I heard about Spark".split(" "), ),
("I wish Java could use case classes".split(" "), ), ("Logistic regression models are neat".split(" "), ) ], ["text"]) How can i achieve the same df while i am reading from source? doc = spark.read.text("/Users/rs/Desktop/nohup.out") how can i create array<string> type with "sentences" column from doc(dataframe) The below one creates more than one column. rdd.map(lambda rdd: rdd[0]).map(lambda row:row.split(" ")) -- Selvam Raman "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"