final_schema_noise_data =
sqlContext.createDataFrame(noise_data_parts,noise_data_struct_schema)
for a_name in name_field_names:
final_schema_noise_data=final_schema_noise_data.withColumn(a_name,spaceDeleteUDF(a_name))
#--- till here final_schema_noise_data.collect() is working---
for t in noise_chars:
final_schema_noise_data =
final_schema_noise_data.na.replace(t,'',a_name)
print a_name,t
#The above loop gets completed but final_schema_noise_data.collect() dos not
yield any result, cursor goes to next line & some processing goes on for
hours but no output.
#Before the inner for loop , the df.collect() gives output in secs & post
completion of the loop no output for hours.
*Any known issue with the df.na.replace function ??*
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Dataframe-replace-collect-going-in-indefinite-time-loop-tp21492.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]