When the data is stored in the Dataset [Row] format, the memory usage is very 
small. 
 When I use collect () to collect data to the driver, each line of the dataset 
will be converted to Row and stored in an array, which will bring great memory 
overhead.
 So, can I collect Dataset[Row] to driver and keep its data format?

Best regards,
maqy

Reply via email to