When the data is stored in the Dataset [Row] format, the memory usage is very small. When I use collect () to collect data to the driver, each line of the dataset will be converted to Row and stored in an array, which will bring great memory overhead. So, can I collect Dataset[Row] to driver and keep its data format?
Best regards, maqy