Re: How the data is distributed

2022-06-07 Thread Sid
Thank you for the information.


On Tue, 7 Jun 2022, 03:21 Sean Owen,  wrote:

> Data is not distributed to executors by anything. If you are processing
> data with Spark. Spark spawns tasks on executors to read chunks of data
> from wherever they are (S3, HDFS, etc).
>
>
> On Mon, Jun 6, 2022 at 4:07 PM Sid  wrote:
>
>> Hi experts,
>>
>>
>> When we load any file, I know that based on the information in the spark
>> session about the executors location, status and etc , the data is
>> distributed among the worker nodes and executors.
>>
>> But I have one doubt. Is the data initially loaded on the driver and then
>> it is distributed or it is directly distributed amongst the workers?
>>
>> Thanks,
>> Sid
>>
>


Re: How the data is distributed

2022-06-06 Thread Sean Owen
Data is not distributed to executors by anything. If you are processing
data with Spark. Spark spawns tasks on executors to read chunks of data
from wherever they are (S3, HDFS, etc).


On Mon, Jun 6, 2022 at 4:07 PM Sid  wrote:

> Hi experts,
>
>
> When we load any file, I know that based on the information in the spark
> session about the executors location, status and etc , the data is
> distributed among the worker nodes and executors.
>
> But I have one doubt. Is the data initially loaded on the driver and then
> it is distributed or it is directly distributed amongst the workers?
>
> Thanks,
> Sid
>


Re: How the data is distributed

2022-06-06 Thread Peyman Mohajerian
Later.

On Mon, Jun 6, 2022 at 2:07 PM Sid  wrote:

> Hi experts,
>
>
> When we load any file, I know that based on the information in the spark
> session about the executors location, status and etc , the data is
> distributed among the worker nodes and executors.
>
> But I have one doubt. Is the data initially loaded on the driver and then
> it is distributed or it is directly distributed amongst the workers?
>
> Thanks,
> Sid
>