How many columns do you need from the big file?

Also how CPU / memory intensive are the computations you want to perform?
Alexander Czech <alexander.cz...@googlemail.com> schrieb am Mo. 27. Nov.
2017 um 10:57:

> I want to load a 10TB parquet File from S3 and I'm trying to decide what
> EC2 instances to use.
>
> Should I go for instances that in total have a larger memory size than
> 10TB? Or is it enough that they have in total enough SSD storage so that
> everything can be spilled to disk?
>
> thanks
>

Reply via email to