Hi All,

Any idea about this?

Thanks,
Rishi

On Tue, May 21, 2019 at 11:29 PM Rishi Shah <rishishah.s...@gmail.com>
wrote:

> Hi All,
>
> What is the best way to determine partitions of a dataframe dynamically
> before writing to disk?
>
> 1) statically determine based on data and use coalesce or repartition
> while writing
> 2) somehow determine count of records for entire dataframe and divide that
> number to determine partition - however how to determine total count
> without having to risk computing dataframe twice (if dataframe is not
> cached, and count() is used)
>
> --
> Regards,
>
> Rishi Shah
>


-- 
Regards,

Rishi Shah

Reply via email to