Hi All, Any idea about this?
Thanks, Rishi On Tue, May 21, 2019 at 11:29 PM Rishi Shah <rishishah.s...@gmail.com> wrote: > Hi All, > > What is the best way to determine partitions of a dataframe dynamically > before writing to disk? > > 1) statically determine based on data and use coalesce or repartition > while writing > 2) somehow determine count of records for entire dataframe and divide that > number to determine partition - however how to determine total count > without having to risk computing dataframe twice (if dataframe is not > cached, and count() is used) > > -- > Regards, > > Rishi Shah > -- Regards, Rishi Shah