Hi George, You can try mounting a larger PersistentVolume to the work directory as described here instead of using localdir which might have site-specific size constraints:
https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes -Matt > On Sep 1, 2022, at 09:16, Manoj GEORGE <manoj.geo...@amadeus.com.invalid> > wrote: > > > CONFIDENTIAL & RESTRICTED > > Hi Team, > > I am new to spark, so please excuse my ignorance. > > Currently we are trying to run PySpark on Kubernetes cluster. The setup is > working fine for some jobs, but when we are processing a large file ( 36 gb), > we run into one of space issues. > > Based on what was found on internet, we have mapped the local dir to a > persistent volume. This still doesn’t solve the issue. > > I am not sure if it is still writing to /tmp folder on the pod. Is there some > other setting which need to be changed for this to work. > > Thanks in advance. > > > > Thanks, > Manoj George > Manager Database Architecture > M: +1 3522786801 > manoj.geo...@amadeus.com > www.amadeus.com > > > Disclaimer: This email message and information contained in or attached to > this message may be privileged, confidential, and protected from disclosure > and is intended only for the person or entity to which it is addressed. Any > review, retransmission, dissemination, printing or other use of, or taking of > any action in reliance upon, this information by persons or entities other > than the intended recipient is prohibited. If you receive this message in > error, please immediately inform the sender by reply email and delete the > message and any attachments. Thank you.