I've been looking at the source code of the PySpark date_add function 
(https://spark.apache.org/docs/latest/api/python/_modules/pyspark/sql/functions.html#date_add)
 and I'm wondering why the days input variable is not cast to a java column 
like the start variable. This effectively means that when working with data 
frames, you can only add one number of days to all of your dates. I think it 
would make more sense to cast the days variable to a java column, so that you 
could add different days to different dates. The jvm function date_add has no 
problem doing this because I can add a date and integer column using the expr 
function (expr("date_add(start, days)"). And if you wanted to add the same 
date, you could just make a lit column with the same number. This argument 
applies to the functions date_sub and add_months as well.

Clay



M Science archives and monitors outgoing and incoming e-mail. The contents of 
this email, including any attachments, are confidential to the ordinary user of 
the email address to which it was addressed. If you are not the addressee of 
this email you may not copy, forward, disclose or otherwise use it or any part 
of it in any form whatsoever. This email may be produced at the request of 
regulators or in connection with civil litigation. M Science accepts no 
liability for any errors or omissions arising as a result of transmission. Use 
by other than intended recipients is prohibited.

Reply via email to