Hey Everyone Hope you are doing well
I was looking through the datetime format¹ in the docs and think that the docs are a little incorrect while describing the formatting for year format. It is written as "... If the count of letters is less than four (but not two), then the sign is only output for negative years. Otherwise, the sign is output if the pad width is exceeded when ‘G’ is not present ..." Some things are incorrect about the statement (The code below is run in python 3.13.1 and spark 3.5.4) 1. Negative sign is output for 4 or more letters regardless if the pad width is exceeded or not. The following example is where year was less than the padding width >>> df.select(F.date_format(F.make_date(F.lit(-2012), F.lit(1), F.lit(1)), 'yyyyy')).show() +------------------------------------------+ |date_format(make_date(-2012, 1, 1), yyyyy)| +------------------------------------------+ | -02012| +------------------------------------------+ I think the behaviour is obvious, but the doc needs some refinment. 2. Positive signs are output (when pad width is exceeded) even if 'G' is present. Ex - >>> df.select(F.date_format(F.make_date(F.lit(20125), F.lit(1), F.lit(1)), 'yyyy G')).show() +-------------------------------------------+ |date_format(make_date(20125, 1, 1), yyyy G)| +-------------------------------------------+ | +20125 AD| +------------------------------------------- Don't know the behaviour of 'G' but I think it never prints the negative sign. It converts the negative value to 'BC' and again prints the positive sign (when pad width is exceeded) Ex- >>> df.select(F.date_format(F.make_date(F.lit(-20125), F.lit(1), F.lit(1)), 'yyyy G')).show() +--------------------------------------------+ |date_format(make_date(-20125, 1, 1), yyyy G)| +--------------------------------------------+ | +20126 BC| +--------------------------------------------+ I think the statement should be like "Prints the negative sign for any number of characters except two. Prints the positive sign for four or more letters when pad width is exceeded". It might be a small thing, but still wanted to make it sure. If I am correct, I can raise a pull request for the same. Thanks Dhruv ¹: https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html