[ 
https://issues.apache.org/jira/browse/SPARK-28874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-28874:
---------------------------------
    Description: 
Pyspark date_format add one years in the last days off year :

Example :
{code:python}
spark.range(1).select(date_format(lit("2010-12-26"), "YYYY-MM-dd")).show()
{code}

{code}
+-----------------------------------+
|date_format(2010-12-26, YYYY-MM-dd)|
+-----------------------------------+
|                         2011-12-26|
+-----------------------------------+
{code}
 

  was:
Pyspark date_format add one years in the last days off year :

Example :
{code:python}
from datetime import datetime
from dateutil.relativedelta import relativedelta

import pandas as pd

from pyspark.sql.functions import date_format, col
from pyspark.sql.types import *

start_date = datetime(2010,1,1)
end_date = datetime(2055,1,1)
indx_ts = pd.date_range(start_date.strftime('%m/%d/%Y'), 
end_date.strftime('%m/%d/%Y'), freq='D')
data_date = [ {"d":datetime.utcfromtimestamp(x.tolist()/1e9)} for x in 
indx_ts.values ]

df_p = spark.createDataFrame(data_date,StructType([StructField('d', DateType(), 
True)]))
df_string = df_p.withColumn("date_string" ,date_format(col("d"), "YYYY-MM-dd"))
df_string.filter("d!=date_string").show(1000)
{code}

{code}
+----------+-----------+
|         d|date_string|
+----------+-----------+
|2010-12-26| 2011-12-26|
|2010-12-27| 2011-12-27|
|2010-12-28| 2011-12-28|
|2010-12-29| 2011-12-29|
|2010-12-30| 2011-12-30|
|2010-12-31| 2011-12-31|
|2012-12-30| 2013-12-30|
|2012-12-31| 2013-12-31|
|2013-12-29| 2014-12-29|
|2013-12-30| 2014-12-30|
|2013-12-31| 2014-12-31|
|2014-12-28| 2015-12-28|
|2014-12-29| 2015-12-29|
|2014-12-30| 2015-12-30|
|2014-12-31| 2015-12-31|
|2015-12-27| 2016-12-27|
|2015-12-28| 2016-12-28|
|2015-12-29| 2016-12-29|
|2015-12-30| 2016-12-30|
|2015-12-31| 2016-12-31|
|2017-12-31| 2018-12-31|
|2018-12-30| 2019-12-30|
|2018-12-31| 2019-12-31|
|2019-12-29| 2020-12-29|
|2019-12-30| 2020-12-30|
|2019-12-31| 2020-12-31|
|2020-12-27| 2021-12-27|
|2020-12-28| 2021-12-28|
|2020-12-29| 2021-12-29|
|2020-12-30| 2021-12-30|
|2020-12-31| 2021-12-31|
|2021-12-26| 2022-12-26|
|2021-12-27| 2022-12-27|
|2021-12-28| 2022-12-28|
|2021-12-29| 2022-12-29|
|2021-12-30| 2022-12-30|
|2021-12-31| 2022-12-31|
|2023-12-31| 2024-12-31|
|2024-12-29| 2025-12-29|
|2024-12-30| 2025-12-30|
|2024-12-31| 2025-12-31|
|2025-12-28| 2026-12-28|
|2025-12-29| 2026-12-29|
|2025-12-30| 2026-12-30|
|2025-12-31| 2026-12-31|
|2026-12-27| 2027-12-27|
|2026-12-28| 2027-12-28|
|2026-12-29| 2027-12-29|
|2026-12-30| 2027-12-30|
|2026-12-31| 2027-12-31|
|2027-12-26| 2028-12-26|
|2027-12-27| 2028-12-27|
|2027-12-28| 2028-12-28|
|2027-12-29| 2028-12-29|
|2027-12-30| 2028-12-30|
|2027-12-31| 2028-12-31|
|2028-12-31| 2029-12-31|
|2029-12-30| 2030-12-30|
|2029-12-31| 2030-12-31|
|2030-12-29| 2031-12-29|
|2030-12-30| 2031-12-30|
|2030-12-31| 2031-12-31|
|2031-12-28| 2032-12-28|
|2031-12-29| 2032-12-29|
|2031-12-30| 2032-12-30|
|2031-12-31| 2032-12-31|
|2032-12-26| 2033-12-26|
|2032-12-27| 2033-12-27|
|2032-12-28| 2033-12-28|
|2032-12-29| 2033-12-29|
|2032-12-30| 2033-12-30|
|2032-12-31| 2033-12-31|
|2034-12-31| 2035-12-31|
|2035-12-30| 2036-12-30|
|2035-12-31| 2036-12-31|
|2036-12-28| 2037-12-28|
|2036-12-29| 2037-12-29|
|2036-12-30| 2037-12-30|
|2036-12-31| 2037-12-31|
|2037-12-27| 2038-12-27|
|2037-12-28| 2038-12-28|
|2037-12-29| 2038-12-29|
|2037-12-30| 2038-12-30|
|2037-12-31| 2038-12-31|
|2038-12-26| 2039-12-26|
|2038-12-27| 2039-12-27|
|2038-12-28| 2039-12-28|
|2038-12-29| 2039-12-29|
|2038-12-30| 2039-12-30|
|2038-12-31| 2039-12-31|
|2040-12-30| 2041-12-30|
|2040-12-31| 2041-12-31|
|2041-12-29| 2042-12-29|
|2041-12-30| 2042-12-30|
|2041-12-31| 2042-12-31|
|2042-12-28| 2043-12-28|
|2042-12-29| 2043-12-29|
|2042-12-30| 2043-12-30|
|2042-12-31| 2043-12-31|
|2043-12-27| 2044-12-27|
|2043-12-28| 2044-12-28|
|2043-12-29| 2044-12-29|
|2043-12-30| 2044-12-30|
|2043-12-31| 2044-12-31|
|2045-12-31| 2046-12-31|
|2046-12-30| 2047-12-30|
|2046-12-31| 2047-12-31|
|2047-12-29| 2048-12-29|
|2047-12-30| 2048-12-30|
|2047-12-31| 2048-12-31|
|2048-12-27| 2049-12-27|
|2048-12-28| 2049-12-28|
|2048-12-29| 2049-12-29|
|2048-12-30| 2049-12-30|
|2048-12-31| 2049-12-31|
|2049-12-26| 2050-12-26|
|2049-12-27| 2050-12-27|
|2049-12-28| 2050-12-28|
|2049-12-29| 2050-12-29|
|2049-12-30| 2050-12-30|
|2049-12-31| 2050-12-31|
|2051-12-31| 2052-12-31|
|2052-12-29| 2053-12-29|
|2052-12-30| 2053-12-30|
|2052-12-31| 2053-12-31|
|2053-12-28| 2054-12-28|
|2053-12-29| 2054-12-29|
|2053-12-30| 2054-12-30|
|2053-12-31| 2054-12-31|
|2054-12-27| 2055-12-27|
|2054-12-28| 2055-12-28|
|2054-12-29| 2055-12-29|
|2054-12-30| 2055-12-30|
|2054-12-31| 2055-12-31|
+----------+-----------+
{code}
 


> Pyspark bug in date_format
> --------------------------
>
>                 Key: SPARK-28874
>                 URL: https://issues.apache.org/jira/browse/SPARK-28874
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.1.0, 2.3.0
>            Reporter: Luis
>            Priority: Major
>
> Pyspark date_format add one years in the last days off year :
> Example :
> {code:python}
> spark.range(1).select(date_format(lit("2010-12-26"), "YYYY-MM-dd")).show()
> {code}
> {code}
> +-----------------------------------+
> |date_format(2010-12-26, YYYY-MM-dd)|
> +-----------------------------------+
> |                         2011-12-26|
> +-----------------------------------+
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to