dchvn opened a new pull request #34182:
URL: https://github.com/apache/spark/pull/34182


   ### What changes were proposed in this pull request?
   Refactor function to_datetime in namespace.py to fix ps.to_datetime with 
plurals of keys like years, months, days.
   
   
   ### Why are the changes needed?
   Pandas in pyspark - function ps.to_datetime does not support for plurals of 
keys like years, months, days.
   ``` python
   # pandas
   df_test = pd.DataFrame({'years': [2015, 2016], 'months': [2, 3], 'days': [4, 
5]})
   df_test['date'] = pd.to_datetime(df_test[['years', 'months', 'days']])
   df_test
   
      years  months  days       date
   0   2015       2     4 2015-02-04
   1   2016       3     5 2016-03-05
   
   
   # pandas on spark
   df_test = ps.DataFrame({'years': [2015, 2016], 'months': [2, 3], 'days': [4, 
5]})
   df_test['date'] = ps.to_datetime(df_test[['years', 'months', 'days']])
   
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/project/spark/python/pyspark/pandas/namespace.py", line 1643, 
in to_datetime
       psdf = arg[["year", "month", "day"]]
     File "/home/project/spark/python/pyspark/pandas/frame.py", line 11888, in 
__getitem__
       return self.loc[:, list(key)]
     File "/home/project/spark/python/pyspark/pandas/indexing.py", line 480, in 
__getitem__
       ) = self._select_cols(cols_sel)
     File "/home/project/spark/python/pyspark/pandas/indexing.py", line 325, in 
_select_cols
       return self._select_cols_by_iterable(cols_sel, missing_keys)
     File "/home/project/spark/python/pyspark/pandas/indexing.py", line 1356, 
in _select_cols_by_iterable
       raise KeyError("['{}'] not in index".format(name_like_string(key)))
   KeyError: "['year'] not in index"
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   After :
   ``` python
   df_test = ps.DataFrame({'years': [2015, 2016], 'months': [2, 3], 'days': [4, 
5]})
   df_test['date'] = ps.to_datetime(df_test[['years', 'months', 'days']])
   df_test
   
      years  months  days       date
   0   2015       2     4 2015-02-04
   1   2016       3     5 2016-03-05
   ```
   
   ### How was this patch tested?
   Unit test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to