jorisvandenbossche commented on a change in pull request #11445:
URL: https://github.com/apache/arrow/pull/11445#discussion_r732616210



##########
File path: docs/source/python/pandas.rst
##########
@@ -160,12 +160,50 @@ Arrow -> pandas Conversion
 Categorical types
 ~~~~~~~~~~~~~~~~~
 
-TODO
+`Pandas categorical 
<https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html>`_
+columns are converted to :ref:`Arrow array dictionaries <data.dictionary>`,
+an special array type optimized to handle repeated, limited, and fixed,
+number of possible values.

Review comment:
       ```suggestion
   a special array type optimized to handle repeated and limited
   number of possible values.
   ```
   
   (in pyarrow the number of possible values is not "fixed" in the same sense 
as in pandas, as eg concatting two dictionary arrays with different dictionary 
values will merge the two)

##########
File path: docs/source/python/pandas.rst
##########
@@ -160,12 +160,50 @@ Arrow -> pandas Conversion
 Categorical types
 ~~~~~~~~~~~~~~~~~
 
-TODO
+`Pandas categorical 
<https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html>`_
+columns are converted to :ref:`Arrow array dictionaries <data.dictionary>`,
+an special array type optimized to handle repeated, limited, and fixed,
+number of possible values.
+
+.. ipython:: python
+
+   df = pd.DataFrame({"cat": pd.Categorical(["a", "b", "c", "a", "b", "c"])})
+   df.cat.dtype.categories
+   df
+
+   table = pa.Table.from_pandas(df)
+   table
+
+We can inspect the :class:`~.ChunkedArray` of the created table and see the
+same categories of the Pandas DataFrame.
+
+.. ipython:: python
+
+   column = table[0]
+   chunk = column.chunk(0)
+   chunk.dictionary
+   chunk.indices
 
 Datetime (Timestamp) types
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-TODO
+`Pandas Timestamps 
<https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html>`_
+use the ``datetime64[ns]`` type in Pandas and are converted to an Arrow
+:class:`~.TimestampArray`.
+
+.. ipython:: python
+
+   df = pd.DataFrame({"datetime": pd.date_range("2020-01-01T00:00:00Z", 
freq="H", periods=3)})
+   df.dtypes
+   df
+
+   table = pa.Table.from_pandas(df)
+   table
+   table[0]
+
+On this example the Pandas Timestamp is time zone aware
+(``UTC`` on this case) information that is used to create the Arrow
+:class:`~.TimestampArray`.

Review comment:
       ```suggestion
   In this example the Pandas Timestamp is time zone aware
   (``UTC`` on this case), and this information is used to create the Arrow
   :class:`~.TimestampArray`.
   ```

##########
File path: docs/source/python/pandas.rst
##########
@@ -160,12 +160,50 @@ Arrow -> pandas Conversion
 Categorical types
 ~~~~~~~~~~~~~~~~~
 
-TODO
+`Pandas categorical 
<https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html>`_
+columns are converted to :ref:`Arrow array dictionaries <data.dictionary>`,
+an special array type optimized to handle repeated, limited, and fixed,
+number of possible values.
+
+.. ipython:: python
+
+   df = pd.DataFrame({"cat": pd.Categorical(["a", "b", "c", "a", "b", "c"])})
+   df.cat.dtype.categories
+   df
+
+   table = pa.Table.from_pandas(df)
+   table
+
+We can inspect the :class:`~.ChunkedArray` of the created table and see the
+same categories of the Pandas DataFrame.
+
+.. ipython:: python
+
+   column = table[0]
+   chunk = column.chunk(0)
+   chunk.dictionary
+   chunk.indices
 
 Datetime (Timestamp) types
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-TODO
+`Pandas Timestamps 
<https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html>`_
+use the ``datetime64[ns]`` type in Pandas and are converted to an Arrow
+:class:`~.TimestampArray`.
+
+.. ipython:: python
+
+   df = pd.DataFrame({"datetime": pd.date_range("2020-01-01T00:00:00Z", 
freq="H", periods=3)})
+   df.dtypes
+   df
+
+   table = pa.Table.from_pandas(df)
+   table
+   table[0]

Review comment:
       ```suggestion
      table
   ```
   
   (the repr of `table` nowadays includes a preview of the data, so also 
showing the column doesn't give much additional information)

##########
File path: docs/source/python/pandas.rst
##########
@@ -160,12 +160,50 @@ Arrow -> pandas Conversion
 Categorical types
 ~~~~~~~~~~~~~~~~~
 
-TODO
+`Pandas categorical 
<https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html>`_
+columns are converted to :ref:`Arrow array dictionaries <data.dictionary>`,

Review comment:
       ```suggestion
   columns are converted to :ref:`Arrow dictionary arrays <data.dictionary>`,
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to