[
https://issues.apache.org/jira/browse/ARROW-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616421#comment-17616421
]
Alenka Frim edited comment on ARROW-18001 at 10/13/22 4:08 AM:
---------------------------------------------------------------
Oh, sorry for the confusion and thank you for making it clear!
was (Author: alenkaf):
Oh, sorry for the confusion and thank you for clearing it out for me!
> [Python] Provide a way to specify the type of a subset of columns for
> from_pandas
> ---------------------------------------------------------------------------------
>
> Key: ARROW-18001
> URL: https://issues.apache.org/jira/browse/ARROW-18001
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Alenka Frim
> Priority: Major
>
> This question came up in the GitHub issue:
> [https://github.com/apache/arrow/issues/14025] .
> h6. Description:
> If a user wants to change a type of one single column when using
> {{to_parquet}} in pandas (or dask) they currently need to specify the schema
> with all columns included. If a column is not specified in the schema, it
> will not be included in the parquet file.
> The type inference happens when converting a python object (eg pandas
> dataframe, or a dict, ..) to an Arrow Table, and once you have such table
> with a fixed schema, writing to Parquet doesn't do type inference anymore
> (since arrow types map to parquet types).
> h6. Proposal
> There should be a possibility to provide a way to specify the type of a
> subset of columns for {{{}from_pandas{}}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)