[ 
https://issues.apache.org/jira/browse/ARROW-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616421#comment-17616421
 ] 

Alenka Frim edited comment on ARROW-18001 at 10/13/22 4:08 AM:
---------------------------------------------------------------

Oh, sorry for the confusion and thank you for making it clear!


was (Author: alenkaf):
Oh, sorry for the confusion and thank you for clearing it out for me!

> [Python] Provide a way to specify the type of a subset of columns for 
> from_pandas
> ---------------------------------------------------------------------------------
>
>                 Key: ARROW-18001
>                 URL: https://issues.apache.org/jira/browse/ARROW-18001
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Alenka Frim
>            Priority: Major
>
> This question came up in the GitHub issue: 
> [https://github.com/apache/arrow/issues/14025] .
> h6. Description:
> If a user wants to change a type of one single column when using 
> {{to_parquet}} in pandas (or dask) they currently need to specify the schema 
> with all columns included. If a column is not specified in the schema, it 
> will not be included in the parquet file.
> The type inference happens when converting a python object (eg pandas 
> dataframe, or a dict, ..) to an Arrow Table, and once you have such table 
> with a fixed schema, writing to Parquet doesn't do type inference anymore 
> (since arrow types map to parquet types).
> h6. Proposal
> There should be a possibility to provide a way to specify the type of a 
> subset of columns for {{{}from_pandas{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to