[
https://issues.apache.org/jira/browse/ARROW-14798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470904#comment-17470904
]
Will Jones commented on ARROW-14798:
------------------------------------
Another issue I notice is that most column types will appear as one line in a
Table repr, except a struct column will display child arrays across multiple
lines (like it would when the array is printed on its own).
{code:python}
>>> pa.table({'arr': pa.array([{'x': 1}, {'x': 2}, {'x': 3}]),
... 'arr2': pa.array([1, 2, 3])})
pyarrow.Table
arr: struct<x: int64>
child 0, x: int64
arr2: int64
----
arr: [ -- is_valid: all not null -- child 0 type: int64
[
1,
2,
3
]]
arr2: [[1,2,3]]
{code}
> [Python] Limit the size of the repr for large Tables
> ----------------------------------------------------
>
> Key: ARROW-14798
> URL: https://issues.apache.org/jira/browse/ARROW-14798
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, Python
> Reporter: Joris Van den Bossche
> Assignee: Will Jones
> Priority: Major
> Labels: good-first-issue, pull-request-available
> Fix For: 8.0.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The new repr is nice that it shows a preview of the data, but this can also
> become very long flooding your console output for larger tables.
> We already default to 10 preview cols, but each column can still consist of
> many chunks. So it might be good to also limit it to 2 chunks?
> The ChunkedArray.to_string method already has a {{window}} keyword, but that
> seems to control both the number of elements to show per chunk as the number
> of chunks (while it would be nice to limit eg to 2 chunks but show up to 10
> elements for each chunk).
> cc [~amol-]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)