[ 
https://issues.apache.org/jira/browse/ARROW-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17608723#comment-17608723
 ] 

Joris Van den Bossche commented on ARROW-17821:
-----------------------------------------------

It seems that a map type then better suites this case (if the lists are always 
equal length, and basically are key-value mapping), as you actually also 
mentioned in the top post. Creating a MapArray from two individual list arrays 
is possible like the following:

{code}
In [1]: A = pa.array([['a', 'b'], ['a', 'b', 'c']])

In [2]: B = pa.array([[1, 2], [3, 4, 5]])

In [3]: M = pa.MapArray.from_arrays(A.offsets, A.values, B.values)

In [4]: M.type
Out[4]: MapType(map<string, int64>)

In [5]: M
Out[5]: 
<pyarrow.lib.MapArray object at 0x7fe7620e7340>
[
  keys:
  [
    "a",
    "b"
  ]
  values:
  [
    1,
    2
  ],
  keys:
  [
    "a",
    "b",
    "c"
  ]
  values:
  [
    3,
    4,
    5
  ]
]

In [6]: M.to_pandas()
Out[6]: 
0            [(a, 1), (b, 2)]
1    [(a, 3), (b, 4), (c, 5)]
dtype: object
{code}

> Implement zip()
> ---------------
>
>                 Key: ARROW-17821
>                 URL: https://issues.apache.org/jira/browse/ARROW-17821
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Python
>            Reporter: Adam Lippai
>            Priority: Major
>
> If column A has list\(x), column B has list\(y), column C has list(z), I'd 
> like to be able to create D = zip(A,B,C) where D would be list(\{ A: x, B: y, 
> C: z}).
> x, y, z are types in the example, type of the resulting is D is list(struct).
> Other features to consider:
>  * Zipping list(struct) with list\(x) or list(struct) with list(struct) 
> should be able to merge
>  * Zipping A,B into a Map with keys from A, values from B



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to