I may be doing something wrong here, so any help would be greatly
appreciated. I am trying to store a nested python dict into an Arrow table,
and I am getting some unexpected results. This is sample code:
import copy
import pyarrow as pa
import random
def test_it():
arr = []
for f in range(5):
num_maps = random.randrange(4) + 1
print("Number of maps = {}".format(num_maps))
mdict = {}
mdict["CORE"] = {}
for r in range(num_maps):
mdict["CORE"][str(r)] = {"status": "realized"}
arr.append(copy.deepcopy(mdict))
tbl = pa.Table.from_pydict({"_map": arr})
print(tbl.to_pydict())
test_it()
This is the output of the code:
Number of maps = 1
Number of maps = 1
Number of maps = 2
Number of maps = 3
Number of maps = 2
{'_map': [{'CORE': {'0': {'status': 'realized'}, '1': None, '2': None}},
{'CORE': {'0': {'status': 'realized'}, '1': None, '2': None}}, {'CORE':
{'0': {'status': 'realized'}, '1': {'status': 'realized'}, '2': None}},
{'CORE': {'0': {'status': 'realized'}, '1': {'status': 'realized'}, '2':
{'status': 'realized'}}}, {'CORE': {'0': {'status': 'realized'}, '1':
{'status': 'realized'}, '2': None}}]}
It seems that when the table is created, it is filling in empty dict values
such that the number of elements is completely equal. This is not what I
wanted, and I am wondering if this is a feature, or am I missing something
such that my intended output would not contain "null" vales.
Thanks,
Partha
--
Partha Dutta
[email protected]