MurrayData commented on issue #34583:
URL: https://github.com/apache/arrow/issues/34583#issuecomment-1473235522

   > The easier workaround is probably to cast to large_string first (then you 
don't need the manual chunking):
   > 
   > ```
   > postcode_dict_large = postcode_dict.cast(pa.dictionary(pa.int32(), 
pa.large_string()))
   > postcode_dict_large.dictionary_decode()
   > ```
   
   Thank you @jorisvandenbossche. Simply specifying 
```type=pa.large_string()```, on the **DictionaryArray** constructor, solved 
the problem. Noted for future applications. **pcds** is already **np.int32**.
   
   ````
   postcode_dict = pa.DictionaryArray.from_arrays(pa.array(pcds_id), 
pa.array(pcds, type=pa.large_string()))
   ````
   worked fine
   ````
   postcode_dict.dictionary_decode()
   
   <pyarrow.lib.LargeStringArray object at 0x7f3ca9aea0e0>
   [
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     "AB1 0AA",
     ...
     "ZE3 9JZ",
     "ZE3 9JZ",
     "ZE3 9XP",
     "ZE3 9XP",
     "ZE3 9XP",
     "ZE3 9XP",
     "ZE3 9XP",
     "ZE3 9XP",
     "ZE3 9XP",
     "ZE3 9XP"
   ]
   ````


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to