jorisvandenbossche commented on issue #35624:
URL: https://github.com/apache/arrow/issues/35624#issuecomment-1560683539

   Thanks for the report!
   
   The code in question is:
   
   
https://github.com/apache/arrow/blob/6d3d2fca2c9693231fa1e52c142ceef563fc23f9/python/pyarrow/compute.py#L520-L525
   
   So if the types don't match, it tries to convert the fill_value to a 
_scalar_ of the correct type. There seems to be multiple things fishy about 
this: 1) this should probably cast to the correct type instead of going through 
`as_py`, and 2) the current version also only works for scalars, not arrays .. 
   We can see this with a simple (non-nested array) example as well:
   
   ```python
   >>> arr = pa.array([1, 2, None, 4, None])
   >>> arr.fill_null(pa.array([10, 20, 30, 40, 50]))
   <pyarrow.lib.Int64Array object at 0x7f41c56983a0>
   [
     1,
     2,
     30,
     4,
     50
   ]
   >>> arr.fill_null(pa.array([10, 20, 30, 40, 50], type="int32"))
   ...
   AttributeError: 'pyarrow.lib.Int32Array' object has no attribute 'as_py'
   ```
   
   
   ---
   
   Now, for your original example using a list array, the situation is a bit 
more complicated. Because for the case where you pass a pyarrow.Array, the 
types also don't match (list of float vs float), and automatically casting in 
that case also wouldn't work. 
   The reason that the python list works is because we convert that to a scalar 
of the correct type under the hood (the `a.scalar(fill_value, 
type=values.type)` in the code snippet above). 
   
   But so if you pass a pyarrow object, I think we will need to require that 
you pass a "correct" scalar instead (because when passing an array, we assume 
that this is meant for filling with an array element-wise, and not for filling 
with a scalar). If you start from a pyarrow array, you can convert this to a 
scalar of the correct type before passing it to `fill_null`, but the problem is 
that this also fails at the moment:
   
   ```
   >>> pa.scalar(pa.array([0.0, 0.0]),  pa.list_(pa.float64(), 2))
   ...
   ArrowInvalid: Could not convert <pyarrow.DoubleScalar: 0.0> with type 
pyarrow.lib.DoubleScalar: tried to convert to double
   ```
   
   So that's also something we should fix. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to