jorisvandenbossche commented on pull request #10557:
URL: https://github.com/apache/arrow/pull/10557#issuecomment-880798233


   > Any other thoughts here? If we're ok with CaseWhen(struct(bool...), T...) 
over CaseWhen(bool, T, bool, T, ...)
   
   As mentioned on the meeting, I don't have a strong opinion on the exact 
signature (assuming you can create the inputs easily in the bindings either 
way), and if it simplifies the implementation / signature quite a bit, that 
sounds as a good reason.
   
   I quickly tried it out in Python:
   
   ```
   In [7]: cond = pc.project(pa.array([True, False, None]), pa.array([False, 
True, None]), field_names=[b"a", b"b"])
   
   In [8]: pc.case_when(cond, pa.array([1, 2, 3]), pa.array([11, 12, 13]))
   Out[8]: 
   <pyarrow.lib.Int64Array object at 0x7f104124c820>
   [
     1,
     12,
     null
   ]
   ```
   
   What I find a little bit annoying is that I have to provide "dummy" names 
(that are never used) to my boolean conditions in `project` (maybe we could 
have some default for this so for this specific case specifying names is not 
required?) 
   (I also noticed that the field names needed to be bytes and not strings, I 
suppose that's an error in the cython bindings, will open an issue about that). 
 
   One other issue is that this uses the "project" kernel, which I thought was 
not really meant for users? (ARROW-11206) I could of course also have used 
`pa.StructArray.from_arrays` I assume. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to