thisisnic commented on PR #34825:
URL: https://github.com/apache/arrow/pull/34825#issuecomment-1503246885

   Thanks for summarising the extra context there that we didn't have before, 
@nealrichardson, that's super helpful; now I can see why things are the way 
they currently are.
   
   FWIW, I agree with your points Dean, but I don't see a reasonable solution 
to that problem which doesn't cause other issues given the different priorities 
we're balancing.  
   
   I'm still on the fence regarding what is the best solution here, but given 
that the issue regarding the `as.data.frame` return type is 1) fairly complex, 
2) contentious, 3) not the entire issue here, and 4) not something that has 
been reported by anyone else (yet), I'm going to close this PR and submit a new 
one for the narrower problem of variable results based on argument ordering.
   
   Open to more discussion here regarding what our priorities *should* be, if 
not the current ones in the order they are above.
   
   For example, I don't fully see the importance of roundtrip fidelity; if the 
input and output have the same (non-default) metadata and contents, then is 
there any harm caused by returning a tibble instead of a data.frame, if a user 
then just has the additional step of calling `as.data.frame()` to get a vanilla 
`data.frame`? Haven't sketched that out in practice though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to