klemniops commented on PR #15861:
URL: https://github.com/apache/datafusion/pull/15861#issuecomment-2833568011

   From the lint description:
   > Enum size is bounded by the largest variant. Having one large variant can 
penalize the memory layout of that enum.
   
   That is to say, the presence of the large variant `AvroError` affects the 
whole layout of `DataFusionError`.
   
   Transitively, the presence of the large variant `AvroError` affects the 
whole layout of `Result<T, DataFusionError>`.  This affects nearly every 
function in the DataFusion API.
   
   This [related lint pull 
request](https://github.com/rust-lang/rust-clippy/pull/9373) elaborates more 
specifically:
   
   > - A large Err-variant may force an equally large Result if Err is actually 
bigger than Ok.
   > - There is a cost involved in large Result, as LLVM may choose to memcpy 
them around above a certain size.
   > - We usually expect the Err variant to be seldomly used, but pay the cost 
every time.
   > - Result returned from library code has a high chance of bubbling up the 
call stack, getting stuffed into MyLibError { IoError(std::io::Error), 
ParseError(parselib::Error), ...}, exacerbating the problem.
   
   As applied here:
   1) every API which returns `Result<T, DataFusionError>` might pay a large 
`memcpy` cost
   2) a return of `Err(DataFusionError::AvroError(...))` will bubble up the 
call stack in nearly all cases, such that (2a) downstream libraries wrapping 
`DataFusionError` in their own error types will also suffer this problem, and 
(2b) the end user request in application code will terminate
   
   
   > I think this error is not rarely used
   
   Indeed `DataFusionError` is used nearly everywhere which is precisely the 
point.  Whereas the `DataFusion::AvroError` is only produced by the avro reader 
but it affects every place where `DataFusionError` can appear.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to