rroelke commented on PR #15861:
URL: https://github.com/apache/datafusion/pull/15861#issuecomment-2833570923

   From the lint description:
   
   >  Enum size is bounded by the largest variant. Having one large variant can 
penalize the memory layout of that enum.
   
   That is to say, the presence of the large variant AvroError affects the 
whole layout of DataFusionError.
   
   Transitively, the presence of the large variant AvroError affects the whole 
layout of `Result<T, DataFusionError>`. This affects nearly every function in 
the DataFusion API.
   
   This [related lint pull 
request](https://github.com/rust-lang/rust-clippy/pull/9373) elaborates more 
specifically:
   > - A large Err-variant may force an equally large Result if Err is actually 
bigger than Ok.
   > - There is a cost involved in large Result, as LLVM may choose to memcpy 
them around above a certain size.
   > - We usually expect the Err variant to be seldomly used, but pay the cost 
every time.
   > - Result returned from library code has a high chance of bubbling up the 
call stack, getting stuffed into MyLibError { IoError(std::io::Error), 
ParseError(parselib::Error), ...}, exacerbating the problem.
   
   As applied here:
   1) every API which returns `Result<T, DataFusionError>` might pay a large 
memcpy cost
   2) a return of `Err(DataFusionError::AvroError(...))` will bubble up the 
call stack in nearly all cases, such that (2a) downstream libraries wrapping 
`DataFusionError` in their own error types will also suffer this problem, and 
(2b) the end user request in application code will terminate
   
   > I think this error is not rarely used
   
   Indeed DataFusionError is used nearly everywhere which is precisely the 
point. Whereas the DataFusion::AvroError is only produced by the avro reader 
but it affects every place where DataFusionError can appear.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to