zanmato1984 commented on issue #36246: URL: https://github.com/apache/arrow/issues/36246#issuecomment-1867019439
> I think we could also investigate other improvements: > > * avoid including `kernel.h` from `function.h` > * avoid including `scalar.h` from `datum.h` Assuming we are going to proceed this work. Here lists the most recent comparison of include analysis. The result is trimmed by listing only arrow's own headers. Without this PR: ``` 30921 ms: arrow/array/data.h (included 387 times, avg 79 ms), included via: 29603 ms: arrow/compute/function.h (included 219 times, avg 135 ms), included via: 28142 ms: arrow/compute/kernel.h (included 226 times, avg 124 ms), included via: 25468 ms: arrow/array/array_base.h (included 385 times, avg 66 ms), included via: 23882 ms: arrow/type.h (included 519 times, avg 46 ms), included via: 22291 ms: arrow/buffer.h (included 491 times, avg 45 ms), included via: 22160 ms: arrow/status.h (included 576 times, avg 38 ms), included via: 21382 ms: arrow/compute/exec.h (included 235 times, avg 90 ms), included via: 20121 ms: arrow/result.h (included 564 times, avg 35 ms), included via: 19198 ms: arrow/util/string_builder.h (included 577 times, avg 33 ms), included via: 18382 ms: arrow/device.h (included 492 times, avg 37 ms), included via: 17972 ms: arrow/array.h (included 256 times, avg 70 ms), included via: 15382 ms: arrow/acero/options.h (included 105 times, avg 146 ms), included via: 14816 ms: arrow/compute/expression.h (included 238 times, avg 62 ms), included via: 13685 ms: arrow/datum.h (included 256 times, avg 53 ms), included via: 13645 ms: arrow/compute/api_aggregate.h (included 137 times, avg 99 ms), included via: 12846 ms: arrow/testing/gtest_util.h (included 240 times, avg 53 ms), included via: ``` With this PR: ``` 29645 ms: arrow/array/data.h (included 387 times, avg 76 ms), included via: 24930 ms: arrow/array/array_base.h (included 385 times, avg 64 ms), included via: 23593 ms: arrow/type.h (included 519 times, avg 45 ms), included via: 22147 ms: arrow/status.h (included 576 times, avg 38 ms), included via: 20656 ms: arrow/result.h (included 564 times, avg 36 ms), included via: 18931 ms: arrow/util/string_builder.h (included 577 times, avg 32 ms), included via: 17606 ms: arrow/datum.h (included 255 times, avg 69 ms), included via: 17484 ms: arrow/array.h (included 256 times, avg 68 ms), included via: 17216 ms: arrow/buffer.h (included 491 times, avg 35 ms), included via: 14315 ms: arrow/acero/options.h (included 105 times, avg 136 ms), included via: 13494 ms: arrow/device.h (included 492 times, avg 27 ms), included via: 12561 ms: arrow/testing/gtest_util.h (included 240 times, avg 52 ms), included via: 11080 ms: arrow/compute/exec.h (included 230 times, avg 48 ms), included via: 10445 ms: arrow/acero/exec_plan.h (included 89 times, avg 117 ms), included via: ``` We can see that: > * avoid including `kernel.h` from `function.h` The cost of including `kernel.h` is reduced along with the reduction of including `function.h`. > * avoid including `scalar.h` from `datum.h` Including `scalar.h` in `datum.h` isn't really expensive, with and without this PR. Is this result good enough already or should I keep investigating anything? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
