ishansurdi commented on PR #3343:
URL: https://github.com/apache/avro/pull/3343#issuecomment-2730047250
Thank you for the review and for pointing this out!
The change was primarily aimed at improving **code readability and
maintainability** by using `tell()` instead of `seek(-SYNC_SIZE, 1)`. The
concern about an additional system call is valid—`tell()` does introduce an
extra operation in the success case.
However, here’s why I believe the tradeoff is acceptable:
1. **Clarity & Maintainability**: Using `tell()` makes the intent of the
function more explicit. The previous approach using `seek(-SYNC_SIZE, 1)`
relies on **relative positioning**, which can be **less intuitive** to new
maintainers.
2. **Performance Benchmark Results**: I tested both approaches, and the
results show that `tell()` is actually **faster** than `seek(-SYNC_SIZE, 1)` in
this scenario:
- `tell()` method: **0.0572 seconds**
- `seek(-SYNC_SIZE, 1)` method: **0.0777 seconds**
This suggests that the performance impact of `tell()` is **not a
concern** and may even be beneficial.
3. **Consistency with Other File Operations**: Many file-processing
functions in AVRO, particularly in `DataFileReader.java` and
`DataFileReader12.java`, **already use `tell()`** for tracking file positions.
This means my approach aligns with **existing AVRO patterns**.
Given these findings, I believe using `tell()` is both **efficient and
consistent**. However, if needed, I can provide additional benchmarks on
**larger datasets** to validate these results further. Let me know if that
would be helpful!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]