AndreaBozzo commented on PR #9165: URL: https://github.com/apache/arrow-rs/pull/9165#issuecomment-3750124544
> The approach looks very nice. > > > ⚠️ FYI, I used claude to refactor those tests, I read the changes and we are keeping the same test cases. > > Great use for an LLM IMO! > > One issue tho, which I also failed to notice when eyeballing the changes, but which claude pointed out when I asked it to check the before/after tests for equivalence: > > > The original tests for `subtract_day_time_opt` and `subtract_month_day_nano_opt` specifically tested: > > > > * Subtracting a positive interval from i64::MIN (should fail going more negative) > > * Subtracting a negative interval from i64::MAX (should fail going more positive) > > > > The refactored versions test: > > > > * Subtracting a positive interval from i64::MAX (mathematically valid direction but extreme input) > > * Subtracting a negative interval from i64::MIN (mathematically valid direction but extreme input) > > It suggested three possible ways to address this (favoring 2/): > > 1. Update the test logic to be add/sub aware and test the original edge cases > > 2. Expand the test logic to test both the original and new edge cases > > 3. Accept that the negative test "succeeds" (both BEFORE and AFTER) because we can't create a date from i64::MIN/MAX in the first place -- so it doesn't matter what interval we pass because the test will never get that far. > > > However, 3/ raised red flags for me: There are already tests to exercise physical (integer) overflow, and based on the comments here I suspect that the original test author _intended_ to test logical overflow, where the interval shifts a valid date in a "bad" direction that takes it out of gamut. If so, the original tests were broken and the refactored ones should actually be working with MIN/MAX_VALID_MILLIS instead of ia64::MIN/MAX. At which point it would be important to also apply 1/ from the list above. > > It also raises the question of whether we should use `?` for intermediate operations like the date creation, or if `unwrap` is more appropriate to be sure the test failed/succeeded for the reason we think it did? Hi, sorry for the intrusion, i approved this PR because the refactoring itself is clean and the core test coverage is preserved. Regarding scovich's observation about the subtle difference in edge case testing (i64::MIN/MAX with positive/negative intervals): since neither i64::MIN nor i64::MAX can be converted to a valid date in the first place, the test fails at date creation before the interval logic is even reached, the behavior is effectively the same (?) That said, if the original intent was to test *logical* overflow (a valid date shifted out of range), then both the original and refactored tests miss that case. Testing with MIN_VALID_MILLIS/MAX_VALID_MILLIS instead would be a nice improvement i super agree, even in separate pr. This would also be a great "good first issue" for newcomers to the repo (like myself 😄) who want to improve test coverage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
