crm26 commented on PR #21376:
URL: https://github.com/apache/datafusion/pull/21376#issuecomment-4210330082

   Thanks for taking a look, @kumarUjjawal.
   
   Some context on why these are bundled the way they are, then a question for 
a committer on whether to split:
   
   **#21371** (2158 additions, 8 files) introduces 6 functions — 
`cosine_distance`, `inner_product`, `array_normalize`, `array_add`, 
`array_subtract`, `array_scale` — plus `vector_math.rs` with shared primitives 
(`dot_product_f64`, `magnitude_f64`, `convert_to_f64_array`) that three of the 
six functions use. If this is split, the primitives need to land in one PR and 
their callers in follow-ups, which creates cross-PR dependencies.
   
   **#21376** (this PR) is stacked on #21371 and adds `array_sum`, 
`array_product`, `array_avg`, plus a one-line `list_min` alias fix. The GitHub 
diff against `main` shows both PRs' changes together (~3030 additions) — the 
actual new content in this PR on top of #21371 is ~872 additions across 5 
new/modified files.
   
   **Question for the committers** (@alamb / @neilconway) — would you prefer 
these split further? Options I see:
   
   - **Option A (current):** #21371 = all vector+array math primitives and 6 
functions; #21376 = 3 array aggregates + 1 alias. Two PRs, related groupings, 
shared primitives co-located with first use.
   - **Option B:** Split into ~6 PRs: (1) vector_math.rs primitives + 
cosine_distance + inner_product, (2) array_normalize, (3) 
array_add/subtract/scale, (4) array_sum + list_min alias, (5) array_product, 
(6) array_avg.
   
   Happy to restructure if Option B is preferred. Option A is what's currently 
submitted. No strong preference on my end — defer to whatever makes review 
easier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to