cloud-fan commented on PR #53829:
URL: https://github.com/apache/spark/pull/53829#issuecomment-3768217446

   Generated by: Cursor (Claude)
   
   ---
   
   Thanks for this improvement! The code looks good overall. A few minor 
suggestions:
   
   **1. Redundant sorting in `_list_grouped_function_infos`**
   
   The functions are sorted by name twice - once before `groupby` and again in 
the list comprehension. Since `itertools.groupby` preserves order within 
groups, the second sort is redundant:
   ```python
   # Current:
   return [(k, sorted(list(g), key=lambda x: x.name)) for k, g in grouped]
   # Could be:
   return [(k, list(g)) for k, g in grouped]
   ```
   
   **2. O(n²) lookup in index generation**
   
   In the index generation loop, `next((infos for gk, infos in grouped_infos if 
gk == group_key), [])` iterates through all groups for each category. A 
dictionary lookup would be more efficient:
   ```python
   grouped_dict = {k: infos for k, infos in grouped_infos}
   # Then:
   category_infos = grouped_dict.get(group_key, [])
   ```
   
   **3. CSS generation readability (optional)**
   
   The many individual `write()` calls for CSS could be consolidated into a 
multi-line string for better readability.
   
   These are all minor and don't affect correctness. The PR achieves its goal 
cleanly!
   
   ---
   _This comment was generated with [GitHub MCP](http://go/mcps)._


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to