codeant-ai-for-open-source[bot] commented on PR #36605:
URL: https://github.com/apache/superset/pull/36605#issuecomment-3653090044

   ## Nitpicks 🔍
   
   <table>
   <tr><td>🔒&nbsp;<strong>No security issues identified</strong></td></tr>
   <tr><td>⚡&nbsp;<strong>Recommended areas for review</strong><br><br>
   
   - [ ] <a 
href='https://github.com/apache/superset/pull/36605/files#diff-4fbc6dbbdd06680aff582fe2a2c59444368ada935b1af37c0646156d638a614aR126-R131'><strong>Unnecessary
 conversions / better numeric check</strong></a><br>The code converts only when 
the dtype equals object; there are other non-numeric dtypes (e.g., category, 
string) that may still need coercion for `nanpercentile`. Consider using 
`pandas.api.types.is_numeric_dtype` to skip already-numeric columns and convert 
all non-numeric types.<br>
   
   - [ ] <a 
href='https://github.com/apache/superset/pull/36605/files#diff-4fbc6dbbdd06680aff582fe2a2c59444368ada935b1af37c0646156d638a614aR128-R131'><strong>Missing
 column guard</strong></a><br>The loop iterates over `metrics` and indexes 
`df.dtypes[column]` / `df[column]` without checking that `column` exists in 
`df`. If `metrics` contains a name not present in the DataFrame this will raise 
a KeyError. Add an existence check before accessing the column.<br>
   
   - [ ] <a 
href='https://github.com/apache/superset/pull/36605/files#diff-4fbc6dbbdd06680aff582fe2a2c59444368ada935b1af37c0646156d638a614aR128-R130'><strong>Deprecated
 dtype check</strong></a><br>The code checks `df.dtypes[column] == np.object_`, 
which relies on `np.object_` (deprecated in recent NumPy). This can cause 
warnings or break with newer NumPy versions. Prefer using pandas type-check 
helpers (e.g., `pandas.api.types.is_object_dtype` or `is_numeric_dtype`) or 
`object`/`pd.Series.dtype` comparisons.<br>
   
   - [ ] <a 
href='https://github.com/apache/superset/pull/36605/files#diff-d51ae5c7b4b86566f46f9da12fdfe1e2975a390ac0b1c650571aa559b21fce25R56-R61'><strong>Coercion
 -> ValueError behaviour</strong></a><br>`to_numeric(..., errors="coerce")` 
will turn non-numeric values into NaN which immediately triggers the following 
`if df[column].isna().any(): raise ValueError(...)`. Verify that raising here 
is the intended behaviour for downstream callers (vs. dropping/coercing those 
rows or returning an empty histogram).<br>
   
   </td></tr>
   </table>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to