EUbaldiEC commented on issue #40709:
URL: https://github.com/apache/superset/issues/40709#issuecomment-4617236735

   Hi,
   The bot pointed me in the (almost) right direction.
   
   Turns out the second line in the if block within 
[rank.py](https://github.com/apache/superset/blob/master/superset/utils/pandas_postprocessing/rank.py#L35)
   ```python
   if group_by:
           gb = df.groupby(group_by, group_keys=False)
           df["rank"] = gb.apply(lambda x: x[metric].rank(pct=True))
   ```
   will fail when the heatmap has only one row/column. 
   
   Substituting it with:
   ```python
   if group_by:
           df["rank"] = df.groupby(group_by)[metric].rank(pct=True)
   ```
   solves the issue.
   
   Running the old version against the test categories df, but filtered for one 
category, does indeed raise the error:
   ```python
   > tmp_df = 
categories_df[categories_df["dept"]=="dept0"].reset_index(drop=True)
   > tmp_df.drop(columns=["rank"], errors="ignore", inplace=True)
   > rank_upstream(tmp_df, "asc_idx", "dept")  # Prev version
   ---------------------------------------------------------------------------
   ValueError                                Traceback (most recent call last)
   ....
   ValueError: Cannot set a DataFrame with multiple columns to the single 
column rank
   
   > tmp_df = 
categories_df[categories_df["dept"]=="dept0"].reset_index(drop=True)
   > tmp_df.drop(columns=["rank"], errors="ignore", inplace=True)
   > rank(tmp_df, "asc_idx", "dept")  # NEW version
        constant        category        dept    name    asc_idx desc_idx        
idx_nulls       rank
   0    dummy   cat0    dept0   person0 0       100     0.0     0.047619
   1    dummy   cat2    dept0   person5 5       95      5.0     0.095238
   2    dummy   cat1    dept0   person10        10      90      10.0    0.142857
   ...
   ```
   
   I am drafting a PR with all the details and unit tests, but can already show 
here that this is working as expected in the failing scenarios (while 
preserving the same behaviour in the ones already functioning).
   
   - Overall case (same works with normalisation across heatmap / y):
   
   <img width="1489" height="838" alt="Image" 
src="https://github.com/user-attachments/assets/7fb21b04-afab-44f3-9d0d-bb77916f666b";
 />
   
   - Single column and normalize across X:
   
   <img width="1485" height="840" alt="Image" 
src="https://github.com/user-attachments/assets/a46e19fd-d2c5-4604-a96b-3e72a388f367";
 />
   
   - Single row and normalize across Y:
   
   <img width="1493" height="841" alt="Image" 
src="https://github.com/user-attachments/assets/6a86e433-40a9-478f-b978-b8c6f187c1da";
 />
   
   - Single row AND column:
   
   <img width="1519" height="839" alt="Image" 
src="https://github.com/user-attachments/assets/ad808981-2fa0-4970-af25-544cd51b8572";
 />


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to