Re: [scikit-learn] MinMaxScaler scales all (and only all) features in X?

Bill Ross Wed, 12 Feb 2025 21:14:02 -0800

It turned out my elderly conda was using sklearn<1.2. After much wasted
time on piecemeal-upgrade conda solve attempts, this got sklearn to
1.6.1,


    $ conda update -n base -c conda-forge conda 

    $ conda install conda=25.1.1 

    # fixes to upgrade compatibility breakages 

    $ conda install tensorflow-gpu 

    $ conda install conda-forge::imbalanced-learn 

Mantra: 

    $ python -c "import sklearn; sklearn.show_versions()" 

I still don't retain column order, but the end columns match (as below),
and now I notice it thanks to column labels existing with
.set_output(transform="pandas"). 

Thanks, 

Bill 

---
--

Phobrain.com 

On 2025-02-11 01:31, Bill Ross wrote:

> I applied ColumnTransformer, but the results are unexpected. It could be my 
> lack of python skill, but it seems like the value of p1_1 in the original 
> should persist at 0,0 in the transformed? 
> 
> ------- pre-scale
> p1_1      p1_2      p1_3      p1_4      p2_1  ...   resp1_4   resp2_1   
> resp2_2   resp2_3   resp2_4
> 760  1.382658  1.440719  1.555705  1.120171  1.717319  ...  0.598736   
> 0.659797  0.376331  0.403887  0.390283 
> 
> ------- scaled
> [[0.17045455 0.04680535 0.04372197 ... 0.37633118 0.40388673 0.39028345] 
> 
> Thanks, 
> 
> Bill 
> 
> Fingers crossed on the formatting. 
> 
> column_trans = make_column_transformer(
> (MinMaxScaler(), 
> ['order_in_session','big_stime','big_time','load_time','user_time','user_time2','mouse_down_time','mouse_time','mouse_dist','mouse_dist2','dot_count','mouse_dx','mouse_dy','mouse_vecx','mouse_vecy','dot_vec_len','mouse_maxv','mouse_maxa','mouse_mina','mouse_maxj','dot_max_vel','dot_max_acc','dot_max_jerk','dot_start_scrn','dot_end_scrn','dot_vec_ang']),
> remainder='passthrough') 
> 
> print('------- pre-scale')
> print( str(X_train) ) 
> 
> X_train = column_trans.fit_transform(X_train) 
> 
> print('------- scaled')
> print( str(X_train) )
> print('------- /scaled') 
> 
> split 414 414
> ------- pre-scale
> p1_1      p1_2      p1_3      p1_4      p2_1  ...   resp1_4   resp2_1   
> resp2_2   resp2_3   resp2_4
> 760  1.382658  1.440719  1.555705  1.120171  1.717319  ...  0.598736  
> 0.659797  0.376331  0.403887  0.390283
> 218  0.985645  0.532462  0.780601  0.687588  0.781293  ...  0.890886  
> 1.072392  0.536962  0.715136  0.792722
> 603  0.783806  0.437074  0.694766  0.371121  0.995891  ...  1.055465  
> 1.518875  1.129209  1.201864  1.476702
> 0    0.501352  0.253304  0.427804  0.283380  0.571035  ...  1.035323  
> 1.621431  0.838613  1.031724  1.131344
> 604  1.442482  1.019641  0.798387  1.055465  1.518875  ...  2.779447  
> 1.636363  1.212313  1.274595  1.723697 
> 
> ... 
> 
> ... 
> 
> ------- scaled
> [[0.17045455 0.04680535 0.04372197 ... 0.37633118 0.40388673 0.39028345]
> [0.27272727 0.04502229 0.04204036 ... 0.53696203 0.7151355  0.7927222 ]
> [0.30681818 0.04517088 0.04456278 ... 1.1292094  1.201864   1.4767016 ]
> ...
> [0.02272727 0.04457652 0.1680213  ... 1.796316   1.939811   2.1776829 ]
> [0.55681818 0.04546805 0.04176009 ... 0.48330075 0.37375322 0.29931256]
> [0.5        0.04457652 0.04091928 ... 0.6759416  0.7517819  0.8801653 ]] 
> 
> ---
> --
> 
> Phobrain.com 
> 
> On 2025-01-23 01:21, Bill Ross wrote:
> 
>>> ColumnTransformer 
>> 
>> Thanks! 
>> 
>> I was also thinking of trying TabPFN, not researched yet, in case you can 
>> comment. <peeks/> Their attribution requirement seems overboard for what I 
>> want, unless it's flat-out miraculous for the flat-footed. :-)
>> 
>> Some of us are working on a related package, skrub (https://skrub-data.org), 
>> which is more focused to on heterogeneous dataframes. It does not currently 
>> have something that would help you much, but we are heavily brain-storming a 
>> variety of APIs to do flexible transformations of dataframes, including 
>> easily doing what you want. The challenge is to address the variety of 
>> cases. 
>> 
>> Those are the storms we want. I'd love to know if/how/which ML tools are 
>> helping with that work, if appropriate here. 
>> 
>> Regards, 
>> Bill
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] MinMaxScaler scales all (and only all) features in X?

Reply via email to