Github user viirya commented on the issue:
https://github.com/apache/spark/pull/18902
@MLnick Thanks for pinging me.
I go through this quickly. The basic idea is the same, performing the
operations on multiple inputs columns at one single Dataset/DataFrame operation.
Unlike `Bucketizer`, `Imputer` has no compatibility concern because it
already supports multiple input columns (`HasInputCols`). In `Bucketizer`, we
don't want to break its current API so it makes thing more complicated a bit.
Actually I'm noticed by `ImputerModel` which also applies `withColumn`
sequentially on each input column. I'd like to address this part with the
`withColumns` API proposed in #17819. What do you think @MLnick?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]