Re: How to use StringIndexer for multiple input /output columns in Spark Java
Yes, the workaround is to create multiple StringIndexers as you described. OneHotEncoderEstimator is only in Spark 2.3.0, you will have to use just OneHotEncoder. On Tue, May 15, 2018, 8:40 AM Mina Aslani wrote: > Hi, > > So, what is the workaround? Should I create multiple indexer(one for each > column), and then create pipeline and set stages to have all the > StringIndexers? > I am using 2.2.1 as I cannot move to 2.3.0. Looks like > oneHotEncoderEstimator is broken, please see my email sent today with > subject: > OneHotEncoderEstimator - java.lang.NoSuchMethodError: org.apache.spark.sql > .Dataset.withColumns > > Regards, > Mina > > On Tue, May 15, 2018 at 2:37 AM, Nick Pentreath > wrote: > >> Multi column support for StringIndexer didn’t make it into Spark 2.3.0 >> >> The PR is still in progress I think - should be available in 2.4.0 >> >> On Mon, 14 May 2018 at 22:32, Mina Aslani wrote: >> >>> Please take a look at the api doc: >>> https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/ml/feature/StringIndexer.html >>> >>> On Mon, May 14, 2018 at 4:30 PM, Mina Aslani >>> wrote: >>> Hi, There is no SetInputCols/SetOutputCols for StringIndexer in Spark java. How multiple input/output columns can be specified then? Regards, Mina >>> >>> >
Re: How to use StringIndexer for multiple input /output columns in Spark Java
Hi, So, what is the workaround? Should I create multiple indexer(one for each column), and then create pipeline and set stages to have all the StringIndexers? I am using 2.2.1 as I cannot move to 2.3.0. Looks like oneHotEncoderEstimator is broken, please see my email sent today with subject: OneHotEncoderEstimator - java.lang.NoSuchMethodError: org.apache.spark.sql .Dataset.withColumns Regards, Mina On Tue, May 15, 2018 at 2:37 AM, Nick Pentreath wrote: > Multi column support for StringIndexer didn’t make it into Spark 2.3.0 > > The PR is still in progress I think - should be available in 2.4.0 > > On Mon, 14 May 2018 at 22:32, Mina Aslani wrote: > >> Please take a look at the api doc: https://spark.apache.org/ >> docs/2.3.0/api/java/org/apache/spark/ml/feature/StringIndexer.html >> >> On Mon, May 14, 2018 at 4:30 PM, Mina Aslani >> wrote: >> >>> Hi, >>> >>> There is no SetInputCols/SetOutputCols for StringIndexer in Spark java. >>> How multiple input/output columns can be specified then? >>> >>> Regards, >>> Mina >>> >> >>
Re: How to use StringIndexer for multiple input /output columns in Spark Java
Multi column support for StringIndexer didn’t make it into Spark 2.3.0 The PR is still in progress I think - should be available in 2.4.0 On Mon, 14 May 2018 at 22:32, Mina Aslani wrote: > Please take a look at the api doc: > https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/ml/feature/StringIndexer.html > > On Mon, May 14, 2018 at 4:30 PM, Mina Aslani wrote: > >> Hi, >> >> There is no SetInputCols/SetOutputCols for StringIndexer in Spark java. >> How multiple input/output columns can be specified then? >> >> Regards, >> Mina >> > >
Re: How to use StringIndexer for multiple input /output columns in Spark Java
Please take a look at the api doc: https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/ml/feature/StringIndexer.html On Mon, May 14, 2018 at 4:30 PM, Mina Aslani wrote: > Hi, > > There is no SetInputCols/SetOutputCols for StringIndexer in Spark java. > How multiple input/output columns can be specified then? > > Regards, > Mina >
How to use StringIndexer for multiple input /output columns in Spark Java
Hi, There is no SetInputCols/SetOutputCols for StringIndexer in Spark java. How multiple input/output columns can be specified then? Regards, Mina