[MLlib]: GLM with multinomial family

Surya Rajaraman Iyer Mon, 14 Feb 2022 09:49:42 -0800

Hi Team,

I am using a multinomial regression in Spark Scala. I want to generate the
coefficient and p-values for every category.

For example, given two variables salary group (dependent variable) and age
group (Independent variable)

salary-group: 10,000-, 10,000-100,000, 100,000+
age-group: 30-, 30-40, 40+

I am looking to get an output like

With 10,000- as baseline, get the coefficients and pvalues for each
category. in the salary group

10,000-100,000,
coefficient Pvalue
Intercept .. ..

age group
30-40 .. ..
40+ .. ..
30- 0 0

100,000+ coefficient Pvalue
Intercept .. ..

age group
30-40 .. ..
40+ .. ..
30- 0 0
To do this, I am forced to use glm with binomial family twice. In order to
parallelize it, I am using thread pools which doesn't seem ideal.

Do you think there is a way to do multinomial logit in spark scala.I do see
it in spark R : https://rdrr.io/cran/SparkR/man/spark.logit.html

Is there a spark way to make the glms parallel? Something like:-

SparkLogisticRegressionResult glm (df: DataFrame) {
}

dfs : Seq[df]
dfs.map(glm)

Thanks a lot for the help!

Regards,
Surya,

--
Confidentiality Notice: This email and any files transmitted with it are
confidential and intended solely for the use of the individual or entity to
whom they are addressed. Additionally, this email and any files
transmitted with it may not be disseminated, distributed or copied. Please
notify the sender immediately by email if you have received this email by
mistake and delete this email from your system. If you are not the intended
recipient, you are notified that disclosing, copying, distributing or
taking any action in reliance on the contents of this information is
strictly prohibited.

<http://www.medallia.com/gartner-report/?source=Marketing%20-%20Email&utm_campaign=FY22Q4_NA_Gartner_MQ_VoC_Campaign&utm_medium=email&utm_source=email-signature&utm_content=report&utm_term=medallia-named-a-leader>

[MLlib]: GLM with multinomial family

Reply via email to