[
https://issues.apache.org/jira/browse/SPARK-28295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nils Skotara updated SPARK-28295:
---------------------------------
Description:
Using pyspark.ml.regression,
when I fit a GeneralizedLinearRegression like this:
glr = GeneralizedLinearRegression(family="gaussian", link="identity",
regParam=0.3, maxIter=10)
model = glr.fit(someData)
It seems like there is no way to get the matching of the features and their
coefficients or standard errors. I am using an ugly work around like this right
now:
field =
model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics")
object2 = model._call_java('summary')
field.setAccessible(True)
value = field.get(object2)
coef_value = {}
for i in range(0, len(value)):
row = value[i].toString()
values = row.split(',')
coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1])
Am I missing something?
If not, I'd like to request a method similar to model.coefficients with which
one can just get the feature names in the right order, like model.features or
something like that.
was:
In from pyspark.ml.regression
when I fit a GeneralizedLinearRegression like this:
glr = GeneralizedLinearRegression(family="gaussian", link="identity",
regParam=0.3, maxIter=10)
model = glr.fit(someData)
It seems like there is no way to get the matching of the features and their
coefficients or standard errors. I am using an ugly work around like this right
now:
field =
model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics")
object2 = model._call_java('summary')
field.setAccessible(True)
value = field.get(object2)
coef_value = {}
for i in range(0, len(value)):
row = value[i].toString()
values = row.split(',')
coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1])
Am I missing something?
If not, I'd like to request a method similar to model.coefficients with which
one can just get the feature names in the right order, like model.features or
something like that.
> Is there a way of getting feature names from pyspark.ml.regression
> GeneralizedLinearRegression?
> -----------------------------------------------------------------------------------------------
>
> Key: SPARK-28295
> URL: https://issues.apache.org/jira/browse/SPARK-28295
> Project: Spark
> Issue Type: Request
> Components: Build
> Affects Versions: 2.3.1
> Reporter: Nils Skotara
> Priority: Minor
> Labels: features
> Fix For: 2.3.1
>
>
> Using pyspark.ml.regression,
> when I fit a GeneralizedLinearRegression like this:
> glr = GeneralizedLinearRegression(family="gaussian", link="identity",
> regParam=0.3, maxIter=10)
> model = glr.fit(someData)
> It seems like there is no way to get the matching of the features and their
> coefficients or standard errors. I am using an ugly work around like this
> right now:
> field =
> model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics")
> object2 = model._call_java('summary')
> field.setAccessible(True)
> value = field.get(object2)
> coef_value = {}
> for i in range(0, len(value)):
> row = value[i].toString()
> values = row.split(',')
> coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1])
> Am I missing something?
> If not, I'd like to request a method similar to model.coefficients with
> which one can just get the feature names in the right order, like
> model.features or something like that.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]