Thanks,  8/10 coeff are zero estimate in CRUZADAS, the parameters for alpha and 
lambda are set in default(i think  zero, the model in R and SAS was fitted 
using glm binary logistic.

 

Cheers

 

De: Simon Dirmeier <simon.dirme...@web.de>
Fecha: martes, 24 de octubre de 2017, 08:30
Para: Alexis Peña <alexis.p...@exalitica.com>, <user@spark.apache.org>
Asunto: Re: Zero Coefficient in logistic regression

 

So, all the coefficients are the same but  for CRUZADAS? How are you fitting 
the model in R (glm)?  Can you try setting zero penalty for alpha and lambda:
  .setRegParam(0)
  .setElasticNetParam(0)
Cheers,
S

Am 24.10.17 um 13:19 schrieb Alexis Peña:

Thanks for your Answer, the features “Cruzadas” are Binaries (0/1). The chisq 
statistic must be work whit 2x2 tables.

 

i fit the model in SAS and R and both the coeff have estimates (not 
significant). Two of this kind of features has estimations

 

CRUZADAS49070,247624087
CRUZADAS5304-0,161424508

 

 

Thanks

 

 

De: Weichen Xu <weichen...@databricks.com>
Fecha: martes, 24 de octubre de 2017, 07:23
Para: Alexis Peña <alexis.p...@exalitica.com>
CC: "user @spark" <user@spark.apache.org>
Asunto: Re: Zero Coefficient in logistic regression

 

Yes chi-squared statistic only used in categorical features. It looks not 
proper here.

Thanks!

 

On Tue, Oct 24, 2017 at 5:13 PM, Simon Dirmeier <simon.dirme...@web.de> wrote:

Hey,

as far as I know feature selection using the a chi-squared statistic, can only 
be done on categorical features and not on possibly continuous ones?
Furthermore, since your logistic model doesn't use any regularization, you 
should be fine here. So I'd check the ChiSqSeletor and possibly replace it with 
another feature selection method. 

There is however always the chance that your response does not depend on your 
covariables, so you'd estimate a zero coefficient.

Cheers,
Simon


Am 24.10.17 um 04:56 schrieb Alexis Peña:

Hi Guys,

 

We are fitting a Logistic model using the following code.

 

 

val Chisqselector = new 
ChiSqSelector().setNumTopFeatures(10).setFeaturesCol("VECTOR_1").setLabelCol("TARGET").setOutputCol("selectedFeatures")

val assembler = new VectorAssembler().setInputCols(Array("FEATURES", 
"selectedFeatures", "PROM_MESES_DIST", "RECENCIA", "TEMP_MIN", "TEMP_MAX", 
"PRECIPITACIONES")).setOutputCol("Union")

val lr = new LogisticRegression().setLabelCol("TARGET").setFeaturesCol("Union")

val pipeline = new Pipeline().setStages(Array(Chisqselector, assembler, lr))

 

 

do you know why the coeff for  the following features are zero estimate, is it  
produced in ChisqSelector or Logistic model?

 

Thanks in advance!!

 

 

CODIGOPARAMETROCOEFICIENTES_MUESTREO_BALANCEADO
PROPIASCV_UM0,276866756
PROPIASCV_U3M-0,241851427
PROPIASCV_U6M-0,568312819
PROPIASCV_U12M0,134706601
PROPIASM_UM5,47E-06
PROPIASM_U3M-7,10E-06
PROPIASM_U6M1,73E-05
PROPIASM_U12M-5,41E-06
PROPIASCP_UM-0,050750105
PROPIASCP_U3M0,125483162
PROPIASCP_U6M-0,353906788
PROPIASCP_U12M0,159538155
PROPIASTUM-0,020217902
PROPIASTU3M0,002101906
PROPIASTU6M-0,005481915
PROPIASTU12M0,003443081
CRUZADAS23030
CRUZADAS39010
CRUZADAS39050
CRUZADAS39070
CRUZADAS39090
CRUZADAS41020
CRUZADAS43070
CRUZADAS45010
CRUZADAS49070,247624087
CRUZADAS5304-0,161424508
LPPROM_MESES_DIST-0,680356554
PROPIASRECENCIA-0,00289069
EXTERNASTEMP_MIN0,006488683
EXTERNASTEMP_MAX-0,013497441
EXTERNASPRECIPITACIONES-0,007607086
INTERCEPTO2,401593191

 

 

 




Reply via email to