subject:"\[jira\] \[Commented\] \(SPARK\-21919\) inconsistent behavior of AFTsurvivalRegression algorithm"

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

2017-09-07 Thread Yanbo Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157027#comment-16157027
 ] 

Yanbo Liang commented on SPARK-21919:
-

[~srowen] You are right, that is caused by line search bug. The error log in 
2.2.0 can tell us what happened. Thanks for dig into it.

> inconsistent behavior of AFTsurvivalRegression algorithm
> 
>
> Key: SPARK-21919
> URL: https://issues.apache.org/jira/browse/SPARK-21919
> Project: Spark
>  Issue Type: Bug
>  Components: ML, PySpark
>Affects Versions: 2.2.0
> Environment: Spark Version: 2.2.0
> Cluster setup: Standalone single node
> Python version: 3.5.2
>Reporter: Ashish Chopra
>
> Took the direct example from spark ml documentation.
> {code}
> training = spark.createDataFrame([
> (1.218, 1.0, Vectors.dense(1.560, -0.605)),
> (2.949, 0.0, Vectors.dense(0.346, 2.158)),
> (3.627, 0.0, Vectors.dense(1.380, 0.231)),
> (0.273, 1.0, Vectors.dense(0.520, 1.151)),
> (4.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
> quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result is:
> Coefficients: [-0.496304411053,0.198452172529]
> Intercept: 2.6380898963056327
> Scale: 1.5472363533632303
> ||label||censor||features  ||prediction   || quantiles ||
> |1.218|1.0   |[1.56,-0.605] |5.718985621018951 | 
> [1.160322990805951,4.99546058340675]|
> |2.949|0.0   |[0.346,2.158] |18.07678210850554 
> |[3.66759199449632,15.789837303662042]|
> |3.627|0.0   |[1.38,0.231]  |7.381908879359964 
> |[1.4977129086101573,6.4480027195054905]|
> |0.273|1.0   |[0.52,1.151]  
> |13.577717814884505|[2.754778414791513,11.859962351993202]|
> |4.199|0.0   |[0.795,-0.226]|9.013087597344805 
> |[1.828662187733188,7.8728164067854856]|
> But if we change the value of all labels as label + 20. as:
> {code}
> training = spark.createDataFrame([
> (21.218, 1.0, Vectors.dense(1.560, -0.605)),
> (22.949, 0.0, Vectors.dense(0.346, 2.158)),
> (23.627, 0.0, Vectors.dense(1.380, 0.231)),
> (20.273, 1.0, Vectors.dense(0.520, 1.151)),
> (24.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
>  quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result changes to:
> Coefficients: [23.9932020748,3.18105314757]
> Intercept: 7.35052273751137
> Scale: 7698609960.724161
> ||label ||censor||features  ||prediction   ||quantiles||
> |21.218|1.0   |[1.56,-0.605] |4.0912442688237169E18|[0.0,0.0]|
> |22.949|0.0   |[0.346,2.158] |6.011158613411288E9  |[0.0,0.0]|
> |23.627|0.0   |[1.38,0.231]  |7.7835948690311181E17|[0.0,0.0]|
> |20.273|1.0   |[0.52,1.151]  |1.5880852723124176E10|[0.0,0.0]|
> |24.199|0.0   |[0.795,-0.226]|1.4590190884193677E11|[0.0,0.0]|
> Can someone please explain this exponential blow up in prediction, as per my 
> understanding prediction in AFT is a prediction of the time when the failure 
> event will occur, not able to understand why it will change exponentially 
> against the value of the label.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

2017-09-07 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157013#comment-16157013
 ] 

Sean Owen commented on SPARK-21919:
---

Hm, yeah, I suppose I should have tried it too. On {{master}}, and in Scala, I 
get:

{code}
import org.apache.spark.ml.linalg._
import org.apache.spark.ml.regression._
  
val training = spark.createDataFrame(Seq(
  (21.218, 1.0, Vectors.dense(1.560, -0.605)),
  (22.949, 0.0, Vectors.dense(0.346, 2.158)),
  (23.627, 0.0, Vectors.dense(1.380, 0.231)),
  (20.273, 1.0, Vectors.dense(0.520, 1.151)),
  (24.199, 0.0, Vectors.dense(0.795, -0.226))  
)).toDF("label", "censor", "features")

val aft = new AFTSurvivalRegression().
  setQuantileProbabilities(Array(0.3, 0.6)).
  setQuantilesCol("quantiles")

val model = aft.fit(training)

println(s"Coefficients: ${model.coefficients}")
println(s"Intercept: ${model.intercept}")
println(s"Scale: ${model.scale}")

model.transform(training).show(truncate=false)
{code}

{code}
17/09/07 15:30:14 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.5
17/09/07 15:30:14 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.25
17/09/07 15:30:14 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.5
17/09/07 15:30:14 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.25
17/09/07 15:30:14 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.125
...
+--+--+--+--+---+
|label |censor|features  |prediction|quantiles  
|
+--+--+--+--+---+
|21.218|1.0   |[1.56,-0.605] |24.20972861807431 
|[21.617443110471118,23.97833624826161] |
|22.949|0.0   |[0.346,2.158] 
|26.461225875981285|[23.627858619625105,26.208314087493857]|
|23.627|0.0   |[1.38,0.231]  
|24.565240805031497|[21.934888406858644,24.330450511651165]|
|20.273|1.0   |[0.52,1.151]  
|26.074003958175602|[23.28209894956245,25.82479316934075]  |
|24.199|0.0   
|[0.795,-0.226]|25.491396901107077|[22.761875236582238,25.247754569057985]|
+--+--+--+--+---+
{code}

But in 2.2.0, I get

{code}
 ERROR optimize.LBFGS: Failure! Resetting history: 
breeze.optimize.FirstOrderException: Line search failed
17/09/07 14:32:35 ERROR optimize.LBFGS: Failure again! Giving up and returning. 
Maybe the objective is just poorly behaved?
...
+--+--+--+-+-+
|label |censor|features  |prediction   |quantiles|
+--+--+--+-+-+
|21.218|1.0   |[1.56,-0.605] |4.091244268823746E18 |[0.0,0.0]|
|22.949|0.0   |[0.346,2.158] |6.011158613411288E9  |[0.0,0.0]|
|23.627|0.0   |[1.38,0.231]  |7.7835948690311731E17|[0.0,0.0]|
|20.273|1.0   |[0.52,1.151]  |1.5880852723124233E10|[0.0,0.0]|
|24.199|0.0   |[0.795,-0.226]|1.459019088419373E11 |[0.0,0.0]|
+--+--+--+-+-+
{code}

So I'm almost sure this is just another symptom of the Breeze / strong Wolfe 
line search bug: https://issues.apache.org/jira/browse/SPARK-21523

> inconsistent behavior of AFTsurvivalRegression algorithm
> 
>
> Key: SPARK-21919
> URL: https://issues.apache.org/jira/browse/SPARK-21919
> Project: Spark
>  Issue Type: Bug
>  Components: ML, PySpark
>Affects Versions: 2.2.0
> Environment: Spark Version: 2.2.0
> Cluster setup: Standalone single node
> Python version: 3.5.2
>Reporter: Ashish Chopra
>
> Took the direct example from spark ml documentation.
> {code}
> training = spark.createDataFrame([
> (1.218, 1.0, Vectors.dense(1.560, -0.605)),
> (2.949, 0.0, Vectors.dense(0.346, 2.158)),
> (3.627, 0.0, Vectors.dense(1.380, 0.231)),
> (0.273, 1.0, Vectors.dense(0.520, 1.151)),
> (4.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
> quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result is:
> Co

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

2017-09-07 Thread Yanbo Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156979#comment-16156979
 ] 

Yanbo Liang commented on SPARK-21919:
-

[~ashishchopra0308] [~srowen] I can't reproduce this issue, I can get correct 
result which is consistent with R {{survreg}}.
{code}
>>> from pyspark.ml.regression import AFTSurvivalRegression
>>> from pyspark.ml.linalg import Vectors
>>> training = spark.createDataFrame([
... (21.218, 1.0, Vectors.dense(1.560, -0.605)),
... (22.949, 0.0, Vectors.dense(0.346, 2.158)),
... (23.627, 0.0, Vectors.dense(1.380, 0.231)),
... (20.273, 1.0, Vectors.dense(0.520, 1.151)),
... (24.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor",
... "features"])
>>> quantileProbabilities = [0.3, 0.6]
>>> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
...  quantilesCol="quantiles")
>>> model = aft.fit(training)
17/09/07 21:54:31 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.5
17/09/07 21:54:31 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.25
17/09/07 21:54:31 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.5
17/09/07 21:54:31 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.25
17/09/07 21:54:31 ERROR StrongWolfeLineSearch: Encountered bad values in 
function evaluation. Decreasing step size to 0.125
>>> print("Coefficients: " + str(model.coefficients))
Coefficients: [-0.065814695216,0.00326705958509]
>>> print("Intercept: " + str(model.intercept))
Intercept: 3.29140205698
>>> print("Scale: " + str(model.scale))
Scale: 0.109856123692
>>> model.transform(training).show(truncate=False)
17/09/07 21:55:05 WARN BLAS: Failed to load implementation from: 
com.github.fommil.netlib.NativeSystemBLAS
17/09/07 21:55:05 WARN BLAS: Failed to load implementation from: 
com.github.fommil.netlib.NativeRefBLAS
+--+--+--+--+---+
|label |censor|features  |prediction|quantiles  
|
+--+--+--+--+---+
|21.218|1.0   |[1.56,-0.605] |24.20972861807431 
|[21.617443110471118,23.97833624826161] |
|22.949|0.0   |[0.346,2.158] 
|26.461225875981285|[23.627858619625105,26.208314087493857]|
|23.627|0.0   |[1.38,0.231]  
|24.565240805031497|[21.934888406858644,24.330450511651165]|
|20.273|1.0   |[0.52,1.151]  
|26.074003958175602|[23.28209894956245,25.82479316934075]  |
|24.199|0.0   
|[0.795,-0.226]|25.491396901107077|[22.761875236582238,25.247754569057985]|
+--+--+--+--+---+
{code}

> inconsistent behavior of AFTsurvivalRegression algorithm
> 
>
> Key: SPARK-21919
> URL: https://issues.apache.org/jira/browse/SPARK-21919
> Project: Spark
>  Issue Type: Bug
>  Components: ML, PySpark
>Affects Versions: 2.2.0
> Environment: Spark Version: 2.2.0
> Cluster setup: Standalone single node
> Python version: 3.5.2
>Reporter: Ashish Chopra
>
> Took the direct example from spark ml documentation.
> {code}
> training = spark.createDataFrame([
> (1.218, 1.0, Vectors.dense(1.560, -0.605)),
> (2.949, 0.0, Vectors.dense(0.346, 2.158)),
> (3.627, 0.0, Vectors.dense(1.380, 0.231)),
> (0.273, 1.0, Vectors.dense(0.520, 1.151)),
> (4.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
> quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result is:
> Coefficients: [-0.496304411053,0.198452172529]
> Intercept: 2.6380898963056327
> Scale: 1.5472363533632303
> ||label||censor||features  ||prediction   || quantiles ||
> |1.218|1.0   |[1.56,-0.605] |5.718985621018951 | 
> [1.160322990805951,4.99546058340675]|
> |2.949|0.0   |[0.346,2.158] |18.07678210850554 
> |[3.66759199449632,15.789837303662042]|
> |3.627|0.0   |[1.38,0.231]  |7.381908879359964 
> |[1.4977129086101573,6.4480027195054905]|
> |0.273|1.0   |[0.

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

2017-09-06 Thread Yanbo Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156579#comment-16156579
 ] 

Yanbo Liang commented on SPARK-21919:
-

[~srowen] I will take a look at it. Thanks.

> inconsistent behavior of AFTsurvivalRegression algorithm
> 
>
> Key: SPARK-21919
> URL: https://issues.apache.org/jira/browse/SPARK-21919
> Project: Spark
>  Issue Type: Bug
>  Components: ML, PySpark
>Affects Versions: 2.2.0
> Environment: Spark Version: 2.2.0
> Cluster setup: Standalone single node
> Python version: 3.5.2
>Reporter: Ashish Chopra
>
> Took the direct example from spark ml documentation.
> {code}
> training = spark.createDataFrame([
> (1.218, 1.0, Vectors.dense(1.560, -0.605)),
> (2.949, 0.0, Vectors.dense(0.346, 2.158)),
> (3.627, 0.0, Vectors.dense(1.380, 0.231)),
> (0.273, 1.0, Vectors.dense(0.520, 1.151)),
> (4.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
> quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result is:
> Coefficients: [-0.496304411053,0.198452172529]
> Intercept: 2.6380898963056327
> Scale: 1.5472363533632303
> ||label||censor||features  ||prediction   || quantiles ||
> |1.218|1.0   |[1.56,-0.605] |5.718985621018951 | 
> [1.160322990805951,4.99546058340675]|
> |2.949|0.0   |[0.346,2.158] |18.07678210850554 
> |[3.66759199449632,15.789837303662042]|
> |3.627|0.0   |[1.38,0.231]  |7.381908879359964 
> |[1.4977129086101573,6.4480027195054905]|
> |0.273|1.0   |[0.52,1.151]  
> |13.577717814884505|[2.754778414791513,11.859962351993202]|
> |4.199|0.0   |[0.795,-0.226]|9.013087597344805 
> |[1.828662187733188,7.8728164067854856]|
> But if we change the value of all labels as label + 20. as:
> {code}
> training = spark.createDataFrame([
> (21.218, 1.0, Vectors.dense(1.560, -0.605)),
> (22.949, 0.0, Vectors.dense(0.346, 2.158)),
> (23.627, 0.0, Vectors.dense(1.380, 0.231)),
> (20.273, 1.0, Vectors.dense(0.520, 1.151)),
> (24.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
>  quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result changes to:
> Coefficients: [23.9932020748,3.18105314757]
> Intercept: 7.35052273751137
> Scale: 7698609960.724161
> ||label ||censor||features  ||prediction   ||quantiles||
> |21.218|1.0   |[1.56,-0.605] |4.0912442688237169E18|[0.0,0.0]|
> |22.949|0.0   |[0.346,2.158] |6.011158613411288E9  |[0.0,0.0]|
> |23.627|0.0   |[1.38,0.231]  |7.7835948690311181E17|[0.0,0.0]|
> |20.273|1.0   |[0.52,1.151]  |1.5880852723124176E10|[0.0,0.0]|
> |24.199|0.0   |[0.795,-0.226]|1.4590190884193677E11|[0.0,0.0]|
> Can someone please explain this exponential blow up in prediction, as per my 
> understanding prediction in AFT is a prediction of the time when the failure 
> event will occur, not able to understand why it will change exponentially 
> against the value of the label.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

2017-09-05 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153367#comment-16153367
 ] 

Sean Owen commented on SPARK-21919:
---

It does look like a problem. From R's survreg I get:

{code}
survreg(formula = Surv(data$label, data$censor) ~ data$feature1 + 
data$feature2, dist = "weibull")
 Value Std. Error   zp
(Intercept)3.29140  0.295 11.1737 5.49e-29
data$feature1 -0.06581  0.245 -0.2688 7.88e-01
data$feature2  0.00327  0.123  0.0265 9.79e-01
Log(scale)-2.20858  0.642 -3.4390 5.84e-04

Scale= 0.11 
{code}

[~yanboliang] I think you originally created this; does it ring any bells?

> inconsistent behavior of AFTsurvivalRegression algorithm
> 
>
> Key: SPARK-21919
> URL: https://issues.apache.org/jira/browse/SPARK-21919
> Project: Spark
>  Issue Type: Bug
>  Components: ML, PySpark
>Affects Versions: 2.2.0
> Environment: Spark Version: 2.2.0
> Cluster setup: Standalone single node
> Python version: 3.5.2
>Reporter: Ashish Chopra
>
> Took the direct example from spark ml documentation.
> {code}
> training = spark.createDataFrame([
> (1.218, 1.0, Vectors.dense(1.560, -0.605)),
> (2.949, 0.0, Vectors.dense(0.346, 2.158)),
> (3.627, 0.0, Vectors.dense(1.380, 0.231)),
> (0.273, 1.0, Vectors.dense(0.520, 1.151)),
> (4.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
> quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result is:
> Coefficients: [-0.496304411053,0.198452172529]
> Intercept: 2.6380898963056327
> Scale: 1.5472363533632303
> ||label||censor||features  ||prediction   || quantiles ||
> |1.218|1.0   |[1.56,-0.605] |5.718985621018951 | 
> [1.160322990805951,4.99546058340675]|
> |2.949|0.0   |[0.346,2.158] |18.07678210850554 
> |[3.66759199449632,15.789837303662042]|
> |3.627|0.0   |[1.38,0.231]  |7.381908879359964 
> |[1.4977129086101573,6.4480027195054905]|
> |0.273|1.0   |[0.52,1.151]  
> |13.577717814884505|[2.754778414791513,11.859962351993202]|
> |4.199|0.0   |[0.795,-0.226]|9.013087597344805 
> |[1.828662187733188,7.8728164067854856]|
> But if we change the value of all labels as label + 20. as:
> {code}
> training = spark.createDataFrame([
> (21.218, 1.0, Vectors.dense(1.560, -0.605)),
> (22.949, 0.0, Vectors.dense(0.346, 2.158)),
> (23.627, 0.0, Vectors.dense(1.380, 0.231)),
> (20.273, 1.0, Vectors.dense(0.520, 1.151)),
> (24.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", 
> "features"])
> quantileProbabilities = [0.3, 0.6]
> aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
>  quantilesCol="quantiles")
> #aft = AFTSurvivalRegression()
> model = aft.fit(training)
> 
> # Print the coefficients, intercept and scale parameter for AFT survival 
> regression
> print("Coefficients: " + str(model.coefficients))
> print("Intercept: " + str(model.intercept))
> print("Scale: " + str(model.scale))
> model.transform(training).show(truncate=False)
> {code}
> result changes to:
> Coefficients: [23.9932020748,3.18105314757]
> Intercept: 7.35052273751137
> Scale: 7698609960.724161
> ||label ||censor||features  ||prediction   ||quantiles||
> |21.218|1.0   |[1.56,-0.605] |4.0912442688237169E18|[0.0,0.0]|
> |22.949|0.0   |[0.346,2.158] |6.011158613411288E9  |[0.0,0.0]|
> |23.627|0.0   |[1.38,0.231]  |7.7835948690311181E17|[0.0,0.0]|
> |20.273|1.0   |[0.52,1.151]  |1.5880852723124176E10|[0.0,0.0]|
> |24.199|0.0   |[0.795,-0.226]|1.4590190884193677E11|[0.0,0.0]|
> Can someone please explain this exponential blow up in prediction, as per my 
> understanding prediction in AFT is a prediction of the time when the failure 
> event will occur, not able to understand why it will change exponentially 
> against the value of the label.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

[jira] [Commented] (SPARK-21919) inconsistent behavior of AFTsurvivalRegression algorithm

5 matches

Site Navigation

Mail list logo

Footer information