In case before i was on purpose try to get a bad model just to check if validation is wrong/bugged.In real case when i choose predictors variables correct validation is even weirder. cmd to train adaptive log.reg.(real deal):
mahout trainAdaptiveLogistic --input DataFraud100kTrening.csv --output model --target fraudRisk --predictors balance numTrans numIntlTrans creditLine --types text --passes 50 --categories 2 validation cmd: mahout validateAdaptiveLogistic --input DataFraud100kTest.csv --model model --auc --confusion and to output is: Log-likelihood:Min=-5.61, Max=-0.00, Mean=-0.16, Median=-0.01 AUC = 0.12 ======================================================= Confusion Matrix ------------------------------------------------------- a b <--Classified as 18812 0 | 18812 a = 0 0 1187 | 1187 b = 1 Entropy Matrix: [[-4.3, -0.6], [-0.0, -0.1]] using mahout 0.9,data has 100k point variables:"custID","gender","state","cardholder","balance","numTrans","numIntlTrans","creditLine","fraudRisk". On standard log.reg. everything is fine AUC is around ~0.75.Sorry for my bad english :). On Fri, Jul 11, 2014 at 5:53 AM, Ted Dunning <[email protected]> wrote: > THis is confusing for sure. The AUC says that you have no predictive > power, but the confusion matrix says you have a perfect solution. > > Can you say more about what you did and what data you used? > > > > > On Thu, Jul 10, 2014 at 6:14 AM, fqsbs1 . <[email protected]> wrote: > > > Hi,i have one question.Is validation of adaptive log.reg. bugged becouse > i > > get this results? > > > > Log-likelihood:Min=-0.69, Max=-0.69, Mean=-0.69, Median=-0.69 > > > > AUC = 0.48 > > > > ======================================================= > > Confusion Matrix > > ------------------------------------------------------- > > a b <--Classified as > > 18812 0 | 18812 a = 0 > > 0 1187 | 1187 b = 1 > > > > > > > > Entropy Matrix: [[-0.7, -0.2], [-0.7, -0.2]] > > >
