Should we maintain ( num_categories * num_of features ) matrix for
per term learning rates in a num_categories-way classification ?
for( i = 0 ; i num_categories ;i++){
for( j = 0 '; j num_of features;j++){
sum_of_squares[i][j] = sum_of_squares[i][j]
Yes. I think that maintaining a learning rate for every parameter that is
being learned is important. It might help to make that sparse, but I
wouldn't think so.
On Sun, Mar 2, 2014 at 1:33 PM, Vishal Santoshi
vishal.santo...@gmail.comwrote:
Should we maintain ( num_categories * num_of
I have been swamped. Generally ad adagrad is a great idea. The code looks fine
at first glance. Certainly some sort of adagrad would be preferable to the
hack that I put in.
Sent from my iPhone
On Feb 26, 2014, at 18:30, Vishal Santoshi vishal.santo...@gmail.com wrote:
Ted, Any
Ted, Any feedback ?
On Mon, Feb 24, 2014 at 2:58 PM, Vishal Santoshi
vishal.santo...@gmail.comwrote:
Hello Ted,
This is regarding AdaGrad update per feature.Have
attached a file which reflects
http://www.ark.cs.cmu.edu/cdyer/adagrad.pdf ( 2 )
It does differ from
Hello Ted,
This is regarding AdaGrad update per feature.Have
attached a file which reflects
http://www.ark.cs.cmu.edu/cdyer/adagrad.pdf ( 2 )
It does differ from OnlineLogisticRegression in the way it implements
public double perTermLearningRate(int j) ;
This class
Hey Ted,
I presume that you would like Adagrad-like solution to replace the
above ?
Things that I could glean out.
* Maintain a simple d-dimensional vector representing to store a running
total of the squares of the gradients, where d is the number of terms. Say
*gradients*.
*
I do see the regularize has the prior ( LI and L2 ) depend on *
perTermLearningRate(j))
...*
On Thu, Feb 20, 2014 at 11:49 AM, Vishal Santoshi vishal.santo...@gmail.com
wrote:
Hey Ted,
I presume that you would like Adagrad-like solution to replace the
above ?
Things that I could
:-)
Many leaks are *very* subtle.
One leak that had me going for weeks was in a news wire corpus. I couldn't
figure out why the cross validation was so good and running the classifier
on new data was s much worse.
The answer was that the training corpus had near-duplicate articles. This
We've been playing around with a number of different parameters, feature
selection, etc. and are able to achieve pretty good results in
cross-validation.
When you say cross validation, do you mean the magic cross validation that
the ALR uses? Or do you mean your 20%?
I mean the 20%.
Gokhan
On Thu, Nov 28, 2013 at 3:18 AM, Ted Dunning ted.dunn...@gmail.com wrote:
On Wed, Nov 27, 2013 at 7:07 AM, Vishal Santoshi
vishal.santo...@gmail.com
Are we to assume that SGD is still a work in progress and
implementations (
Cross Fold, Online, Adaptive ) are too flawed to
Inline
On Mon, Dec 2, 2013 at 8:55 AM, optimusfan optimus...@yahoo.com wrote:
... To accomplish this, we used AdaptiveLogisticRegression and trained 46
binary classification models. Our approach has been to do an 80/20 split
on the data, holding the 20% back for cross-validation of the
Absolutely. I will read through. The idea is to first fix the learning
rate update equation in OLR.
I think this code in OnlineLogisticRegression is the current equation ?
@Override
public double currentLearningRate() {
return mu0 * Math.pow(decayFactor, getStep()) *
Yes. Exactly.
On Thu, Nov 28, 2013 at 6:32 AM, Vishal Santoshi
vishal.santo...@gmail.comwrote:
Absolutely. I will read through. The idea is to first fix the learning
rate update equation in OLR.
I think this code in OnlineLogisticRegression is the current equation ?
@Override
Hell Ted,
Are we to assume that SGD is still a work in progress and implementations (
Cross Fold, Online, Adaptive ) are too flawed to be realistically used ?
The evolutionary algorithm seems to be the core of OnlineLogisticRegression,
which in turn builds up to Adaptive/Cross Fold.
b) for truly
Sorry to spam, I never meant the Hello to come out as Hell. Given a
little disappointment in the mail, I figure I rather spam than be
misunderstood,
On Wed, Nov 27, 2013 at 10:07 AM, Vishal Santoshi vishal.santo...@gmail.com
wrote:
Hell Ted,
Are we to assume that SGD is still a work in
No problem at all. Kind of funny.
On Wed, Nov 27, 2013 at 7:08 AM, Vishal Santoshi
vishal.santo...@gmail.comwrote:
Sorry to spam, I never meant the Hello to come out as Hell. Given a
little disappointment in the mail, I figure I rather spam than be
misunderstood,
On Wed, Nov 27, 2013
On Wed, Nov 27, 2013 at 7:07 AM, Vishal Santoshi vishal.santo...@gmail.com
Are we to assume that SGD is still a work in progress and implementations (
Cross Fold, Online, Adaptive ) are too flawed to be realistically used ?
They are too raw to be accepted uncritically, for sure. They have
Hi-
We're currently working on a binary classifier using Mahout's
AdaptiveLogisticRegression class. We're trying to determine whether or not the
models are suffering from high bias or variance and were wondering how to do
this using Mahout's APIs? I can easily calculate the cross validation
Well, first off, let me say that I am much less of a fan now of the magical
cross validation approach and adaptation based on that than I was when I
wrote the ALR code. There are definitely legs in the ideas, but my
implementation has a number of flaws.
For example:
a) the way that I provide
19 matches
Mail list logo