I'm looking at the Criteo Kaggle competition.  Each row is a data related
to the a single display of an advertisement.  You're trying to predict
whether the ad will be clicked or not.

Am I trying to categorize?  Yes and no.  I'm trying to predict whether the
ad will be clicked, but the way I'm trying to do that is by categorizing
the rows into buckets and calculating probability based on the category.

I'm not sure how else you'd go about it.


On Thu, Aug 7, 2014 at 5:44 PM, Jim Bridgewater <[email protected]> wrote:

> Hi Ryan,
>
> For classification problems it sounds like you are headed in the right
> direction, but I'm unclear about what your objective is.  Are you just
> trying to categorize each row in the data set?
>
>
>
> On Thu, Aug 7, 2014 at 1:33 PM, Ryan Belcher <[email protected]> wrote:
> > I've been playing around with NuPIC for a while and am still trying to
> wrap
> > my head around how to use it.  Right now I'm playing with some prediction
> > scenarios where you have a number of input fields and you're trying to
> > predict one output.
> >
> > My understaning is that if the inputs aren't related temporally, then
> it's a
> > Spatial Pooling problem.  If there are common patterns in the data, then
> it
> > may be helpful to create hierarchies of SPs.
> >
> > The data I'm looking at right now probably doesn't have common patterns.
> > It's basically a bunch of categorical data from which you're trying to
> > predict a boolean outcome.  There are about 15M rows in the training set.
> >
> > So my thinking is to create 1 SP where the inputDimensions is wide
> enough to
> > accomodate all of the fields and columnDimensions sized so that rows get
> > grouped together.  (If there were 100k columns, then on average 150 rows
> > would be pooled together.)
> >
> > In theory I could run all of the training data through the SP, then run
> it
> > through again (without learning) and calculate an outcome probability for
> > each column.  Then I could run the test data through and it's probability
> > would be the probability of the column it matches.
> >
> > Is that a reasonable approach or am I way out in left field?
> >
> > Thanks,
> > Ryan
> >
> > _______________________________________________
> > nupic mailing list
> > [email protected]
> > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
> >
>
>
>
> --
> James Bridgewater, PhD
> Arizona State University
> 480-227-9592
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to