yep, product images are there as well. have a go here: https://bbyopen.com/documentation/products-api/product-attributes#TableImages
What Ted said. (if I understand correctly) You could create two datasets from the one with: A dataset with: userID,skuID,1 Another with: searchID,skuID,1 Timestamps are there too if you want to get clever with preferences rather than binary. (i'd scrub the search terms before mapping them to IDs too) I also have an image with all this data packaged on AWS inside a postgres database if you wanted to fart around with it. in public images just do a search for "ACM hackathon" and you should see it. Feel free to ping me off list with specific questions on that. On Tue, Apr 16, 2013 at 10:29 AM, Ted Dunning <[email protected]> wrote: > Primary action can be emitting a search term. Secondary can be click to > view. > > > On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel <[email protected]> wrote: > > > For the cross-recommender we need some replacement for a primary > > action--purchases and a secondary action--views, clicks, impressions, > > something. > > > > To use this data we would treat clicks like a purchase--the primary > action > > we want to recommend. Then the search-result-item-impressions is like a > > view in the x-recommender description. In this case the SRII is an item > > seen on a search results page. Each SRII would come with a user ID, > itemID, > > and implicit preference. Clicks also come with userID, itemID and > implicit > > preference. > > > > The cross-recommender would have the effect of finding click > > recommendations from search result item impressions. At very least this > > seems like a way to use clicks to re-rank search results. > > > > Is this good enough for testing the x-recommender algo? Do we have SRIIs > > with item ID and user ID? Maybe there are product page URLs we can use as > > item ids? I'll look, thanks. > > > > > > On Apr 15, 2013, at 5:52 PM, Nick Kolegraff <[email protected]> > > wrote: > > > > Hey Guys, > > This is a dataset that kinda fits the bill, sorta -- probably the closest > > thing out there. I got this extracted from BestBuy. Now, while it is > more > > focused on 'search' opposed to recommendations...could probably double > for > > a recs problem. > > > > basically, each userid is mapped to a query that resulted in a click on a > > particular sku (product_id). They are the real skus as well, so they can > > map back to real products in their products api (this data is also > provided > > in bulk on kaggle): > > > > https://bbyopen.com/api-profiles/products-api > > http://www.kaggle.com/c/acm-sf-chapter-hackathon-big/data > > > > > > On Mon, Apr 15, 2013 at 2:03 PM, Pat Ferrel <[email protected]> > wrote: > > > > > MAJOR may be too tame a word. > > > > > > Furthermore there are several enhancements the community could make to > > > support retail data and retail recommenders. For one thing without > public > > > data a *public* cross-recommender will probably not get built. > > > > > > The cross-recommender needs to separate actions types and use them in > > > slightly different ways so it is important to have a data set with > user's > > > purchases but also views, add-to-cart, impressions, purchases in > groups > > > (shopping carts)--whatever events are available with anonymized user > IDs. > > > > > > This data set would be significant in getting new techniques into the > > > community and therefore back to people like you. > > > > > > On Apr 15, 2013, at 9:49 AM, Koobas <[email protected]> wrote: > > > > > > Definitely of MAJOR interest. > > > I am sure it would also draw all kinds of desired attention to your > > > business. > > > Movie Lens is way too small to be meaningful any more. > > > Wikipedia articles and Stackoverflow tags are not retail data! > > > By all means, post some real retail data, if you can. > > > Meaningful sizes would be appreciated: millions of customers, > > > thousands - tens of thousands products. > > > > > > > > > On Mon, Apr 15, 2013 at 12:27 PM, Robin Morris <[email protected]> > wrote: > > > > > >> I asked management here a while ago whether there would be a problem > > with > > >> releasing an anonymized set of data from one of our retail customers, > > and > > >> didn't get too much push-back. If this is something that would be of > > >> major interest, I can ask again and see whether there's something we > can > > >> put out as a community resource. > > >> > > >> Robin > > >> > > >> > > >> On 4/10/13 8:37 PM, "Pat Ferrel" <[email protected]> wrote: > > >> > > >>> I have retail data but can't publish results from it. If I could get > a > > >>> public sample I'd share how the technique worked out. > > >>> > > >>> Not sure how to simulate this data. It has the important > characteristic > > >>> that every purchase is also a view but not the other way around and > > > Ted's > > >>> technique is a way to scrub the views that don't lead to purchases. > All > > >>> these are implicit preferences but that's not the important part for > > > this > > >>> technique. > > >>> > > >>> On Apr 10, 2013, at 4:15 PM, Koobas <[email protected]> wrote: > > >>> > > >>> Retail data may be hard to impossible, but one can improvise. > > >>> It seems to be fairly common to use Wikipedia articles (Myrrix, > > > GraphLab). > > >>> Another idea is to use StackOverflow tags (Myrrix examples). > > >>> Although they are only good for emulating implicit feedback. > > >>> > > >>> > > >>> On Wed, Apr 10, 2013 at 6:48 PM, Ted Dunning <[email protected]> > > >>> wrote: > > >>> > > >>>> On Wed, Apr 10, 2013 at 10:38 AM, Pat Ferrel <[email protected] > > > > >>>> wrote: > > >>>> > > >>>>> Does anyone know of a public data set that provides things like > views > > >>>>> and > > >>>>> purchases? > > >>>>> > > >>>> > > >>>> I don't. > > >>>> > > >>> > > >> > > >> > > > > > > > > > > >
