Hi Ted, I agree with you. I would love to release it. Unfortunately it is not my data therefore I can not just release it to public not even anonymized. If someone is willing to contribute new algorithms I can release anonymized data sets on a personal basis.
The problem is that there are quite reliable ways to deanonymize data in a reliable way [1]. Further this is also used [2]. Germany is a lot more restricted about privacy laws. So if someone is interested in using my dataset send me an email. /Manuel [1] Narayanan, Arvind ; Shmatikov, Vitaly: Robust De-anonymization of Large Sparse Datasets. In: Proceedings of the 2008 IEEE Symposium on Security and Privacy. Washington, DC, USA : IEEE Computer Society, 2008. – ISBN 978–0– 7695–3168–7, 111–125 [2] Barbaro, Michael ; Jr., Tom Z.: A Face Is Exposed for AOL Searcher No. 4417749. http://www.nytimes.com/2006/08/09/technology/09aol.html?_r=1. Version: August 2006, Checked: 2011-03-09 [3] http://www.wired.com/threatlevel/2009/12/netflix-privacy-lawsuit/ On 29.11.2011, at 15:31, Ted Dunning wrote: > Manuel, > > If you can blind your data sufficiently to release it publicly, it would > make it much easier to get others to help with this. > > On Tue, Nov 29, 2011 at 3:21 AM, Manuel Blechschmidt < > [email protected]> wrote: > >> Hello Anatoliy, >> >> On 29.11.2011, at 10:32, Anatoliy Kats wrote: >> >>> Hi, >>> >>> There was a conversation some time ago about incorporating time >> dependency for preferences: >> http://thread.gmane.org/gmane.comp.apache.mahout.user/2951 >>> >>> Has there been any more discussion about this? Has anything been >> checked into Mahout? Is anyone working on it? I might be able to pitch in. >> >> >> I am currently working with a data set which has highly seasonal data. >> Actually it is the sales data of a merchant selling tea and spices. >> >> I benchmarked the different recommenders against it: >> http://thread.gmane.org/gmane.comp.apache.mahout.user/10433 >> >> As far as I know there are currently no recommenders that incorporate time >> or seasons. The DataModel supports it but it isn't used. >> >> I would guess that identifying seasonal patterns could enhance my >> recommendations a lot. >> >> I scanned the following paper: >> Improving E-Commerce Recommender Systems by the Identification of Seasonal >> Products >> http://www.aaai.org/Papers/Workshops/2007/WS-07-08/WS07-08-011.pdf >> >> Actually I think that what the paper is doing is not that advanced. >> >> I currently try to identify seasonal products with R. I am playing around >> with seasonal ARIMA models (http://www.duke.edu/~rnau/seasarim.htm >> http://cran.r-project.org/web/packages/forecast/forecast.pdf). If I have >> a working solution with R I might implement it in Mahout. >> >> What is your use case? Do you already have a data set? >> >>> >>> Thanks, >>> >>> Anatoliy >> >> /Manuel >> >> -- >> Manuel Blechschmidt >> Dortustr. 57 >> 14467 Potsdam >> Mobil: 0173/6322621 >> Twitter: http://twitter.com/Manuel_B >> >> -- Manuel Blechschmidt Dortustr. 57 14467 Potsdam Mobil: 0173/6322621 Twitter: http://twitter.com/Manuel_B
