Dear Chao, The input file combines both the object popularity and the object size distribution. That’s why the sizes are not sorted and some sizes may even repeat.
Regards, Djordje ________________________________________ From: Roy Lee [[email protected]] Sent: Thursday, March 06, 2014 8:16 PM To: [email protected] Subject: A question about twitter data set of Data Caching Hi, I have a question about twitter dataset for Data Caching benchmark. Each entry in unscaled twitter dataset contains CDF value and size of the data. Is the CDF the CDF of data size distribution? If so, why the data sizes in the dataset file are not in order? I mean why the data sizes is not listed from large value to small value or from small value to large value. Thanks, Chao
