Hi all, I'm afraid I won't be able to give the whole HTTP logs I wanted to give if we keep the original format. Rezal's proposal seems better to me.
Kind regards, Bruno Verachten > The only concern I have is that some of the information requested will have > privacy impact. Is it possible to draft some sort of confidentiality > agreement between the sharing parties, this project, and Apache to make sure > this information is kept private and only used for the project? > > Also, would it make sense to remove the last 8 or 16 bits from the IP > address? I think it would also be a good idea to round the timestamp to the > nearest day. > > Thanks, > Reza > > -----Original Message----- > From: Stefano Andreani [mailto:[email protected]] > Sent: Tuesday, October 09, 2012 3:34 PM > To: [email protected] > Subject: DDR update procedure > > Hi all, > > In order to setup a process to build and maintain the DDR, the first > requirement is to identify a way to allow the uploading of http logs from > contributors, in order to analyze the user agents. We should not receive just > user agent lists, but full http logs, for these reasons: > - we need the timestamp for each user agent, in order to identify the > frequency of each user agent (we will not be able to process each user agent, > but hopefully enough to cover 99% of the http requests; > - we need the source IP address, in order to have a geographical MAP of > distribution of the Devices, for two reasons: (1) analyzing IP addresses > (using a geographical DB, like GeoIP) we would have a map of uncovered > regions, so we would be able to improve the global coverage chasing > contributions from specific regions. (2) the same device can have different > user agents depending on the region where is has been commercialized and > using IPs we can improve analysis. > > At the same time, uploading such information without a clear policy about how > that data is handled could imply privacy issues, so we must keep the upload > area private and guarantee that the information is used consistently with the > objectives of DeviceMAP project, and not for other purposes. > > Do you agree on this? > > Anyone from the Apache infrastructure team can help to identify what is the > technical solution to satisfy these requirements? > > Cheers, > Stefano. > -- Bruno Verachten
