Hi Bertrand,

I agree that web server logs are the first input to start the activity on 
DeviceMap data. I don't agree it would be a good idea working on the immediate 
available web server, like apache.org.

I see the following process to be put in place:
1. analyze logs form RELEVANT web servers 
2. use the AVAILABLE TOOLS to discover device model and to find device 
properties
3. release DeviceMap data snapshot
4. wait 30 days
5. goto 1

With RELEVANT I mean in terms of geographical coverage and of device access. 
Geographical coverage is very important, because [mobile] devices are 
distributed by device manufacturers not homogeneously. Each market is 
different: while in US today the traffic from connected devices would be 9X% 
desktop browsers+iOS+Android, we can find emerging markets where a relevant 
rate is made by feature phones, set-top-boxes, etc. With "relevant in terms of 
device access" I mean that the server to be analyzed should contain contents 
suitable for consumption by heterogeneous devices: weather forecasts are 
generic contents, while a web server distributing developer's tools would be 
not generic enough to be accessed by heterogeneous devices.

With AVAILABLE TOOLS I mean both manual analysis and search in Internet, and, 
hopefully soon, the Web crawlers developed within the DeviceMap project to grab 
information from publicly available sources.

P.S. I'm trying "Clean and Build" but the process doesn't run... can you please 
fix it? ;)

-Stefano

On 03/feb/2012, at 18.30, Bertrand Delacretaz wrote:

> Hi,
> 
> Another idea discussed with Philip is using web server logs to find
> out which User-Agent values are out there.
> 
> We can probably get logs from apache.org (which gets tons of traffic,
> not sure how much mobile but that's probably growing), and we could
> ask for contributions of more suitably anonymized logs from other
> websites.
> 
> WDYT?
> -Bertrand

Reply via email to