Hello Chris,

In the Portuguese Web Archive we did a study of web characteristics
for the portuguese web. I don't know if this helps you but where is
the papper.

João Miranda, Daniel Gomes, Trends in Web characteristics (best paper
award: 2nd place), 7th Latin American Web Congress, Merida, Mexico,
November 2009
Link to the papper:
http://sobre.arquivo.pt/sobre-o-arquivo/trends-in-web-characteristics/at_download/file
Presentation: 
http://sobre.arquivo.pt/about-the-archive/presentation-trends-in-web-characteristics
About other publications from our archive:
http://sobre.arquivo.pt/about-the-archive/publications?set_language=en

Hope this is of assistence.
Cheers,
Simão Fontes

On Sat, Jan 28, 2012 at 2:01 AM, Mattmann, Chris A (388J)
<[email protected]> wrote:
> (sorry for the cross post)
>
> Hey Guys,
>
> I'm trying to find a good citation or estimate (if anyone has done one) that 
> estimates
> the breakout (by % or some other metric) of content types out there out the 
> web
> (with a whole web crawl or a meaningful representative dataset) that are non 
> HTML.
>
> Anyone have any ideas about this?
>
> Thanks!
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: [email protected]
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>

Reply via email to