Re: [FRIAM] Google Correlate

ERIC P. CHARLES Tue, 04 Oct 2011 08:05:01 -0700

The comic book documentation is actually pretty good. It was formed out of the
original idea that one could predict outbreaks 
of sickness by tracking things like searches for medicine. This makes 
sense (and works pretty well).


That admitted, given a sufficient number of variables, many, many things will
correlate due to chance.  Undoubtedly lots of other things will also correlate
with any given data set, and most will be nonsensical. Thus, I'm not sure this
is the type of program that can be used casually and result in accidental
amazing discoveries. 

When I was poking around I noticed that a search for the word "Honda" had
several hits for Yamaha motorcycles high on the list. That seems like the type
of info that might be useful for those two companies to know. 

Apparently several things that might seem to be immensely popular are also the
result of simple trends in usage (for example web usage). The popularity of
xkcd tracks almost perfectly the popularity of youtube song searches and
several craig's list sites. That seems like the type of information that would
be useful to someone trying to tease apart specific effects of website
popularity from general increases in usage. 

Other interesting things:
I noticed that a search for "Glenn Beck" had "Glen Beck is an idiot" as the
10th highest correlation (r = .91). The graph suggests that from 2009 onward, a
good number of the people interested in Glenn were interested because they
thought he was an idiot (the correlation before that is much weaker). I
searched as far down as r = .65 for Jon Stewart and Stephen Colbert and found
no similar correlations. 

I also noticed that Google has a pretty decent naughty word filter in place.
Not only does mysteriously find nothing if you search for naughty words, it
doesn't put naughty things in the correlated words list. For example, trying to
correlate "glory" or "sanchez" doesn't pull up anything of interest, despite
the fact that it clearly should. Also, searches for "full facial" mostly peaked
around reports of "full face transplants". That doesn't seem useful, but I
thought it was an interesting built in limitation.

Eric


On Mon, Oct  3, 2011 12:19 PM, Roger Critchlow <[email protected]> wrote:
>
>
>And I fail to understand why searches for "advil" would map to searches for
"chai tea latte" and "pappardelle" with r>0.931, guess I need to read the Comic
Book documentation.


>>
>
>>-- rec --
>
>>On Mon, Oct 3, 2011 at 9:55 AM, Douglas Roberts <<#>> wrote:
>
>Sorry about the delay; busy weekend.>
>
>>Yes, google correlate maps search data, not actual real data, although you do
have the option of loading your own time series and letting google correlate
that for you.
>
>>
>
>>Still, I fail to understand why searches for "ibuprofen" would map to
searches for "gateway bible" with r=0.949.
>>
>
>>--Doug>
>>
>>
>
>
>>On Fri, Sep 30, 2011 at 3:10 PM, Robert J. Cordingley <<#>> wrote:
>
>  
>    
>  
>  >
>    I thought the example Doug sent was a correlation between the
>    frequency of search terms, not with any actual real data.  I always
>    thought correlating time based data was difficult since as usage
>    rises in general one might expect search term frequency to rise
>    together too.  Real statisticians can chime in here.
>
>
>    'santa fe' was interesting in showing an annual cycle presumably
>    corresponding to people's interest in finding info on a tourist
>    destination.  
>
>
>    Drawing was interesting in that it seems one could always find some
>    search term frequencies to correlate with any shape curve.
>
>
>    Conclusion: it was fun.
>
>
>    Thanks
>
>    Robert C>
>>
>>
>
>
>
>    On 9/30/11 2:36 PM, Douglas Roberts wrote:
>    That's the one I was using, Tom.
>      >
>
>      >--Doug
>
>>On Fri, Sep 30, 2011 at 2:34 PM, Tom
>          Johnson <<#>>
>          wrote:
>Is this
>            any help?  <https://www.google.com/trends/correlate/>
>
>
>            -tj
>
>
>
>>
>              >
>                >On Fri, Sep 30, 2011 at 9:53 AM, Douglas
>                  Roberts <<#>>
>                  wrote:
>
>
>              
>              
>                >
>                  >Has anybody been able to get anything
>                    useful out of that thing?  Most of the items I've
>                    searched for return totally bizarre results.  For
>                    example, searching on ibuprofen with the intent to
>                    see any correlations with influenza give this as one
>                    of the higher-correlated results:
>                    >
>                      
>
>
>                    >
<http://www.google.com/trends/correlate/search?e=ibuprofen&e=gateway+bible&t=weekly>
>
>
>
>                    >
>                      
>
>
>                    >WTF?  Does this show that on-line religion
>                      causes headaches?  Seriously, has anybody found
>                      this tool to be of any practical use?
>                    >
>
>                    >--Doug
>
>                        >
>
>                        -- 
>
>                        Doug Roberts
><#>
><#>
>                        ><http://parrot-farm.net/Second-Cousins>
>                        >
>
><> - Office
><> - Cell
>                        
>
>
>                    
>
>
>                
>                >============================================================
>
>                  FRIAM Applied Complexity Group listserv
>
>                  Meets Fridays 9a-11:30 at cafe at St. John's College
>
>                  lectures, archives, unsubscribe, maps at 
> <http://www.friam.org>
>
>
>              
>            
>            
>
>
>
>              -- 
>
>              ==========================================
>
>              J. T. Johnson
>
>              Institute for Analytic Journalism   --   Santa Fe, NM USA
><http://www.analyticjournalism.com>
><>(c) 
>                                                <>(h)
><http://www.jtjohnson.com>            
>                   <#>
>
>              ==========================================
>
>
>            ============================================================
>
>            FRIAM Applied Complexity Group listserv
>
>            Meets Fridays 9a-11:30 at cafe at St. John's College
>
>            lectures, archives, unsubscribe, maps at <http://www.friam.org>
>
>
>        
>        
>
>>
>
>        -- 
>
>        Doug Roberts
><#>
><#>
>        ><http://parrot-farm.net/Second-Cousins>
>        >
>
><> - Office
><> - Cell
>        
>
>
>      
>
>
============================================================
>FRIAM Applied Complexity Group listserv
>Meets Fridays 9a-11:30 at cafe at St. John's College
>lectures, archives, unsubscribe, maps at <http://www.friam.org>
>    
>    
>
>
>
>
-- 
>Robert Cordingley
>Web Development
>>
><http://cirrillian.com>
><> (office)
><> (cell)
>  
>
>
>
>
>
>
>
>
>
>============================================================
>
>FRIAM Applied Complexity Group listserv
>
>Meets Fridays 9a-11:30 at cafe at St. John's College
>
>lectures, archives, unsubscribe, maps at <http://www.friam.org>
>
>
>
>
>
>
============================================================
>FRIAM Applied Complexity Group listserv
>Meets Fridays 9a-11:30 at cafe at St. John's College
>lectures, archives, unsubscribe, maps at http://www.friam.org
>

Eric Charles

Professional Student and
Assistant Professor of Psychology
Penn State University
Altoona, PA 16601

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
lectures, archives, unsubscribe, maps at http://www.friam.org

Re: [FRIAM] Google Correlate

Reply via email to