[08/51] [partial] opennlp-sandbox git commit: merge from bgalitsky's own git repo

bgalitsky Wed, 16 Nov 2016 01:11:40 -0800

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt
----------------------------------------------------------------------
diff --git 
a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt
 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt
new file mode 100644
index 0000000..def743c
--- /dev/null
+++ 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt
@@ -0,0 +1,2 @@
+
+I want to open by quoting Einstein 's wonderful statement , just so people 
will feel at ease that the great scientist of the 20th century also agrees with 
us , and also calls us to this action . He said , " A human being is a part of 
the whole , called by us , universe , a part limited in time and space . He 
experiences himself , his thoughts and feelings , as something separated from 
the rest , a kind of optical delusion of his consciousness , that separation . 
This delusion is a kind of prison for us , restricting us to our personal 
desires and to affection for a few persons nearest to us . Our task must be to 
free ourselves from this prison by widening our circle of compassion , to 
embrace all living creatures and the whole of nature in its beauty . " This 
insight of Einstein 's is uncannily close to that of Buddhist psychology , 
wherein compassion , karuna , it is called , is defined as , " The sensitivity 
to another 's suffering and the corresponding will to free the other from
  that suffering . " It pairs closely with love . Which is the will for the 
other to be happy . Which requires , of course , that one feels some happiness 
oneself and wishes to share it . This is perfect in that it clearly opposes 
self-centeredness and selfishness to compassion , the concern for others , and 
, further , it indicates that those caught in the cycle of self-concern , 
suffer helplessly , while the compassionate are more free and implicitly more 
happy . The Dalai Lama often states that compassion is his best friend . It 
helps him when he is overwhelmed with grief and despair . Compassion helps him 
turn away from the feeling of his suffering as the most absolute , most 
terrible suffering anyone has ever had and broadens his awareness of the 
sufferings of others , even of the perpetrators of his misery and the whole 
mass of beings . In fact , suffering is so huge and enormous , his own becomes 
less and less monumental . And he begins to move beyond his self-concern into 
the
  broader concern for others . And this immediately cheers him up , as his 
courage is stimulated to rise to the occasion . Thus , he uses his own 
suffering as a doorway to widening his circle of compassion . He is a very good 
colleague of Einstein 's , we must say . Now , I want to tell a story , which 
is a very famous story in the Indian and Buddhist tradition , of the great 
Saint Asanga who lived -- contemporary of Augustine in the West and was sort of 
like the Buddhist Augustine . And Asanga lived 800 years after the Buddha 's 
time . And he was discontented with the state of people 's practice of the 
Buddhist religion in India at that time . And so he said , " I 'm sick of all 
this . Nobody 's really living the doctrine . They 're talking about love and 
compassion and wisdom and enlightenment , but they are acting selfish and 
pathetic . So Buddha 's teaching has lost its momentum . I know next Buddha 
will come a few thousand years from now , but exists currently in a certain 
heave
 n , that 's Maitreya . So , I 'm going to go on a retreat , and I 'm going to 
meditate and pray until the Buddha Maitreya reveals himself to me , and gives 
me a teaching or something to revive the practice of compassion in the world 
today . " So he went on this retreat . And he meditated for three years and he 
did not see the future Buddha Maitreya . And he left in disgust . And as he was 
leaving , he saw a man -- a funny little man -- sitting sort of part way down 
the mountain . And he had a lump of iron . And he was rubbing it with a cloth . 
And he became became interested in that . He said , " Well what are you doing ? 
" And the man said , " I 'm making a needle . " And he said , " That 's 
ridiculous . you ca n't make a needle by rubbing a lump of iron with a cloth . 
" And the man said , " Really ? " And he showed him a dish full of needles . So 
he said , " Okay , I get the point . " He went back to his cave . He meditated 
again . Another three years , no vision . He leaves again
  . This time , he comes down . And as he 's leaving , he sees a bird making a 
nest on a cliff ledge . And where it 's landing to bring the twigs to the cliff 
, its feathers brushes the rock , and it had cut the rock in , inches , six to 
eight inches in , there was a cleft in the rock by the brushing of the feathers 
of generations of the birds . So he said , " All right . I get the point . " He 
went back . Another three years . Again , no vision of Maitreya after nine 
years . And , he again leaves , and this time water dripping , making a giant 
bowl in the rock where it drips in a stream . And so , again , he goes back . 
And after 12 there is still no vision . And he 's freaked out . And he wo n't 
even look left or right to see any encouraging vision . And he comes to the 
town . He 's a broken person . And there , in the town , he 's approached by a 
dog who comes like this -- one of these terrible dogs you can see in some poor 
countries , even in America , I think , in some areas -- 
 and he 's looking just terrible . And he becomes interested in this dog 
because it 's so pathetic , and it 's trying to attract his attention . And he 
sits down looking at the dog . And the dog 's whole hindquarters are a complete 
open sore . And some of it is like gangrenous . And there 's like maggots in 
the flesh . And it 's terrible . He thinks , " What can I do to fix up this dog 
? Well , at least I can clean this wound and wash it . " So he takes it to some 
water , he 's about to clean , then his awareness focuses on the maggots . And 
he sees the maggots , and the maggots are kind of looking a little cute . And 
they 're maggoting happily in the dog 's hindquarters there . " Well , if I 
clean the dog , I 'll kill the maggots . So how can that be ? That 's it . I 'm 
a useless person and there 's no Buddha , no Maitreya , and everything is all 
hopeless . And now I 'm going to kill the maggots ? " So , he had a brilliant 
idea . And he took a shard of something , and cut a piece of
  flesh from his thigh , and he placed it on ground . He was not really 
thinking too carefully about the ASPCA . He was just immediately caught with 
the situation . So he thought , " I will take the maggots and put them on this 
piece of flesh , then clean the dog 's wounds , and then , you know , I 'll 
figure out what to do with the maggots . " So he starts to do that . He ca n't 
grab the maggots . Apparently they wriggle around . They 're kind of hard to 
grab , these maggots . So he says , " Well , I 'll put my tongue on the dog 's 
flesh . And then the maggots will jump on my warmer tongue . The dog is kind of 
used up . And then I 'll spit them one by one down on the thing . " So he goes 
down , and he 's sticking his tongue out like this . And he had to close his 
eyes , it 's so disgusting , and the smell and everything . And then , suddenly 
, there 's a pfft , a noise like that . He jumps back and there , of course , 
is the future Buddha Maitreya . In a beautiful vision like rainbo
 w lights , golden , jeweled , plasma body , like exquisite mystic vision , he 
sees . And he says , " Oh . " He bows . But , being human , he 's immediately 
thinking of his next complaint . So as he comes up from his first bow he says , 
" My Lord , I 'm so happy to see you , but where have you been for 12 years ? 
What is this ? " And Maitreya says , " I was with you . Who do you think was 
making needles and making nests and dripping on rocks for you , mister dense ? 
" ( Laughter ) " Looking for the Buddha in person . " he said . And he said , " 
You did n't have , until this moment , real compassion . And , until you have 
real compassion , you cannot recognize love . " Maitreya means love , the 
loving one , you know , in Sanskrit . And so he looked very dubious , Asanga 
did . And he said , " If you do n't believe me , just take me with you . " And 
so he took the Maitreya -- it shrunk into a globe , a ball -- took him on his 
shoulder . And he ran into town in the marketplace , and he s
 aid , " Rejoice . Rejoice . The future Buddha has come ahead of all 
predictions . Here he is . " And then pretty soon they started throwing rocks 
and stones at him -- It was n't Chautauqua . It was some other town -- because 
they saw a demented looking , scrawny looking yogi man , like some kind of 
hippie , with a bleeding leg and a rotten dog on his shoulder , shouting that 
the future Buddha had come . So , naturally , they chased him out of town . But 
on the edge of town , one elderly lady , a char woman in the charnel ground , 
saw a jeweled foot on a jeweled lotus on his shoulder and then the dog , but 
she saw the jewel foot of the Maitreya , and she offered a flower . So that 
encouraged him , and he went with Maitreya . With Maitreya then took him to a 
certain heaven , the way the Buddhist myth unfolds in a typical way . And 
Maitreya then kept him in heaven for five years , dictating to him five 
complicated tomes of the methodology of how you cultivate compassion . And then 
I th
 ought I would share with you what that method is , or one of them . Famous one 
, it 's called the " Sevenfold Causal Method of Developing Compassion . " And 
it begins first by one meditating and visualizing that all beings are with one 
, and all -- even animals too -- but everyone is in human form . The animals 
are in one of their human lives . The humans are human . And then , among them 
, you think of your friends and loved ones , the circle at the table . And you 
think of your enemies , and you think of the neutral ones . And then you try to 
say , " Well , the loved ones I love . But , you know , after all , they 're 
nice to me . I had fights with them . Sometimes they were unfriendly . I got 
mad . Brothers can fight . Parents and children can fight . So , in a way , I 
like them so much because they 're nice to me . While the neutral ones I do n't 
know . They could all be just fine . And then the enemies I do n't like because 
they 're mean to me . But they are nice to somebody . 
 I could be them . " And then the Buddhists , of course , think , because we 
've all had infinite previous lives , the Buddhists think that we 've all been 
each other 's relatives , actually , and everyone , therefore all of you , in 
the Buddhist view in some previous life , although you do n't remember it and 
neither do I , have been my mother , for which I do apologize for the trouble I 
caused you . And also , actually , I 've been your mother . I 've been female , 
and I 've been every single one of you , your mother in a previous life , the 
way the Buddhists reflect . So , my mother is this life is really great . But 
all of you in a way are part of the eternal mother . You gave me that 
expression , the eternal mama , you said . That 's wonderful . So , that 's the 
way the Buddhists do it . A theist , Christian , can think that all beings , 
even my enemies , are God 's children . So , in that sense , we 're related . 
So , they first create this foundation of equality . So , we sort
  of reduce a little of the clinging to the ones we love -- just in the 
meditation -- and we open our mind to those we do n't know . And we definitely 
reduce the hostility and " I do n't want to be compassionate to them " to the 
ones we think of as the bad guys , the ones we hate and we do n't like . And we 
do n't hate anyone therefore . So we equalize . That 's very important . And 
then the next thing we do is what is called mother recognition . And that is , 
we think of every being as familiar , as family . We expand . We take the 
feeling about remembering a mama , and we defuse that to all beings in this 
meditation . And we see the mother in every being . We see that look that the 
mother has on her face , this looking at this child that is a miracle that she 
has produced from her own body , being a mammal , where she has true compassion 
, truly is the other , and identifies completely . Often the life of that other 
will be more important to her than her own life . And that 's why 
 it 's the most powerful form of -- altruism . The mother is what is the model 
of all altruism for human beings , in spiritual traditions . And so , we 
reflect until we can sort of see that motherly expression in all beings . 
People laugh at me because , you know , I used to say that I used to meditate 
on mama Cheney as my mom , when , of course , I was annoyed with him about all 
of his evil doings in Iraq . I used to meditate on George Bush . He 's quite a 
cute mom in a female form . Has his little ears and he smiles and he rocks you 
in his arms . And you think of him as nursing you . And then Saddam Hussein 's 
serious mustache is a problem . But you think of him as a mom . And this is the 
way you do it . You take any being who looks weird to you , and you see how 
they could be familiar to you . And you do that for awhile until you really 
feel that . You can feel the familiarity of all beings . Nobody seems alien . 
They 're not " other . " You reduce the feeling of otherness about b
 eings . Then you move from there to remembering the kindness of mothers in 
general , if you can remember the kindness of your own mother , if you can 
remember the kindness of your spouse , or , if you are a mother yourself , how 
you were with your children . And you begin to get very sentimental , you 
cultivate sentimentality intensely . You will even weep , perhaps , with 
gratitude and kindness . And then you connect that with your feeling that 
everyone has that motherly possibility . Every being , even the most mean 
looking ones , can be motherly . And then , third , you step from there to what 
is called a feeling of gratitude . You want to repay that kindness that all 
beings have shown to you . And then the fourth step , you go to what is called 
lovely love . In each one of these you can take some weeks , or months , or 
days depending on how you do it , or you can do them in a run , this meditation 
. And then you think of how lovely beings are when they are happy , when they 
are 
 satisfied . And every being looks beautiful when they are internally feeling a 
happiness . Their face does n't look like this . When they 're angry , they 
look ugly , every being , but when they 're happy they look beautiful . And so 
you see beings in their potential happiness . And you feel a love toward them 
that you want them to be happy , even the enemy . And , actually , it 's very 
logical to want to -- we think Jesus is being unrealistic when he says love 
thine enemy . He does say that , and we think he 's being unrealistic and sort 
of spiritual and highfalutin and , " Nice for him to say it , but I ca n't do 
that . " But , actually , that 's practical . If you love your enemy that means 
you want your enemy to be happy . If your enemy was really happy , why would 
they bother to be your enemy ? How boring to run around chasing you . They 
would be relaxing somewhere having a good time . So it makes sense to want your 
enemy to be happy because they 'll stop being your enemy becau
 se that 's too much trouble . But anyway , that 's the lovely love . And then 
finally , the fifth step is compassion , universal compassion . And that is 
where you then look at the reality of all the beings you can think of . And you 
look at them , and you see how they are . And you realize how unhappy they are 
actually , mostly , most of the time . You see that furrowed brow in people . 
And then you realize they do n't even have compassion on themselves . They 're 
driven by this duty and this obligation . " I have to get that . I need more . 
I 'm not worthy . And I should do something . " And they 're rushing around all 
stressed out . And they think of it as somehow macho , hard discipline on 
themselves . But actually they are cruel to themselves . And , of course , they 
are cruel and ruthless toward others . And they , then , never get any positive 
feedback . And the more they succeed , and the more power they have , the more 
unhappy they are . And this is where you feel real comp
 assion for them . And you then feel you must act . And it 's the motivation -- 
And the choice of action , of course , hopefully will be more practical than 
poor Asanga who was fixing the maggots on the dog , because he had that 
motivation , and whoever was in front of him , he wanted to help . But , of 
course , that is impractical . He should have founded the ASPCA in the town and 
gotten some scientific help for dogs and maggots . And I 'm sure he did that 
later . But that just indicates the state of mind , you know . And so the next 
step -- the sixth step beyond universal compassion -- which then is this thing 
where you 're linked with the needs of others in a true way , and you have 
compassion for yourself also , and you do n't -- it is n't sentimental only . 
You might be in fear of something . Some bad guy is making himself more and 
more unhappy being more and more mean to other people and getting punished in 
the future for it in various ways . And in Buddhism , they catch it in 
 the future life . Of course in theistic religion they 're punished by God or 
whatever . And materialism , they think they get out of it just by not existing 
, by dying , but they do n't . And so they get reborn as whatever , you know . 
Never mind . I wo n't get into that . But the next step is called universal 
responsibility . And that is very important -- the Charter of Compassion must 
lead us to develop through true compassion , what is called universal 
responsibility . And that means that the great teaching of his holiness , the 
Dalai Lama , that he always teaches everywhere , and he says that is the common 
religion of humanity , kindness , But kindness means universal responsibility . 
And that means whatever happens to other beings is happening to us , that we 
are responsible for that , and we should take it and do whatever we can at 
whatever little level and small level that we can do it . We absolutely must do 
that . There is no way not to do it . And then , finally , that lea
 ds to a new orientation in life where we live equally for ourselves and others 
, and we realize that happiness for ourselves -- and we are joyful and happy . 
One thing we must n't think is compassion makes you miserable . Compassion 
makes you happy . The first person who is happy , when you get great compassion 
, is yourself , even if you have n't done anything yet for anybody else . 
Although , the change in your mind already does something for other beings . 
They can sense this new quality in yourself , and it helps them already , and 
gives them an example . And that uncompassionate clock has just showed me that 
it 's all over . So , practice compassion , read the charter , disseminate it 
and develop it within yourself . Do n't just think , oh well , I 'm 
compassionate , or I 'm not compassionate , and sort of think you 're stuck 
there . You can develop this . You can diminish the non-compassion , the 
cruelty , the callousness , the neglect of others . Take universal 
responsibility
  for them , and then , not only will God smile and the eternal mama will smile 
, but Karen Armstrong will smile . Thank you very much . 
\ No newline at end of file


http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt
----------------------------------------------------------------------
diff --git 
a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt
 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt
new file mode 100644
index 0000000..6d0bcf8
--- /dev/null
+++ 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt
@@ -0,0 +1,2 @@
+
+This is my first time at TED . Normally , as an advertising man , I actually 
speak at TED Evil , which is TED 's secret sister organization -- the one that 
pays all the bills . It 's held every two years in Burma . And I particularly 
remember a really good speech by Kim Jong Il on how to get teens smoking again 
. ( Laughter ) But , actually , it 's suddenly come to me after years working 
in the business , that what we create in advertising , which is intangible 
value -- you might call it perceived value , you might call it badge value , 
subjective value , intangible value of some kind -- gets rather a bad rap . If 
you think about it , if you want to live in a world in the future where there 
are fewer material goods , you basically have two choices . You can either live 
in a world which is poorer , which people in general do n't like . Or you can 
live in a world where actually intangible value constitutes a greater part of 
overall value , that actually intangible value , in many ways
  is a very , very fine substitute for using up labor or limited resources in 
the creation of things . Here is one example . This is a train which goes from 
London to Paris . The question was given to a bunch of engineers , about 15 
years ago , " How do we make the journey to Paris better ? " And they came up 
with a very good engineering solution , which was to spend six billion pounds 
building completely new tracks from London to the coast , and knocking about 40 
minutes off a three-and-half-hour journey time . Now , call me Mister Picky . I 
'm just an ad man ... ... but it strikes me as a slightly unimaginative way of 
improving a train journey merely to make it shorter . Now what is the hedonic 
opportunity cost on spending six billion pounds on those railway tracks ? Here 
is my naive advertising man 's suggestion . What you should in fact do is 
employ all of the world 's top male and female supermodels , pay them to walk 
the length of the train , handing out free Chateau Petrus for
  the entire duration of the journey . ( Laughter ) ( Applause ) Now , you 'll 
still have about three billion pounds left in change , and people will ask for 
the trains to be slowed down . ( Laughter ) Now , here is another naive 
advertising man 's question again . And this shows that engineers , medical 
people , scientific people , have an obsession with solving the problems of 
reality , when actually most problems , once you reach a basic level of wealth 
in society , most problems are actually problems of perception . So I 'll ask 
you another question . What on earth is wrong with placebos ? The seem 
fantastic to me . They cost very little to develop . They work extraordinarily 
well . They have no side effects , or if they do , they 're imaginary , so you 
can safely ignore them . ( Laughter ) So I was discussing this . And I actually 
went to the Marginal Revolution blog by Tyler Cowen . I do n't know if anybody 
knows it . Someone was actually suggesting that you can take this conce
 pt further , and actually produce placebo education . The point is that 
education does n't actually work by teaching you things . It actually works by 
giving you the impression that you 've had a very good education , which gives 
you an insane sense of unwarranted self confidence , which then makes you very 
, very successful in later life . So , welcome to Oxford , ladies and gentlemen 
. ( Laughter ) ( Applause ) But , actually , the point of placebo education is 
interesting . How many problems of life can be solved actually by tinkering 
with perception , rather than that tedious , hardworking and messy business of 
actually trying to change reality ? Here 's a great example from history . I 
've heard this attributed to several other kings , but doing a bit of 
historical research it seems to be Fredrick the Great . Fredrick the Great of 
Prussia was very very keen for the Germans to adopt the potato , and to eat it 
. Because he realized that if you had two sources of carbohydrate , wh
 eat and potatoes , you get less price volatility in bread . And you get a far 
lower risk of famine , because you actually had two crops to fall back on , not 
one . The only problem is : potatoes , if you think about it , look pretty 
disgusting . And also , 18th century Prussians ate very , very few vegetables 
-- rather like contemporary Scottish people . ( Laughter ) So , actually , he 
tried making it compulsory . The Prussian peasantry said , " We ca n't even get 
the dogs to eat these damn things . They are absolutely disgusting and they 're 
good for nothing . " There are even records of people being executed for 
refusing to grow potatoes . So he tried plan B. He tried the marketing solution 
, which is he declared the potato as a royal vegetable . And none but the royal 
family could consume it . And he planted it in a royal potato patch , with 
guards who had instructions to guard over it , night and day , but with secret 
instructions not to guard it very well . ( Laughter ) Now 18t
 h century peasants know that there is one pretty safe rule in life , which is 
if something is worth guarding , it 's worth stealing . Before long , there was 
a massive underground potato-growing operation in Germany . What he 'd 
effectively done is he 'd re-branded the potato . It was an absolute 
masterpiece . I told this story and a gentleman from Turkey came up to me and 
said , " Very , very good marketer , Fredrick the Great . But not a patch on 
Ataturk . " Ataturk , rather like Nicolas Sarkozy , was very keen to discourage 
the wearing of a veil , in Turkey , to modernize it . Now , boring people would 
have just simply banned the veil . But that would have ended up with a lot of 
awful kickback and a hell of a lot of resistance . Ataturk was a lateral 
thinker . He made it compulsory for prostitutes to wear the veil . ( Laughter ) 
( Applause ) I ca n't verify that fully . But it does not matter . There is 
your environmental problem solved , by the way , guys : All convicted child m
 olesters have to drive a Porsche Cayenne . ( Laughter ) What Ataturk realized 
actually is two very fundamental things . Which is that , actually , first one 
, all value is actually relative . All value is perceived value . For those of 
you who do n't speak Spanish , jugo de naranja -- it 's actually the Spanish 
for " orange juice . " Because actually it 's not the dollar . It 's actually 
the peso in Buenos Aires . Very clever Buenos Aires street vendors decided to 
practice price discrimination to the detriment to any passing gringo tourists . 
As an advertising man , I have to admire that . But the first thing this all 
shows is that all value is subjective . Second point is that persuasion is 
often better than compulsion . These funny signs that flash your speed at you , 
some of the new ones , on the bottom right , now actually show a smiley face or 
a frowny face , to act as an emotional trigger . What 's fascinating about 
these signs is they cost about 10 percent of the running cost
  of a conventional speed camera . But they prevent twice as many accidents . 
So , the bizarre thing which is baffling to conventional , classically trained 
economists , is that a weird little smiley face has a better effect on changing 
your behavior than the threat of a Â£60 fine and three penalty points . Tiny 
little behavioral economics detail : in Italy , penalty points go backwards . 
You start with 12 and they take them away . Because the found that loss 
aversion is a more powerful influence on people 's behavior . In Britain we 
tend to feel , " Whoa ! Got another three ! " Not so in Italy . Another 
fantastic case of creating intangible value to replace actual or material value 
, which remember , is what , after all , the environmental movement needs to be 
about : This , again , is from Prussia , from , I think , about 1812 , 1813. 
The wealthy Prussians , to help in war against the French , were encouraged to 
give in all their jewelry . And it was replaced with replica jewelry m
 ade of cast iron . Here 's one : " Gold gab ich fÃ¼r Eisen , 1813. " The 
interesting thing is that for 50 years hence , the highest status jewelry you 
could wear in Prussia was n't made of gold or diamonds . It was made of cast 
iron . Because actually , never mind the actual intrinsic value of having gold 
jewelry . This actually had symbolic value , badge value . It said that your 
family had made a great sacrifice in the past . So , the modern equivalent 
would of course be this . ( Laughter ) But , actually , there is a thing , just 
as there are Veblen goods , where the value of the good depends on it being 
expensive and rare -- there are opposite kind of things where actually the 
value in them depends on them being ubiquitous , classless and minimalistic . 
If you think about it , Shakerism was a proto-environmental movement . Adam 
Smith talks about 18th century America where the prohibition against visible 
displays of wealth was so great , it was almost a block in the economy in Ne
 w England , because even wealthy farmers could find nothing to spend their 
money on , without incurring the displeasure of their neighbors . It 's 
perfectly possible to create these social pressures which lead to more 
egalitarian societies . What 's also interesting , if you look at products that 
have a high component of what you might call messaging value , a high component 
of intangible value , versus their intrinsic value : They are often quite 
egalitarian . In terms of dress , denim is perhaps the perfect example of 
something which replaces material value with symbolic value . Coca-Cola . A 
bunch of you may be a load of pinkos , and you may not like the Coca-Cola 
company . But it 's worth remembering Andy Warhol 's point about Coke . What 
Warhol said about Coke is , he said , " What I really like about Coca-Cola is 
the president of the United States ca n't get a better Coke than the bum on the 
corner of the street . " Now , that is , actually , when you think about it , 
we take 
 it for granted -- it 's actually a remarkable achievement , to produce 
something that 's so democratic . Now , we basically have to change our views 
slightly . There is a basic view that real value involves making things , 
involves labor . It involves engineering . It involves limited raw materials . 
And that what we add on top is kind of false . It 's a fake version . And there 
is a reason for some suspicion and uncertainly about it . It patently veers 
toward propaganda . However , what we do have now is a much more variegated 
media ecosystem in which to kind of create this kind of value . And it 's much 
fairer . When I grew up , this was basically the media environment of my 
childhood as translated into food . You had a monopoly supplier . On the left , 
you have Rupert Murdoch , or the BBC . ( Laughter ) And on your right you have 
a dependent public which is pathetically grateful for anything you give it . ( 
Laughter ) Nowadays , the user is actually involved . This is actually wh
 at 's called , in the digital world , " user-generated content . " Although it 
's called agriculture , in the world of food . ( Laughter ) This is actually 
called a mash-up , where you take content that someone else has produced and 
you do something new with it . In the world of food we call it cooking . This 
is food 2.0 , which is food you produce for the purpose of sharing it with 
other people . This is mobile food . British are very good at that . Fish and 
chips in newspaper , the Cornish Pastie , the pie , the sandwich . We invented 
the whole lot of them . We 're not very good at food in general . Italians do 
great food , but it 's not very portable , generally . ( Laughter ) I only 
learned this the other day . The Earl of Sandwich did n't invent the sandwich . 
He actually invented the toasty . But then , the Earl of Toasty would be a 
ridiculous name . ( Laughter ) Finally , we have contextual communication . Now 
, the reason I show you Pernod -- it 's only one example . Every c
 ountry has a contextual alcoholic drink . In France it 's Pernod . It tastes 
great within the borders of that country . But absolute shite if you take it 
anywhere else . ( Laughter ) Unicum in Hungary , for example . The Greeks have 
actually managed to produce something called Retsina , which even tastes shite 
when you 're in Greece . ( Laughter ) But so much communication now is 
contextual that the capacity for actually nudging people , for giving them 
better information -- B. J. Fogg , at the University of Stanford , makes the 
point that actually the mobile phone is -- He 's invented the phrase , " 
persuasive technologies . " He believes the mobile phone , by being 
location-specific , contextual , timely and immediate , is simply the greatest 
persuasive technology device ever invented . Now , if we have all these tools 
at our disposal , we simply have to ask the question , and Thaler and Sunstein 
have , of how we can use these more intelligently . I 'll give you one example 
. If y
 ou had a large red button of this kind , on the wall of your home , and every 
time you pressed it it saved 50 dollars for you , put 50 dollars into your 
pension , you would save a lot more . The reason is that the interface 
fundamentally determines the behavior . Okay ? Now , marketing has done a very 
very good job of creating opportunities for impulse buying . Yet we 've never 
created the opportunity for impulse saving . If you did this , more people 
would save more . It 's simply a question of changing the interface by which 
people make decisions . And the very nature of the decisions changes . 
Obviously , I do n't want people to do this , because as an advertising man I 
tend to regard saving as just consumerism needlessly postponed . ( Laughter ) 
But if anybody did want to do that , that 's the kind of thing we need to be 
thinking about , actually : fundamental opportunities to change human behavior 
. Now , I 've got an example here from Canada . There was a young intern at 
Ogilv
 y Canada called Hunter Somerville , who was working in improv in Toronto , and 
got a part-time job in advertising , and was given the job of advertising 
Shreddies . Now this is the most perfect case of creating intangible added 
value , without changing the product in the slightest . Shreddies is a strange 
, square , whole-grain cereal , only available in New Zealand , Canada and 
Britain . It 's Kraft 's peculiar way of rewarding loyalty to the crown . ( 
Laughter ) In working out how you could relaunch Shreddies , he came up with 
this . Video : ( Buzzer ) Man : Shreddies is supposed to be square . ( Laughter 
) Woman : Have any of these diamond shapes gone out ? ( Laughter ) Voiceover : 
New Diamond Shreddies cereal . Same 100 percent whole-grain wheat in a 
delicious diamond shape . ( Applause ) Rory Sutherland : I 'm not sure this is 
n't the most perfect example of intangible value creation . All it requires is 
photons , neurons , and a great idea to create this thing . I would say it
  's a work of genius . But , naturally , you ca n't do this kind of thing 
without a little bit of market research . Man : So , Shreddies is actually 
producing a new product , which is something very exciting for them . So they 
are introducing new Diamond Shreddies . ( Laughter ) So I just want to get your 
first impressions when you see that , when you see the Diamond Shreddies box 
there . ( Laughter ) Woman : Were n't they square ? Woman #2 : I 'm a little 
bit confused . Woman #3 : They look like the squares to me . Man : They -- Yeah 
, it 's all in the appearance . But it 's kind of like flipping a six or a nine 
like a six . If you flip it over it looks like a nine . But a six is very 
different from a nine . Woman # 3 : Or an " M " and a " W " . Man : An " M " 
and a " W " , exactly . Man #2 : [ unclear ] You just looked like you turned it 
on its end . But when you see it like that it 's more interesting looking . Man 
: Just try both of them . Take a square one there , first . ( Lau
 ghter ) Man : Which one did you prefer ? Man #2 : The first one . Man : The 
first one ? ( Laughter ) Rory Sutherland : Now , naturally , a debate raged . 
There were conservative elements in Canada , unsurprisingly , who actually 
resented this intrusion . So , eventually , the manufacturers actually arrived 
at a compromise , which was the combo pack . ( Laughter ) ( Applause ) ( 
Laughter ) If you think it 's funny , bear in mind there is an organization 
called the American Institute of Wine Economics , which actually does extensive 
research into perception of things , and discovers that except for among 
perhaps five or ten percent of the most knowledgeable people , there is no 
correlation between quality and enjoyment in wine , except when you tell the 
people how expensive it is , in which case they tend to enjoy the more 
expensive stuff more . So drink your wine blind in the future . But this is 
both hysterically funny -- but I think an important philosophical point , which 
is , goi
 ng forward , we need more of this kind of value . We need to spend more time 
appreciating what already exists , and less time agonizing over what else we 
can do . Two quotations to more or less end with . One of them is , " Poetry is 
when you make new things familiar and familiar things new . " Which is n't a 
bad definition of what our job is , to help people appreciate what is 
unfamiliar , but also to gain a greater appreciation , and place a far higher 
value on those things which are already existing . There is some evidence , by 
the way , that things like social networking help do that . Because they help 
people share news . They give badge value to everyday little trivial activities 
. So they actually reduce the need for actually spending great money on display 
, and increase the kind of third-party enjoyment you can get from the smallest 
, simplest things in life . Which is magic . The second one is the second G. K. 
Chesterton quote of this session , which is , " We are perishi
 ng for want of wonder , not for want of wonders , " which I think for anybody 
involved in technology , is perfectly true . And a final thing : When you place 
a value on things like health , love , sex and other things , and learn to 
place a material value on what you 've previously discounted for being merely 
intangible , a thing not seen , you realize you 're much much wealthier than 
you ever imagined . Thank you very much indeed . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt
----------------------------------------------------------------------
diff --git 
a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt
 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt
new file mode 100644
index 0000000..7d554ee
--- /dev/null
+++ 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt
@@ -0,0 +1,2 @@
+
+We look around the media , as we see on the news from Iraq , Afghanistan , 
Sierra Leone , and the conflict seems incomprehensible to us . And that 's 
certainly how it seemed to me when I started this project . But as a physicist 
, I thought , well if you give me some data , I could maybe understand this . 
You know , give us a go . So as a naive New Zealander I thought , well I 'll go 
to the Pentagon . Can you get me some information ? ( Laughter ) No. So I had 
to think a little harder . And I was watching the news one night in Oxford . 
And I looked down at the chattering heads on my channel of choice . And I saw 
that there was information there . There was data within the streams of news 
that we consume . All this noise around us actually has information . So what I 
started thinking was , perhaps there is something like open source intelligence 
here . If we can get enough of these streams of information together we can 
perhaps start to understand the war . So this is exactly what I 
 did . We started bringing a team together , an interdisciplinary team of 
scientists , of economists , mathematicians . We brought these guys together 
and we started to try and solve this . We did it in three steps . The first 
step we did was to collect . We did 130 different sources of information -- 
from NGO reports to newspapers and cable news . We brought this raw data in and 
we filtered it . We extracted the key bits on information to build the database 
. That database contained the timing of attacks , the location , the size and 
the weapons used . It 's all in the streams of information we consume daily , 
we just have to know how to pull it out . And once we had this we could start 
doing some cool stuff . What if we were to look at the distribution of the 
sizes of attacks ? What would that tell us ? So we started doing this . And you 
can see here on the horizontal axis you 've got the number of people killed in 
an attack or the size of the attack . And on the vertical axis you 
 've got the number of attacks . So we plot data for sample on this . You see 
some sort of random distribution -- perhaps 67 attacks , one person was killed 
, or 47 attacks where seven people were killed . We did this exact same thing 
for Iraq . And we did n't know , for Iraq what we were going to find . It turns 
out what we found was pretty surprising . You take all of the conflict , all of 
the chaos , all of the noise , and out of that comes this precise mathematical 
distribution of the way attacks are ordered in this conflict . This blew our 
mind . Why should a conflict like Iraq have this as its fundamental signature ? 
Why should there be order in war ? We did n't really understand that . We 
thought maybe there is something special about Iraq . So we looked at a few 
more conflicts . We looked at Colombia , we looked at Afghanistan , and we 
looked at Senegal . And the same pattern emerged in each conflict . This was 
n't supposed to happen . These are different wars , with differen
 t religious factions , different political factions , and different 
socioeconomic problems . And yet the fundamental patterns underlying them are 
the same . So we went a little wider . We looked around the world at all the 
data we could get our hands on . From Peru to Indonesia , we studied this same 
pattern again . And we found that not only were the distributions these 
straight lines , but the slope of these lines , they clustered around this 
value of Alpha equals 2.5 . And we could generate an equation that could 
predict the likelihood of an attack . What we 're saying here is the 
probability of an attack killing X number of people in a country like Iraq , is 
equal to a constant , times the size of that attack , raised to the power of 
negative Alpha . And negative Alpha is the slope of that line I showed you 
before . So what ? This is data , statistics . What does it tell us about these 
conflicts ? That was a challenge we had to face as physicists . How do we 
explain this ? And w
 hat we really found was that Alpha if we really think about it , is the 
organizational structure of the insurgency . Alpha is the distribution of the 
sizes of attacks , which is really the distribution of the group strength 
carrying out the attacks . So we look at a process of group dynamics -- 
coalescence and fragmentation . Groups coming together . Groups breaking apart 
. And we start running the numbers on this . Can we simulate it ? Can we create 
the kind of patterns that we 're seeing in places like Iraq ? Turns out we kind 
of do a reasonable job . We can run these simulations . We can recreate this 
using a process of group dynamics to explain the patterns that we see all 
around the conflicts around the world . So what 's going on ? Why should these 
different -- seemingly different conflicts have the same patterns ? Now what I 
believe is going on is that the insurgent forces , they evolve over time . They 
adapt . And it turns out there is only one solution to fight a much stron
 ger enemy . And if you do n't find that solution as an insurgent force , you 
do n't exist . So every insurgent force that is ongoing , every conflict that 
is ongoing , it 's going to look something like this . And that is what we 
think is happening . Taking it forward , how do we change it ? How do we end a 
war like Iraq ? What does it look like ? Alpha is the structure . It 's got a 
stable state at 2.5 . This is what wars look like when they continue . We 've 
got to change that . We can push it up . The forces become more fragmented . 
There is more of them , but they are weaker . Or we push it down . They 're 
more robust . There is less groups . But perhaps you can sit and talk to them . 
So this graph here , I 'm going to show you now . No one has seen this before . 
This is literally stuff that we 've come through last week . And we see the 
evolution of Alpha through time . We see it start . And we see it grow up to 
the stable state the wars around the world look like . And it stay
 s there through the invasion of Falusia until the Samarra bombings in the 
Iraqi elections of '06 . And the system gets perturbed . It moves upwards to a 
fragmented state . This is when the surge happens . And depending on who you 
ask , the surge was supposed to push it up even further . The opposite happened 
. The groups became stronger . They became more robust . And so I 'm thinking , 
right , great , it 's going to keep going down . We can talk to them . We can 
get a solution . The opposite happened . It 's moved up again . The groups are 
more fragmented . And this tells me one of two things . Either we 're back 
where we started , and the surge has had no effect . Or finally the groups have 
been fragmented to the extent that we can start to think about maybe moving out 
. I do n't know what the answer is to that . But I know that we should be 
looking at the structure of the insurgency to answer that question . Thank you 
. ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt
----------------------------------------------------------------------
diff --git 
a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt
 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt
new file mode 100644
index 0000000..5871249
--- /dev/null
+++ 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt
@@ -0,0 +1,2 @@
+
+Good morning . Let 's look for a minute at the greatest icon of all , Leonardo 
da Vinci . We 're all familiar with his fantastic work -- his drawings , his 
paintings , his inventions , his writings . But we do not know his face . 
Thousands of books have been written about him , but there 's controversy , and 
it remains , about his looks . Even this well-known portrait is not accepted by 
many art historians . So what do you think ? Is this the face of Leonardo da 
Vinci or is n't it ? Let 's find out . Leonardo was a man that drew everything 
around him . He drew people , anatomy , plants , animals , landscapes , 
buildings , water , everything . But no faces ? I find that hard to believe . 
His contemporaries made faces , like the ones you see here . En face or three 
quarters . So surely a passionate drawer like Leonardo must have made 
self-portraits from time to time . So let 's try to find them . I think that if 
we were to scan all of his work and look for self-portraits , we would fi
 nd his face looking at us . So I looked at all of his drawings , more than 700 
, and looked for male portraits . There are about 120 , you see them here . 
Which ones of these could be self-portraits ? Well , for that they have to be 
done as we just saw , en face or three-quarters . So we can eliminate all the 
profiles . It also has to be sufficiently detailed . So we can also eliminate 
the ones that are very vague or very stylized . And we know from his 
contemporaries that Leonardo was a very handsome , even beautiful man . So we 
can also eliminate the ugly ones or the caricatures . ( Laughter ) And look 
what happens -- only three candidates remain that fit the bill . And here they 
are . Yes indeed , the old man is there , as is this famous pen drawing of the 
Homo Vitruvianos . And lastly , the only portrait of a male that Leonardo 
painted , " The Musician . " Before we go into these faces , I should explain 
why I have some right to talk about them . I 've made more than 1,100 portr
 aits myself for newspapers , over the course of 300 -- 30 years , sorry , 30 
years only . ( Laughter ) But there are 1,100 , and very few artists have drawn 
so many faces . So I know a little about drawing and analyzing faces . OK , now 
let 's look at these three portraits . And hold onto your seats , because if we 
zoom in on those faces , remark how they have the same broad forehead , the 
horizontal eyebrows , the long nose , the curved lips and the small , 
well-developed chin . I could n't believe my eyes when I first saw that . There 
is no reason why these portraits should look alike . All we did was look for 
portraits that had the characteristics of a self-portrait , and look , they are 
very similar . Now , are they made in the right order ? The young man should be 
made first . And as you see here from the years that they were created , it is 
indeed the case . They are made in the right order . What was the age of 
Leonardo at the time ? Does that fit ? Yes it does . He was 33 , 
 38 and 63 when these were made . So we have three pictures , potentially of 
the same person of the same age as Leonardo at the time . But how do we know it 
's him , and not someone else ? Well , we need a reference . And here 's the 
only picture of Leonardo that 's widely accepted . It 's a statue made by 
Verrocchio , of David , for which Leonardo posed as a boy of 15. And if we now 
compare the face of the statue , with the face of the musician , you see the 
very same features again . The statue is the reference , and it connects the 
identity of Leonardo to those three faces . Ladies and gentlemen , this story 
has not yet been published . It 's only proper that you here at TED hear and 
see it first . The icon of icons finally has a face . Here he is -- Leonardo da 
Vinci . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt
----------------------------------------------------------------------
diff --git 
a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt
 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt
new file mode 100644
index 0000000..97cf3ef
--- /dev/null
+++ 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt
@@ -0,0 +1,2 @@
+
+So I want to talk today about an idea . It 's a big idea . Actually , I think 
it 'll eventually be seen as probably the single biggest idea that 's emerged 
in the past century . It 's the idea of computation . Now , of course , that 
idea has brought us all of the computer technology we have today and so on . 
But there 's actually a lot more to computation than that . It 's really a very 
deep , very powerful , very fundamental idea , whose effects we 've only just 
begun to see . Well , I myself have spent the past 30 years of my life working 
on three large projects that really try to take the idea of computation 
seriously . So I started off at a young age as a physicist using computers as 
tools . Then , I started sort of drilling down , thinking about the 
computations I might want to do , trying to figure out what primitives they 
could be built up from and how they could be automated as much as possible . 
Eventually , I created a whole structure based on symbolic programming and so o
 n that let me build Mathematica . And for the past 23 years , at an increasing 
rate , we 've been pouring more and more ideas and capabilities and so on into 
Mathematica , and I 'm happy to say that 's led to many good things in R and D 
and education , lots of other areas . Well , I have to admit , actually , that 
I also had a very selfish reason for building Mathematica . I wanted to use it 
myself , a bit like Galileo got to use his telescope 400 years ago . But I 
wanted to look , not at the astronomical universe , but at the computational 
universe . So we normally think of programs as being complicated things that we 
build for very specific purposes . But what about the space of all possible 
programs ? Here 's a representation of a really simple program . So , if we run 
this program , this is what we get . Very simple . So let 's try changing the 
rule for this program a little bit . Now we get another result , still very 
simple . Try changing it again . You get something a little 
 bit more complicated , but if we keep running this for awhile , we find out 
that , although the pattern we get is very intricate , it has a very regular 
structure . So the question is : Can anything else happen ? Well , we can do a 
little experiment . Let 's just do a little mathematical experiment , try and 
find out . Let 's just run all possible programs of the particular type that we 
're looking at . They 're called cellular automata . You can see a lot of 
diversity in the behavior here . Most of them do very simple things . But if 
you look along all these different pictures , at rule number 30 , you start to 
see something interesting going on . So let 's take a closer look at rule 
number 30 here . So here it is . We 're just following this very simple rule at 
the bottom here , but we 're getting all this amazing stuff . It 's not at all 
what we 're used to , and I must say that , when I first saw this , it came as 
a huge shock to my intuition , and , in fact , to understand it ,
  I eventually had to create a whole new kind of science . ( Laughter ) This 
science is different , more general , than the mathematics-based science that 
we 've had for the past 300 or so years . You know , it 's always seemed like a 
big mystery how nature , seemingly so effortlessly manages to produce so much 
that seems to us so complex . Well , I think we 've found its secret . It 's 
just sampling what 's out there in the computational universe and quite often 
getting things like Rule 30 or like this . And knowing that , starts to explain 
a lot of long-standing mysteries in science . It also brings up new issues 
though , like computational irreducibility . I mean , we 're used to having 
science let us predict things , but something like this is fundamentally 
irreducible . The only way to find its outcome is , effectively , just to watch 
it evolve . It 's connected to , what I call , the principle of computational 
equivalence , which tells us that even incredibly simple systems can
  do computations as sophisticated as anything . It does n't take lots of 
technology or biological evolution to be able to do arbitrary computation , 
just something that happens , naturally , all over the place . Things with 
rules as simple as these can do it . Well , this has deep implications about 
the limits of science , about predictability and controllability of things like 
biological processes or economies , about intelligence in the universe , about 
questions like free will and about creating technology . You know , working on 
this science for many years , I kept wondering , " What will be its first 
killer app ? " Well , ever since I was a kid , I 'd been thinking about 
systematizing knowledge and somehow making it computable . People like Leibniz 
had wondered about that too 300 years earlier . But I 'd always assumed that to 
make progress , I 'd essentially have to replicate a whole brain . Well , now I 
got to thinking : This scientific paradigm of mine suggests something dif
 ferent . And , by the way , I 've now got huge computation capabilities in 
Mathematica , and I 'm a CEO with some worldly resources to do large , 
seemingly crazy , projects . So I decided to just try to see how much of the 
systematic knowledge that 's out there in the world we can make computable . So 
, it 's been a big , very complex project , which I was not sure was going to 
work at all . But I 'm happy to say that it 's actually going really well . And 
last year we were able to release the first website version of Wolfram Alpha . 
It 's purpose is to be a serious knowledge engine that computes answers to 
questions . So let 's give it a try . Let 's start off with something really 
easy . Hope for the best . Very good . Okay . So far so good . ( Laughter ) Let 
's try something a little bit harder . Let 's say ... Let 's do some mathy 
thing and with luck it 'll work out the answer and try and tell us some 
interesting things things about related math . We could ask it something about
  the real world . Let 's say -- I do n't know -- What 's the GDP of Spain ? 
And it should be able to tell us that . Now we could compute something related 
to this , let 's say the GDP of Spain divided by , I do n't know , the -- hmmm 
... let 's say the revenue of Microsoft . ( Laughter ) The idea is that we can 
sort of just type this in , this kind of question in however we think of it . 
So let 's try asking a question , like a health related question . So let 's 
say we have a lab finding that -- you know , we have an LDL level of 140 for a 
male aged 50. So let 's type that in , and now Wolfram Alpha will go and use 
available public health data and try to figure out what part of the population 
that corresponds to and so on . Or let 's try asking about , I do n't know , 
the international space station . And what 's happening here is that Wolfram 
Alpha is not just looking up something ; it 's computing , in real time , where 
the international space station is right now , at this momen
 t , how fast it 's going and so on . So Wolfram Alpha knows about lots and 
lots of kinds of things . It 's got by now , pretty good coverage of everything 
you might find in a standard reference library and so on . But the goal is to 
go much further and , very broadly , to democratize all of this kind of 
knowledge , and to try and be an authoritative source in all areas , to be able 
to compute answers to specific questions that people have , not by searching 
what other people may have written down before , but by using built in 
knowledge to compute fresh new answers to specific question . Now , of course , 
Wolfram Alpha is a monumentally huge , long term project with lots and lots of 
challenges . For a start , one has to curate a zillion different sources of 
facts and data , and we built quite a pipeline of Mathematica automation and 
human domain experts for doing this . But that 's just the beginning . Given 
raw facts or data to actually answer questions , one has to compute , one h
 as to implement all those methods and models and algorithms and so on that 
science and other areas have built up over the centuries . Well , even starting 
from Mathematica , this is still a huge amount of work . So far , there are 
about 8 million lines of Mathematica code in Wolfram Alpha built by experts 
from many , many different fields . Well , a crucial idea of Wolfram Alpha is 
that you can just ask it questions using ordinary human language , which means 
that we 've got to be able to take all those strange utterances that people 
type into the input field and understand them . And I must say that I thought 
that step might just be plain impossible . Two big things happened . First , a 
bunch of new ideas about linguistics that came from studying the computational 
universe . And second , the realization that having actual computable knowledge 
completely changes how one can set about understanding language . And , of 
course , now with Wolfram Alpha actually out in the wild , we can 
 learn from its actual usage . And , in fact , there 's been an interesting 
coevolution that 's been going on between Wolfram Alpha and its human users . 
And it 's really encouraging . Right now , if we look at web queries , more 
than 80 percent of them get handled successfully the first time . And if you 
look at things like the iPhone app , the fraction is considerably larger . So , 
I 'm pretty pleased with it all . But , in many ways , we 're still at the very 
beginning with Wolfram Alpha . I mean , everything is scaling up very nicely . 
We 're getting more confident . You can expect to see Wolfram Alpha technology 
showing up in more and more places , working both with this kind of public data 
, like on the website , and with private knowledge for people and companies and 
so on . You know , I 've realized that Wolfram Alpha actually gives one a sort 
of whole new kind of computing that one can call knowledge-based computing , in 
which one 's starting , not just from raw computation 
 , but from a vast amount of built-in knowledge . And when one does that , one 
really changes the economics of delivering computational things , whether it 's 
on the web or elsewhere . You know , we have a fairly interesting situation 
right now . On the one hand , we have Mathematica , with its sort of precise , 
formal language and a huge network of carefully designed capabilities able to 
get a lot done in just a few lines . Let me show you a couple of examples here 
. So here 's a trivial piece of Mathematica programming . Here 's something 
where we 're sort of integrating a bunch of different capabilities here . Here 
we 'll just create in this line a little user interface that allows us to do 
something fun there . If you go on , that 's a slightly more complicated 
program that 's now doing all sorts of algorithmic things and creating user 
interface and so on . But it 's something that 's very precise stuff . It 's a 
precise specification with a precise formal language that causes Ma
 thematica to know what to do here . Well , then on the other hand , we have 
Wolfram Alpha , with all the sort of messiness of the world and human language 
and so on built into it . So what happens when you put these things together ? 
I think it 's actually rather wonderful . With Wolfram Alpha inside Mathematica 
, you can , for example , make precise programs that call on real-world data . 
Here 's a really simple example . You can also just sort of give vague input 
and then try and have Wolfram Alpha figure out what you 're talking about . Let 
's try this here . But actually I think sort of the most exciting thing about 
this is that it really gives one the chance to democratize programming . I mean 
, anyone will be able to just sort of say what they want in plain language , 
then , the idea is , that Wolfram Alpha will be able to figure out what precise 
pieces of code can do what they 're asking for and then show them examples that 
will let them pick what they need to build up bigger
  and bigger , precise programs . So , sometimes , Wolfram Alpha will be able 
to do the whole thing immediately and just give back a whole big program that 
you can then compute with . So here 's a big website where we 've been 
collecting lots of educational and other demonstrations about lots of kinds of 
things . So , I do n't know , I 'll show you one example , maybe here . This is 
just an example of one of these computable documents . This is probably a 
fairly small piece of Mathematica code that 's able to be run here . Okay . Let 
's zoom out again . So , given our new kind of science , is there a general way 
to use it to make technology ? So , with physical materials , we 're used to 
kind of going around the world and discovering that particular materials are 
useful for particular technological purposes and so on . Well , it turns out , 
we can do very much the same kind of thing in the computational universe . 
There 's an inexhaustible supply of programs out there . The challenge
  is to see how to harness them for human purposes . Something like Rule 30 , 
for example , turns out to be a really good randomness generator . Other simple 
programs are good models for processes in the natural or social world . And , 
for example , Wolfram Alpha and Mathematica are actually now full of algorithms 
that we discovered by searching the computational universe . And , for example 
, this -- we go back here -- This has become surprisingly popular among 
composers finding musical forms by searching the computational universe . In a 
sense , we can use the computational universe to get mass customized creativity 
. I 'm hoping we can , for example , use that even to get Wolfram Alpha to 
routinely sort of do invention and discovery on the fly and to find all sorts 
of wonderful stuff that no engineer and no process of incremental evolution 
would ever come up with . Well , so , that leads to sort of an ultimate 
question . Could it be that someplace out there in the computational un
 iverse we might find our physical universe ? Perhaps there 's even some quite 
simple rule , some simple program for our universe . Well , the history of 
physics would have us believe that the rule for the universe must be pretty 
complicated . But in the computational universe we 've now seen how rules that 
are incredibly simple can produce incredibly rich and complex behavior . So 
could that be what 's going on with our whole universe ? If the rules for the 
universe are simple , it 's kind of inevitable that they have to be very 
abstract and very low level , operating , for example , far below the level of 
space or time , which makes it hard to represent things . But in at least a 
large class of cases , one can think of the universe as being like some kind of 
network , which , when it gets big enough , behaves like continuous space in 
much the same way as having lots of molecules can behave like a continuous 
fluid . Well , then the universe has to evolve by applying little rules tha
 t progressively update this network . And each possible rule , in a sense , 
corresponds to a candidate universe . Actually , I have n't shown these before 
, but here are a few of the candidate universes that I 've looked at . Some of 
these are hopeless universes , completely sterile , with other kinds of 
pathologies like no notion of space , no notion of time , no matter , other 
problems like that . But the exciting thing that I 've found in the last few 
years is that you actually do n't have to go very far in the computational 
universe before you start finding candidate universes that are n't obviously 
not our universe . Here 's the problem : Any serious candidate for our universe 
, is inevitably full of computational irreducibility , which means that it is 
irreducibly difficult to find out how it will really behave , and whether it 
matches our physical universe . A few years ago , I was pretty excited to 
discover that there are candidate universes with incredibly simple rules that
  successfully reproduce special relativity and even general relativity and 
gravitation and at least give hints of quantum mechanics . So , will we find 
the whole of physics ? I do n't know for sure . But I think at this point it 's 
sort of almost embarrassing not to at least try . Not an easy project . One has 
got to build a lot of technology . One 's got to build a structure that 's 
probably at least as deep as existing physics . And I 'm not sure what the best 
way to organize the whole thing is . Build a team , open it up , offer prizes 
and so on . But I 'll tell you here today that I 'm committed to seeing this 
project done , to see if , within this decade , we can finally hold in our 
hands the rule for our universe and know where our universe lies in the space 
of all possible universes -- and be able to type into Wolfram Alpha " the 
theory of the universe , " and have it tell us . ( Laughter ) So I 've been 
working on the idea of computation now for more than 30 years , building
  tools and methods and turning sort of intellectual ideas into millions of 
lines of code and grist for server farms and so on . With every passing year , 
I realize how much more powerful the idea of computation really is . It 's 
taken us a long way already , but there 's so much more to come . From the 
foundations of science to the limits of technology to the very definition of 
the human condition , I think computation is destined to be the defining idea 
of our future . Thank you . ( Applause ) Chris Anderson : That was astonishing 
. Stay here . I 've got a question . ( Applause ) So , that was , fair to say , 
an astonishing talk . Are you able to say in a sentence or two how this type of 
thinking could integrate at some point to things like string theory or the kind 
of things that people think of as the fundamental explanations of the universe 
? Stephen Wolfram : Well , the parts of physics that we kind of know to be true 
, things like the standard model of physics . What I 'm tryi
 ng to do better reproduce the standard model of physics or it 's simply wrong 
. The things that people have tried to do in the last 25 years or so with 
string theory and so on have been an interesting exploration that has tried to 
get back to the standard model , but has n't quite gotten there . My guess is 
that some great simplifications of what I 'm doing may actually have 
considerable resonance with what 's been done in string theory , but that 's a 
complicated math thing that I do n't yet know how it 's going to work out . CA 
: Benoit Mandlebrot is in the audience . He has also shown how complexity can 
arise from a simple start . Does your work relate to his ? SW : I think so . I 
view Benoit Mandlebrot 's work as kind of one of the founding contributions to 
this kind of area . Benoit has been particularly interested in nested patterns 
, in fractals and so on , where the structure is something that 's kind of 
tree-like , and where there 's sort of a big branch that makes little b
 ranches , and even smaller branches and so on . That 's kind of one of the 
ways that you get towards true complexity . I think things like the Rule 30 
cellular automaton get us to a different level . In fact , in a very precise 
way they get us to a different level because they seem to be things that are 
capable of complexity that 's sort of as great as complexity can ever get ... I 
could go on about this at great length , but I wo n't . CA : Stephen Wolfram , 
thank you . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt
----------------------------------------------------------------------
diff --git 
a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt
 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt
new file mode 100644
index 0000000..ad09734
--- /dev/null
+++ 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt
@@ -0,0 +1,2 @@
+
+Several years ago , here at TED , Peter Skillman introduced a design challenge 
called the marshmallow challenge . And the idea 's pretty simple . Teams of 
four have to build the tallest free-standing structure out of 20 sticks of 
spaghetti , one yard of tape , one yard of string and a marshmallow . The 
marshmallow has to be on top . And , though it seems really simple , it 's 
actually pretty hard , because it forces people to collaborate very quickly . 
And so I thought that this was an interesting idea , and I incorporated it into 
a design workshop . And it was a huge success . And since then , I 've 
conducted about 70 design workshops across the world with students and 
designers and architects , even the CTOs of the Fortune 50 , and there 's 
something about this exercise that reveals very deep lessons about the nature 
of collaboration , and I 'd like to share some of them with you . So , normally 
, most people begin by orienting themselves to the task . They talk about it , 
they fi
 gure out what it 's going to look like , they jockey for power , then they 
spend some time planning , organizing . They sketch and they lay out spaghetti 
They spend the majority of their time assembling the sticks into ever-growing 
structures and then , finally , just as they 're running out of time , someone 
takes out the marshmallow , and then they gingerly put it on top , and then 
they stand back , and Ta-da ! they admire their work . But what really happens 
, most of the time , is that the " ta-da " turns into an " uh-oh , " because 
the weight of the marshmallow causes the entire structure to buckle and to 
collapse . So there are a number of people who have a lot more " uh-oh " 
moments than others , and among the worst are recent graduates of business 
school . ( Laughter ) They lie , they cheat , they get distracted , and they 
produce really lame structures . And of course there are teams that have a lot 
more " ta-da " structures , and , among the best , are recent graduates of 
 kindergarten . ( Laughter ) And it 's pretty amazing . As Peter tells us , not 
only do they produce the tallest structures , but they 're the most interesting 
structures of them all . So the question you want to ask is : How come ? Why ? 
What is it about them ? And Peter likes to say that , " None of the kids spend 
any time trying to be CEO of Spaghetti Inc. " Right . They do n't spend time 
jockeying for power . But there 's another reason as well . And the reason is 
that business students are trained to find the single right plan , right . And 
then they execute on it . And then what happens is , when they put the 
marshmallow on the top , they run out of time , and what happens ? It 's a 
crisis . Sound familiar ? Right . What kindergarteners do differently , is that 
they start with the marshmallow , and they build prototypes , successive 
prototypes , always keeping the marshmallow on top , so they have multiple 
times to fix ill built prototypes along the way . So designers recognize
  this type of collaboration as the essence of the iterative process . And with 
each version , kids get instant feedback about what works and what does n't 
work . So the capacity to play in prototype is really essential , but let 's 
look at how different teams perform . So the average for most people is around 
20 inches , business schools students , about half of that , lawyers , a little 
better , but not much better than that , kindergarteners , better than most 
adults . Who does the very best ? Architects and engineers , thankfully . ( 
Laughter ) 39 inches is the tallest structure I 've seen . And why is it ? 
Because they understand triangles and self-re-enforcing geometrical patterns 
are the key to building stable structures . So CEOs , a little bit better than 
average . But here 's where it gets interesting . If you put you put an 
executive admin . on the team , they get significantly better . ( Laughter ) It 
's incredible . You know , you look around , you go , " Oh , that team 
 's going to win . " You can just tell beforehand . And why is that ? Because 
they have special skills of facilitation . They manage the process , they 
understand the process . And any team who manages and pays a close attention to 
work will significantly improve the team 's performance . Specialized skills 
and facilitation skills are the combination [ that ] leads to strong success . 
If you have 10 teams that typically perform , you 'll get maybe six or so that 
have standing structures . And I tried something interesting . I thought , let 
's up the ante once . So I offered a 10,000 dollar prize of software to the 
winning team . So what do you think happened to these design students ? What 
was the result ? Here 's what happened . Not one team had a standing structure 
. If anyone had built , say , a one inch structure , they could have taken home 
the prize . So , is n't it interesting that high stakes have a strong impact . 
We did the exercise again with the same students . What do yo
 u think happened then ? So now they understand the value of prototyping . So 
the same team went from being the very worst to being among the very best . 
They produced the tallest structures in the least amount of time . So there 's 
deep lessons for us about the nature of incentives and success . So , you might 
ask : Why would anyone actually spend time writing a marshmallow challenge ? 
And the reason is , I help create digital tools and processes to help teams 
build cars and video games and visual effects . And what the marshmallow 
challenge does is it helps them identify the hidden assumptions . Because , 
frankly , every project has its own marshmallow , does n't it . The challenge 
provides a shared experience , a common language , common stance to build the 
right prototype . And so , this is the value of the experience , of this so 
simple exercise . And those of you who are interested , may want to go to 
marshmallowchallenge . com . It 's a blog that you can look at how to build t
 he marshmallows . There 's step-by-step instructions on this . There are crazy 
examples from around the world of how people tweak and adjust the system . 
There 's world records on this as well . And the fundamental lesson , I believe 
, is that design truly is a contact sport . It demands that we bring all of our 
senses to the task , and that we apply the very best of our thinking , our 
feeling and our doing to the challenge that we have at hand . And , sometimes , 
a little prototype of this experience is all that it takes to turn us from an " 
uh-oh " moment to a " ta-da " moment . And that can make a big difference . 
Thank you very much . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt
----------------------------------------------------------------------
diff --git 
a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt
 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt
new file mode 100644
index 0000000..44a13c7
--- /dev/null
+++ 
b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt
@@ -0,0 +1,2 @@
+
+Last year at TED we aimed to try to clarify the overwhelming complexity and 
richness that we experience at the conference in a project called Big Viz . And 
the Big Viz is a collection of 650 sketches that were made by two visual 
artists . David Sibbet from The Grove , and Kevin Richards from Autodesk made 
650 sketches that strive to capture the essence of each presenter 's ideas . 
And the consensus was , it really worked . These sketches brought to life the 
key ideas , the portraits , the magic moments that we all experienced last year 
. This year we were thinking , " Why does it work ? " What is it about 
animation , graphics , illustrations , that create meaning ? And this is an 
important question to ask and answer because the more we understand how the 
brain creates meaning , the better we can communicate , and I also think , the 
better we can think and collaborate together . So this year we 're going to 
visualize how the brain visualizes . Cognitive psychologists now tell us that
  the brain does n't actually see the world as it is , but instead , creates a 
series of mental models through a collection of " Ah-ha moments , " or moments 
of discovery , through various processes . The processing , of course , begins 
with the eyes . Light enters , hits the back of the retina , and is circulated 
, most of which is streamed to the very back of the brain , at the primary 
visual cortex . And primary visual cortex sees just simple geometry , just the 
simplest of shapes . But it also acts like a kind of relay station that 
re-radiates and redirects information to many other parts of the brain . As 
many as 30 other parts that selectively make more sense , create more meaning 
through the kind of " Ah-ha " experiences . We 're only going to talk about 
three of them . So the first one is called the ventral stream . It 's on this 
side of the brain . And this is the part of the brain that will recognize what 
something is . It 's the " what " detector . Look at a hand . Look at
  a remote control . Chair . Book . So that 's the part of the brain that is 
activated when you give a word to something . A second part of the brain is 
called the dorsal stream . And what it does is locates the object in physical 
body space . So if you look around the stage here you 'll create a kind of 
mental map of the stage . And if you closed your eyes you 'd be able to 
mentally navigate it . You 'd be activating the dorsal stream if you did that . 
The third part that I 'd like to talk about is the limbic system . And this is 
deep inside of the brain . It 's very old , evolutionarily . And it 's the part 
that feels . It 's the kind of gut center , where you see an image and you go , 
" Oh ! I have a strong or emotional reaction to whatever I 'm seeing . " So the 
combination of these processing centers help us make meaning in very different 
ways . So what can we learn about this ? How can we apply this insight ? Well , 
again , the schematic view is that the eye visually interrogat
 es what we look at . The brain processes this in parallel , the figments of 
information asking a whole bunch of questions to create a unified mental model 
. So , for example , when you look at this image a good graphic invites the eye 
to dart around , to selectively create a visual logic . So the act of engaging 
, and looking at the image creates the meaning . It 's the selective logic . 
Now we 've augmented this and spatialized this information . Many of you may 
remember the magic wall that we built in conjunction with Perceptive Pixel 
where we quite literally create an infinite wall . And so we can compare and 
contrast the big ideas . So the act of engaging and creating interactive 
imagery enriches meaning . It activates a different part of the brain . And 
then the limbic system is activated when we see motion , when we see color . 
and there are primary shapes and pattern detectors that we 've heard about 
before . So the point of this is what ? We make meaning by seeing , by an ac
 t of visual interrogation . The lessons for us are three-fold . First , use 
images to clarify what we 're trying to communicate . Secondly make those 
images interactive so that we engage much more fully . And the third is to 
augment memory by creating a visual persistence . These are techniques that can 
be used to be -- that can be applied in a wide range of problem solving . So 
the low-tech version looks like this . And , by the way , this is the way in 
which we develop and formulate strategy within Autodesk , in some of our 
organizations and some of our divisions . What we literally do is have the 
teams draw out the entire strategic plan on one giant wall . And it 's very 
powerful because everyone gets to see everything else . There 's always a room 
, always a place to be able to make sense of all of the components in the 
strategic plan . This is a time-lapse view of it . You can ask the question , " 
Who 's the boss ? " You 'll be able to figure that out . So the act of 
collective
 ly and collaboratively building the image transforms the collaboration . No 
Powerpoint is used in two days. But instead the entire team creates a shared 
mental model that they can all agree on and move forward on . And this can be 
enhanced and augmented with some emerging digital technology . And this is our 
great unveiling for today . And this is an emerging set of technologies that 
use large-screen displays with intelligent calculation in the background to 
make the invisible visible . Here what we can do is look at sustainability , 
quite literally . So a team can actually look at all the key components that 
heat the structure and make choices and then see the end result that is 
visualized on this screen . So making images meaningful has three components . 
The first again , is making ideas clear by visualizing them . Secondly , making 
them interactive . And then thirdly , making them persistent . And I believe 
that these three principles can be applied to solving some of the very t
 ough problems that we face in the world today . Thanks so much . ( Applause ) 
\ No newline at end of file

[08/51] [partial] opennlp-sandbox git commit: merge from bgalitsky's own git repo

Reply via email to