Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
But what is Bits Per Character evaluation testing? How does it work?
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M17bef63d367de5a6688fe7f4
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Indeed, if someone removes a dozen char/word types in the training/test set if 
they use Perplexity as evaluation, they can get a higher score. To work/compare 
with enwiki8 you must either decompress it losslessly or train on it in full 
and in full on test set
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-Mad032fa4e019de2c87d66d51
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread Matt Mahoney
On Sun, Mar 22, 2020, 8:57 AM  wrote:

> 1 more question Matt:
> https://openai.com/blog/better-language-models/
> They say "enwik8 - bits per character (–) - OURS: 0.93 - LAST RECORD 0.99"
> Butbut...enwiki8 is 100MB! 0.99 alone is 100MB / 8 = 12.5MB. The best
> compression is 14.8MB though. What are they doing here? 0.93 is 11,625,000
> bytes.
>

I don't know where OpenAI got those numbers. They certainly aren't mine.

Also, perplexity is the same as compression. The conversion is
Perplexity = 2^(bits per word).

Some language modelers improve their numbers by removing punctuation and
capitalization, splitting words into a root and suffix, and mapping rare
words outside a 20K word dictionary to a common token. In data compression
we consider that cheating.


--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M31d8f5c1210dc5cd98b32d87
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Symbolic AI: Concepts

2020-03-22 Thread Mike Archbold
Go Dawgs!

I played collision football for 9 seasons, probably 70 games and 3x
practices, at least concussions and kept playing.  One time the world
turned purple. It's amazing the amount of trauma the brain can take
and still work. There's a limit, though. College football is cracking
down on targeting, and it's a good thing.




On 3/22/20, A.T. Murray  wrote:
> When I addressed the Board of Regents of the University of Washington on
> December 12, 2019 -- see
> https://www.washington.edu/regents/minutes/meeting-minutes-for-2019 -- I
> told them that I was protesting against Husky Football brain injuries
> because I had spent my adult life studying the human brain and working as
> an independent scholar in artificial intelligence (AI). The chairman of the
> Board of Regents was at first quite talkative, interrupting me once to
> state that he had graduated from our alma mater one year ahead of me, and
> interrupting again to admit that he, too, had served in the U.S. Army after
> graduation. But he made no further comments as I revealed that I was
> loading up the Guest Workstations downstairs in the library with my various
> AI programs that think in English and Latin and Russian, so that passers-by
> might see the Symbolic AI in action and stop to inspect the software. Those
> demos went on for a couple more months, until I made the mistake of
> commandeering all seven workstations in a cluster with my AI, and someone
> complained, and I was forbidden to load my software onto even a single
> workstation.
> 
> The software is no good, anyway. It can think and reason with concepts, but
> just barely. My main justification for working further on thehopeless
> Symbolic AI project is that it gives me something to do in my twilight
> years, my Goetterdaemmerung, in the foolhardy hope that somebody smarter
> than me and more resourceful than me might take over a branch, a fork, of
> the open-source AI project and produce a better thinking Mind and a Truer
> AI than my feeble efforts have yielded. So therefore, Friends, Regents,
> Countrymen, Yours Truly took the step recently of adding a Live Traffic
> Report widget onto about eighty-eight AI webpages so as to see the various
> worldwide place from which Netizens were making not the pilgrimage but
> perhapsthe boredom-driven homage not to Catalonia but to Mentifex AI. And
> the results were astounding, Oh Force Majeure, Oh Avant Garde of the AGI
> Anabasis. A secondary page of the Life Traffic report was showing a
> Mercator Projection of the known world with little bombshells going off in
> real time when an eager AI aficionado landed on one of the 88 AI webpages.
> Many visits have been from Russia and from former Soviet Republics of the
> former USSR, since your AGI-nogoodnik has been posting his links in the
> gotai.net Russian-language AI forum.
> 
> Assembled as we are on the most prestigious AGI Mail-List of cyberspace and
> noosphere coordinates, we contemplate Symbolic AI as based on words of
> natural language serving as symbols representing concepts. In our lousy, no
> good, almost worthless AI software that we hope to fob off onto the better
> minds and deeper pockets, a word coming into each AI Mind is either
> recognized as a known concept or not recognized and therefore treated as a
> New Concept.
> 
> http://ai.neocities.org/OldConcept.html -- handles old concepts.
> 
> http://ai.neocities.org/NewConcept.html -- creates new concepts.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T1632b2db76dafcd5-Maac4659d70b7ee56354b3871
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Thanks JB. So my refined conclusion is Perplexity is worse than Lossless 
Compression because Lossless Compression forces you to Learn Online, etc, which 
was amazing for me to code. And the Perplexity test dataset is ok if different 
actually but it still can be quite similar in some ways in certain areas of the 
dataset. And LC understands the very data it compresses - Perplexity works on 
different topic datasets but does lose relatedness if too different which is 
bad even if only a bit different. And if we want to train on 98% of the 
internet data, we need test data not duplicated in the data.

On Sunday, March 22, 2020, at 11:06 AM, Alan Grimes wrote:
> have to say about being
disassembled and re-formed as described?
We already are changing, morphing, dying, and transforming. Ageing does that to 
you. Bending over does it to you - your body deforms and isn't a perfect statue 
of metal. You lose neurons and gain new wisdom. You are ever growing. All 
change is death and birth. We fear not death, but specific fates. We long for a 
specific death lol. You'll be mad at me about this but that's your evolutionary 
instinctive knowledge on alert to help you survive. In all technicalness, all 
change is death and birth. Prove me wrong.


--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-Mf19bd92715c66f9975476c92
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread Alan Grimes via AGI
immortal.discover...@gmail.com wrote:
>
> It's [imperative] you understand that all AI find/create patterns
> because it lets them solve unseen problems the programmer never set
> them to answer. And all of physics has patterns. The reason Earth will
> become a fractal pattern of nanobot units terraformed and completely
> organized and unified is because being cooperative is paramount by
> using distributed government and knowing where / when / what all is in
> your home base and mind lead to higher probability for prediction and
> survival - that's why Earth will be a fully organized fractal of
> metaloid_superorganisms.

So what did the stupid smelly squishy humans have to say about being
disassembled and re-formed as described?

-- 
Clowns feed off of funny money;
Funny money comes from the FED
so NO FED -> NO CLOWNS!!! 

Powers are not rights.


--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M0f5518396b0d3971b71fa8fe
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread James Bowery
Marcus Hutter implicitly addresses perplexity in this Hutter Prize FAQ
entry:

http://www.hutter1.net/prize/hfaq.htm#xvalid
Why aren't cross-validation or train/test-set used for evaluation?A common
way of evaluating machine learning algorithms is to split the data into a
training set and a test set, learn e.g. the parameters of a Neural Network
(NN) on the train set and evaluate its performance on the test set. While
this method, and similarly its extension to cross-validation, can work in
practice, it is not a fool-proof method for evaluation: In the training
phase, the algorithm could *somehow* manage to "effectively" store the
information contained in the test set and use it to predict the test set
without the desired generalization capability. This can happen in a number
of ways:

   1. The test set could be very *similar* or in the extreme case identical
   to the train set, so even without access to the test set, the algorithm has
   effectively access to the information in the test set via the train set.
   For instance, if you downloaded all images from the internet and randomly
   split them into train and test set, most images would be in both sets,
   since most images appear multiple times online. Similarly if you download
   all text. Admittedly, Wikipedia should be less prone to repetition, since
   curated.
   2. The algorithm could *accidentally* contain test set information,
   though statistically this is very unlikely, and would only be a problem if
   HKCP received an enormous number of submissions, or contestants optimize
   their algorithms based on test-set performance.
   3. The contestant could *cheat* and simply hide the test set in the
   algorithm itself. This could be circumvented by keeping the test set
   secret, but one could never be sure whether it has leaked, a grain of doubt
   will always remain, and even if not, ...
   4. if the test set is taken from a *public* source like Wikipedia, a
   gargantuan NN could be trained on all of Wikipedia or the whole Internet.
   Limiting the size of the decompression algorithm can prevent this. Indeed
   this is the spirit of the used compression metric.

One the other hand, including the size of the decompressor rules out many
SOTA batch NN, which are often huge, but maybe they only *appear* better
than HKCP records, due to some of (a)-(d). The solution is to train online
 or to go to larger corpora
that are a more comprehensive sample of human knowledge.





On Sun, Mar 22, 2020 at 6:46 AM  wrote:

> Also see the below link about Perplexity Evaluation for AI! As I said,
> Lossless Compression evaluation in the Hutter Prize is *the best* and see
> it really is the same thing, prediction accuracy. Except it allows errors.
>
> https://planspace.org/2013/09/23/perplexity-what-it-is-and-what-yours-is/
>
> https://www.youtube.com/watch?v=BAN3NB_SNHY
>
> Hmm. I assume they take words or sentences and check if the prediction is
> close/exact, then carry on. With lossless compression, it stores the
> arithmetic encoded decimal of the probability and the resulting file size
> shows the probability error for the whole file, no matter if your predictor
> did poor on some or not, as well, just like Perplexity. However they don't
> consider the neural network size, it could just copy the data. That's why
> they use a test set after/during training. The goal is same, make a good
> neural network predictor though. The test set/compression is also, similar
> a lot, they are seeing how well it understands the data while not copying
> the data directly.
>
> So which is better? I'm not sure now. Perplexity, or Lossless Compression?
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  + delivery
> options  Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M3e2aee20c3e5fca760ef4fcc
Delivery options: https://agi.topicbox.com/groups/agi/subscription


[agi] Symbolic AI: Concepts

2020-03-22 Thread A.T. Murray
When I addressed the Board of Regents of the University of Washington on
December 12, 2019 -- see
https://www.washington.edu/regents/minutes/meeting-minutes-for-2019 -- I
told them that I was protesting against Husky Football brain injuries
because I had spent my adult life studying the human brain and working as
an independent scholar in artificial intelligence (AI). The chairman of the
Board of Regents was at first quite talkative, interrupting me once to
state that he had graduated from our alma mater one year ahead of me, and
interrupting again to admit that he, too, had served in the U.S. Army after
graduation. But he made no further comments as I revealed that I was
loading up the Guest Workstations downstairs in the library with my various
AI programs that think in English and Latin and Russian, so that passers-by
might see the Symbolic AI in action and stop to inspect the software. Those
demos went on for a couple more months, until I made the mistake of
commandeering all seven workstations in a cluster with my AI, and someone
complained, and I was forbidden to load my software onto even a single
workstation.

The software is no good, anyway. It can think and reason with concepts, but
just barely. My main justification for working further on thehopeless
Symbolic AI project is that it gives me something to do in my twilight
years, my Goetterdaemmerung, in the foolhardy hope that somebody smarter
than me and more resourceful than me might take over a branch, a fork, of
the open-source AI project and produce a better thinking Mind and a Truer
AI than my feeble efforts have yielded. So therefore, Friends, Regents,
Countrymen, Yours Truly took the step recently of adding a Live Traffic
Report widget onto about eighty-eight AI webpages so as to see the various
worldwide place from which Netizens were making not the pilgrimage but
perhapsthe boredom-driven homage not to Catalonia but to Mentifex AI. And
the results were astounding, Oh Force Majeure, Oh Avant Garde of the AGI
Anabasis. A secondary page of the Life Traffic report was showing a
Mercator Projection of the known world with little bombshells going off in
real time when an eager AI aficionado landed on one of the 88 AI webpages.
Many visits have been from Russia and from former Soviet Republics of the
former USSR, since your AGI-nogoodnik has been posting his links in the
gotai.net Russian-language AI forum.

Assembled as we are on the most prestigious AGI Mail-List of cyberspace and
noosphere coordinates, we contemplate Symbolic AI as based on words of
natural language serving as symbols representing concepts. In our lousy, no
good, almost worthless AI software that we hope to fob off onto the better
minds and deeper pockets, a word coming into each AI Mind is either
recognized as a known concept or not recognized and therefore treated as a
New Concept.

http://ai.neocities.org/OldConcept.html -- handles old concepts.

http://ai.neocities.org/NewConcept.html -- creates new concepts.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T1632b2db76dafcd5-M87c16b3a7db3f7c7a8042512
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
1 more question Matt:
https://openai.com/blog/better-language-models/
They say "enwik8 - bits per character (–) - OURS: 0.93 - LAST RECORD 0.99"
Butbut...enwiki8 is 100MB! 0.99 alone is 100MB / 8 = 12.5MB. The best 
compression is 14.8MB though. What are they doing here? 0.93 is 11,625,000 
bytes.
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M39bf273e3e3f871053b98087
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Another question Matt.
I see you have a large page on compressors, but I don't see a comparison of 
compressor's top n predictions (generated text not in enwiki8). This would show 
how nicely the better compressors are generating realistic text like from the 
dataset. I know the output probably isn't as good as GPT-2 but can't you show 
us the offroad prediction ability of compressors that got wiki8 to 50MB, 40MB, 
30MB, and so oin, and see how better they are rambling?
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-Mcf2d8c17f681b2b77bf9fe33
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread stefan.reich.maker.of.eye via AGI
> Data compression won't solve AGI. 

I thought you had tried to convince us otherwise earlier... :)

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M646707e88d06bae5f550dbc7
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Ah, you can train a net on a dataset and then test on another different dataset 
but you can never be sure the dataset is on topic, it has to be different 
lol! With Lossless Compression evaluation, the predictor also predicts the 
next token, and we store the accuracy error, but it is of the same dataset, 
meaning it can fully understand the dataset, and is safe because we include the 
code size and compressed error size and make sure the compression is most can 
get. Speed matters too. And working memory size. Cus brute force would work but 
is slowest

 Since both evaluations test the predictor's accuracy and know the right symbol 
to predict, we see the error, but we can't know the best compression/accuracy 
possible, the contest will never stop. With Perplexity, this is true too I 
think, it gets ex. 90% letters or words predicted exactly, but how many can it 
get right? 100%? Maybe if the training dataset is large enough, it will do 
better, but doesn't mean it is understanding it as much. With compression, you 
can do better the bigger the dataset, but you can at least keep the size static 
and focus on compression aka understanding the data better. I guess with 
Perplexity you too can keep your training set static. So ya both can keep 
dataset same size and improve prediction to an unknown limit.

Conclusion is Perplexity isn't focusing on the very dataset it is digesting, 
but a different "test" dataset, which is bad. Right Matt?
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M7f38f959969b1087b0d8cde5
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
Also see the below link about Perplexity Evaluation for AI! As I said, Lossless 
Compression evaluation in the Hutter Prize is *the best* and see it really is 
the same thing, prediction accuracy. Except it allows errors.

https://planspace.org/2013/09/23/perplexity-what-it-is-and-what-yours-is/

https://www.youtube.com/watch?v=BAN3NB_SNHY

Hmm. I assume they take words or sentences and check if the prediction is 
close/exact, then carry on. With lossless compression, it stores the arithmetic 
encoded decimal of the probability and the resulting file size shows the 
probability error for the whole file, no matter if your predictor did poor on 
some or not, as well, just like Perplexity. However they don't consider the 
neural network size, it could just copy the data. That's why they use a test 
set after/during training. The goal is same, make a good neural network 
predictor though. The test set/compression is also, similar a lot, they are 
seeing how well it understands the data while not copying the data directly.

So which is better? I'm not sure now. Perplexity, or Lossless Compression?
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-Mdd8c32dae7701a14c4a1485d
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
That's spot on JB, a 3D physics engine would help predict movies. But we must 
work with larger "objects" where can, not atoms. Still works! All the physics 
sim would lack is the fact that the cat, seeing the mouse, is likely going TO 
jump on it, before actually springing off the floor. So you need a text 
predictor still I think. Text can also, well, vision for this type, predict the 
cat's landing, I think. So no need a real sim? Vision can predict at all 
levels, modeling the light only! Knowing what comes next in a movie is key, 
having a rulebase memory prior, and therefore i think a actual physics sim is 
unneeded and too precise? Look at human brains, no sim, yet can 3D sim! And use 
it in the real world as your plan of action.
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M75ad74ee02ff3a141d37cfac
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] The limitations of the validity of compression.

2020-03-22 Thread immortal . discoveries
With image compression, and building an understanding of the world first and 
then adding language on top: humans regularly don't talk about microscopic 
levels or walls of bread bag etc, we say "bread" - we learn to segment objects 
in vision, vision has noise in images and there's never an exact match to 
context like in PPM text predictors, so what we do is we recognize different 
breads and group them as a representation for bread, and can store different 
type images and specific episodal images as well, then we we see any of these 
bread, it is an exact match woho!, then we can get a entailing prediction of a 
hand approaching the bread, it's like capsule networks, balls linking to balls, 
bread>hand>mouth>chews>swallows>leaves. So image compression evaluation would 
work, the prediction for a TYPE of hand following the previous image of part of 
same image would be high if see a bread bag or coffee mug. You have to let ALL 
breads think HAND entails, that's the key. Awesome! Now, as I said, AGI doesn't 
at first need a body, text will do, but as for building a visual understanding 
from movies, it simply is only grouping breads as 1 bread "word" as a capsule 
(the capsule 'bread', is a visual word, non-text), what you really need is text 
where humans have already groupified it for you! Bread, there's ur text 
capsule. Text is just that! And gives you power to talk about types of bread or 
atoms. This building a understanding of the world first is language, all is 
language...words are part of vision really, same thing. As for ignoring low 
level talks, talking about high level objects, yes, text/vision we see/do that, 
it's efficient, and when we need we can go low level when deal with few things 
otherwise go high level. So what we learnt here is text is all you need 
basically, and we learnt here how to do image compression.

Vision still seems odd to me btw, I mean you can imagine a large metal chess 
piece sitting on a underground lava floor with crimson red glow and a yellow 
strap laid on top with a reflection of it. How? And how do we recognize it all 
if saw in real life? It seems, these capsules work for different lighting, 
angles, rotations, location, motion, distortion, colors, size, and similar 
looking objects. You can recognize the chess piece, the dirt on it, the lava 
floor, the whole scene... but what about image generation? You add a large 
chess piece to where you want on the laval floor, lay a yellow strap over top 
it, but what accounts for the hiding of it? The reflections? Etc? One object, 
being near the other, is transforming it!? Blending their context... I know, it 
gets these abilities from seeing lots of movies, but its exact way it works 
boggles me still. Even if human brains don't actually generate such physics 
sims, we can on computers using 2D or 3D vision pixels/voxels. The 3D would be 
rather easy to do just this. But for 2D, I'm stuck. And how useful is it 
anyway? Say you got an glass of red wine and a bug and a cloth that gets wet 
traveling up the cloth and also a table for water to fall off it, and run the 
2D sim to get more detail than text can provide, how does 2D image generation 
know when water should fall off the table, hide behind objects, and share 
reflection with the red wine etc? As for the usefulness of this, it is far more 
detail than text but maybe no, i mean text can talk about it, and so does 
physics prediction in text as text prediction is that, text can even say the 
Water Reflected the Wine's red color, being near it, or fell behind it and 
disappeared but was waiting there hidden working on a make-do tool after 
turning into a caveman - oh ya, i can daydream animals morphing like GANs do. 
That morphing, can be objectfied as a word "morph", and the actual segmentation 
of cat>tiger>bear>man>moleRat>hamster>slug>worm>pencil>stick can be said to as 
words. After all, text is object-fying vision as capsules, same thing. Just 
less context as vision would show maybe. As for the words changing each other, 
like the sentence is "red wine fell on a hamster", it blends context maybe to 
get "wine splashed and made a hamster wet". That mirrors our sight seeing, our 
data reflects our visual capsules and their transformation. The hamster, could 
be said to become crippled, now the crippled hamster is walking around, the 
"paraplegic" i could call it now instead.
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2a0cd9d392f9ff94-M3f2e822e256fb12c8edc3a80
Delivery options: https://agi.topicbox.com/groups/agi/subscription