At 4:57 +1000 26/7/16, Andy Farkas wrote:
>A really nice and simple overview of machine learning:
>https://medium.com/@ageitgey/machine-learning-is-fun-80ea3ec3c471#.pigplooz9

To the sceptics among us, it provides a nice, easy way to develop the critique 
that machine learning badly needs.

For example:

>"Machine learning" is an umbrella term covering lots of ... generic algorithms 
>[that can tell you something interesting about a set of data without you 
>having to write any custom code specific to the problem]

This begs such questions as:
-   how big is the set?
-   what parts of the real world do they correspond to sufficiently
    accurately to be useful?
-   how is the choice made as to which algorithm to use in which context?
-   how is the size and content of a sufficient training-set determined?
-   how are 'answers' tested against the real-world?
-   how is an audit performed?


>In supervised learning, you are letting the computer work out that 
>relationship for you. And once you know what math was required to solve this 
>specific set of problems, you could answer to any other problem of the same 
>type!

If that meant "you can generate *an* answer", all would be well.

But the presumption is far too easily made that it's *the* answer.


> ... unsupervised learning is becoming increasingly important as the 
> algorithms get better because it can be used without having to label the data 
> with the correct answer

Oh dear, now the risk of blind presumption of 'truth' is no longer just a 
sceptic's fantasy.  Geitgey now believes his own mythology.

It's particularly hilarious given that his chosen example is property 
valuation, and 'value' is highly context-dependent and not a topic to which any 
notion of 'truth' / 'correct answer' applies.

And his aside addressed to sceptics like me missed the point entirely:
>Side note for pedants: There are lots of other types of machine learning 
>algorithms. But this is a pretty good place to start.


>Of course if you are reading this 50 years in the future and we've figured out 
>the algorithm for Strong AI, then this whole post will all seem a little 
>quaint.

It's quaint, but not for the reason he thinks.  The quaintness lies in the 
naive belief that there's such a thing as "the algorithm for Strong AI".


> ... you've just written a function that you don't really understand but that 
> you can prove will work

The notion that you can 'prove that something doesn't work' is tenable. 

But the notion that you can 'prove that something *does* work' is delusional.

What works on any particular training-set (nomatter how large) may or may not 
work for the next instance.

And his very next example suggests that the algorithm that supports guesses 
about a house's market-value (a basically harmless application) is the one you 
want to embed in your autonomous car (an essentially dangerous application).


>Then you are using that equation to guess the sales price of houses you've 
>never seen before based where that house would appear on your line. It's a 
>really powerful idea and you can solve "real" problems with it.

The leap from 'guess' to 'solve' is breathtaking.


> ... it's important to remember that machine learning only works if the 
> problem is actually solvable with the data that you have

Ah, could he be about to become wise?


>For example, if you build a model that predicts home prices based on the type 
>of potted plants in each house, it's never going to work. There just isn't any 
>kind of relationship between the potted plants in each house and the home's 
>sale price. So no matter how hard it tries, the computer can never deduce a 
>relationship between the two.

But he's completely missed the point that:
(a)  there are many, many circumstances in which correlations can be found
(b)  the scheme implicitly confuses correlation with causality ("predicts")


>In my mind, the biggest problem with machine learning right now is that it 
>mostly lives in the world of academia and commercial research groups

And he's got that arse-about as well.  Keeping AI, sub-sets like ML, and naive 
people like him, in laboratories is the best thing we can do.


Nope, I haven't yet read Parts 2-4, but there are some rather more useful 
things I need to do this morning than deconstruct a silly belief system.

</tetchy>


-- 
Roger Clarke                                 http://www.rogerclarke.com/
                                    
Xamax Consultancy Pty Ltd      78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916                        http://about.me/roger.clarke
mailto:[email protected]                http://www.xamax.com.au/

Visiting Professor in the Faculty of Law            University of N.S.W.
Visiting Professor in Computer Science    Australian National University
_______________________________________________
Link mailing list
[email protected]
http://mailman.anu.edu.au/mailman/listinfo/link

Reply via email to