Andrii (lOkadin) Zvorygin <[EMAIL PROTECTED]> wrote: >> Even if we were able to constrain the grammar, you still have the problem that people will still make ungrammatical statements, misspell words, omit words, and so on. >Amazing you should mention such valid points against natural languages.
This misses the point. Where are you going to get 1 GB of Lojban text to train your language model? If you require that all text pass through a syntax checker for errors, you will greatly increase the cost of generating your training data. This is not a trivial problem. It is a big part of why programmers can only write 10 lines of code per day on projects 1/1000 the size of a language model. Then when you have built the model, you will still have a system that is intolerant of errors and hard to use. Your language model needs to have a better way to deal with inconsistency than to report errors and make more work for the user. >Lojban already exceeds many natural languages in it's ability to express. How so? In English I can use mathematical notation understood by others to express complex ideas. I could even invent a new branch of mathematics, introduce appropriate notation, and express ideas in it. There are cognitive limits to what natural language can express, such as the inability to describe a person's face (as well as a picture would), or to describe a novel odor, or to convey learned physical skills such as swimming or riding a bicycle. One could conceivably introduce notation to describe such things in any natural or artificial language, but that does not solve the problem. Your neural circuitry has limits; it allows you to connect a face to a name but not to a description. Any such notation might be usable by machines but not by humans. -- Matt Mahoney, [EMAIL PROTECTED] ----- Original Message ---- From: Andrii (lOkadin) Zvorygin <[EMAIL PROTECTED]> To: [email protected] Sent: Saturday, November 25, 2006 5:01:04 AM Subject: Re: Re: [agi] Understanding Natural Language On 11/24/06, Matt Mahoney <[EMAIL PROTECTED]> wrote: > Andrii (lOkadin) Zvorygin <[EMAIL PROTECTED]> wrote: > >I personally don't understand why everyone seems to insist on using > >ambiguous illogical languages to express things when there are viable > >alternative available. > > I think because an AGI needs to communicate in languages that people already > know. I don't understand how artificial languages like Lojban contribute to this goal. We should focus our efforts instead on learning and modeling existing languages. > > I understand that artificial languages like Lojban and Esperanto and Attempto > have simple grammars. >I don't believe they would stay that way if they were widely used for person to person communication (as opposed to machine interfaces). Lojban grammar is easily extensible and forwards compatible. You can add features to the language through CMAvo and GISmu. Lojban already exceeds many natural languages in it's ability to express. There are very crucial parts of communication that English lacks such as logical connectives and attitudinals. >Languages evolve over time, both in individuals, and more slowly in social groups. Are you implying languages evolve faster in individuals? >A language model is not a simple set of rules. A natural language model is not. An artificial language is constructed with rules that were also created by individual -- as opposed to groups of -- humans. Lojban was especially designed to be logical, unlike Esperanto. Therefore making them recreatable by individual humans, and depending on your definition: "simple". >It is a probability distribution described by a large set of patterns such as words, word associations, grammatical structures and sentences. The approach of a world of blind to seeing is to feel at things. Sometimes they wonder if there is not another way. >Each time you read or hear a message, the probabilities for the observed patterns are increased a little and new patterns are added. >In a social setting, these probabilities tend to converge by consensus as this knowledge is shared. I agree this is a wonderful solution to predicting what the vocabulary of a language group is. >Formal definitions of artificial languages do not capture this type of knowledge, the thousands or millions of new words, idioms, shared knowledge and habits of usage. sa'u(simply speaking) Artificial languages lack a historic/cultural user base. Do I even need to reply to that?zo'o.ui.u'i(last statement humourously while happy in an amused kind of way) > > Even if we were able to constrain the grammar, you still have the problem > that people will still make ungrammatical statements, misspell words, omit > words, and so on. Amazing you should mention such valid points against natural languages. * ungrammatical statements: If they were ungrammatical they wouldn't parse in the universal Lojban parser(All Lojban parsers can be Universal Lojban parsers as long as they follow the few simple grammar rules). * misspell words: In Lojban words have a very strict formation, mu'a(for example): GISmu are either in (ccvcv or cvccv formation) all others are also syntactically unambiguous. Additionally words in Lojban are specifically designed not to sound similar to each other, so chances are it still looks/sounds just like the original word even when misspelled. If a parse error occurs(rare for Lojban users, usually typos) the user can always be notified. *omit words: (I gave an example of some GISmu before, basically they have predefined places, so you can always ask a specific question about ommitted information by simply putting a "ma" for the SUMti(argument) which you wish to know, or "mo" for the SELbri(function). *and so on. >A language model must be equipped to deal with this. go'i.ui(repetition of your statement as confirmation and happiness) >It means evaluating lots of soft constraints from a huge database for error correction, just like we do to resolve ambiguity in natural language. If "It" can be substituted as "Resolving ambiguity in natural languages" OR(logical connective) "Resolve ambiguity in ambiguous languages", I agree. > -- Matt Mahoney, [EMAIL PROTECTED] mu'omi'eLOKadin(Over to you, my name in Lokadin.) ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
