There are 5 messages in this issue.

Topics in this digest:

1.1. Re: Lexicon (proportion and quantity)    
    From: BPJ

2a. Re: Is there a word for this?    
    From: Jeff Sheets
2b. Re: Is there a word for this?    
    From: Gary Shannon

3a. NLP class (was RE: Is there a word for this?)    
    From: Mathieu Roy
3b. Re: NLP class (was RE: Is there a word for this?)    
    From: Nikolay Ivankov


Messages
________________________________________________________________________
1.1. Re: Lexicon (proportion and quantity)
    Posted by: "BPJ" [email protected] 
    Date: Sat Jan 26, 2013 12:19 pm ((PST))

On 2013-01-26 00:17, Alex Fink wrote:
>> Jörg << It is IMHO useful to keep a thematic dictionary of
>> your conlang where words are sorted according to fields of
>> discourse; this shows better than an alphabetically sorted
>> dictionary which fields are already well-covered and which
>> need more work.>>
>>> I agree. In fact, my conlang will be both at the same time.
>>> For example all words that are animal will start with the
>>> same letter, etc.
> And I urge caution here as well.  This line of thinking is
> common enough (especially in the Wilkins era, but also in
> things like Solresol in less straightforward form) that the
> body plan of language it yields has a name: it's a_taxonomic
> language_.  But taxonomic languages have a significant flaw in
> usability.  They give the most similar words to the most
> similar meanings, and that means that if I accidentally mishear
> you a little bit, it will often not be obvious that the word
> that got changed was an accident -- instead it will be far more
> likely to make a significant difference to the meaning,
> undetected!

It deserves being pointed out with some emphasis that
a taxonomically organized vocabulary database (which may
be of the dead-tree variety or whatever, not necessary
maintained with a so-called database application, as
Jörg described isn't the same as a taxonomically organized
morphology.  A Roget-style thesaurus is a taxonomically
organized vocabulary database, but the vocabulary therein
can come from any kind of language, natlang or conlang.
Roget was inspired by the taxonomic languages of Wilkins
et alii, but the idea of a taxonomically organized vocabulary
database transcends the taxonomic language, thankfully.
So at least some good came off those languages.

A good read if you're into naturalistic conlangs is
(still) Buck, C.D., 1949. A Dictionary of Selected
Synonyms in the Principal Indo-European Languages: A
Contribution to the History of Ideas, University of
Chicago Press. although Buck had his biasses of
selection.

On 2013-01-26 16:06, Jörg Rhiemeier wrote:
> Correct - taxonomic languages tend to have an awfully low level
> of redundancy: a mishearing, especially one near the end of the
> word (where this is more likely than near the beginning of the
> word), is likely to yield a valid word that even makes sense in
> the context, so the error cannot easily be detected by the result
> being either a word that does not exist or one that exists but
> does not fit the context.

So maybe if one *must* make a taxonomic language one
should order the morphemes with most specific first and
initial stress! ;)

Swedish shorthand theoretician Hans Karlgren pointed
out that if one is to map the English vowel system onto
a small inventory of vowel signs one ought to take the
Great Vowel Shift into account and e.g. use the same
sign for /ɪ/ and /aɪ/ rather than mapping the sign
for Swedish <i> to both of English /ɪ/ and /i/ as
people who wrote English with Swedish Shorthand
usually did, using similar arguments.  I managed
to come up with a much better morphophonemic mapping
than his, taking phoneme cognacy into account *and*
actually having different signs for most GVS pairs,
but that's another story.

/bpj





Messages in this topic (30)
________________________________________________________________________
________________________________________________________________________
2a. Re: Is there a word for this?
    Posted by: "Jeff Sheets" [email protected] 
    Date: Sat Jan 26, 2013 12:24 pm ((PST))

Interesting choices for the names of categories/tags. I'm a bit too used to
Phrase Structure Rules to have figured that say RB means adverb, or that JJ
means adjective. But how you name your tags isn't important so long as the
system is consistent.

Some other links for you. The first is a link to a Coursera online class on
natural language processing, which I think most closely matches your
interest. Might give you an idea of how academics are approaching the
problem today. NLP is without doubt an extremely difficult task for
computers.

https://www.coursera.org/course/nlangp

This next link is a crash course to Phrase Structure Rules. It gives an
idea of how I approach syntax, and also shows just how ambiguous natural
languages can be, and thus how difficult it can be to translate them
automatically with computers.

http://people.umass.edu/afarudi/Phrase%20Structure%20Rules-Kyle%20Johnson.pdf

If you have some money you can afford to spend, less than $50, I highly
recommend the following textbook:

http://www.amazon.com/Grammar-as-Science-Richard-Larson/dp/026251303X

It's a thorough read on how syntax is approached on its own, including how
to deal with features of words and phrases in a grammar (such as verbs of
motion as a feature, locative, dative, accusative in noun phrases, etc.)
and how to deal with movement of constituents in a sentence. Don't let its
cartoony diagrams fool you, either. It gets into the really complex
syntactic structures just as much as it does simplistic ones.  Available in
paperback and a kindle version, and used versions at quite a low price.

Now, I wish there was a nice, concise, definitive, and standardized list of
English phrase structure rules... but as far as I know, there is no such
compilation. Primarily because linguists don't necessarily agree on how to
deal with the really complex issues of English syntax.


On Fri, Jan 25, 2013 at 2:34 PM, Gary Shannon <[email protected]> wrote:

> This is very interesting. Thanks for posting that link. I'll have to
> spend some more time looking into that.
>
> As for my formal grammar, I'm using a parenthetical notation that
> allows me to tag/parse a sentence, and then extract both the
> production rules and the lexicon directly from a collection of tagged
> sentences. Something like this:
>
> Sentence: Bravely the wounded soldier struggled on.
>
> Tagged/parsed:
>
> SNT(RB(Bravely) SNT(ND(DT(the) NJ(JJ(wounded) NN(soldier)))
> VBP(VB(struggled) RBP(on))))
>
> Words removed:
>
> SNT(RB SNT(ND(DT NJ(JJ NN)) VBP(VB RBP)))
>
> Rules extracted:
>
> ND(DT NJ)
> NJ(JJ NN)
> SNT(ND VBP)
> SNT(RB SNT)
> VBP(VB RBP)
>
> Lexicon extracted:
>
> DT(the)
> JJ(wounded)
> NN(soldier)
> RB(bravely)
> RBP(on)
> VB(struggled)
>
> Sorted by word:
>
> bravely RB
> on RBP
> soldier NN
> struggled VB
> the DT
> wounded JJ
>
> And, of course, when the same word shows up with different parts of
> speech, all those alternatives would appear in the lexicon. My tags
> are borrowed from the Brown Corpus tag set, with several modifications
> to fit my specific application. (
> http://www.comp.leeds.ac.uk/ccalas/tagsets/brown.html )
>
> --gary
>
>
> On Fri, Jan 25, 2013 at 11:14 AM, Jeff Sheets <[email protected]>
> wrote:
> > I'm surprised nobody has mentioned "constituent" yet.
> >
> > http://en.wikipedia.org/wiki/Constituent_(linguistics)
> >
> > The set of all constituents then, is all phrases and single words in the
> > language. As the context becomes more known, the set of potential
> > constituents is reduced to a subset. Note, however:
> >
> > The box ___ down.
> >
> > does not offer a constituent in the technical meaning of the word, but I
> > still think that that is as close as you are likely to get. About all
> that
> > fits there is the subset of verbs that function and fit with the adverb
> > down, though I'm not putting too much thought into that. However, if you
> > start with:
> >
> > The box ___.
> >
> > You know that any of the following will fit:
> >
> > came in the mail today.
> > fell down.
> > is really rather large and ungainly to transport across the distance of
> 16
> > miles by foot both uphill an downhill.
> >
> > The context allows a much broader set of constituents. However, below
> > constituents are just the parts of speech. The reason why verbs like
> > "spoke" don't fit in the first sentence is that they lack some features.
> > Some verbs will be transitive, and thus require a direct object. Some
> verbs
> > are ditransitive and require both a direct and indirect object. In this
> > case, the feature is more that the verbs must describe movement.
> >
> > The box slides down.
> > The box fell down.
> > The box ran down.
> > The box jumped down.
> > The box teleported down.
> > * The box spoke down.
> > * The box thought down.
> > * The box befriended down.
> > x The box ascended down.
> >
> > That last sentence feels grammatical to me, though obviously it makes no
> > sense, but the three marked with * are very much syntactically incorrect
> > for me. The key thing is that slides, fell, ran, jumped, teleported, and
> > ascended are all verbs which have the feature of describing motion.
> >
> > One question I have is, how are you defining/describing the grammar and
> > lexicon of your language? Are you using a formal grammar notation like
> the
> > following?
> >
> > S -> NP VP
> > NP -> (Det) N
> > NP -> NP PP
> > NP -> Adj NP
> > PP -> Prep NP
> > VP -> V
> > VP -> VP Adv
> > etc.
> >
> > You may want to identify that adverbs like "down" must modify a verb with
> > the feature of "motion", and then for every motion verb, add that feature
> > to a list of features. Other features you should probably have is the
> > transitivity of the verb.
> >
> >
> > On Tue, Jan 22, 2013 at 8:57 PM, Ralph DeCarli <[email protected]
> >wrote:
> >
> >> On Mon, 21 Jan 2013 20:20:30 -0600
> >> George Corley <[email protected]> wrote:
> >>
> >> >
> >> > Do you consider the instrument or other prepositional elements
> >> > inherently part of the verb?
> >> >
> >> In this specific case I consider the fork to be a data element of
> >> the 'eating' predicate, if that makes any sense. I tend to think of
> >> the language in data modeling terms.
> >>
> >> A given prepositional phrase could modify the subject, the object or
> >> the predicate, but it can't modify the entire sentence. I think this
> >> actually stems from my general fear of 'global variables'.
> >>
> >> In other words, I'm really still more of a programmer and a "data
> >> bigot" than a linguist, so my conlang (or con-patois, more
> >> accurately) is going to reflect my learned habits.
> >>
> >> Ralph
> >> --
> >>
> >> Have you heard of the new post-neo-modern art style?
> >> They haven't decided what it looks like yet.
> >>
>





Messages in this topic (21)
________________________________________________________________________
2b. Re: Is there a word for this?
    Posted by: "Gary Shannon" [email protected] 
    Date: Sat Jan 26, 2013 1:37 pm ((PST))

Interesting links. Thank you. I ordered a used copy of that book
(Grammar as Science).

Are you familiar with Link Grammars? I wrote a parser based on a link
grammar several years ago. Here's a good introduction:
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/link/pub/www/papers/ps/LG-IWPT93.pdf

--gary

On Sat, Jan 26, 2013 at 12:24 PM, Jeff Sheets <[email protected]> wrote:
> Interesting choices for the names of categories/tags. I'm a bit too used to
> Phrase Structure Rules to have figured that say RB means adverb, or that JJ
> means adjective. But how you name your tags isn't important so long as the
> system is consistent.
>
> Some other links for you. The first is a link to a Coursera online class on
> natural language processing, which I think most closely matches your
> interest. Might give you an idea of how academics are approaching the
> problem today. NLP is without doubt an extremely difficult task for
> computers.
>
> https://www.coursera.org/course/nlangp
>
> This next link is a crash course to Phrase Structure Rules. It gives an
> idea of how I approach syntax, and also shows just how ambiguous natural
> languages can be, and thus how difficult it can be to translate them
> automatically with computers.
>
> http://people.umass.edu/afarudi/Phrase%20Structure%20Rules-Kyle%20Johnson.pdf
>
> If you have some money you can afford to spend, less than $50, I highly
> recommend the following textbook:
>
> http://www.amazon.com/Grammar-as-Science-Richard-Larson/dp/026251303X
>
> It's a thorough read on how syntax is approached on its own, including how
> to deal with features of words and phrases in a grammar (such as verbs of
> motion as a feature, locative, dative, accusative in noun phrases, etc.)
> and how to deal with movement of constituents in a sentence. Don't let its
> cartoony diagrams fool you, either. It gets into the really complex
> syntactic structures just as much as it does simplistic ones.  Available in
> paperback and a kindle version, and used versions at quite a low price.
>
> Now, I wish there was a nice, concise, definitive, and standardized list of
> English phrase structure rules... but as far as I know, there is no such
> compilation. Primarily because linguists don't necessarily agree on how to
> deal with the really complex issues of English syntax.
>
>
> On Fri, Jan 25, 2013 at 2:34 PM, Gary Shannon <[email protected]> wrote:
>
>> This is very interesting. Thanks for posting that link. I'll have to
>> spend some more time looking into that.
>>
>> As for my formal grammar, I'm using a parenthetical notation that
>> allows me to tag/parse a sentence, and then extract both the
>> production rules and the lexicon directly from a collection of tagged
>> sentences. Something like this:
>>
>> Sentence: Bravely the wounded soldier struggled on.
>>
>> Tagged/parsed:
>>
>> SNT(RB(Bravely) SNT(ND(DT(the) NJ(JJ(wounded) NN(soldier)))
>> VBP(VB(struggled) RBP(on))))
>>
>> Words removed:
>>
>> SNT(RB SNT(ND(DT NJ(JJ NN)) VBP(VB RBP)))
>>
>> Rules extracted:
>>
>> ND(DT NJ)
>> NJ(JJ NN)
>> SNT(ND VBP)
>> SNT(RB SNT)
>> VBP(VB RBP)
>>
>> Lexicon extracted:
>>
>> DT(the)
>> JJ(wounded)
>> NN(soldier)
>> RB(bravely)
>> RBP(on)
>> VB(struggled)
>>
>> Sorted by word:
>>
>> bravely RB
>> on RBP
>> soldier NN
>> struggled VB
>> the DT
>> wounded JJ
>>
>> And, of course, when the same word shows up with different parts of
>> speech, all those alternatives would appear in the lexicon. My tags
>> are borrowed from the Brown Corpus tag set, with several modifications
>> to fit my specific application. (
>> http://www.comp.leeds.ac.uk/ccalas/tagsets/brown.html )
>>
>> --gary
>>
>>
>> On Fri, Jan 25, 2013 at 11:14 AM, Jeff Sheets <[email protected]>
>> wrote:
>> > I'm surprised nobody has mentioned "constituent" yet.
>> >
>> > http://en.wikipedia.org/wiki/Constituent_(linguistics)
>> >
>> > The set of all constituents then, is all phrases and single words in the
>> > language. As the context becomes more known, the set of potential
>> > constituents is reduced to a subset. Note, however:
>> >
>> > The box ___ down.
>> >
>> > does not offer a constituent in the technical meaning of the word, but I
>> > still think that that is as close as you are likely to get. About all
>> that
>> > fits there is the subset of verbs that function and fit with the adverb
>> > down, though I'm not putting too much thought into that. However, if you
>> > start with:
>> >
>> > The box ___.
>> >
>> > You know that any of the following will fit:
>> >
>> > came in the mail today.
>> > fell down.
>> > is really rather large and ungainly to transport across the distance of
>> 16
>> > miles by foot both uphill an downhill.
>> >
>> > The context allows a much broader set of constituents. However, below
>> > constituents are just the parts of speech. The reason why verbs like
>> > "spoke" don't fit in the first sentence is that they lack some features.
>> > Some verbs will be transitive, and thus require a direct object. Some
>> verbs
>> > are ditransitive and require both a direct and indirect object. In this
>> > case, the feature is more that the verbs must describe movement.
>> >
>> > The box slides down.
>> > The box fell down.
>> > The box ran down.
>> > The box jumped down.
>> > The box teleported down.
>> > * The box spoke down.
>> > * The box thought down.
>> > * The box befriended down.
>> > x The box ascended down.
>> >
>> > That last sentence feels grammatical to me, though obviously it makes no
>> > sense, but the three marked with * are very much syntactically incorrect
>> > for me. The key thing is that slides, fell, ran, jumped, teleported, and
>> > ascended are all verbs which have the feature of describing motion.
>> >
>> > One question I have is, how are you defining/describing the grammar and
>> > lexicon of your language? Are you using a formal grammar notation like
>> the
>> > following?
>> >
>> > S -> NP VP
>> > NP -> (Det) N
>> > NP -> NP PP
>> > NP -> Adj NP
>> > PP -> Prep NP
>> > VP -> V
>> > VP -> VP Adv
>> > etc.
>> >
>> > You may want to identify that adverbs like "down" must modify a verb with
>> > the feature of "motion", and then for every motion verb, add that feature
>> > to a list of features. Other features you should probably have is the
>> > transitivity of the verb.
>> >
>> >
>> > On Tue, Jan 22, 2013 at 8:57 PM, Ralph DeCarli <[email protected]
>> >wrote:
>> >
>> >> On Mon, 21 Jan 2013 20:20:30 -0600
>> >> George Corley <[email protected]> wrote:
>> >>
>> >> >
>> >> > Do you consider the instrument or other prepositional elements
>> >> > inherently part of the verb?
>> >> >
>> >> In this specific case I consider the fork to be a data element of
>> >> the 'eating' predicate, if that makes any sense. I tend to think of
>> >> the language in data modeling terms.
>> >>
>> >> A given prepositional phrase could modify the subject, the object or
>> >> the predicate, but it can't modify the entire sentence. I think this
>> >> actually stems from my general fear of 'global variables'.
>> >>
>> >> In other words, I'm really still more of a programmer and a "data
>> >> bigot" than a linguist, so my conlang (or con-patois, more
>> >> accurately) is going to reflect my learned habits.
>> >>
>> >> Ralph
>> >> --
>> >>
>> >> Have you heard of the new post-neo-modern art style?
>> >> They haven't decided what it looks like yet.
>> >>
>>





Messages in this topic (21)
________________________________________________________________________
________________________________________________________________________
3a. NLP class (was RE: Is there a word for this?)
    Posted by: "Mathieu Roy" [email protected] 
    Date: Sat Jan 26, 2013 1:07 pm ((PST))

Jeff: <<Some other links for you. The first is a link to a Coursera online
class on natural language processing, which I think most closely matches
your interest. Might give you an idea of how academics are approaching the
problem today. NLP is without doubt an extremely difficult task for
computers.

https://www.coursera.org/course/nlangp>>

I was hesitating between this course and a similar one
(https://www.coursera.org/course/nlp). In your opinion, which one seems
better and why?

-Mathieu





Messages in this topic (2)
________________________________________________________________________
3b. Re: NLP class (was RE: Is there a word for this?)
    Posted by: "Nikolay Ivankov" [email protected] 
    Date: Sat Jan 26, 2013 2:22 pm ((PST))

On Sat, Jan 26, 2013 at 10:07 PM, Mathieu Roy <[email protected]>wrote:

> Jeff: <<Some other links for you. The first is a link to a Coursera online
> class on natural language processing, which I think most closely matches
> your interest. Might give you an idea of how academics are approaching the
> problem today. NLP is without doubt an extremely difficult task for
> computers.
>
> https://www.coursera.org/course/nlangp>>
>
> I was hesitating between this course and a similar one
> (https://www.coursera.org/course/nlp). In your opinion, which one seems
> better and why?
>
> -Mathieu
>

Seconding this question.





Messages in this topic (2)





------------------------------------------------------------------------
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/conlang/

<*> Your email settings:
    Digest Email  | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/conlang/join
    (Yahoo! ID required)

<*> To change settings via email:
    [email protected] 
    [email protected]

<*> To unsubscribe from this group, send an email to:
    [email protected]

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 
------------------------------------------------------------------------

Reply via email to