:>: -----Original Message-----
:>: From: IBM Mainframe Discussion List [mailto:[email protected]] On
:>: Behalf Of Steve Comstock
:>: Sent: Monday, November 19, 2012 2:00 PM
:>: To: [email protected]
:>: Subject: Re: Parsing (was: "New" way to do UCB lookups)
:>:
:>: On 11/19/2012 2:56 PM, Paul Gilmartin wrote:
:>: > On Mon, 19 Nov 2012 21:39:57 +0000, Lindy Mayfield wrote:
:>: >>
:>: >> It gets me all Lewis Carroll just thinking about it.  I cannot even
:>: imagine how to create something like that SQL in Finnish.  Something so
:>: simple as that, I cannot even think how a computer could parse it
:>: written in an agglutinative language.  Though I am a bear of very little
:>: brain, so I'm sure it could be done.  :-)
:>: >>
:>: > Wouldn't this be somewhat like FORTRAN, where the lexical analyzer
:>: first removes
:>: > _all_[1] blanks, rendering the source code maximally agglutinative,
:>: then attempts
:>: > to parse the mess so created?
:>: >
:>: > [1] Well, except in quoted or counted text strings.
:>: >
:>: >> So to bring it a bit back on to topic, English can be weird, but
:>: sometimes quite useful in its own way.
:>: >>
:>: > Classic Latin was written with no interword separators.
:>:
:>: Interesting. I didn't know that. Japanese is written with no
:>: interword separators also.

According to one of the folks I worked with over there, on the rare occasion
when the character sequence is not sufficient to determine where the word
break is, they will use a dot (think period or decimal point) to separate
the characters.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to