Hey Ben,

It seems that recent college IT grads here hope to earn about 3000rmb
(375usd) a month, but often must settle for less.  This is based on my
rather limited knowledge.  Hopefully I will know more in the near future,
since I have been getting the word out and have a local headhunter looking
for some candidates.  One prospect who is not willing to leave his job for
short term work responded, "you are offering too much."

>I guess the important thing is to store as much data as possible, in a
>clearly structured way.

>People can always postprocess the data using their own scripts, so long as
>the information is there are and is clearly structured...

Yes, I agree with this sentiment.  I am thinking along the lines full
conventional citation plus other data such as location and original date of
creation.  We may indulge in a little overkill, since I have already
experienced remorse at not recording more detail in some of the early
stages.  Trial and error remains a great teacher.

>XML or RDF type syntax is generally easy for people to work with...

XML may be the way to go.  Perhaps XML files can largely replace DB's, and a
translation from XML to a DB should be straightforward.  A relational DB
could allow associating one convun to another, thus illustrating a joke or
poem, for example.  Those types of relationships may be difficult with XML,
but could be done programmatically, at least to some extent.  This AI
business sure could consume a lot of "gurus."

>I would definitely want each conversational unit linked to each
conversation
>it was embodied in -- the full conversational history... so that the
context
>could be determined....  One of the interesting things to mine from this
>dataset is how people respond to context...

I will add "Ben" to my WordNet gloss for "ambitious" :-)  . . . good point
though.  We are now able to conveniently store mind-boggling amounts of text
data.  Ella will display the entire text of Kant's Critique of Pure Reason
in a single window of your browser (its amazing that those scrollbars never
wear out).  The one-microprocessor bottleneck is the big limitation (for me
anway).

>On a different topic: If you plan to involve statistical NLP technology in
>the next phase of your project, that could be an interesting thing to talk
>about ... it's not something I'm working on now, but we played around with
>it a lot at Webmind Inc. ...

Thanks for the idea.  I have been meaning to take a closer look at what has
gone on at Webmind Inc..

Later . . . Kevin

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Reply via email to