Yes, and the structure is obviously case-insensitive. More troublesome is probably that there can be multiple ACADEMIC-EMPHASIS entries, which can be tricky to tidify. Also one would need to figure out what is the meaning of lines like
(DEFPROP BOSTON-COLLEGE0 T DUPLICATE) -pd > On 18 Jan 2018, at 18:04 , Barry Rowlingson <b.rowling...@lancaster.ac.uk> > wrote: > > The file also has a bunch of email headers stuck in the middle of it: > > > ..... > > (QUALITY-OF-LIFE SCALE:1-5 4) > (ACADEMIC-EMPHASIS HEALTH-SCIENCE) > ) > ------- > ------- > > From lebow...@cs.columbia.edu Mon Feb 22 20:53:02 1988 > Received: from zodiac by meridian (5.52/4.7) > Received: from Jessica.Stanford.EDU by ads.com (5.58/1.9) > id AA04539; Mon, 22 Feb 88 20:59:59 PST > Received: from Portia.Stanford.EDU by jessica.Stanford.EDU with TCP; Mon, > 22 Feb > 88 20:58:22 PST > Received: from columbia.edu (COLUMBIA.EDU.ARPA) by Portia.STANFORD.EDU > (1.2/Ultrix2.0-B) > id AA11480; Mon, 22 Feb 88 20:49:53 pst > Received: from CS.COLUMBIA.EDU by columbia.edu (5.54/1.14) > id AA10186; Mon, 22 Feb 88 23:48:44 EST > Message-Id: <8802230448.aa10...@columbia.edu> > Date: Fri 22 Jan 88 02:50:00-EST > From: The Mailer Daemon <mai...@cs.columbia.edu> > To: lebow...@cs.columbia.edu > Subject: Message of 18-Jan-88 20:13:54 > Resent-Date: Mon 22 Feb 88 23:44:07-EST > Resent-From: Michael Lebowitz <lebow...@cs.columbia.edu> > Resent-To: soud...@portia.stanford.edu > Resent-Message-Id: <12376918538.25.lebow...@cs.columbia.edu> > Status: R > > Message undeliverable and dequeued after 3 days: > souders%merid...@ads.arpa: Cannot connect to host > ------------ > Date: Mon 18 Jan 88 20:13:54-EST > From: Michael Lebowitz <lebow...@cs.columbia.edu> > Subject: bigger file part 3 > To: souders%merid...@ads.arpa > In-Reply-To: <8801182147.aa08...@ads.arpa> > Message-ID: <12367705229.11.lebow...@cs.columbia.edu> > > (DEF-INSTANCE GEORGETOWN > (STATE MARYLAND) > (LOCATION URBAN) > (CONTROL PRIVATE) > (NO-OF-STUDENTS THOUS:10-15) > (MALE:FEMALE RATIO:45:55) > .... > > Which dates it to 1988. Nice. > > Barry > > > > On Thu, Jan 18, 2018 at 9:20 AM, Peter Crowther <peter.crowt...@melandra.com >> wrote: > >> That's a nice example of why Lisp is both powerful and terrifying - you're >> looking at a Lisp *program*, not just Lisp *data*, as Lisp makes no >> distinction between the two. You just read 'em in. >> >> The two definitions at the bottom are function definitions. The top one >> defines the def-instance function. Reading that indicates that it accepts >> an atom as a name and a list of key-value or key-range-value lists as >> properties, where they keys may be repeated to give you multi-valued >> attributes in your result. The bottom one defines a function for removing >> duplicate entries of the same location. >> >> The rest of the file (apart from the included email headers) is a whole >> load of calls to the def-instance function. In Lisp, you'd define the >> functions, then just run the rest of the file. >> >> To my knowledge, there is no generic way to read Lisp "data" into anything >> else, because of this quirk that data can look like anything. If anyone >> can correct me on that, great, but I'd be somewhat surprised. Therefore, >> as David intimated, the tools you need are generic tools for handling text, >> and you'll have to deal with the formatting yourself. If I were doing a >> one-off transform of this file, I'd probably reach for vi... but I'm an old >> Unix hacker. I certainly wouldn't teach that tooling. awk or perl could >> certainly handle it; or if you want to give students a wider view of the >> world you might wish to try ANTLR and get them to write a grammar to parse >> the file. The Clojure grammar ( >> https://github.com/antlr/grammars-v4/blob/master/clojure/Clojure.g4) would >> be an interesting place to start, although Terence Parr's comment of "match >> a bunch of crap in parentheses" would probably give a flavour of what to >> implement. Depends what else the students are learning. >> >> Hope this helps rather than hinders. >> >> - Peter >> >> On 18 January 2018 at 05:25, Ranjan Maitra <mai...@email.com> wrote: >> >>> Thanks! I am trying to use it in R. (Actually, I try to give my students >>> experiences with different kinds of files and I was wondering if there >> were >>> tools available for such kinds of files. I don't know Lisp so I do not >>> actually know what the lines towards the bottom of the file mean.( >>> >>> Many thanks for your response! >>> >>> Best wishes, >>> Ranjan >>> >>> On Wed, 17 Jan 2018 20:59:48 -0800 David Winsemius < >> dwinsem...@comcast.net> >>> wrote: >>> >>>> >>>>> On Jan 17, 2018, at 8:22 PM, Ranjan Maitra <mai...@email.com> wrote: >>>>> >>>>> Dear friends, >>>>> >>>>> Is there a way to read data files written in lisp into R? >>>>> >>>>> Here is the file: https://archive.ics.uci.edu/ >>> ml/machine-learning-databases/university/university.data >>>>> >>>>> I would like to read it into R. Any suggestions? >>>> >>>> It's just a text file. What difficulties are you having? >>>>> >>>>> >>>>> Thanks very much in advance for pointers on this and best wishes, >>>>> Ranjan >>>>> >>>>> -- >>>>> Important Notice: This mailbox is ignored: e-mails are set to be >>> deleted on receipt. Please respond to the mailing list if appropriate. >> For >>> those needing to send personal or professional e-mail, please use >>> appropriate addresses. >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide http://www.R-project.org/ >>> posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> David Winsemius >>>> Alameda, CA, USA >>>> >>>> 'Any technology distinguishable from magic is insufficiently advanced.' >>> -Gehm's Corollary to Clarke's Third Law >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/ >>> posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> -- >>> Important Notice: This mailbox is ignored: e-mails are set to be deleted >>> on receipt. Please respond to the mailing list if appropriate. For those >>> needing to send personal or professional e-mail, please use appropriate >>> addresses. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/ >>> posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.