Re: dot POS files and Corpus Linguistics
On 28/04/2010 11:33, David Glasgow wrote: On 28 Apr 2010, at 3:17 am, use-revolution-requ...@lists.runrev.com wrote: I then found out that in the case of corpus files POS means 'parts of speech'. This is typical academia delighting in obscurantism. Now for more 'fun': Also bundled in the corpus are .psd files which, wait for it, are NOT Adobe Photoshop files. PSD: Probably Something Different ??? Richmond, I have no knowledge or advice that might help you. Further, wrangling your strange corpus is of no possible use or real interest to me. .but somehow I'm hooked. There is something of the Pratchett about your posts. Please continue instalments on your progress. Good Luck, Oh Dear! You are in trouble . . . :) Paying Council Tax today; standing in a queue and listening to moronic proles wibbling on about the price of fish. I shall ruminate (cowlike) on POS and PSD files; whether that elevates me above the level of the proles or makes me even more moronic than they are has yet to be seen. ___ use-revolution mailing list use-revolution@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution
Re: dot POS files and Corpus Linguistics
On 28 Apr 2010, at 3:17 am, use-revolution-requ...@lists.runrev.com wrote: > I then found out that in the case of corpus files POS means 'parts of speech'. > This is typical academia delighting in obscurantism. > > Now for more 'fun': > > Also bundled in the corpus are .psd files which, wait for it, are NOT > Adobe Photoshop files. > > PSD: Probably Something Different ??? Richmond, I have no knowledge or advice that might help you. Further, wrangling your strange corpus is of no possible use or real interest to me. .but somehow I'm hooked. There is something of the Pratchett about your posts. Please continue instalments on your progress. Good Luck, David Glasgow ___ use-revolution mailing list use-revolution@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution
Re: dot POS files and Corpus Linguistics
I think we can all agree RunRev is the best dev environment going but suggesting dot.net is a POS may be going a little too far On Tue, Apr 27, 2010 at 5:20 PM, Richmond Mathewson < richmondmathew...@gmail.com> wrote: > On 27/04/2010 22:03, stephen barncard wrote: > >> Richmond, it appears that .pos files are LOTUS NOTES, among many others >> >> http://file-extension.net/seeker/file_extension_pos >> >> http://filext.com/file-extension/POS >> >> http://en.wikipedia.org/wiki/Lotus_Notes >> >> http://www.computerfileextensions.com/file-extensions.php/POS >> >> FILE FORMAT: >> http://www.x-ways.net/winhex/POS_Format_2_0.html >> >> >> Thank you very much for your suggestion. > > However, I got led up that garden path and spent some time mucking > around with Lotus notes. > > I then found out that in the case of corpus files POS means 'parts of > speech'. > This is typical academia delighting in obscurantism. > > Now for more 'fun': > > Also bundled in the corpus are .psd files which, wait for it, are NOT > Adobe Photoshop files. > > PSD: Probably Something Different ??? > > ___ > use-revolution mailing list > use-revolution@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-revolution > ___ use-revolution mailing list use-revolution@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution
Re: dot POS files and Corpus Linguistics
On 27/04/2010 22:03, stephen barncard wrote: Richmond, it appears that .pos files are LOTUS NOTES, among many others http://file-extension.net/seeker/file_extension_pos http://filext.com/file-extension/POS http://en.wikipedia.org/wiki/Lotus_Notes http://www.computerfileextensions.com/file-extensions.php/POS FILE FORMAT: http://www.x-ways.net/winhex/POS_Format_2_0.html Thank you very much for your suggestion. However, I got led up that garden path and spent some time mucking around with Lotus notes. I then found out that in the case of corpus files POS means 'parts of speech'. This is typical academia delighting in obscurantism. Now for more 'fun': Also bundled in the corpus are .psd files which, wait for it, are NOT Adobe Photoshop files. PSD: Probably Something Different ??? ___ use-revolution mailing list use-revolution@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution
Re: dot POS files and Corpus Linguistics
Richmond, it appears that .pos files are LOTUS NOTES, among many others http://file-extension.net/seeker/file_extension_pos http://filext.com/file-extension/POS http://en.wikipedia.org/wiki/Lotus_Notes http://www.computerfileextensions.com/file-extensions.php/POS FILE FORMAT: http://www.x-ways.net/winhex/POS_Format_2_0.html On 27 April 2010 11:04, Richmond Mathewson wrote: > Well, Yippee-doo; the good folks at the University of > Oxford have sent me the files of the > York-Toronto-Helsinki Parsed Corpus of Old English Prose > (try saying that with your mouth full of cornflakes). > > Jolly generous considering it is normally restricted to British > Higher Education Institutions (somehow the University of > Plovdiv, Paisii Hilendarski doesn't fit in that category). > > HOWEVER; the corpus comes in .pos files whcih cheeses me > off immensely; on opening them with the redoubtable > TextWrangler they are heavily formatted in some odd fashion > suggesting some sort of meta-tagging. > > The Java-based CS_2.002.74.jar, a.k.a 'CorpusSearch' doesn't run > for some funny reason on ye olde G4 (have yet to try it on the > Ubu-Box); but that doesn't really fuss me as ye olde academics > have decided the parameters of their stuff in advance and my feet > are too big for their shoes (hey; it's mixed metaphors time again). > > So; I am looking to build a Runrev data-miner / chewer / masticator > / whatever; but, until I can work out what a .pos file can be opened with > (so I can hae a keek at its formatin) the whole thing is on standby. > Once I can see what a .pos file should look like in some sort of POS-file > reader I can cobble together a suitably algorithmic sieve to make the > file look like it should inside a text field prior to 'chewin the fat'. > > Google comes up with unintentionally witty results about 'point of sale' > and so forth, as well as something about Arabic linguistic corpora, > Chinese linguistic corpora and so forth (well, at least they are going > in the right direction). > > Having written one of those slimy messages back, where one thanks people > fulsomely and then shoves in the 'However'; I got a "we cannot comment on > other methods of accessing the corpus" message. Well; at least I signed my > name with > my second name (Richmond) otherwise I would have had what the Americans > call > a 'Dear John' message . . . :) > > Any help re POS-file readers would be most welcome. > > sincerely, Richmond Mathewson. > ___ > use-revolution mailing list > use-revolution@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-revolution > -- - Stephen Barncard Back home in SF ___ use-revolution mailing list use-revolution@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution
dot POS files and Corpus Linguistics
Well, Yippee-doo; the good folks at the University of Oxford have sent me the files of the York-Toronto-Helsinki Parsed Corpus of Old English Prose (try saying that with your mouth full of cornflakes). Jolly generous considering it is normally restricted to British Higher Education Institutions (somehow the University of Plovdiv, Paisii Hilendarski doesn't fit in that category). HOWEVER; the corpus comes in .pos files whcih cheeses me off immensely; on opening them with the redoubtable TextWrangler they are heavily formatted in some odd fashion suggesting some sort of meta-tagging. The Java-based CS_2.002.74.jar, a.k.a 'CorpusSearch' doesn't run for some funny reason on ye olde G4 (have yet to try it on the Ubu-Box); but that doesn't really fuss me as ye olde academics have decided the parameters of their stuff in advance and my feet are too big for their shoes (hey; it's mixed metaphors time again). So; I am looking to build a Runrev data-miner / chewer / masticator / whatever; but, until I can work out what a .pos file can be opened with (so I can hae a keek at its formatin) the whole thing is on standby. Once I can see what a .pos file should look like in some sort of POS-file reader I can cobble together a suitably algorithmic sieve to make the file look like it should inside a text field prior to 'chewin the fat'. Google comes up with unintentionally witty results about 'point of sale' and so forth, as well as something about Arabic linguistic corpora, Chinese linguistic corpora and so forth (well, at least they are going in the right direction). Having written one of those slimy messages back, where one thanks people fulsomely and then shoves in the 'However'; I got a "we cannot comment on other methods of accessing the corpus" message. Well; at least I signed my name with my second name (Richmond) otherwise I would have had what the Americans call a 'Dear John' message . . . :) Any help re POS-file readers would be most welcome. sincerely, Richmond Mathewson. ___ use-revolution mailing list use-revolution@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution