[Haskell-cafe] Re: How would you hack it?
Achim Schneider wrote: > Andrew Coppin <[EMAIL PROTECTED]> wrote: >> Achim Schneider wrote: >> > Andrew Coppin <[EMAIL PROTECTED]> wrote: >> > >> >> I have a file that contains several thousand words, seperated by >> >> white space. [I gather that on Unix there's a standard location for >> >> this file?] >> > Looking at /usr/share/dict/words, I'm assured that the proper >> > seperator is \n. >> > >> >> Thanks. I did look around trying to find this, but ultimately failed. >> (Is it a standard component, or is it installed as part of some >> specific application?) >> > [EMAIL PROTECTED] ~ % equery b /usr/share/dict/words > [ Searching for file(s) /usr/share/dict/words in *... ] > sys-apps/miscfiles-1.4.2 (/usr/share/dict/words) > [EMAIL PROTECTED] ~ % eix miscfiles > [I] sys-apps/miscfiles > Available versions: 1.4.2 {minimal} > Installed versions: 1.4.2(18:27:27 02/14/07)(-minimal) > Homepage:http://www.gnu.org/directory/miscfiles.html > Description: Miscellaneous files On Ubuntu (and supposedly debian): [EMAIL PROTECTED]: ~ > dpkg -S /usr/share/dict/words dictionaries-common: /usr/share/dict/words Cheers Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How would you hack it?
Henning Thielemann <[EMAIL PROTECTED]> wrote: > Sounds like a generator for scientific articles. :-) > Maybe >http://hackage.haskell.org/cgi-bin/hackage-scripts/package/markov-chain > can be of help for you. It's also free of randomIO. > I once invented this, though ungeneralised, for a map generator of a RTS... the river always went from the left to the right, at approximate the middle of the map, its direction being dependant on its current offset from that middle, its width (that is, a wide river can bend upwards and downwards more than one tile) and a random factor. You can also express this using a markoff chain. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: How would you hack it?
Achim Schneider wrote: If you run one over obscure academic papers, you can even generate publishable results. I don't have a link ready, but there was a fun incident involving this. http://www.physics.nyu.edu/faculty/sokal/dawkins.html ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How would you hack it?
Andrew Coppin <[EMAIL PROTECTED]> wrote: > Achim Schneider wrote: > > Andrew Coppin <[EMAIL PROTECTED]> wrote: > > > > > >> I have a file that contains several thousand words, seperated by > >> white space. [I gather that on Unix there's a standard location for > >> this file?] > > Looking at /usr/share/dict/words, I'm assured that the proper > > seperator is \n. > > > > Thanks. I did look around trying to find this, but ultimately failed. > (Is it a standard component, or is it installed as part of some > specific application?) > [EMAIL PROTECTED] ~ % equery b /usr/share/dict/words [ Searching for file(s) /usr/share/dict/words in *... ] sys-apps/miscfiles-1.4.2 (/usr/share/dict/words) [EMAIL PROTECTED] ~ % eix miscfiles [I] sys-apps/miscfiles Available versions: 1.4.2 {minimal} Installed versions: 1.4.2(18:27:27 02/14/07)(-minimal) Homepage:http://www.gnu.org/directory/miscfiles.html Description: Miscellaneous files > > Generate a Map Int [String] map, with the latter list being an > > infinite list of words with that particular size. > > > > Now assume that you want to have a 100 character sentence. You > > start by looking if you got any 100 character word, if yes it's > > your sentence, if not you divide it in half (maybe offset by a > > weighted random factor [1]) and start over again. > > > > You can then specify your whole document along the lines of > > > > (capitalise $ words 100) ++ ". " ++ (capitalise $ words 10) ++ "?" > > ++ (capitalise $ words 20) ++ "oneone1!" > > > > [1] Random midpoint displacement is a very interesting topic by > > itself. > > I'm not following your logic, sorry... > That's probably because I just described the points and not the rest of the morphisms... imagine some plumbing and tape between my sentences. Midpoint displacement is a great way to achieve randomness while still keeping a uniform appearance. In the defining paper, that I don't have ready right now, an example was shown where a realistic outline of Australia was generated from ten or so data points: If you display it next to the actual outline, only a geographer could tell which one's the fake. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: How would you hack it?
Achim Schneider wrote: Andrew Coppin <[EMAIL PROTECTED]> wrote: I have a file that contains several thousand words, seperated by white space. [I gather that on Unix there's a standard location for this file?] Looking at /usr/share/dict/words, I'm assured that the proper seperator is \n. Thanks. I did look around trying to find this, but ultimately failed. (Is it a standard component, or is it installed as part of some specific application?) As I understand it, Haskell's "words" function will work on any kind of white space - spaces, line feeds, caridge returns, tabs, etc. - so it should be fine. ;-) Since I'm developing on Windows, what I actually did was have Google find me a file online that I can download. [Remember my post a while back? "GHC panic"? Apparently GHC doesn't like it if you try to represent the entire 400 KB file as a single [String]...] Generate a Map Int [String] map, with the latter list being an infinite list of words with that particular size. Now assume that you want to have a 100 character sentence. You start by looking if you got any 100 character word, if yes it's your sentence, if not you divide it in half (maybe offset by a weighted random factor [1]) and start over again. You can then specify your whole document along the lines of (capitalise $ words 100) ++ ". " ++ (capitalise $ words 10) ++ "?" ++ (capitalise $ words 20) ++ "oneone1!" [1] Random midpoint displacement is a very interesting topic by itself. I'm not following your logic, sorry... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: How would you hack it?
On Wed, 4 Jun 2008, Achim Schneider wrote: > Gregory Collins <[EMAIL PROTECTED]> wrote: > > > Andrew Coppin <[EMAIL PROTECTED]> writes: > > > > > Clearly, what I *should* have done is think more about a good > > > abstraction before writing miles of code. ;-) So how would you guys > > > do this? > > > > If you want text that roughly resembles English, you're better off > > getting a corpus of real English text and running it through a Markov > > chain. Mark Dominus has written a few blog posts about this topic > > recently, see http://blog.plover.com/lang/finnpar.html. > > > If you run one over obscure academic papers, you can even generate > publishable results. I don't have a link ready, but there was a fun > incident involving this. A famous paper generator is http://pdos.csail.mit.edu/scigen/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How would you hack it?
Gregory Collins <[EMAIL PROTECTED]> wrote: > Andrew Coppin <[EMAIL PROTECTED]> writes: > > > Clearly, what I *should* have done is think more about a good > > abstraction before writing miles of code. ;-) So how would you guys > > do this? > > If you want text that roughly resembles English, you're better off > getting a corpus of real English text and running it through a Markov > chain. Mark Dominus has written a few blog posts about this topic > recently, see http://blog.plover.com/lang/finnpar.html. > If you run one over obscure academic papers, you can even generate publishable results. I don't have a link ready, but there was a fun incident involving this. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: How would you hack it?
Andrew Coppin <[EMAIL PROTECTED]> wrote: > I have a file that contains several thousand words, seperated by > white space. [I gather that on Unix there's a standard location for > this file?] > Looking at /usr/share/dict/words, I'm assured that the proper seperator is \n. > Clearly, what I *should* have done is think more about a good > abstraction before writing miles of code. ;-) So how would you guys > do this? > Generate a Map Int [String] map, with the latter list being an infinite list of words with that particular size. Now assume that you want to have a 100 character sentence. You start by looking if you got any 100 character word, if yes it's your sentence, if not you divide it in half (maybe offset by a weighted random factor [1]) and start over again. You can then specify your whole document along the lines of (capitalise $ words 100) ++ ". " ++ (capitalise $ words 10) ++ "?" ++ (capitalise $ words 20) ++ "oneone1!" [1] Random midpoint displacement is a very interesting topic by itself. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe