[Haskell-cafe] Re: How would you hack it?

2008-06-05 Thread Ben Franksen
Achim Schneider wrote:
> Andrew Coppin <[EMAIL PROTECTED]> wrote:
>> Achim Schneider wrote:
>> > Andrew Coppin <[EMAIL PROTECTED]> wrote:
>> >   
>> >> I have a file that contains several thousand words, seperated by
>> >> white space. [I gather that on Unix there's a standard location for
>> >> this file?]
>> > Looking at /usr/share/dict/words, I'm assured that the proper
>> > seperator is \n.
>> >   
>> 
>> Thanks. I did look around trying to find this, but ultimately failed.
>> (Is it a standard component, or is it installed as part of some
>> specific application?)
>> 
> [EMAIL PROTECTED] ~ % equery b /usr/share/dict/words
> [ Searching for file(s) /usr/share/dict/words in *... ]
> sys-apps/miscfiles-1.4.2 (/usr/share/dict/words)
> [EMAIL PROTECTED] ~ % eix miscfiles
> [I] sys-apps/miscfiles
>  Available versions:  1.4.2 {minimal}
>  Installed versions:  1.4.2(18:27:27 02/14/07)(-minimal)
>  Homepage:http://www.gnu.org/directory/miscfiles.html
>  Description: Miscellaneous files

On Ubuntu (and supposedly debian):

[EMAIL PROTECTED]: ~ > dpkg -S /usr/share/dict/words
dictionaries-common: /usr/share/dict/words

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How would you hack it?

2008-06-05 Thread Achim Schneider
Henning Thielemann <[EMAIL PROTECTED]> wrote:

> Sounds like a generator for scientific articles. :-)
> Maybe
>http://hackage.haskell.org/cgi-bin/hackage-scripts/package/markov-chain
>  can be of help for you. It's also free of randomIO.
> 
I once invented this, though ungeneralised, for a map generator of a
RTS... the river always went from the left to the right, at approximate
the middle of the map, its direction being dependant on its current
offset from that middle, its width (that is, a wide river can bend
upwards and downwards more than one tile) and a random factor. You can
also express this using a markoff chain.

-- 
(c) this sig last receiving data processing entity. Inspect headers for
past copyright information. All rights reserved. Unauthorised copying,
hiring, renting, public performance and/or broadcasting of this
signature prohibited. 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: How would you hack it?

2008-06-05 Thread Jules Bean

Achim Schneider wrote:

If you run one over obscure academic papers, you can even generate
publishable results. I don't have a link ready, but there was a fun
incident involving this.


http://www.physics.nyu.edu/faculty/sokal/dawkins.html
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How would you hack it?

2008-06-04 Thread Achim Schneider
Andrew Coppin <[EMAIL PROTECTED]> wrote:

> Achim Schneider wrote:
> > Andrew Coppin <[EMAIL PROTECTED]> wrote:
> >
> >   
> >> I have a file that contains several thousand words, seperated by
> >> white space. [I gather that on Unix there's a standard location for
> >> this file?]
> > Looking at /usr/share/dict/words, I'm assured that the proper
> > seperator is \n.
> >   
> 
> Thanks. I did look around trying to find this, but ultimately failed. 
> (Is it a standard component, or is it installed as part of some
> specific application?)
> 
[EMAIL PROTECTED] ~ % equery b /usr/share/dict/words
[ Searching for file(s) /usr/share/dict/words in *... ]
sys-apps/miscfiles-1.4.2 (/usr/share/dict/words)
[EMAIL PROTECTED] ~ % eix miscfiles
[I] sys-apps/miscfiles
 Available versions:  1.4.2 {minimal}
 Installed versions:  1.4.2(18:27:27 02/14/07)(-minimal)
 Homepage:http://www.gnu.org/directory/miscfiles.html
 Description: Miscellaneous files

> > Generate a Map Int [String] map, with the latter list being an
> > infinite list of words with that particular size.
> >
> > Now assume that you want to have a 100 character sentence. You
> > start by looking if you got any 100 character word, if yes it's
> > your sentence, if not you divide it in half (maybe offset by a
> > weighted random factor [1]) and start over again.
> >
> > You can then specify your whole document along the lines of
> >
> > (capitalise $ words 100) ++ ". " ++ (capitalise $ words 10) ++ "?"
> > ++ (capitalise $ words 20) ++ "oneone1!" 
> >
> > [1] Random midpoint displacement is a very interesting topic by
> > itself. 
> 
> I'm not following your logic, sorry...
>
That's probably because I just described the points and not the rest
of the morphisms... imagine some plumbing and tape between my sentences.

Midpoint displacement is a great way to achieve randomness while still
keeping a uniform appearance. In the defining paper, that I don't have
ready right now, an example was shown where a realistic outline of
Australia was generated from ten or so data points: If you display it
next to the actual outline, only a geographer could tell which one's
the fake.

-- 
(c) this sig last receiving data processing entity. Inspect headers for
past copyright information. All rights reserved. Unauthorised copying,
hiring, renting, public performance and/or broadcasting of this
signature prohibited. 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: How would you hack it?

2008-06-04 Thread Andrew Coppin

Achim Schneider wrote:

Andrew Coppin <[EMAIL PROTECTED]> wrote:

  

I have a file that contains several thousand words, seperated by
white space. [I gather that on Unix there's a standard location for
this file?]

Looking at /usr/share/dict/words, I'm assured that the proper seperator
is \n.
  


Thanks. I did look around trying to find this, but ultimately failed. 
(Is it a standard component, or is it installed as part of some specific 
application?)


As I understand it, Haskell's "words" function will work on any kind of 
white space - spaces, line feeds, caridge returns, tabs, etc. - so it 
should be fine. ;-) Since I'm developing on Windows, what I actually did 
was have Google find me a file online that I can download.


[Remember my post a while back? "GHC panic"? Apparently GHC doesn't like 
it if you try to represent the entire 400 KB file as a single [String]...]



Generate a Map Int [String] map, with the latter list being an infinite
list of words with that particular size.

Now assume that you want to have a 100 character sentence. You start by
looking if you got any 100 character word, if yes it's your sentence,
if not you divide it in half (maybe offset by a weighted random
factor [1]) and start over again.

You can then specify your whole document along the lines of

(capitalise $ words 100) ++ ". " ++ (capitalise $ words 10) ++ "?" ++
(capitalise $ words 20) ++ "oneone1!" 


[1] Random midpoint displacement is a very interesting topic by itself.
  


I'm not following your logic, sorry...

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: How would you hack it?

2008-06-04 Thread Henning Thielemann

On Wed, 4 Jun 2008, Achim Schneider wrote:

> Gregory Collins <[EMAIL PROTECTED]> wrote:
>
> > Andrew Coppin <[EMAIL PROTECTED]> writes:
> >
> > > Clearly, what I *should* have done is think more about a good
> > > abstraction before writing miles of code. ;-) So how would you guys
> > > do this?
> >
> > If you want text that roughly resembles English, you're better off
> > getting a corpus of real English text and running it through a Markov
> > chain. Mark Dominus has written a few blog posts about this topic
> > recently, see http://blog.plover.com/lang/finnpar.html.
> >
> If you run one over obscure academic papers, you can even generate
> publishable results. I don't have a link ready, but there was a fun
> incident involving this.

A famous paper generator is
   http://pdos.csail.mit.edu/scigen/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How would you hack it?

2008-06-04 Thread Achim Schneider
Gregory Collins <[EMAIL PROTECTED]> wrote:

> Andrew Coppin <[EMAIL PROTECTED]> writes:
> 
> > Clearly, what I *should* have done is think more about a good
> > abstraction before writing miles of code. ;-) So how would you guys
> > do this?
> 
> If you want text that roughly resembles English, you're better off
> getting a corpus of real English text and running it through a Markov
> chain. Mark Dominus has written a few blog posts about this topic
> recently, see http://blog.plover.com/lang/finnpar.html.
> 
If you run one over obscure academic papers, you can even generate
publishable results. I don't have a link ready, but there was a fun
incident involving this.

-- 
(c) this sig last receiving data processing entity. Inspect headers for
past copyright information. All rights reserved. Unauthorised copying,
hiring, renting, public performance and/or broadcasting of this
signature prohibited. 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: How would you hack it?

2008-06-04 Thread Achim Schneider
Andrew Coppin <[EMAIL PROTECTED]> wrote:

> I have a file that contains several thousand words, seperated by
> white space. [I gather that on Unix there's a standard location for
> this file?]
>
Looking at /usr/share/dict/words, I'm assured that the proper seperator
is \n.

> Clearly, what I *should* have done is think more about a good 
> abstraction before writing miles of code. ;-) So how would you guys
> do this?
>
Generate a Map Int [String] map, with the latter list being an infinite
list of words with that particular size.

Now assume that you want to have a 100 character sentence. You start by
looking if you got any 100 character word, if yes it's your sentence,
if not you divide it in half (maybe offset by a weighted random
factor [1]) and start over again.

You can then specify your whole document along the lines of

(capitalise $ words 100) ++ ". " ++ (capitalise $ words 10) ++ "?" ++
(capitalise $ words 20) ++ "oneone1!" 

[1] Random midpoint displacement is a very interesting topic by itself.

-- 
(c) this sig last receiving data processing entity. Inspect headers for
past copyright information. All rights reserved. Unauthorised copying,
hiring, renting, public performance and/or broadcasting of this
signature prohibited. 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe