hello,
i'm doing this exercise and would appreciate any comments. i want to create a 
machine to scan a text, then split the text into elements (storing in a hash 
table). then we connect these hash keys in a probabilistic way, so that if we 
start from a word, we can jump to other words in a probabilistic way. hence, we 
can generate a sentence that is sufficiently independent from our bias. the 
point is, if these probability makes sense (for example, it is real statistics 
from real high quality text, i.e. from famous writers), i hope that the machine 
can generate one or two sentences that is entertaining.

(require 2htdp/batch-io)
(require racket/hash)

(define input (read-file "sample.txt"))
(define data (remove-duplicates (string-split input)))

to make it very easy at first, i dont use the frequency of the elements (words 
mostly) just yet. i just make a hash table that take each element as a key, the 
associated value will be a dispatching rule. to begin with, the dispatching 
rule is simple: the current key will be connected to two other keys, with 
probability. for example: the element "run" is connected to "instead." with 
probability .48 and "helmets" with probability .08.

(hash "run" (hash 0.48 "instead." 0.08 "helmets"))


(struct state (word dispatch) #:transparent)

(define (random-member lst)
  (list-ref lst (random (length lst))))

(define (make-sample-machine lst)
  (define l (length lst))
  (define (make-transition)
    (hash (round-2 (random)) (random-member lst)
          (round-2 (random)) (random-member lst)))
  (foldl (lambda (word h) (hash-union h (hash word (make-transition))))
         (hash) lst))

(define (round-n x n)
  (/ (round (* x (expt 10 n))) (expt 10 n)))
(define (round-2 x)
  (round-n x 2))

(define m (make-sample-machine data))

data of the machine looks like this:

'#hash(("spread" . #hash((0.9 . "right") (0.21 . "deep")))
       ("instead" . #hash((0.64 . "then") (0.19 . "dark")))
       ("through" . #hash((0.3 . "meadow") (0.95 . "white,")))
       ("their" . #hash((0.56 . "instead") (0.98 . "valley,")))

now i try to generate a sentence of 10 words, i guess it is some kind of loops, 
but when it write the function, it is super slow.

the idea is that, we randomise to choose the first word, then this first word 
has an associated dispatching rule. we use the probability in this rule to 
randomise for the next word..

is it because i use too much randomisation that the function is super slow?

(define (accumulate lst)
  (define total (apply + lst))
  (let absolute->relative ([elements lst] [so-far #i0.0])
    (cond
     [(empty? elements) '()]
     [else (define nxt (+ so-far (round-2 (/ (first elements) total))))
           (cons nxt (absolute->relative (rest elements) nxt))])))

(define (randomise accumulated-lst)
  (define r (random))
  (for/last ([p (in-naturals)] [% (in-list accumulated-lst)] #:final (< r %)) 
p))

(define (generate-text m n)
  (define l (hash-count m))
  (define r (random l))
  (match-define (cons first-word dispatch) (hash-iterate-pair m r))
  (cons first-word
        (let generate ([count-down n] [next-batch dispatch])
          (cond
           [(zero? count-down) '()]
           [else
            (define proba (hash-keys dispatch))
            (define next-word (hash-iterate-key m (randomise (accumulate 
proba))))
            (define next-dispatch (hash-ref m next-word))
            (cons next-word (generate (- n 1) next-dispatch))]))))

this exercise is at the beginer level, so i guess, someone must have done it 
before. anyone has experience in doing this? like, is it a good way to 
represent the data in a hash table? how to handle when the sample text (so the 
hash table) becomes very large?

here is the sample text:

They had marched more than thirty kilometres since dawn, along the white, hot 
road where occasional thickets of trees threw a moment of shade, then out into 
the glare again. On either hand, the valley, wide and shallow, glittered with 
heat; dark green patches of rye, pale young corn, fallow and meadow and black 
pine woods spread in a dull, hot diagram under a glistening sky. But right in 
front the mountains ranged across, pale blue and very still, snow gleaming 
gently out of the deep atmosphere. And towards the mountains, on and on, the 
regiment marched between the rye fields and the meadows, between the scraggy 
fruit trees set regularly on either side the high road. The burnished, dark 
green rye threw off a suffocating heat, the mountains drew gradually nearer and 
more distinct. While the feet of the soldiers grew hotter, sweat ran through 
their hair under their helmets, and their knapsacks could burn no more in 
contact with their shoulders, but seemed instead to give off a cold, prickly 
sensation.

thank you,
and have a good day,
(if you read until this point)

 

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to