Hello PHP-list,
I'm building a script, that provides a honeypot of invalid email
addresses for spambots.. for this I want to provide a macro for the
templates that looks like %rand[x]-[y]%, where [x] and [y] are
integers, that specify the length of the random script.
My first thoughts were about the best parsing of the %rand..%-part,
but now I came to a point, where I could also need suggestions on the
random string generation..
It considers very basic word generations and the only meaningful word
I discovered was 'Java'.. *g
For generation of a random string with length 1.000.000 it takes about
13 seconds on my xp 1600+.. that's quite a lot, imho, so suggestions
are very welcome..
the script goes here, ready to copy'n'paste:
--------------------------------------------------------------
list($low, $high) = explode(" ", microtime());
$this->timerstart = $high + $low;
function parserandstr($toparse){
$debug = 0;
$new = '';
$ch = array(
'punct' => array('.', '.', '.', '..', '!', '!!', '?!'),
'sep' => array(', ', ' - '),
'vocal' => array('a', 'e', 'i', 'o', 'u'),
'cons_low' => array('x', 'y', 'z'),
'cons_norm' => array('b', 'c', 'd', 'f', 'g', 'h', 'j', 'k',
'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'w')
);
while ( ($pos = strpos($toparse, '%rand')) !== FALSE){
if ($debug) echo '<br><br>$pos: ' . $pos;
$new .= substr($toparse, 0, $pos);
if ($debug) echo '<br>$new: "' . $new . '"';
$toparse = substr($toparse, $pos + 5);
if ($debug) echo '<br>$toparse: "' . $toparse . '"';
$posclose = strpos($toparse, '%', 1);
if ($debug) echo '<br>$posclose: "' . $posclose . '"';
if ($posclose){
$rlength = substr($toparse, 0, $posclose);
if ($debug) echo '<br>$rlength: "' . $rlength . '"';
$possep = strpos($rlength, '-');
$minrlen = substr($rlength, 0, $possep);
$maxrlen = substr($rlength, $possep + 1);
if ($debug) echo '<br>$minrlen: "' . $minrlen . '"';
if ($debug) echo '<br>$maxrlen: "' . $maxrlen . '"';
$rlen = rand($minrlen, $maxrlen);
// generate random string
$randstr = ''; $inword = 0; $insentence = 0; $lastchar = '';
for($j = 0; $j < $rlen; $j++){
if ($inword > 3 && rand(0, 5) == 1) { // punctuation chars
if (rand(0,5) > 0) $char = ' ';
else {
$char = $ch['punct'][rand(0, count($ch['punct'])-1)] . ' ';
$j += strlen($char)-1;
$insentence = 0;
}
$inword = 0;
}
else {
if (!$lastwasvocal && rand(0, 10) > 6) { // vocals
$char = $ch['vocal'][rand(0, count($ch['vocal'])-1)];
$lastwasvocal = true;
} else {
do {
if (rand(0, 30) > 0) // normal priority consonants
$char = $ch['cons_norm'][rand(0, count($ch['cons_norm'])-1)];
else $char = $ch['cons_low'][rand(0, count($ch['cons_low'])-1)];
} while ($char == $lastchar);
$lastwasvocal = false;
}
$inword++;
$insentence++;
}
if ($insentence == 1 || ($inword == 1 && rand(0, 30) < 10))
$randstr .= strtoupper($char);
else $randstr .= $char;
$lastchar = $char;
}
$new .= $randstr;
if ($debug) echo '<br>$new: ' . $new;
$toparse = substr($toparse, $posclose + 1);
if ($debug) echo '<br>$toparse: "' . $toparse . '"';
} else $new .= '%rand';
}
return $new . $toparse;
}
function pre_dump($var, $desc=''){
echo '<pre>::'.$desc.'::<br>'; var_dump($var); echo '</pre>';
}
#$s = parserandstr('random string comes here: '
. '%rand10-1000%. this is a fake %rand and should not be killed..');
$s = parserandstr('%rand200000-200000%');
echo '<br><br>' . $s;
echo '<br><br>' . strlen($s);
list($low, $high) = explode(" ", microtime());
$t = $high + $low;
printf("<br>loaded in: %.4fs", $t - $this->timerstart);
------------------------------------------------------------
--
shinE!
http://www.thequod.de ICQ#152282665
PGP 8.0 key: http://thequod.de/danielhahler.asc
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php