This is somewhat related to recent discussion of "Binary Logit with string series".

As you may know, when gretl reads string-valued series from (e.g.) a CSV file, consecutive 1-based numeric codes are attached to the strings in their order of occurrence. In some cases that's fine, but if the string values have a "natural" ordering it may not be so fine.

Consider the following CSV (I'll call it strorder.csv):

income,working
"high","yes"
"low","yes"
"middle","no"
"middle","no"
"low","yes"
"high","yes"
"middle","no"

On importing these data "high" will get code 1, "low" code 2, and so on. One may want to recode income as low=1, middle=2, high=3, and also recode working as no=1, yes=2. That's not super difficult in hansl, but I wonder if it would be worthwhile to provide a built-in function to do the job: this could be a new function, or perhaps could be enabled as a variant of strvsort(), via an optional second argument.

Anyway, here's a hansl prototype for such a function. The idea is that you pass in the original series along with an array in which the original string values are ordered as you want them coded, and you get back a new string-valued series.

<hansl>
set verbose off

function series strorder (series x, strings S)
   Sx = strvals(x)
   ns = nelem(Sx)
   if ns != nelem(S)
      funcerr "Replacement strings array is of wrong length"
   endif
   matrix v = zeros(ns, 1)
   matrix chk
   loop i=1..ns
      chk = instrings(S, Sx[i])
      if rows(chk) == 0
        funcerr "Bad replacement string"
      endif
      v[i] = chk[1]
   endloop
   series ret = replace(x, seq(1,ns), v)
   stringify(ret, S)
   return ret
end function

open strorder.csv -q

print "string values of income:"
eval strvals(income)
income2 = strorder(income, defarray("low", "middle", "high"))
print "string values of income2:"
eval strvals(income2)
print income income2 -o # should be identical

print "string values of working:"
eval strvals(working)
working2 = strorder(working, defarray("no", "yes"))
print "string values of working2:"
eval strvals(working2)
print working working2 -o # should be identical
</hansl>

Allin Cottrell
_______________________________________________
Gretl-devel mailing list -- gretl-devel@gretlml.univpm.it
To unsubscribe send an email to gretl-devel-le...@gretlml.univpm.it
Website: 
https://gretlml.univpm.it/postorius/lists/gretl-devel.gretlml.univpm.it/

Reply via email to