Am 11.11.2024 um 03:04 schrieb Cottrell, Allin:
On Sun, Nov 10, 2024 at 1:27 PM Sven Schreiber
<sven.schrei...@fu-berlin.de> wrote:
I'm currently looking at ch. 21 of the guide, "Cheat sheet". I'd propose
the following cleanups (which I could apply if people agree):
Following-up on this earlier list, here's an update:
(and see below for a crash...)
section 21.1:
- Time averaging of panel datasets: I think it would be nice to use a
real-world dataset such as grunfeld.gdt instead of having the slightly
distracting code for creation of artificial data.
Here's a tested variant of what I meant:
<hansl>
open grunfeld
# how many periods (here: years) to average
newfreq = 4
# a dummy for endpoints
series endpoint = (time % newfreq == 0) # 'time' already in dataset
list X = invest value kstock # time-varying variables
# compute averages
loop foreach i X
series $i = movavg($i, newfreq)
endloop
# drop extra observations
smpl endpoint --dummy --permanent
# restore panel structure
setobs firm year --panel-vars
print firm year X -o
</hansl>
OK with you guys to replace the old example with artificial data?
section 21.2:
- Generating a dummy variable for a specific observation: Instead of
t=="Italy" one can also write obs=="Italy", which may be more intuitive
for cross-sectional data.
Already done (by Allin).
- Generating a “subset of values” dummy: Nowadays one could use the
contains() function I think, which would be more readable.
Here's an artificial but also tested example of what I mean:
<hansl>
nulldata 10
series src = {1,2,3,12,13,14,22,23,24,25}
matrix sel = {2,13,14,25}
series D1 = contains(src, sel)
</hansl>
So I think that the long-ish paragraph about the "clever solution" could
be deleted. Also, I'm not sure that what is then labeled as the "proper
solution" using the replace() function is actually "more proper" than
the one I gave using contains(). Opinions?
section 21.3:
- Interaction dummies (p. 194 of the A4 guide version from October):
remove the old string-substitution-based code that pre-dates the
interaction operator (^; which is also already mentioned there).
Again, is the old solution (starting with "But back in my day...")
really still needed?
- Realized volatility: Is this example even consistent? It starts by
talking about minutes and hours, but then switches over to seconds and
minutes. Maybe that's part of the clever trick, I don't know... Apart
from that, it seems that another trick in the cheat sheet could be
re-used here, namely "Moving functions for time series".
OK, so here's something much more straightforward IMHO to calculate a
per-hour volatility, using the aggregate function:
<hansl>
nulldata 720
setobs 60 1:1 --time-series # 60 minutes per hour
series x = normal()
matrix v = aggregate(x, $obsmajor, var) # $obsmajor means hour here
print v
dataset compact 1 # yields error !
series rv = v[,end]
</hansl>
HOWEVER, for the "dataset compact 1" line gretl tells me "not
supported", and I don't understand why. Shouldn't it be quite easy to
compact from any periodicity down to 1?
Plus, when I try the compaction in the GUI on this artificially created
dataset, the program crashes (disappears). This is 2024d on Windows.
- Looping over two paired lists: Can't this one be generalized, by using
Lx[i] and Ly[i] instead of y$i and x$i ?
Already done.
- Cross-validation: Could it be that using some feature of the regls
apparatus or a contributed package (by Artur?) would be more practical
nowadays?
It mentions the leverage command - could be that this was already the
answer to my previous remark, not sure
- Is my matrix result broken? - One could now use sum() instead of
sumc(sumr()).
Already done.
These are all good points. Let's see if we can address them.
As indicated item by item above, some of it was already addressed,
thanks for that.
cheers
sven
_______________________________________________
Gretl-devel mailing list -- gretl-devel@gretlml.univpm.it
To unsubscribe send an email to gretl-devel-le...@gretlml.univpm.it
Website:
https://gretlml.univpm.it/postorius/lists/gretl-devel.gretlml.univpm.it/