Re: Getting contens of file into a hash

Michael Lamertz Mon, 29 Apr 2002 05:15:51 -0700

On Mon, Apr 29, 2002 at 11:59:42AM +0100, Anders Holm wrote:
> Hi folks!
> 
> Could someone point me in the right direction on this one.
> 
> I'm trying to input /proc/cpuinfo
> on a Linux box into a hash. This file
> contains information of the processors on the running machine, what type
> they are, number of cpu's etc. and I want to use this info later.


Ok, let's look at the data:

    ---------- snip ----------
    nijushiho:~$ cat /proc/cpuinfo 
    processor       : 0
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 8
    model name      : Pentium III (Coppermine)
    stepping        : 6
    cpu MHz         : 798.266
    cache size      : 256 KB
    fdiv_bug        : no
    hlt_bug         : no
    f00f_bug        : no
    coma_bug        : no
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 2
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat 
pse36 mmx fxsr sse
    bogomips        : 1592.52

    nijushiho:~$ 
    ---------- snip ----------

> Now the problem I have is that I'd like to get it into a hash, and I just
> can't figure out how to do this. I can split it up at the : delimiter, but
> I'm not sure how then to get in the data in a hash properly.

You're right on track (with some exceptions, but read about that further
down the page).

    my ($key, $value) = split /:/;

will give you e.g.
    $key   "cpu family"
    $value "6"

You access the "cells" of a hash via their 'key', and you write the
value into that cell by using an assignment.

    $hash{$key} = $value

Note that most beginners consider it funny that it's written

    $hash{$key}

and not
    
    %hash{$key}

The reason for the notation is that the leftmost character describes
what you have in hand, and not what 'hash' is.  So '$hash{$key}' means

    The *SCALAR* inside the hash '%hash' at position '$key'


The same goes for accessing the content of the hash when reading it out
later.

    print $hash{'cpu family'}

For more information about hashes read

    perldoc perldata

    perldoc perldsc
    perldoc perlreftut
    perldoc perlref

Start with the first one.  The other three will too soon lead you to
'references' which will add an indirection layer to all your data.  Go
with the basics first, and once you've grokked that, go for the latter.

> More than likely I'm describing this completely wrong, but hey, I'm a
> newbie.. ;)

Nope, that was pretty ok.


Ok, now on to some discussion the use of 'split' here:

*IF* you have full control over your input data, using split is just
fine.  If, however, you're relying on others to not change their output
you're on thin ice.

Of course you need to make assumptions about the data you're working
with, but you should optimize on rubustness.

So if someone decides that there's a terrible cool feature for xmyrgl
CPUs called effectiveness that's to be displayed as a ratio

    xmyrgle fx          : 3:2

split will give you

    'xmyrgle fx', 3, 2

and since you only assigned that to two variables, your information will
be broken.

A much saner assumption would be that everything up to the first colon
will be the key and all the rest will be the value

    my ($key, $val) = /\s*              # Skip leading whitespace
                        ([^\:]*?)       # Start with everything not ':'
                                        #    minimize so whitespace gets eaten by next 
statement
                        \s* : \s*       # Allow whitespace around ':'
                        (.*?)           # Slurp in all the rest,
                                        #    minimizing again
                        \s*             # Catch trailing whitespace
                        $/x;            # x allows fancy display and commenting of 
regexp

Well, that code will break too if somebody considers it cool to use a
colon inside the name field, but in that case, you'll still have all the
information available.  Nothing is lost, nothing gets thrown away except
for the whitespace and the colon.

Proof of concept:

    ---------- snip ----------
    nijushiho:~$ (cat /proc/cpuinfo; echo '    xmyrgle fx          : 3:2  ') |\
    >     perl -MData::Dumper -lane '/\s*([^\:]*?)\s*:\s*(.*?)\s*$/ or next;
    >         $hash{$1} = $2; 
    >         END { print Dumper(\%hash) }'
    $VAR1 = {
              'coma_bug' => 'no',
              'flags' => 'fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat 
pse36 mmx fxsr sse',
              'cpu MHz' => '798.266',
              'wp' => 'yes',
              'f00f_bug' => 'no',
              'processor' => '0',
              'xmyrgle fx' => '3:2',                          # <=== TADAAA
              'bogomips' => '1592.52',
              'vendor_id' => 'GenuineIntel',
              'stepping' => '6',
              'cpuid level' => '2',
              'model name' => 'Pentium III (Coppermine)',
              'hlt_bug' => 'no',
              'model' => '8',
              'fpu' => 'yes',
              'cache size' => '256 KB',
              'fdiv_bug' => 'no',
              'fpu_exception' => 'yes',
              'cpu family' => '6'
            };

    nijushiho:~$ 
    ---------- snip ----------

Just some "food for thought" (tm)...

PS:  use '//x'!

-- 
                       If we fail, we will lose the war.

Michael Lamertz                        |      +49 221 445420 / +49 171 6900 310
Nordstr. 49                            |                       [EMAIL PROTECTED]
50733 Cologne                          |                 http://www.lamertz.net
Germany                                |               http://www.perl-ronin.de 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Getting contens of file into a hash

Reply via email to