From: Kevin Viel <[EMAIL PROTECTED]>

I have obtain results of a query in XML format:

<?xml version="1.0"?>
<!DOCTYPE eSummaryResult PUBLIC "-//NLM//DTD eSummaryResult, 29
October 2004//EN"
"http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eSummary_041029.dtd";>
<eSummaryResult> <DocSum>
        <Id>4609</Id>
        <Item Name="Name" Type="String">MYC</Item>
        <Item Name="Description" Type="String">v-myc myelocytomatosis
viral oncogene homolog (avian)</Item>
        <Item Name="Orgname" Type="String">Homo sapiens</Item>
        <Item Name="Status" Type="Integer">0</Item>
        <Item Name="CurrentID" Type="Integer">0</Item>
        <Item Name="Chromosome" Type="String">8</Item>
        <Item Name="GeneticSource" Type="String">genomic</Item>
        <Item Name="MapLocation" Type="String">8q24.12-q24.13</Item>
        <Item Name="OtherAliases" Type="String">c-Myc</Item> <Item
Name="OtherDesignations" Type="String">avian myelocytomatosis viral oncogene homolog|myc proto-oncogene
protein|v-myc avian myelocytomatosis viral oncogene homolog</Item>
        <Item Name="NomenclatureSymbol" Type="String">MYC</Item>
<Item Name="NomenclatureName" Type="String">v-myc myelocytomatosis viral oncogene homolog (avian)</Item>
        <Item Name="NomenclatureStatus" Type="String">Official</Item>
        <Item Name="TaxID" Type="Integer">9606</Item> <Item
        Name="Mim" Type="List">
                <Item Name="int" Type="Integer">190080</Item>
        </Item>
</DocSum>


Jenda Krynicky kindly provided:

use XML::Rules;

my $parser = XML::Rules->new(
 rules => [
  Id => 'content',
  Item => sub {$_[1]->{Name} => $_[1]->{_content}},
# from the <Item> tags we are interested in the content # and want to use the Name attribute as the key to access # that value. We ignore the Type attribute.
  DocSum => sub {
   # by now all the data from the <Item>s are in the %{$_[1]} hash

if ($_[1]->{Chromosome} != 8 or $_[1]->{NomenclatureName} !~ /\bviral\b/) {
    # ignore everything outside the 8th chromosome that's not 'viral'
    return;
   }

   # do something with the data
   # or return the part of the data you want to keep using whatever
   # you suits you best as the key
   return $_[1]->{Name} => $_[1];
  },
  eSummaryResult => 'pass no content',
 ]
);

my $data = $parser->parse($the_xml_or_file);

print $data->{MYC}{NomenclatureName}, "\n";
__END__

I'd like to understand this better. It seems to be a reference (little arrow). Is that the same as using /@referenced_array, for instance?

It seems to be a hash with the key "rules" and a four-item array as its value. The third item of this array is a hash with a subroutine, or anonymous function declaration, as its value.

I am wrong, correct?

  A) Correct, you were incorrect.
  B) Incorrect, you were correct.
  C) You're still buying beer.

To start with specific questions, could someone explain:

>   Item => sub {$_[1]->{Name} => $_[1]->{_content}}

Thanks,

Kevin


--
Kevin Viel
Department of Genetics                       [EMAIL PROTECTED]
Southwest Foundation for Biomedical Research phone:  (210)258-9884
P.O. Box 760549                              fax:    (210)258-9444
San Antonio, TX 78245-0549

Kevin Viel
PhD Candidate
Department of Epidemiology
Rollins School of Public Health
Emory University
Atlanta, GA 30322

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to