This looks like a good application for XML::Twig.  The SYNOPSIS for that module 
shows examples of how to setup handlers for matching and processing an XML file.

HTH,

Dave Clarke


________________________________
From: perl-win32-users-boun...@listserv.activestate.com 
[mailto:perl-win32-users-boun...@listserv.activestate.com] On Behalf Of Paul 
Rousseau
Sent: Wednesday, October 31, 2012 1:21 PM
To: perl Win32-users
Subject: How to Use XML::Parser to Reduce an XML file to what is wanted

Hello Users.

I have an .xml file that I want to search for specific items, ignoring the 
remaining items. I have dumped the .xml file using the tree method. Partial 
results look as follows:

$VAR1 = [
          'Session',
          [
            {},
            'Hostname',
            [
              {
                'RemoteHost' => '\\\\SCADA',
                'Remote' => '0'
              },
              'Server',
              [
                {
                  'GroupCount' => '1',
                  'Connected' => '1',
                  'Name' => 'Kepware.KEPServerEX.V5'
                },
                'Group',
                [
                  {
                    'PercentDeadband' => '0.00',
                    'Connected' => '2',
                    'TimeBias' => '-420',
                    'ItemCount' => '26283',
                    'Active' => '-1',
                    'ReqUpdateRate' => '10000',
                    'Name' => '11-30'
                  },
                  'Item',
                  [
                    {
                      'ReqDataType' => '0',
                      'AccessPath' => '',
                      'Active' => '-1'
                    },
                    0,
                    '11-30.PLC.Global.EY_01_1001'
                  ],
                  'Item',
                  [
                    {
                      'ReqDataType' => '0',
                      'AccessPath' => '',
                      'Active' => '-1'
                    },
                    0,
                    '11-30.PLC.Global.Always_Off_Bit'
                  ],

.
.
. (There are many items so I did not include all of them.
.
.
                  'Item',
                  [
                    {
                      'ReqDataType' => '0',
                      'AccessPath' => '',
                      'Active' => '-1'
                    },
                    0,
                    '11-30.PLC.Global.PSHH_01_1010.ClassCTimer.CTL_x.CTL_11'
                  ]
                ]
              ]
            ]
          ]
        ];

I want to be able to maintain the .xml file integrity, so I want to open the 
original and after finding what I am looking for, dump the results to a second 
file.

As an example, the file has many "Item" entries with the text,

                  'Item',
                  [
                    {
                      'ReqDataType' => '0',
                      'AccessPath' => '',
                      'Active' => '-1'
                    },
                    0,
                    '11-30.PLC.Global.ACKNOWLEDGE.Latched'
                  ],

I want to be able to parse the original .xml file, find all items that contain 
the text, '.Latched', and output the results similar to the following:


$VAR1 = [
          'Session',
          [
            {},
            'Hostname',
            [
              {
                'RemoteHost' => '\\\\SCADA',
                'Remote' => '0'
              },
              'Server',
              [
                {
                  'GroupCount' => '1',
                  'Connected' => '1',
                  'Name' => 'Kepware.KEPServerEX.V5'
                },
                'Group',
                [
                  {
                    'PercentDeadband' => '0.00',
                    'Connected' => '2',
                    'TimeBias' => '-420',
                    'ItemCount' => '26283',
                    'Active' => '-1',
                    'ReqUpdateRate' => '10000',
                    'Name' => '11-30'
                  },
                  'Item',
                  [
                    {
                      'ReqDataType' => '0',
                      'AccessPath' => '',
                      'Active' => '-1'
                    },
                    0,
                    '11-30.PLC.Global.ACKNOWLEDGE.Latched'
                  ],
                  'Item',
                  [
                    {
                      'ReqDataType' => '0',
                      'AccessPath' => '',
                      'Active' => '-1'
                    },
                    0,
                    '11-30.PLC.Global.EY_01_1001.Latched'
                  ]
                ]
              ]
            ]
          ]
        ];

I checked the Net but did not find an "extraction" example that maintains the 
.xml integrity.

I am thinking I would need logic to do the following.

1. Open the .xml file.
2. Begin parsing.
3. If the object is not 'Item', keep it. (This will keep objects such as 
'Session', 'Server', 'Group')
4. If the object is 'Item', and it contains the text, '.Latched', keep it.
5. Otherwise, ignore 'Item'
6. Open the output file.
7. Write out all the kept items.
8. Close both files.

Any help would be greatly appreciated.

Paul Rousseau

Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to