I managed to solve this using the following method:

"""Returns a dictionary of indexes of spectra for which there are secondary 
scans, along with the indexes of those scans
        """
        scans = dict()

        # get an iterable
        context = cElementTree.iterparse(self.info['filename'], events=("end",))

        # turn it into an iterator
        context = iter(context)

        # get the root element
        event, root = context.next()

        for event, elem in context:
            if event == "end" and elem.tag == self.XML_SPACE + "scan":
                parentId = int(elem.get('num'))
                for child in elem.findall(self.XML_SPACE + 'scan'):
                    childId = int(child.get('num'))
                    try:
                        indexes = scans[parentId]
                    except KeyError:
                        indexes = []
                        scans[parentId] = indexes
                    indexes.append(childId)
                    child.clear()
                root.clear()
        return scans

I think the trick is using the 'end' event to determine how much data your 
iterparse is taking in, but I'm still not quite clear on whether this is the 
best way to do it.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to