On Sunday, February 4, 2018 at 5:32:51 PM UTC-6, Stanley Denman wrote: > On Sunday, February 4, 2018 at 4:26:24 PM UTC-6, Stanley Denman wrote: > > I am trying to parse a Python nested list that is the result of the > > getOutlines() function of module PyPFD2 using pyparsing module. This is the > > result I get. what in the world are 'expandtabs' and why is that making a > > difference to my parse attempt? > > > > Python Code > > 7 > > import PPDF2,pyparsing > > from pyparsing import Word, alphas, nums > > pdfFileObj=open('x.pdf','rb') > > pdfReader=PyPDF2.PdfFileReader(pdfFileObj) > > List=pdfReader.getOutlines() > > myparser = Word( alphas ) + Word(nums, exact=2) +"of" + Word(nums, exact=2) > > myparser.parseString(List) > > > > This is the error I get: > > > > Traceback (most recent call last): > > File "<pyshell#23>", line 1, in <module> > > myparser.parseString(List) > > File "C:\python\lib\site-packages\pyparsing.py", line 1620, in parseString > > instring = instring.expandtabs() > > AttributeError: 'list' object has no attribute 'expandtabs' > > > > Thanks so much, not getting any helpful responses from > > https://python-forum.io.
I have found that I can use the index values in the list to print out the section I need. So print(MyList[7]) get me to section f taht I want. print(MyList[9][1]) for example give me a string that is the bookmark entry for Exhibit 1F. But this index value would presumeably be different for each pdf file - that is there may not always be Section A-E, but there will always be a Section F. In ther words, the index values that get me to the right section would be different in each pdf file. -- https://mail.python.org/mailman/listinfo/python-list