On Tue, 13 Feb 2018 13:42:08 +0000, Rhodri James wrote:

> On 13/02/18 13:11, Stanley Denman wrote:
>> I am trying to performance a regex on a "string" of text that python
>> isinstance is telling me is a dictionary.  When I run the code I get
>> the following error:
>> 
>> {'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.: 
>> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0),
>> '/Type': '/FitB'}
>> 
>> Traceback (most recent call last):
>>    File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in
>>    <module>
>>      x=MyRegex.findall(MyDict)
>> TypeError: expected string or bytes-like object
>> 
>> Here is the "string" of code I am working with:
>> 
>> {'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.: 
>> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0),
>> '/Type': '/FitB'}
>> 
>> I want to grab the name "MILANI, JOHN C" and the last date
>> "-mm/dd/yyyy" as a pair such that if I have  X numbers of string like
>> the above I will end out with N pairs of values (name and date)/  Here
>> is my code:
>>   
>> import PyPDF2,re pdfFileObj=open('x.pdf','rb')
>> pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
>> Result=pdfReader.getOutlines()
>> MyDict=(Result[-1][0])
>> print(MyDict)
>> print(isinstance(MyDict,dict))
>> MyRegex=re.compile(r"MILANI,")
>> x=MyRegex.findall(MyDict)
>> print(x)
> 
> As the error message says, re.findall() expects a string.  A dictionary
> is in no sense a string, so passing it in whole like that won't work.
> If you know that the name will always show up in the title field, you
> can pass just the title:
> 
>    x = MyRegex.findall(MyDict['/Title'])
> 
> Otherwise you will have to loop through all the entries in the
> dictionary:
> 
>    for entry in MyDict.values():
>      x = MyRegex.findall(entry) # ...and do something with x
> 
> I rather suspect you are going to find that the titles aren't in a very
> systematic format, though.

for what purpose are you trying to run this regex anyway?
it is almost certainly the wrong approach for your task



-- 
Larkinson's Law:
        All laws are basically false.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to