On 20Aug2014 16:35, Dima Kulik <dexterne...@mail.ru> wrote:
Hi to all. I have a problem with parsing file.
I have txt file exported from AD and it has such structure:

DistinguishedName : CN=*** ,OU=*** ,OU=*** ,DC=*** ,DC=***,DC=***
GroupCategory     : Distribution
GroupScope        : Universal
Name              : ****
ObjectClass       : group
ObjectGUID        : 0b74b4e2-aad1-4342-a8f4-2fa7763e1d49
SamAccountName    : ****
SID               : S-1-5-21-1801674531-492894223-839522115-16421
[...]
I've tried to make little parser:

keywords = ['Name', 'Name:']
input_file=open("Mail_Groups.txt","r").readlines()
output_file=open("Out.txt","w")
for line in input_file:
    for word in line.split():

Aside from the remarks from others, I would change the way you're parsing each line. Based entirely on what you show above, I'd make the main out loops look like this:

  for line in input_file:
      left, right = line.split(':', 1)
      label = left.strip()
      value = right.strip()

and then made decisions using "label" and "value".

Your approach breaks the line into "words" on whitespace, which has several difficulties, including that the example input data look like a report. Often things with trailing colons will abut the colon if the label is long, eg:

    HereIsALongNameLabel: info...

Your split() will be presuming the colon is spaced out.

Just splitting once on the first colon and the trimming the whitespace from the two piece is simpler and gets you a more reliable parse.

Cheers,
Cameron Simpson <c...@zip.com.au>

Trust the computer industry to shorten Year 2000 to Y2K. It was this
thinking that caused the problem in the first place.
- Mark Ovens <ma...@uk.radan.com>
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to