Re: [Tutor] can anyone help me in solving this problem this is urgent

Emile van Sebille Sat, 07 Nov 2009 10:04:51 -0800

On 11/6/2009 4:24 PM surjit khakh said...

Write a python program to read a text file named “text.txt” and show thenumberof times each article is found in the file. Articles in the Englishlanguage are the
words “a”, “an”, and “the”.

Sounds like you're taking a python class. Great! It's probably thebest programming language to start with.

First, it helps when asking questions if you mention what version of thelanguage you're using. Some features and options are newer. Inparticular, there's a string method 'count' that isn't available inolder pythons, while the replace method has been around at least ten years.

If you haven't already, the tutorial athttp://docs.python.org/tutorial/index.html is a great place to start.Pay particular attention to section 3's string introduction athttp://docs.python.org/tutorial/introduction.html#strings and section 7starting withhttp://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files

on files.

Implicit in this problem is identifying words in the text file. This istough because you need to take punctuation into account. There's a neattool in newer pythons such that, assuming you've read the file contentsinto a variable txt, allows you to say set(txt) to get all the letters,numbers, punctuation marks, and any other whitespace type charactersembedded in the content. You'll need to know these so that you canrecognize the word regardless of adjacent punctuation. In this specificcase, as articles in English always precede nouns you'll always findwhitespace following an article. It would be a space except, of course,when the article ends the line and line wrap characters are included inthe text file.


For example, consider the following text:

"""
SECTION 1.4. COUNTY PLANNING COMMISSION.

a. The County Planning Commission shall consist of five members. Eachmember of the Board of Supervisors shall recommend that a resident ofhis district be appointed to the Commission; provided, however, theappointments to the Commission shall require the affirmative vote of notless than a majority of the entire membership of the Board.

"""

Any a's, an's or the's in the paragraph body can be easily counted withthe string count method once you properly prepared the text.

I expect the an's and the's are the easy ones to count. Considerhowever the paragraph identifier -- "a." -- this is not an article butwould likely be counted as one in most solutions. There may also be asubsequent reference to this section (eg, see a above) or range ofsections (eg, see a-n above) that further make this a harder problem.One possible approach may involve confirming the a noun follows thearticle. There are dictionaries you can access, or word lists that canhelp. The WordNet database from Princeton appears fairly complete with117k entries, but even there it's easy to find exceptions: "A 20's styleapproach"; "a late bus"; or "a fallen hero".

So, frankly, I expect that solutions to this problem will range from thenaive through the reasonably complete to the impossible without humanconfirmation of complex structure and context.

For your homework, showing you can read in the file, strip out anypunctuation, count the resulting occurances, and report the resultsshould do it.


Emile

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] can anyone help me in solving this problem this is urgent

Reply via email to