On 14Jun2018 20:01, Sharan Basappa <sharan.basa...@gmail.com> wrote:
>Can anyone explain to me the purpose of "pattern" in the line below:
>documents.append((w, pattern['class']))
>documents is declared as a list as follows:
>documents.append((w, pattern['class']))

Not without a lot more context. Where did you find this code?

I am sorry that partial info was not sufficient.
I am actually trying to implement my first text classification code and I am 
referring to the below URL for that:


Ah, ok. It helps to include some cut/paste of the relevant code, though the URL is a big help.

The wider context of the code you recite looks like this:

 words = []
 classes = []
 documents = []
 ignore_words = ['?']
 # loop through each sentence in our training data
 for pattern in training_data:
     # tokenize each word in the sentence
     w = nltk.word_tokenize(pattern['sentence'])
     # add to our words list
     # add to documents in our corpus
 documents.append((w, pattern['class']))

and the training_data is defined like this:

 training_data = []
 training_data.append({"class":"greeting", "sentence":"how are you?"})
 training_data.append({"class":"greeting", "sentence":"how is your day?"})
 ... lots more ...

So training data is a list of dicts, each dict holding a "class" and "sentence" key. The "for pattern in training_data" loop iterates over each item of the training_data. It calls nltk.word_tokenize on the 'sentence" part of the training item, presumably getting a list of "word" strings. The documents list gets this tuple:

 (w, pattern['class'])

added to it.

In this way the documents list ends up with tuples of (words, classification), with the words coming from the sentence via nltk and the classification coming straight from the train item's "class" value.

So at the end of the loop the documents array will look like:

 documents = [
   ( ['how', 'are', 'you'], 'greeting' ),
   ( ['how', 'is', 'your', 'day', 'greeting' ),

and so forth.

Cameron Simpson <c...@cskk.id.au>

Reply via email to