Dear Cameron, This is so kind of you. Thanks for spending time to explain the code. It did help a lot. I did go back and brush up lists & dictionaries.
At this point, I think, I need to go back and brush up Python from the start. So, I will do that first. On Friday, 15 June 2018 09:12:22 UTC+5:30, Cameron Simpson wrote: > On 14Jun2018 20:01, Sharan Basappa <sharan.basa...@gmail.com> wrote: > >> >Can anyone explain to me the purpose of "pattern" in the line below: > >> > > >> >documents.append((w, pattern['class'])) > >> > > >> >documents is declared as a list as follows: > >> >documents.append((w, pattern['class'])) > >> > >> Not without a lot more context. Where did you find this code? > > > >I am sorry that partial info was not sufficient. > >I am actually trying to implement my first text classification code and I am > >referring to the below URL for that: > > > >https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6 > > Ah, ok. It helps to include some cut/paste of the relevant code, though the > URL > is a big help. > > The wider context of the code you recite looks like this: > > words = [] > classes = [] > documents = [] > ignore_words = ['?'] > # loop through each sentence in our training data > for pattern in training_data: > # tokenize each word in the sentence > w = nltk.word_tokenize(pattern['sentence']) > # add to our words list > words.extend(w) > # add to documents in our corpus > documents.append((w, pattern['class'])) > > and the training_data is defined like this: > > training_data = [] > training_data.append({"class":"greeting", "sentence":"how are you?"}) > training_data.append({"class":"greeting", "sentence":"how is your day?"}) > ... lots more ... > > So training data is a list of dicts, each dict holding a "class" and > "sentence" > key. The "for pattern in training_data" loop iterates over each item of the > training_data. It calls nltk.word_tokenize on the 'sentence" part of the > training item, presumably getting a list of "word" strings. The documents > list > gets this tuple: > > (w, pattern['class']) > > added to it. > > In this way the documents list ends up with tuples of (words, > classification), > with the words coming from the sentence via nltk and the classification > coming > straight from the train item's "class" value. > > So at the end of the loop the documents array will look like: > > documents = [ > ( ['how', 'are', 'you'], 'greeting' ), > ( ['how', 'is', 'your', 'day', 'greeting' ), > ] > > and so forth. > > Cheers, > Cameron Simpson <c...@cskk.id.au> -- https://mail.python.org/mailman/listinfo/python-list