Re: Index of entity in List with a Condition

2018-06-12 Thread subhabangalore
On Wednesday, June 13, 2018 at 6:30:45 AM UTC+5:30, Cameron Simpson wrote:
> On 11Jun2018 13:48, Subhabrata Banerjee wrote:
> >I have the following sentence,
> >
> >"Donald Trump is the president of United States of America".
> >
> >I am trying to extract the index 'of', not only for single but also
> >for its multi-occurance (if they occur), from the list of words of the
> >string, made by simply splitting the sentence.
> > index1=[index for index, value in enumerate(words) if value == "of"],
> >where words=sentence.split()
> >
> >I could do this part more or less nicely.
> >
> >But I am trying to say if the list of words has the words "United"
> >and "States" and it has "of " in the sentence then the previous
> >word before of is, president.
> >
> >I am confused how may I write this, if any one may help it.
> 
> You will probably have to drop the list comprehension and go with something 
> more elaborate.
> 
> Also, lists have an "index" method:
> 
>   >>> L = [4,5,6]
>   >>> L.index(5)
>   1
> 
> though it doesn't solve your indexing problems on its own.
> 
> I would be inclined to deconstuct the sentence into a cross linked list of 
> elements. Consider making a simple class to encapsulate the knowledge about 
> each word (totally untested):
> 
>   class Word:
> def __init__(word):
>   self.word = word
> 
>   words = []
>   for index, word in sentence.split():
> W = Word(word)
> W.index = index
> words.append(W)
> W.wordlist = words
> 
> Now you have a list of Word objects, each of which knows its list position 
> _and_ also knows about the list itself, _and_ you have the list of Word 
> objects 
> correspnding to your sentence words.
> 
> You'll notice we can just hang whatever attributes we like off these "Word" 
> objects: we added a .wordlist and .index on the fly. It isn't great formal 
> object design, but it makes building things up very easy.
> 
> You can add methods or properties to your class, such as ".next":
> 
>   @property
>   def next(self):
> return self.wordlist[self.index - 1]
> 
> and so forth. That will let you write expressions about Words:
> 
>   for W in wordlist:
> if W.word == 'of' and W.next.word == 'the' and W.next.next.word == 
> 'United' ...:
>   if W.previous.word != 'president':
> ... oooh, unexpected preceeding word! ...
> 
> You can see that you could also write methods like "is_preceeded_by":
> 
>   def is_preceed_by(self, word2):
> return self.previous.word == word2
> 
> and test "W.is_preceeded_by('president')".
> 
> In short, write out what you would like to express. Then write methods that 
> implement the smaller parts of what you just wrote.
> 
> Cheers,
> Cameron Simpson https://mail.python.org/mailman/listinfo/python-list


Index of entity in List with a Condition

2018-06-11 Thread subhabangalore
I have the following sentence,

"Donald Trump is the president of United States of America". 

I am trying to extract the index 'of', not only for single but also
for its multi-occurance (if they occur), from the list of words of the
string, made by simply splitting the sentence.
 index1=[index for index, value in enumerate(words) if value == "of"],
where words=sentence.split()

I could do this part more or less nicely. 

But I am trying to say if the list of words has the words "United"
and "States" and it has "of " in the sentence then the previous 
word before of is, president. 

I am confused how may I write this, if any one may help it.

Thanking in advance.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Some Issues on Tagging Text

2018-05-27 Thread subhabangalore
On Sunday, May 27, 2018 at 2:41:43 AM UTC+5:30, Cameron Simpson wrote:
> On 26May2018 04:02, Subhabrata Banerjee  wrote:
> >On Saturday, May 26, 2018 at 3:54:37 AM UTC+5:30, Cameron Simpson wrote:
> >> It sounds like you want a more general purpose parser, and that depends 
> >> upon
> >> your purposes. If you're coding to learn the basics of breaking up text, 
> >> what
> >> you're doing is fine and I'd stick with it. But if you're just after the
> >> outcome (tags), you could use other libraries to break up the text.
> >>
> >> For example, the Natural Language ToolKit (NLTK) will do structured 
> >> parsing of
> >> text and return you a syntax tree, and it has many other facilities. Doco:
> >>
> >>   http://www.nltk.org/
> >>
> >> PyPI module:
> >>
> >>   https://pypi.org/project/nltk/
> >>
> >> which you can install with the command:
> >>
> >>   pip install --user nltk
> >>
> >> That would get you a tree structure of the corpus, which you could process 
> >> more
> >> meaningfully. For example, you could traverse the tree and tag higher level
> >> nodes as you came across them, possibly then _not_ traversing their inner
> >> nodes. The effect of that would be that if you hit the grammatic node:
> >>
> >>   government of Mexico
> >>
> >> you might tags that node with "ORGANISATION", and choose not to descend 
> >> inside
> >> it, thus avoiding tagging "government" and "of" and so forth because you 
> >> have a
> >> high level tags. Nodes not specially recognised you're keep descending 
> >> into,
> >> tagging smaller things.
> >>
> >> Cheers,
> >> Cameron Simpson
> >
> >Dear Sir,
> >
> >Thank you for your kind and valuable suggestions. Thank you for your kind 
> >time too.
> >I know NLTK and machine learning. I am of belief if I may use language 
> >properly we need machine learning-the least.
> 
> I have similar beliefs: not that machine learning is not useful, but that it 
> has a tendency to produce black boxes in terms of the results it produces 
> because its categorisation rules are not overt, rather they tend to be side 
> effects of weights in a graph.
> 
> So one might end up with a useful tool, but not understand how or why it 
> works.
> 
> >So, I am trying to design a tagger without the help of machine learning, by 
> >simple Python coding. I have thus removed standard Parts of Speech(PoS) or 
> >Named Entity (NE) tagging scheme.
> >I am trying to design a basic model if required may be implemented on any 
> >one of these problems.
> >Detecting longer phrase is slightly a problem now I am thinking to employ 
> >re.search(pattern,text). If this part is done I do not need machine 
> >learning. 
> >Maintaining so much data is a cumbersome issue in machine learning.
> 
> NLTK is not machine learning (I believe). It can parse the corpus for you, 
> emitting grammatical structures. So that would aid you in recognising words, 
> phrases, nouns, verbs and so forth. With that structure you can then make 
> better decisions about what to tag and how.
> 
> Using the re module is a very hazard prone way of parsing text. It can be 
> useful for finding fairly fixed text, particularly in machine generated text, 
> but it is terrible for prose.
> 
> Cheers,
> Cameron Simpson 

Dear Sir, 

Thank you for your kind time to discuss the matter. 
I am very clear in Statistics but as I am a Linguist too I feel the modern day
craziness on theories is going no where. Many theories but hardly anything of
practical value, bit like post Chomskyan Linguistics scenario. Theories of 
parsing
are equally bad. Only advantage of statistics is if it is not giving result you 
may 
abandon them quickly. 

I do not feel Parsing theories of Linguistics lead anywhere esp if data is 
really big. 

I am looking for patterns. Like if you say Organizations in documents are 
mostly all
capital lettered acronyms. So no need of taking ML solution for that rather a 
simple
code line of [word for word in words if word.isupper()] does the job. In the 
same way
there are many interesting patterns in language if you observe them. I made 
many, making
many more. All you need some good time to observe the data patiently. 

NLTK is a library mainly built for students practice but now everyone uses it. 
They have many corpora and tools (most of them are built with ML based 
approach),
but they have many more ML libraries which you may use on user defined data and 
standard.
NLTK integrates nicely with other Python based libraries like Scikit or Gensim 
or Java based 
ones like Stanford. The code lines are nicely documented if you feel you may 
read as proper
references are mostly given. 

I got good results in re earlier but I would surely check your point.

Thank you again for your kind time and a nice discussion.




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Some Issues on Tagging Text

2018-05-26 Thread subhabangalore
On Saturday, May 26, 2018 at 3:54:37 AM UTC+5:30, Cameron Simpson wrote:
> On 25May2018 04:23, Subhabrata Banerjee  wrote:
> >On Friday, May 25, 2018 at 3:59:57 AM UTC+5:30, Cameron Simpson wrote:
> >> On 24May2018 03:13, wrote:
> >> >I have a text as,
> >> >
> >> >"Hawaii volcano generates toxic gas plume called laze PAHOA: The eruption 
> >> >of Kilauea volcano in Hawaii sparked new safety warnings about toxic gas 
> >> >on the Big Island's southern coastline after lava began flowing into the 
> >> >ocean and setting off a chemical reaction. Lava haze is made of dense 
> >> >white clouds of steam, toxic gas and tiny shards of volcanic glass. Janet 
> >> >Babb, a geologist with the Hawaiian Volcano Observatory, says the plume 
> >> >"looks innocuous, but it's not." "Just like if you drop a glass on your 
> >> >kitchen floor, there's some large pieces and there are some very, very 
> >> >tiny pieces," Babb said. "These little tiny pieces are the ones that can 
> >> >get wafted up in that steam plume." Scientists call the glass Limu O 
> >> >Pele, or Pele's seaweed, named after the Hawaiian goddess of volcano and 
> >> >fire"
> >> >
> >> >and I want to see its tagged output as,
> >> >
> >> >"Hawaii/TAG volcano generates toxic gas plume called laze PAHOA/TAG: The 
> >> >eruption of Kilauea/TAG volcano/TAG in Hawaii/TAG sparked new safety 
> >> >warnings about toxic gas on the Big Island's southern coastline after 
> >> >lava began flowing into the ocean and setting off a chemical reaction. 
> >> >Lava haze is made of dense white clouds of steam, toxic gas and tiny 
> >> >shards of volcanic glass. Janet/TAG Babb/TAG, a geologist with the 
> >> >Hawaiian/TAG Volcano/TAG Observatory/TAG, says the plume "looks 
> >> >innocuous, but it's not." "Just like if you drop a glass on your kitchen 
> >> >floor, there's some large pieces and there are some very, very tiny 
> >> >pieces," Babb/TAG said. "These little tiny pieces are the ones that can 
> >> >get wafted up in that steam plume." Scientists call the glass Limu/TAG 
> >> >O/TAG Pele/TAG, or Pele's seaweed, named after the Hawaiian goddess of 
> >> >volcano and fire"
> >> >
> >> >To do this I generally try to take a list at the back end as,
> >> >
> >> >Hawaii
> >> >PAHOA
> [...]
> >> >and do a simple code as follows,
> >> >
> >> >def tag_text():
> >> >corpus=open("/python27/volcanotxt.txt","r").read().split()
> >> >wordlist=open("/python27/taglist.txt","r").read().split()
> [...]
> >> >list1=[]
> >> >for word in corpus:
> >> >if word in wordlist:
> >> >word_new=word+"/TAG"
> >> >list1.append(word_new)
> >> >else:
> >> >list1.append(word)
> >> >lst1=list1
> >> >tagged_text=" ".join(lst1)
> >> >print tagged_text
> >> >
> >> >get the results and hand repair unwanted tags Hawaiian/TAG goddess of 
> >> >volcano/TAG.
> >> >I am looking for a better approach of coding so that I need not spend 
> >> >time on
> >> >hand repairing.
> >>
> >> It isn't entirely clear to me why these two taggings are unwanted. 
> >> Intuitively,
> >> they seem to be either because "Hawaiian goddess" is a compound term where 
> >> you
> >> don't want "Hawaiian" to get a tag, or because "Hawaiian" has already 
> >> received
> >> a tag earlier in the list. Or are there other criteria.
> >>
> >> If you want to solve this problem with a programme you must first clearly
> >> define what makes an unwanted tag "unwanted". [...]
> >
> >By unwanted I did not mean anything so intricate.
> >Unwanted meant things I did not want.
> 
> That much was clear, but you need to specify in your own mind _precisely_ 
> what 
> makes some things unwanted and others wanted. Without concrete criteria you 
> can't write code to implement those criteria.
> 
> I'm not saying "you need to imagine code to match these things": you're 
> clearly 
> capable of doing that. I'm saying you need to have well defined concepts of 
> what makes something unwanted (or, if that is easier to define, wanted).  You 
> can do that iteratively: start with your basic concept and see how well it 
> works. When those concepts don't give you the outcome you desire, consider a 
> specific example which isn't working and try to figure out what additional 
> criterion would let you distinguish it from a working example.
> 
> >For example,
> >if my target phrases included terms like,
> >government of Mexico,
> >
> >now in my list I would have words with their tags as,
> >government
> >of
> >Mexico
> >
> >If I put these words in list it would tag
> >government/TAG of/TAG Mexico
> >
> >but would also tag all the "of" which may be
> >anywhere like haze is made of/TAG dense white,
> >clouds of/TAG steam, etc.
> >
> >Cleaning these unwanted places become a daunting task
> >to me.
> 
> Richard Damon has pointed out that you seem to want phrases instead of just 
> words.
> 
> >I have been experimenting around
> >wordlist=["Kilauea volcano","Kilauea/TAG 
> >volcano/TAG"),("H

Re: Some Issues on Tagging Text

2018-05-25 Thread subhabangalore
On Friday, May 25, 2018 at 3:59:57 AM UTC+5:30, Cameron Simpson wrote:
> First up, thank you for a well described problem! Remarks inline below.
> 
> On 24May2018 03:13, wrote:
> >I have a text as,
> >
> >"Hawaii volcano generates toxic gas plume called laze PAHOA: The eruption of 
> >Kilauea volcano in Hawaii sparked new safety warnings about toxic gas on the 
> >Big Island's southern coastline after lava began flowing into the ocean and 
> >setting off a chemical reaction. Lava haze is made of dense white clouds of 
> >steam, toxic gas and tiny shards of volcanic glass. Janet Babb, a geologist 
> >with the Hawaiian Volcano Observatory, says the plume "looks innocuous, but 
> >it's not." "Just like if you drop a glass on your kitchen floor, there's 
> >some large pieces and there are some very, very tiny pieces," Babb said. 
> >"These little tiny pieces are the ones that can get wafted up in that steam 
> >plume." Scientists call the glass Limu O Pele, or Pele's seaweed, named 
> >after the Hawaiian goddess of volcano and fire"
> >
> >and I want to see its tagged output as,
> >
> >"Hawaii/TAG volcano generates toxic gas plume called laze PAHOA/TAG: The 
> >eruption of Kilauea/TAG volcano/TAG in Hawaii/TAG sparked new safety 
> >warnings about toxic gas on the Big Island's southern coastline after lava 
> >began flowing into the ocean and setting off a chemical reaction. Lava haze 
> >is made of dense white clouds of steam, toxic gas and tiny shards of 
> >volcanic glass. Janet/TAG Babb/TAG, a geologist with the Hawaiian/TAG 
> >Volcano/TAG Observatory/TAG, says the plume "looks innocuous, but it's not." 
> >"Just like if you drop a glass on your kitchen floor, there's some large 
> >pieces and there are some very, very tiny pieces," Babb/TAG said. "These 
> >little tiny pieces are the ones that can get wafted up in that steam plume." 
> >Scientists call the glass Limu/TAG O/TAG Pele/TAG, or Pele's seaweed, named 
> >after the Hawaiian goddess of volcano and fire"
> >
> >To do this I generally try to take a list at the back end as,
> >
> >Hawaii
> >PAHOA
> >Kilauea
> >volcano
> >Janet
> >Babb
> >Hawaiian
> >Volcano
> >Observatory
> >Babb
> >Limu
> >O
> >Pele
> >
> >and do a simple code as follows,
> >
> >def tag_text():
> >corpus=open("/python27/volcanotxt.txt","r").read().split()
> >wordlist=open("/python27/taglist.txt","r").read().split()
> 
> You might want use this to compose "wordlist":
> 
>  wordlist=set(open("/python27/taglist.txt","r").read().split())
> 
> because it will make your "if word in wordlist" test O(1) instead of O(n), 
> which will matter later if your wordlist grows.
> 
> >list1=[]
> >for word in corpus:
> >if word in wordlist:
> >word_new=word+"/TAG"
> >list1.append(word_new)
> >else:
> >list1.append(word)
> >lst1=list1
> >tagged_text=" ".join(lst1)
> >print tagged_text
> >
> >get the results and hand repair unwanted tags Hawaiian/TAG goddess of 
> >volcano/TAG.
> >I am looking for a better approach of coding so that I need not spend time 
> >on 
> >hand repairing.
> 
> It isn't entirely clear to me why these two taggings are unwanted. 
> Intuitively, 
> they seem to be either because "Hawaiian goddess" is a compound term where 
> you 
> don't want "Hawaiian" to get a tag, or because "Hawaiian" has already 
> received 
> a tag earlier in the list. Or are there other criteria.
> 
> If you want to solve this problem with a programme you must first clearly 
> define what makes an unwanted tag "unwanted".
> 
> For example, "Hawaiian" is an adjective, and therefore will always be part of 
> a 
> compound term.
> 
> Can you clarify what makes these taggings you mention "unwanted"?
> 
> Cheers,
> 
Sir, Thank you for your kind time to write such a nice reply. 

By unwanted I did not mean anything so intricate. 
Unwanted meant things I did not want. 
For example, 
if my target phrases included terms like, 
government of Mexico, 

now in my list I would have words with their tags as,
government
of
Mexico

If I put these words in list it would tag 
government/TAG of/TAG Mexico

but would also tag all the "of" which may be
anywhere like haze is made of/TAG dense white,
clouds of/TAG steam, etc. 

Cleaning these unwanted places become a daunting task
to me. 

I have been experimenting around 
wordlist=["Kilauea volcano","Kilauea/TAG 
volcano/TAG"),("Hawaii","Hawaii/TAG"),...]
tag=reduce(lambda a, kv: a.replace(*kv), wordlist, corpus)

is giving me sizeably good result but size of the wordlist is slight concern. 

-- 
https://mail.python.org/mailman/listinfo/python-list


Some Issues on Tagging Text

2018-05-24 Thread subhabangalore
I have a text as, 

"Hawaii volcano generates toxic gas plume called laze PAHOA: The eruption of 
Kilauea volcano in Hawaii sparked new safety warnings about toxic gas on the 
Big Island's southern coastline after lava began flowing into the ocean and 
setting off a chemical reaction. Lava haze is made of dense white clouds of 
steam, toxic gas and tiny shards of volcanic glass. Janet Babb, a geologist 
with the Hawaiian Volcano Observatory, says the plume "looks innocuous, but 
it's not." "Just like if you drop a glass on your kitchen floor, there's some 
large pieces and there are some very, very tiny pieces," Babb said. "These 
little tiny pieces are the ones that can get wafted up in that steam plume." 
Scientists call the glass Limu O Pele, or Pele's seaweed, named after the 
Hawaiian goddess of volcano and fire"

and I want to see its tagged output as,

"Hawaii/TAG volcano generates toxic gas plume called laze PAHOA/TAG: The 
eruption of Kilauea/TAG volcano/TAG in Hawaii/TAG sparked new safety warnings 
about toxic gas on the Big Island's southern coastline after lava began flowing 
into the ocean and setting off a chemical reaction. Lava haze is made of dense 
white clouds of steam, toxic gas and tiny shards of volcanic glass. Janet/TAG 
Babb/TAG, a geologist with the Hawaiian/TAG Volcano/TAG Observatory/TAG, says 
the plume "looks innocuous, but it's not." "Just like if you drop a glass on 
your kitchen floor, there's some large pieces and there are some very, very 
tiny pieces," Babb/TAG said. "These little tiny pieces are the ones that can 
get wafted up in that steam plume." Scientists call the glass Limu/TAG O/TAG 
Pele/TAG, or Pele's seaweed, named after the Hawaiian goddess of volcano and 
fire"

To do this I generally try to take a list at the back end as, 

Hawaii
PAHOA
Kilauea 
volcano 
Janet 
Babb
Hawaiian 
Volcano 
Observatory
Babb 
Limu 
O 
Pele

and do a simple code as follows, 

def tag_text():
corpus=open("/python27/volcanotxt.txt","r").read().split()
wordlist=open("/python27/taglist.txt","r").read().split()
list1=[]
for word in corpus:
if word in wordlist:
word_new=word+"/TAG"
list1.append(word_new)
else:
list1.append(word)
lst1=list1
tagged_text=" ".join(lst1)
print tagged_text

get the results and hand repair unwanted tags Hawaiian/TAG goddess of 
volcano/TAG.

I am looking for a better approach of coding so that I need not spend time on 
hand repairing.

Here, corpus i.e., the volcanoxt is the untagged text given in the first and 
the wordlist, i.e., taglist
is list of words given just above the code. 

I am using Python2.7.15 on MS-Windows 7.

If any one may kindly suggest a solution.

Thanks in advance. 





-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem of writing long list of lists file to csv

2018-05-22 Thread subhabangalore
On Tuesday, May 22, 2018 at 3:55:58 PM UTC+5:30, Peter Otten wrote:
> 
> 
> > lst2=lst1[:4]
> > with open("my_csv.csv","wb") as f:
> > writer = csv.writer(f)
> > writer.writerows(lst2)
> > 
> > Here it is writing only the first four lists. 
> 
> Hint: look at the first line in the quotation above.

Thank you Sir. Sorry to disturb you. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Problem of writing long list of lists file to csv

2018-05-22 Thread subhabangalore
I have a list of lists (177 lists). 

I am trying to write them as file.

I used the following code to write it in a .csv file.

import  csv
def word2vec_preprocessing():
a1=open("/python27/EngText1.txt","r")
list1=[]
for line in a1:
line1=line.lower().replace(".","").split()
#print line1
list1.append(line1)
lst1=list1
lst2=lst1[:4]
with open("my_csv.csv","wb") as f:
writer = csv.writer(f)
writer.writerows(lst2)

Here it is writing only the first four lists. 

I have searched for help and it seems it is an issue and 
without much of fix. 
Please see the following link. 
https://stackoverflow.com/questions/30711899/python-how-to-write-list-of-lists-to-file

I have now tried pandas and json as follows, but same result. 
my_df = pd.DataFrame(lst2)
my_df.to_csv('sbb_csv.csv', index=False, header=False)

with open('sbb1.json', 'w') as F:
# Use the json dumps method to write the list to disk  
F.write(json.dumps(lst2))
with open('sbb1.json', 'r') as F:
B = json.loads(F.read())

print B


I am using Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:22:17) [MSC 
v.1500 32 bit (Intel)] on win32
in MS-Windows.

Please suggest what  error I may be doing? 

Thanking in advance.




-- 
https://mail.python.org/mailman/listinfo/python-list


TypeError: expected string or Unicode object, NoneType found

2018-05-19 Thread subhabangalore
I wrote a small piece of following code 

import nltk
from nltk.corpus.reader import TaggedCorpusReader
from nltk.tag import CRFTagger
def NE_TAGGER():
reader = TaggedCorpusReader('/python27/', r'.*\.pos')
f1=reader.fileids()
print "The Files of Corpus are:",f1
sents=reader.tagged_sents()
ls=len(sents)
print "Length of Corpus Is:",ls
train_data=sents[:300]
test_data=sents[301:350]
ct = CRFTagger()
crf_tagger=ct.train(train_data,'model.crf.tagger')

This code is working fine. 
Now if I change the data size to say 500 or 3000 in  train_data by giving  
train_data=sents[:500] or
 train_data=sents[:3000] it is giving me the following error.

Traceback (most recent call last):
  File "", line 1, in 
NE_TAGGER()
  File "C:\Python27\HindiCRFNERTagger1.py", line 20, in NE_TAGGER
crf_tagger=ct.train(train_data,'model.crf.tagger')
  File "C:\Python27\lib\site-packages\nltk\tag\crf.py", line 185, in train
trainer.append(features,labels)
  File "pycrfsuite\_pycrfsuite.pyx", line 312, in 
pycrfsuite._pycrfsuite.BaseTrainer.append (pycrfsuite/_pycrfsuite.cpp:3800)
  File "stringsource", line 53, in 
vector.from_py.__pyx_convert_vector_from_py_std_3a__3a_string 
(pycrfsuite/_pycrfsuite.cpp:10738)
  File "stringsource", line 15, in 
string.from_py.__pyx_convert_string_from_py_std__in_string 
(pycrfsuite/_pycrfsuite.cpp:10633)
TypeError: expected string or Unicode object, NoneType found
>>> 

I have searched for solutions in web found the following links as,
https://stackoverflow.com/questions/14219038/python-multiprocessing-typeerror-expected-string-or-unicode-object-nonetype-f
or
https://github.com/kamakazikamikaze/easysnmp/issues/50

reloaded Python but did not find much help. 

I am using Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:22:17) [MSC 
v.1500 32 bit (Intel)] on win32

My O/S is, MS-Windows 7.

If any body may kindly suggest a resolution. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Time Calculation to Tag a Sentence/File (Posting On Python-List Prohibited)

2017-06-09 Thread subhabangalore
On Saturday, June 10, 2017 at 1:53:07 AM UTC+5:30, Paul Barry wrote:
> This is a strange statement.  Python 3 doesn't even clash with Python 2, so
> I can't think of how it might cause problems with Java.  I've run 2 and 3
> on Windows 7, Vista, and 10 without any issues.
> 
> Paul.
> 
> On 9 June 2017 at 20:14,  wrote:
> 
> > On Friday, June 9, 2017 at 1:18:35 PM UTC+5:30, Lawrence D’Oliveiro wrote:
> > > 
> > > > ... (with Python2.7 on MS-Windows 7) ...
> > >
> > > Why?
> >
> > Are you asking why not Python3? My Java based colleagues say it clashes
> > with Java, so we try to work around Python2.x.
> > --
> > https://mail.python.org/mailman/listinfo/python-list
> >
> 
> 
> 
> -- 
> 
> Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland.

Dear Sir,
I believe I can take your word. I'd take your word and send to my project 
manager, let me check his view now. 
Regards,
RP
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Time Calculation to Tag a Sentence/File (Posting On Python-List Prohibited)

2017-06-09 Thread subhabangalore
On Friday, June 9, 2017 at 1:18:35 PM UTC+5:30, Lawrence D’Oliveiro wrote:
> On Thursday, June 8, 2017 at 9:57:40 AM UTC+12, subhaba...@gmail.com wrote:
> > ... (with Python2.7 on MS-Windows 7) ...
> 
> Why?

Are you asking why not Python3? My Java based colleagues say it clashes with 
Java, so we try to work around Python2.x.
-- 
https://mail.python.org/mailman/listinfo/python-list


Time Calculation to Tag a Sentence/File

2017-06-07 Thread subhabangalore
I am trying to calculate the time required to tag one sentence/file by one 
trained NLTK HMM Tagger.
To do this I am writing the following code, please suggest if I need to revise 
anything here.

import nltk
from nltk.corpus.reader import TaggedCorpusReader
import time
#HMM 
reader = TaggedCorpusReader('/python27/', r'.*\.pos')
f1=reader.fileids()
print f1
sents=reader.tagged_sents()
ls=len(sents)
print "Total No of Sentences:",ls
train_sents=sents[0:40]
test_sents=sents[41:46]
#TRAINING & TESTING
hmm_tagger=nltk.HiddenMarkovModelTagger.train(train_sents)
test=hmm_tagger.test(test_sents)
appli_sent1=reader.sents(fileids='minicv.pos')[0]
print "SAMPLE INPUT:",appli_sent1
#TIME CALCULATION 
start_time = time.clock()
application=hmm_tagger.tag(appli_sent1) #I MAY REPLACE WITH ONE DOCUMENT 
print "ENTITY RECOGNIZED",application 
print "Time Taken Is:",time.clock() - start_time, "seconds"

NB: This is a toy kind example and I did not follow much of training/testing 
size parameters. 

My question is only for the time calculation part. It is not a forum for 
Machine Learning, but as there are many people who has very high level 
knowledge on it, any one is most welcome to give his/her valuable feedback 
which may improve my knowledge. 

As the code is pasted here from IDLE (with Python2.7 on MS-Windows 7) I could 
not maintain proper indentation, apology for the same. 
-- 
https://mail.python.org/mailman/listinfo/python-list


String Replacement

2017-01-23 Thread subhabangalore
I have a string like 

"Trump is $ the president of USA % Obama was $ the president of USA % Putin is 
$ the premier of Russia%"

Here, I want to extract the portions from $...%, which would be

"the president of USA", 
"the president of USA", 
"the premier of Russia"

and would work some post extraction jobs, like I may split them or annotate 
them and may replace them
back to its own position with the edited string.

In the end it may look like

"Trump is  the/DET president/NN of/PREP USA/NN  Obama was  the/DET president/NN 
of/PREP USA/NN Putin is  the/DET premier/NN of/PREP Russia/NN"

I am working around replace and re.sub 

If any one may kindly suggest.

I am using Python2.7.12 on MS-Windows 7

Thanking in Advance
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: UTF-8 Encoding Error

2016-12-29 Thread subhabangalore
On Friday, December 30, 2016 at 7:16:25 AM UTC+5:30, Steve D'Aprano wrote:
> On Sun, 25 Dec 2016 04:50 pm, Grady Martin wrote:
> 
> > On 2016年12月22日 22時38分, wrote:
> >>I am getting the error:
> >>UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15:
> >>invalid start byte
> > 
> > The following is a reflex of mine, whenever I encounter Python 2 Unicode
> > errors:
> > 
> > import sys
> > reload(sys)
> > sys.setdefaultencoding('utf8')
> 
> 
> This is a BAD idea, and doing it by "reflex" without very careful thought is
> just cargo-cult programming. You should not thoughtlessly change the
> default encoding without knowing what you are doing -- and if you know what
> you are doing, you won't change it at all.
> 
> The Python interpreter *intentionally* removes setdefaultencoding at startup
> for a reason. Changing the default encoding can break the interpreter, and
> it is NEVER what you actually need. If you think you want it because it
> fixes "Unicode errors", all you are doing is covering up bugs in your code.
> 
> Here is some background on why setdefaultencoding exists, and why it is
> dangerous:
> 
> https://anonbadger.wordpress.com/2015/06/16/why-sys-setdefaultencoding-will-break-code/
> 
> If you have set the Python 2 default encoding to anything but ASCII, you are
> now running a broken system with subtle bugs, including in data structures
> as fundamental as dicts.
> 
> The standard behaviour:
> 
> py> d = {u'café': 1}
> py> for key in d:
> ... print key == 'caf\xc3\xa9'
> ...
> False
> 
> 
> As we should expect: the key in the dict, u'café', is *not* the same as the
> byte-string 'caf\xc3\xa9'. But watch how we can break dictionaries by
> changing the default encoding:
> 
> py> reload(sys)
> 
> py> sys.setdefaultencoding('utf-8')  # don't do this
> py> for key in d:
> ... print key == 'caf\xc3\xa9'
> ...
> True
> 
> 
> So Python now thinks that 'caf\xc3\xa9' is a key. Or does it?
> 
> py> d['caf\xc3\xa9']
> Traceback (most recent call last):
>   File "", line 1, in 
> KeyError: 'caf\xc3\xa9'
> 
> By changing the default encoding, we now have something which is both a key
> and not a key of the dict at the same time.
> 
> 
> 
> > A relevant Stack Exchange thread awaits you here:
> > 
> > http://stackoverflow.com/a/21190382/2230956
> 
> And that's why I don't trust StackOverflow. It's not bad for answering
> simple questions, but once the question becomes more complex the quality of
> accepted answers goes down the toilet. The highest voted answer is *wrong*
> and *dangerous*.
> 
> And then there's this comment:
> 
> Until this moment I was forced to include "# -- coding: utf-8 --" at 
> the begining of each document. This is way much easier and works as
> charm
> 
> I have no words for how wrong that is. And this comment:
> 
> ty, this worked for my problem with python throwing UnicodeDecodeError
> on var = u"""vary large string"""
> 
> No it did not. There is no possible way that Python will throw that
> exception on assignment to a Unicode string literal.
> 
> It is posts like this that demonstrate how untrustworthy StackOverflow can
> be.
> 
> 
> 
> -- 
> Steve
> “Cheer up,” they said, “things could be worse.” So I cheered up, and sure
> enough, things got worse.

Thanks for your detailed comment. The code is going all fine sometimes, and 
sometimes giving out errors. If any one may see how I am doing the problem.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: UTF-8 Encoding Error

2016-12-29 Thread subhabangalore
On Friday, December 30, 2016 at 3:35:56 AM UTC+5:30, subhaba...@gmail.com wrote:
> On Monday, December 26, 2016 at 3:37:37 AM UTC+5:30, Gonzalo V wrote:
> > Try utf-8-sig
> > El 25 dic. 2016 2:57 AM, "Grady Martin" <> escribió:
> > 
> > > On 2016年12月22日 22時38分,  wrote:
> > >
> > >> I am getting the error:
> > >> UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15:
> > >> invalid start byte
> > >>
> > >
> > > The following is a reflex of mine, whenever I encounter Python 2 Unicode
> > > errors:
> > >
> > > import sys
> > > reload(sys)
> > > sys.setdefaultencoding('utf8')
> > >
> > > A relevant Stack Exchange thread awaits you here:
> > >
> > > http://stackoverflow.com/a/21190382/2230956
> > > --
> > > https://mail.python.org/mailman/listinfo/python-list
> > >
> 
> Thank you for your kind time and answers. 
> 
> I tried to open one file in default ASCII format in MS-Windows 7. 
> txtf=open("/python27/TestFiles/small_file_1.txt","r").read()
> I could write them in UTF-8 using 
> cd1=codecs.open("/python27/TestFiles/file1.pos","w", "utf-8-sig")
> cd1.write(txtf)
> 
> Here, I was getting an error as,
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 150: 
> ordinal not in range(128)
> 
> Then I used,
> >>> import sys
> >>> reload(sys)
> >>> sys.setdefaultencoding('utf8')
> 
> and then wrote 
> >>> cd1.write(txtf)
> it went fine. 
> 
> Now in my actual problem I am writing it bit differently: 
> 
> with open('textfile.txt') as f:
> for i, g in enumerate(grouper(n, f, fillvalue=''), 1):
> with open('/Python27/TestFiles/small_filing_{0}.pos'.format(i * 
> n), 'w') as fout:
> fout.writelines(g)
> 
> I am trying to fix this. 
> 
> If you may kindly suggest.

The grouper method is:
def grouper(n, iterable, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
n = 3
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: UTF-8 Encoding Error

2016-12-29 Thread subhabangalore
On Monday, December 26, 2016 at 3:37:37 AM UTC+5:30, Gonzalo V wrote:
> Try utf-8-sig
> El 25 dic. 2016 2:57 AM, "Grady Martin" <> escribió:
> 
> > On 2016年12月22日 22時38分,  wrote:
> >
> >> I am getting the error:
> >> UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15:
> >> invalid start byte
> >>
> >
> > The following is a reflex of mine, whenever I encounter Python 2 Unicode
> > errors:
> >
> > import sys
> > reload(sys)
> > sys.setdefaultencoding('utf8')
> >
> > A relevant Stack Exchange thread awaits you here:
> >
> > http://stackoverflow.com/a/21190382/2230956
> > --
> > https://mail.python.org/mailman/listinfo/python-list
> >

Thank you for your kind time and answers. 

I tried to open one file in default ASCII format in MS-Windows 7. 
txtf=open("/python27/TestFiles/small_file_1.txt","r").read()
I could write them in UTF-8 using 
cd1=codecs.open("/python27/TestFiles/file1.pos","w", "utf-8-sig")
cd1.write(txtf)

Here, I was getting an error as,
UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 150: 
ordinal not in range(128)

Then I used,
>>> import sys
>>> reload(sys)
>>> sys.setdefaultencoding('utf8')

and then wrote 
>>> cd1.write(txtf)
it went fine. 

Now in my actual problem I am writing it bit differently: 

with open('textfile.txt') as f:
for i, g in enumerate(grouper(n, f, fillvalue=''), 1):
with open('/Python27/TestFiles/small_filing_{0}.pos'.format(i * n), 
'w') as fout:
fout.writelines(g)

I am trying to fix this. 

If you may kindly suggest. 


-- 
https://mail.python.org/mailman/listinfo/python-list


UTF-8 Encoding Error

2016-12-22 Thread subhabangalore
I am getting the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15: invalid 
start byte

as I try to read some files through TaggedCorpusReader. TaggedCorpusReader is a 
module
of NLTK.
My files are saved in ANSI format in MS-Windows default. 
I am using Python2.7 on MS-Windows 7. 

I have tried the following options till now, 
string.encode('utf-8').strip()
unicode(string)
unicode(str, errors='replace')
unicode(str, errors='ignore')
string.decode('cp1252')

But nothing is of much help.

If any one may kindly suggest.

I am trying if you may see.
-- 
https://mail.python.org/mailman/listinfo/python-list


Working around multiple files in a folder

2016-11-21 Thread subhabangalore
I have a python script where I am trying to read from a list of files in a 
folder and trying to process something. 
As I try to take out the output I am presently appending to a list.

But I am trying to write the result of individual files in individual list or 
files.

The script is as follows:

import glob
def speed_try():
#OPENING THE DICTIONARY
a4=open("/python27/Dictionaryfile","r").read()
#CONVERTING DICTIONARY INTO WORDS
a5=a4.lower().split()
list1=[]
for filename in glob.glob('/Python27/*.txt'):
a1=open(filename,"r").read()
a2=a1.lower()
a3=a2.split()
for word in a3:
if word in a5:
a6=a5.index(word)
a7=a6+1
a8=a5[a7]
a9=word+"/"+a8
list1.append(a9)
elif word not in a5:
list1.append(word)
else:
print "None"

x1=list1
x2=" ".join(x1)
print x2

Till now, I have tried to experiment over the following solutions:

a) def speed_try():
  #OPENING THE DICTIONARY
  a4=open("/python27/Dictionaryfile","r").read()
  #CONVERTING DICTIONARY INTO WORDS
  a5=a4.lower().split()
  list1=[]
  for filename in glob.glob('/Python27/*.txt'):
 a1=open(filename,"r").read()
 a2=a1.lower()
 a3=a2.split()
  list1.append(a3)


x1=list1
print x1

Looks very close but I am unable to fit the if...elif...else part. 

b) import glob
def multi_filehandle():
list_of_files = glob.glob('/Python27/*.txt')
for file_name in list_of_files:
FI = open(file_name, 'r')
FI1=FI.read().split()
FO = open(file_name.replace('txt', 'out'), 'w') 
for line in FI:
FO.write(line)

FI.close()
FO.close()

I could write output but failing to do processing of the files between opening 
and writing.

I am trying to get examples from fileinput.

If anyone of the learned members may kindly suggest how may I proceed.

I am using Python2.x on MS-Windows. 

The practices are scripts and not formal codes so I have not followed style 
guides.

Apology for any indentation error.

Thanking in advance.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python String Handling

2016-11-12 Thread subhabangalore
On Saturday, November 12, 2016 at 7:34:31 AM UTC+5:30, Steve D'Aprano wrote:
> On Sat, 12 Nov 2016 09:29 am wrote:
> 
> > I have a string
> > "Hello my name is Richard"
> > 
> > I have a list of words as,
> > ['Hello/Hi','my','name','is','Richard/P']
> > 
> > I want to identify the match of 'Hello' and 'Richard'
> > in list, and replace them with 'Hello/Hi" and 'Richard/P'
> > respectively.
> > 
> > The result should look like,
> > "Hello/Hi my name is Richard/P".
> 
> Looks like you want:
> 
> 
> mystring = "Hello my name is Richard"
> words = ['Hello/Hi', 'my', 'name', 'is', 'Richard/P']
> result = " ".join(words)
> 
> assert result == "Hello/Hi my name is Richard/P"
> 
> 
> and mystring is irrelevant.
> 
> 
> 
> 
> -- 
> Steve
> “Cheer up,” they said, “things could be worse.” So I cheered up, and sure
> enough, things got worse.

Thank you all for your kind time. 
The problem is slightly more complex.

I am restating the problem. 

"Hello my name is Richard"

is a string. 

I have tagged the words Hello and Richard
as "Hello/Hi" and "Richard/P". 
After this I could get the string as a list of words
as in,
['Hello/Hi','my','name','is','Richard/P'] 

Now I want to replace the string with 
Hello/Hi my name is Richard/P

It may seem a joining of list but is not because
if I try to make, 
['Hello/Hi','my/M','name','is/I','Richard/P'] 

I may do, but doing the following string 

Hello/Hi my/M name is/I Richard/P

is tough as entities with tag may vary. I have to make
a rule. 

I am trying to recognize the index of the word in
the list, pop it and replace with new value and joining
the list as string.

This is okay but as expert people if you have any smarter
suggestion.

Thanks in advance

 


-- 
https://mail.python.org/mailman/listinfo/python-list


Python String Handling

2016-11-11 Thread subhabangalore
I have a string 
"Hello my name is Richard"

I have a list of words as,
['Hello/Hi','my','name','is','Richard/P']

I want to identify the match of 'Hello' and 'Richard'
in list, and replace them with 'Hello/Hi" and 'Richard/P'
respectively.

The result should look like,
"Hello/Hi my name is Richard/P".

Simple replace method may not work.

I was trying the following script. 


import fuzzywuzzy
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
import itertools
def sometry():
 x1="Hello my name is Richard"
 x2=x1.split()
 x3=['Hello/Hi','my','name','is','Richard/P']
 list1=[]
 for i in x2:
  x4=process.extractOne(i, x3)
  print x4
  x5=x4[0]
  print x5
  x6=[x5 if x==i else x for x in x2]
  print x6
  list1.append(x6)

 b1=list1
 print b1
 merged = list(itertools.chain.from_iterable(b1))
 merged1=list(set(merged))
 print merged1

I am working in Python2.x on MS-Windows. 
This is a simple practice script so I have not followed style guides.
Apology for any indentation error.

I am trying if any one of the members may give any idea how may I achieve it.

Thanks in Advance. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question on List processing

2016-04-26 Thread subhabangalore
On Monday, April 25, 2016 at 10:07:13 PM UTC+5:30, Steven D'Aprano wrote:
> 
> 
> > Dear Group,
> > 
> > I have a list of tuples, as follows,
> > 
> > list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA
> [... 17 more lines of data ...]
> 
> Hi Subhabrata, and thanks for the question.
> 
> Please remember that we are offering help for free, in our own time. If you
> want help from us, you should help us to help you.
> 
> It is very unlikely that many people will spend the time to study your data
> in close enough detail to understand your requirements. Please give a
> *simplified* example. Instead of 17 lines of repetitive data, use a "toy"
> example that matches the format but without all the complicated details.
> And format it so that it is easy to read:
> 
> input = [u"('a/b/ ','A')",
>  u"('z/x/ ','B')",
>  u"('b/d/ ','C')",
>  ]
> 
> output = 
> 
> 
> 
> > I tried to make it as follows,
> [...]
> > but not helping.
> 
> What do you mean, "not helping"? What happens when you try?
> 
> Please show a *simple* example, with no more than four or five lines of
> *short, easy to read* text.
> 
> Remember, we are giving you advice and consulting for free. We are not paid
> to do this. If your questions are too difficult, boring, tedious, or
> unpleasant, we will just ignore them, so please help us to help you by
> simplifying them as much as possible.
> 
> Thank you.
> 
> 
> 
> 
> -- 
> Steven

Dear Steven,

Thank you for your kind suggestion. 
I am trying to send you a revised example.

I have a list as, 
list1=[u"('koteeswaram/BHPERSN engaged/NA ','class1')", u"('koteeswaram/BHPERSN 
is/NA ','class1')"]

I like to convert it as, 

list1=[('koteeswaram/BHPERSN engaged/NA ','class1'),
 ('koteeswaram/BHPERSN is/NA  ','class1')]

I tried to make it as follows,
list2=[]
for i in list1:
a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
a2=a1.replace('"',"")
list2.append(a2)

and,

for i in list1:
a3=i[1:-1]
list2.append(a3)


but I am not getting desired output. 
If any one may kindly suggest how may I approach it? 

Regards,
Subhabrata 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question on List processing

2016-04-26 Thread subhabangalore
On Monday, April 25, 2016 at 10:07:13 PM UTC+5:30, Steven D'Aprano wrote:
> On Tue, 26 Apr 2016 12:56 am, wrote:
> 
> > Dear Group,
> > 
> > I have a list of tuples, as follows,
> > 
> > list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA
> [... 17 more lines of data ...]
> 
> Hi Subhabrata, and thanks for the question.
> 
> Please remember that we are offering help for free, in our own time. If you
> want help from us, you should help us to help you.
> 
> It is very unlikely that many people will spend the time to study your data
> in close enough detail to understand your requirements. Please give a
> *simplified* example. Instead of 17 lines of repetitive data, use a "toy"
> example that matches the format but without all the complicated details.
> And format it so that it is easy to read:
> 
> input = [u"('a/b/ ','A')",
>  u"('z/x/ ','B')",
>  u"('b/d/ ','C')",
>  ]
> 
> output = 
> 
> 
> 
> > I tried to make it as follows,
> [...]
> > but not helping.
> 
> What do you mean, "not helping"? What happens when you try?
> 
> Please show a *simple* example, with no more than four or five lines of
> *short, easy to read* text.
> 
> Remember, we are giving you advice and consulting for free. We are not paid
> to do this. If your questions are too difficult, boring, tedious, or
> unpleasant, we will just ignore them, so please help us to help you by
> simplifying them as much as possible.
> 
> Thank you.
> 
> 
> 
> 
> -- 
> Steven

Dear Steven,

Thank you for your kind suggestion. 
I will keep it in mind.

I am trying to send you a revised example.
list1=[u"('koteeswaram/BHPERSN engaged/NA ','class1')", u"('koteeswaram/BHPERSN 
is/NA ','class1')"] 

[('koteeswaram/BHPERSN engaged/NA ','class1'),
 ('koteeswaram/BHPERSN is/NA  ','class1')]

I tried to make it as follows,
list2=[]
for i in list1:
a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
a2=a1.replace('"',"")
list2.append(a2)

and,

for i in list1:
a3=i[1:-1]
list2.append(a3)


but I am not getting desired output. 
If any one may kindly suggest how may I approach it? 

Regards,
Subhabrata 
-- 
https://mail.python.org/mailman/listinfo/python-list


Question on List processing

2016-04-25 Thread subhabangalore
Dear Group,

I have a list of tuples, as follows,

list1=[u"('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA 
philanthropic/NA activities/NA  ','class1')", u"('koteeswaram/BHPERSN is/NA 
a/NA very/NA nice/NA person/NA  ','class1')", u"('koteeswaram/BHPERSN came/NA 
to/NA mumbai/LOC but/NA could/NA not/NA attend/NA the/ARTDEF board/NA 
meeting/NA  ','class1')", u"('the/ARTDEF people/NA of/NA the/ARTDEF company 
ABCOMP did/NA not/NA vote/NA for/NA koteeswaram/LOC  ','class2')", 
u"('the/ARTDEF director AHT of/NA the/ARTDEF company,/NA koteeswaram/BHPERSN 
had/NA been/NA advised/NA to/NA take/NA rest/NA for/NA a/NA while/NA  
','class2')", u"('animesh/BHPERSN chauhan/BHPERSN arrived/NA by/NA his/PRNM3PAS 
private/NA aircraft/NA in/NA mumbai/LOC  ','class2')", u"('animesh/BHPERSN 
chauhan/BHPERSN met/NA the/ARTDEF prime/HPLPERST minister/AHT of/NA india/LOCC 
over/NA some/NA issues/NA  ','class2')", u"('animesh/BHPERSN chauhan/BHPERSN 
is/NA trying/NA to/NA set/NA up/NA a/NA plant/NA in/NA uk/LOCC  ','class3')", 
u"('animesh/BHPERSN chauh
 an/BHPERSN is/NA trying/NA to/NA launch/NA a/NA new/ABCOMP office/AHT in/NA 
burdwan/LOC  ','class3')", u"('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA 
to/NA work/NA out/NA the/ARTDEF launch/NA of/NA a/NA new/ABCOMP product/NA 
in/NA india/LOCC  ','class3')"]

I want to make it like,

[('koteeswaram/BHPERSN engaged/NA himself/NA in/NA various/NA philanthropic/NA 
activities/NA','class1'),
 ('koteeswaram/BHPERSN is/NA a/NA very/NA nice/NA person/NA  ','class1'), 
('koteeswaram/BHPERSN came/NA to/NA mumbai/LOC but/NA could/NA not/NA attend/NA 
the/ARTDEF board/NA meeting/NA','class1'), ('the/ARTDEF people/NA of/NA 
the/ARTDEF company ABCOMP did/NA not/NA vote/NA for/NA koteeswaram/LOC  
','class2'),   ('the/ARTDEF director AHT of/NA the/ARTDEF company,/NA 
koteeswaram/BHPERSN had/NA been/NA advised/NA to/NA take/NA rest/NA for/NA a/NA 
while/NA  ','class2'), ('animesh/BHPERSN chauhan/BHPERSN arrived/NA by/NA 
his/PRNM3PAS private/NA aircraft/NA in/NA mumbai/LOC','class2'), 
('animesh/BHPERSN chauhan/BHPERSN met/NA the/ARTDEF prime/HPLPERST minister/AHT 
of/NA india/LOCC over/NA some/NA issues/NA','class2'), ('animesh/BHPERSN 
chauhan/BHPERSN is/NA trying/NA to/NA set/NA up/NA a/NA plant/NA in/NA 
uk/LOCC','class3'), ('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA 
launch/NA a/NA new/ABCOMP office/AHT in/NA burdwan/LOC','class3'),
('animesh/BHPERSN chauhan/BHPERSN is/NA trying/NA to/NA work/NA out/NA 
the/ARTDEF launch/NA of/NA a/NA new/ABCOMP product/NA in/NA 
india/LOCC','class3')]

I tried to make it as follows,
list2=[]
for i in train_sents:
a1=unicodedata.normalize('NFKD', i).encode('ascii','ignore')
a2=a1.replace('"',"")
list2.append(a2)

and,

for i in list1:
a3=i[1:-1]
list2.append(a3)


but not helping.
If any one may kindly suggest how may I approach it?

Thanks in Advance,
Regards,
Subhabrata Banerjee. 



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Review Request of Python Code

2016-03-10 Thread subhabangalore
On Friday, March 11, 2016 at 12:22:31 AM UTC+5:30, Matt Wheeler wrote:
> On 10 March 2016 at 18:12,   wrote:
> > Matt, thank you for if...else suggestion, the data of NewTotalTag.txt
> > is like a simple list of words with unconventional tags, like,
> >
> > w1 tag1
> > w2 tag2
> > w3 tag3
> > ...
> > ...
> > w3  tag3
> >
> > like that.
> 
> I suspected so. The way your code currently works, if your input text
> contains one of the tags, e.g. 'tag1' you'll get an entry in your
> output something like 'tag1/w2'. I assume you don't want that :).
> 
> This is because you're using a single list to include all of the tags.
> Try something along the lines of:
> 
> dict_word={} #empty dictionary
> for line in dict_read.splitlines():
> word, tag = line.split(' ')
> dict_word[word] = tag
> 
> Notice I'm using splitlines() instead of split() to do the initial
> chopping up of your input. split() will split on any whitespace by
> default. splitlines should be self-explanatory.
> 
> I would split this and the file-open out into a separate function at
> this point. Large blobs of sequential code are not particularly easy
> on the eyes or the brain -- choose a sensible name, like
> load_dictionary. Perhaps something you could call like:
> 
> dict_word = load_dictionary("NewTotalTag.txt")
> 
> 
> You also aren't closing the file that you open at any point -- once
> you've loaded the data from it there's no need to keep the file opened
> (look up context managers).
> 
> -- 
> Matt Wheeler
> http://funkyh.at

Dear Matt,

I want in the format of w1/tag1...you may find my detailed problem statement in 
reply of someone else's query. If you feel I would write again for you.

Regards,
Subhabrata
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Review Request of Python Code

2016-03-10 Thread subhabangalore
On Wednesday, March 9, 2016 at 9:49:17 AM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> I am trying to write a code for pulling data from MySQL at the backend and 
> annotating words and trying to put the results as separated sentences with 
> each line. The code is generally running fine but I am feeling it may be 
> better in the end of giving out sentences, and for small data sets it is okay 
> but with 50,000 news articles it is performing dead slow. I am using 
> Python2.7.11 on Windows 7 with 8GB RAM. 
> 
> I am trying to copy the code here, for your kind review. 
> 
> import MySQLdb
> import nltk
> def sql_connect_NewTest1():
> db = MySQLdb.connect(host="localhost",
>  user="*", 
>  passwd="*",  
>  db="abcd_efgh")
> cur = db.cursor()
> #cur.execute("SELECT * FROM newsinput limit 0,5;") #REPORTING RUNTIME 
> ERROR
> cur.execute("SELECT * FROM newsinput limit 0,50;")
> dict_open=open("/python27/NewTotalTag.txt","r") #OPENING THE DICTIONARY 
> FILE 
> dict_read=dict_open.read() 
> dict_word=dict_read.split()
> a4=dict_word #Assignment for code. 
> list1=[]
> flist1=[]
> nlist=[]
> for row in cur.fetchall():
> #print row[2]
> var1=row[3]
> #print var1 #Printing lines
> #var2=len(var1) # Length of file
> var3=var1.split(".") #SPLITTING INTO LINES
> #print var3 #Printing The Lines 
> #list1.append(var1)
> var4=len(var3) #Number of all lines
> #print "No",var4
> for line in var3:
> #print line
> #flist1.append(line)
> linew=line.split()
> for word in linew:
> if word in a4:
> windex=a4.index(word)
> windex1=windex+1
> word1=a4[windex1]
> word2=word+"/"+word1
> nlist.append(word2)
> #print list1
> #print nlist
> elif word not in a4:
> word3=word+"/"+"NA"
> nlist.append(word3)
> #print list1
> #print nlist
> else:
> print "None"
> 
> #print "###",flist1
> #print len(flist1)
> #db.close()
> #print nlist
> lol = lambda lst, sz: [lst[i:i+sz] for i in range(0, len(lst), sz)] 
> #TRYING TO SPLIT THE RESULTS AS SENTENCES 
> nlist1=lol(nlist,7)
> #print nlist1
> for i in nlist1:
> string1=" ".join(i)
> print i
> #print string1
> 
>
> Thanks in Advance.


Dear Group,

Thank you all, for your kind time and all suggestions in helping me.

Thank you Steve for writing the whole code. It is working full 
and fine. But speed is still an issue. We need to speed up. 

Inada I tried to change to 
cur = db.cursor(MySQLdb.cursors.SSCursor) but my System Admin 
said that may not be an issue.

Freidrich, my problem is I have a big text repository of .txt
files in MySQL in the backend. I have another list of words with
their possible tags. The tags are not conventional Parts of Speech(PoS)
tags,  and bit defined by others. 
The code is expected to read each file and its each line.
On reading each line it will scan the list for appropriate
tag, if it is found it would assign, else would assign NA.
The assignment should be in the format of /tag, so that
if there is a string of n words, it should look like,
w1/tag w2/tag w3/tag w4/tag wn/tag, 

where tag may be tag in the list or NA as per the situation.

This format is taken because the files are expected to be tagged
in Brown Corpus format. There is a Python Library named NLTK.
If I want to save my data for use with their models, I need 
some specifications. I want to use it as Tagged Corpus format. 

Now the tagged data coming out in this format, should be one 
tagged sentences in each new line or a lattice. 

They expect the data to be saved in .pos format but presently 
I am not doing in this code, I may do that later. 

Please let me know if I need to give any more information.

Matt, thank you for if...else suggestion, the data of NewTotalTag.txt
is like a simple list of words with unconventional tags, like,

w1 tag1
w2 tag2
w3 tag3
...
...
w3  tag3

like that. 

Regards,
Subhabrata  

  
-- 
https://mail.python.org/mailman/listinfo/python-list


Review Request of Python Code

2016-03-08 Thread subhabangalore
Dear Group,

I am trying to write a code for pulling data from MySQL at the backend and 
annotating words and trying to put the results as separated sentences with each 
line. The code is generally running fine but I am feeling it may be better in 
the end of giving out sentences, and for small data sets it is okay but with 
50,000 news articles it is performing dead slow. I am using Python2.7.11 on 
Windows 7 with 8GB RAM. 

I am trying to copy the code here, for your kind review. 

import MySQLdb
import nltk
def sql_connect_NewTest1():
db = MySQLdb.connect(host="localhost",
 user="*", 
 passwd="*",  
 db="abcd_efgh")
cur = db.cursor()
#cur.execute("SELECT * FROM newsinput limit 0,5;") #REPORTING RUNTIME 
ERROR
cur.execute("SELECT * FROM newsinput limit 0,50;")
dict_open=open("/python27/NewTotalTag.txt","r") #OPENING THE DICTIONARY 
FILE 
dict_read=dict_open.read() 
dict_word=dict_read.split()
a4=dict_word #Assignment for code. 
list1=[]
flist1=[]
nlist=[]
for row in cur.fetchall():
#print row[2]
var1=row[3]
#print var1 #Printing lines
#var2=len(var1) # Length of file
var3=var1.split(".") #SPLITTING INTO LINES
#print var3 #Printing The Lines 
#list1.append(var1)
var4=len(var3) #Number of all lines
#print "No",var4
for line in var3:
#print line
#flist1.append(line)
linew=line.split()
for word in linew:
if word in a4:
windex=a4.index(word)
windex1=windex+1
word1=a4[windex1]
word2=word+"/"+word1
nlist.append(word2)
#print list1
#print nlist
elif word not in a4:
word3=word+"/"+"NA"
nlist.append(word3)
#print list1
#print nlist
else:
print "None"

#print "###",flist1
#print len(flist1)
#db.close()
#print nlist
lol = lambda lst, sz: [lst[i:i+sz] for i in range(0, len(lst), sz)] #TRYING 
TO SPLIT THE RESULTS AS SENTENCES 
nlist1=lol(nlist,7)
#print nlist1
for i in nlist1:
string1=" ".join(i)
print i
#print string1

   
Thanks in Advance.






-- 
https://mail.python.org/mailman/listinfo/python-list


Problem in creating list of lists

2016-02-29 Thread subhabangalore
I have few sentences, like, 

the film was nice.  
leonardo is great. 
it was academy award.

Now I want them to be tagged with some standards which may look like,

the DT film NN was AV nice ADJ
leonardo NN is AV great ADJ
it PRP was AV academy NN award NN

I could do it but my goal is to see it as,
[[('the','DT'),('film', 
'NN'),('was','AV'),('nice','ADJ')],[('leonardo','NN'),('is','AV'),('great','ADJ')],[('it','PRP'),
('was','AV'),('academy','NN'),('award','NN')]]

that is a list of lists where in each list there is a set of tuples. 
I could solve each one like I am getting 
one list with tuples, but not all within one.
As it is PoS Tagging so I am not being able to use zip.
If any one may please suggest. 

If any one may kindly suggest a solution.
Thanks in advance.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Error in Tree Structure

2016-02-29 Thread subhabangalore
On Saturday, February 27, 2016 at 9:43:56 PM UTC+5:30, Rustom Mody wrote:
> On Saturday, February 27, 2016 at 2:47:53 PM UTC+5:30, subhaba...@gmail.com 
> wrote:
> > I was trying to implement the code, 
> > 
> > import nltk
> > import nltk.tag, nltk.chunk, itertools
> > def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
> > words, ents = zip(*tree.pos())
> > iobs = []
> > prev = None
> > for ent in ents:
> > if ent == tree.node:
> > iobs.append('O')
> > prev = None
> > elif prev == ent:
> >  iobs.append('I-%s' % ent)
> > else:
> >  iobs.append('B-%s' % ent)
> >  prev = ent
> > words, tags = zip(*tag(words))
> > return itertools.izip(words, tags, iobs)
> > 
> > def ieer_chunked_sents(tag=nltk.tag.pos_tag):
> > for doc in ieer.parsed_docs():
> > tagged = ieertree2conlltags(doc.text, tag)
> > yield nltk.chunk.conlltags2tree(tagged)
> > 
> > 
> > from chunkers import ieer_chunked_sents, ClassifierChunker
> > from nltk.corpus import treebank_chunk
> > ieer_chunks = list(ieer_chunked_sents())
> > chunker = ClassifierChunker(ieer_chunks[:80])
> > print chunker.parse(treebank_chunk.tagged_sents()[0])
> > score = chunker.evaluate(ieer_chunks[80:])
> > print score.accuracy()
> > 
> > It is running fine. 
> > But as I am trying to rewrite the code as,
> > chunker = ClassifierChunker(list1),
> > where list1 is same value as,
> > ieer_chunks[:80]
> > only I am pasting the value as 
> > [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), 
> > Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', 
> > [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', 
> > [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), 
> > (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), 
> > (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 
> > 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), 
> > (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 
> > 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), 
> > (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), 
> > (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 
> > 'NN'),(u'(cm-kjd)', 'NN')])]
> > the value of whole list directly I am getting syntax error.
> 
> Dunno how literally you intend this but there is a "" near the end 
> of the list. Intended?

It is intended. As actual list was large.
And most likely I could solve the problem,
with 
from nltk.tree import Tree
I missed in my code.

Thank you for your kind time and discussion.

Regards,
RP
-- 
https://mail.python.org/mailman/listinfo/python-list


Error in Tree Structure

2016-02-27 Thread subhabangalore
I was trying to implement the code, 

import nltk
import nltk.tag, nltk.chunk, itertools
def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
words, ents = zip(*tree.pos())
iobs = []
prev = None
for ent in ents:
if ent == tree.node:
iobs.append('O')
prev = None
elif prev == ent:
 iobs.append('I-%s' % ent)
else:
 iobs.append('B-%s' % ent)
 prev = ent
words, tags = zip(*tag(words))
return itertools.izip(words, tags, iobs)

def ieer_chunked_sents(tag=nltk.tag.pos_tag):
for doc in ieer.parsed_docs():
tagged = ieertree2conlltags(doc.text, tag)
yield nltk.chunk.conlltags2tree(tagged)


from chunkers import ieer_chunked_sents, ClassifierChunker
from nltk.corpus import treebank_chunk
ieer_chunks = list(ieer_chunked_sents())
chunker = ClassifierChunker(ieer_chunks[:80])
print chunker.parse(treebank_chunk.tagged_sents()[0])
score = chunker.evaluate(ieer_chunks[80:])
print score.accuracy()

It is running fine. 
But as I am trying to rewrite the code as,
chunker = ClassifierChunker(list1),
where list1 is same value as,
ieer_chunks[:80]
only I am pasting the value as 
[Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), 
Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', 
[(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 
'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 
'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), 
Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), 
(u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), 
(u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 
'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), 
(u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),(u'(cm-kjd)', 
'NN')])]
the value of whole list directly I am getting syntax error.
I tried to paste it in Python IDE outside code there also it is giving syntax 
error. 
If I do not paste the value and and rename ieer_chunks[:80] as list1 there is 
no error.
I may be doing some problem while copying the value and pasting it. 
But I did not change anything there.

Is it any error in Python part or in NLTK part? 

Thanks in advance.
If any one may guide me what is the error I am doing and how may I solve it.


 

-- 
https://mail.python.org/mailman/listinfo/python-list


How may I change values in tuples of list of lists?

2016-02-23 Thread subhabangalore
Hi 

I am trying to use the following set of tuples in list of lists. 
I am using a Python based library named, NLTK. 

>>> import nltk
>>> from nltk.corpus import brown as bn
>>> bt=bn.tagged_sents()
>>> bt_5=bt[:5]
>>> print bt
[[(u'The', u'AT'), (u'Fulton', u'NP-TL'), (u'County', u'NN-TL'), (u'Grand', 
u'JJ-TL'), (u'Jury', u'NN-TL'), (u'said', u'VBD'), (u'Friday', u'NR'), (u'an', 
u'AT'), (u'investigation', u'NN'), (u'of', u'IN'), (u"Atlanta's", u'NP$'), 
(u'recent', u'JJ'), (u'primary', u'NN'), (u'election', u'NN'), (u'produced', 
u'VBD'), (u'``', u'``'), (u'no', u'AT'), (u'evidence', u'NN'), (u"''", u"''"), 
(u'that', u'CS'), (u'any', u'DTI'), (u'irregularities', u'NNS'), (u'took', 
u'VBD'), (u'place', u'NN'), (u'.', u'.')], [(u'The', u'AT'), (u'jury', u'NN'), 
(u'further', u'RBR'), (u'said', u'VBD'), (u'in', u'IN'), (u'term-end', u'NN'), 
(u'presentments', u'NNS'), (u'that', u'CS'), (u'the', u'AT'), (u'City', 
u'NN-TL'), (u'Executive', u'JJ-TL'), (u'Committee', u'NN-TL'), (u',', u','), 
(u'which', u'WDT'), (u'had', u'HVD'), (u'over-all', u'JJ'), (u'charge', u'NN'), 
(u'of', u'IN'), (u'the', u'AT'), (u'election', u'NN'), (u',', u','), (u'``', 
u'``'), (u'deserves', u'VBZ'), (u'the', u'AT'), (u'praise', u'NN'), (u'an
 d', u'CC'), (u'thanks', u'NNS'), (u'of', u'IN'), (u'the', u'AT'), (u'City', 
u'NN-TL'), (u'of', u'IN-TL'), (u'Atlanta', u'NP-TL'), (u"''", u"''"), (u'for', 
u'IN'), (u'the', u'AT'), (u'manner', u'NN'), (u'in', u'IN'), (u'which', 
u'WDT'), (u'the', u'AT'), (u'election', u'NN'), (u'was', u'BEDZ'), 
(u'conducted', u'VBN'), (u'.', u'.')], ...]
>>> 

Now if I want to change the values of tags like 'AT', 'NP-TL', 'NN-TL', etc. to 
some arbitrary ones like XX,YY,ZZ and yet preserve total structure of tuples in 
list of lists, please suggest how may I do it. 

I donot think it is an NLTK issue, rather a Python issue. 
I am trying to access and change but using 
for i,j in enumerate(bt_5), etc. bit long stuff.

If any one may kindly suggest a smart line of code. 

I am using Python2.7.11 on MS-Windows-10. My NLTK version is 3.1

Thanks in advance.

Regards,
RP.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Combing Search Engine with REST

2015-07-10 Thread subhabangalore
On Friday, July 10, 2015 at 5:36:48 PM UTC+5:30, Laura Creighton wrote:
> In a message of Fri, 10 Jul 2015 04:46:25 -0700, 
>  writes:
> >Dear Group,
> >
> >I am trying to make a search engine. I used Whoosh to do it. 
> >I want to add documents to it. This is going fine. 
> >Now, I want to add documents in the index with REST framework.
> >I could learn Flask well. 
> >My task is to use Flask to add documents (by using put/post) to index. 
> >I am slightly confused how may I do it.
> >
> >If any one of esteemed members of the group may suggest.
> >
> >Regards,
> >Subhabrata Banerjee. 
> 
> I suggest you look at
> https://pythonhosted.org/Flask-WhooshAlchemy/
> and see if it does what you want.
> 
> Laura

Hi,
Thanks. But documentation is very low. Both whoosh and Flask are well 
documented.
Regards,
Subhabrata. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to design a search engine in Python?

2015-02-22 Thread subhabangalore
On Sunday, February 22, 2015 at 2:42:48 PM UTC+5:30, Laura Creighton wrote:
> In a message of Sat, 21 Feb 2015 22:07:30 -0800,  write
> >Dear Sir,
> >
> >Thank you for your kind suggestion. Let me traverse one by one. 
> >My special feature is generally Semantic Search, but I am trying to build
> >a search engine first and then go for semantic I feel that would give me a 
> >solid background to work around the problem. 
> >
> >Regards,
> >Subhabrata. 
> 
> You may find the API docs surrounding rdelbru.github.io/SIREn/
> of interest then.
> 
> Laura Creighton

Dear Madam,

Thank you for your kind help. I would surely check then. 

Regards,
Subhabrata. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to design a search engine in Python?

2015-02-21 Thread subhabangalore
On Sunday, February 22, 2015 at 11:08:47 AM UTC+5:30, Denis McMahon wrote:
> On Sat, 21 Feb 2015 21:02:34 -0800, subhabangalore wrote:
> 
> > Thank you for your suggestion. But I was looking for a small tutorial of
> > algorithm of the whole engine. I would try to check it build individual
> > modules and integrate them. I was getting some in google and youtube,
> > but I tried to consult you as I do not know whether they would be fine.
> > I am trying your way, let me see how much I go. There are so many search
> > algorithms in our popular data structure books, that is not an issue but
> > how a search engine is getting done, I am thinking bit on that.
> 
> Presumably a search engine is simply a database of keyword -> result, 
> possibly with some scoring factor.
> 
> Calculating scoring factor is going to be fun.
> 
> Then of course result pages might have scoring factors too. What about a 
> search with multiple keywords. Some result pages might match more than 
> one keyword, so you might add their score for each keyword together to 
> get the ranking in that enquiry for that page.
> 
> But then pages with lots and lots of different keywords might be low 
> scoring, because searchers are looking for content, not pages of keywords.
> 
> Finally, What special, unique feature is your search engine going to have 
> that makes it better than all the existing ones?
> 
> -- 
> Denis McMahon,
Dear Sir,

Thank you for your kind suggestion. Let me traverse one by one. 
My special feature is generally Semantic Search, but I am trying to build
a search engine first and then go for semantic I feel that would give me a 
solid background to work around the problem. 

Regards,
Subhabrata. 


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to design a search engine in Python?

2015-02-21 Thread subhabangalore
On Sunday, February 22, 2015 at 10:12:39 AM UTC+5:30, Steven D'Aprano wrote:
> wrote:
> 
> > Dear Group,
> > 
> > I am trying to build a search engine in Python.
> 
> How to design a search engine in Python?
> 
> First, design a search engine.
> 
> Then, write Python code to implement that search engine.
> 
> 
> > To do this, I have read tutorials and working methodologies from web and
> > books like Stanford IR book [ http://www-nlp.stanford.edu/IR-book/]. I
> > know how to design a crawler, I know PostgresSql, I am fluent with
> > PageRank, TF-IDF, Zipf's law, etc. I came to know of
> > Whoosh[https://pypi.python.org/pypi/Whoosh/]
> 
> How does your search engine work? What does it do?
> 
> You MUST be able to describe the workings of your search engine in English,
> or the natural language of your choice. Write out the steps that it must
> take, the tasks that it must perform. This is your algorithm. Without an
> algorithm, how do you expect to write code? What will the code do?
> 
> Once you have designed your search engine algorithm, then *and only then*
> should you start to write code to implement that algorithm.
> 
> 
> 
> 
> -- 
> Steven

Dear Sir,

Thank you for your suggestion. But I was looking for a small tutorial of 
algorithm of the whole engine. I would try to check it build individual modules 
and integrate them. I was getting some in google and youtube, but I tried to 
consult you as I do not know whether they would be fine. I am trying your way, 
let me see how much I go. There are so many search algorithms in our popular 
data structure books, that is not an issue but how a search engine is getting 
done, I am thinking bit on that. 

Regards,
Subhabrata.
-- 
https://mail.python.org/mailman/listinfo/python-list


How to design a search engine in Python?

2015-02-21 Thread subhabangalore
Dear Group, 

I am trying to build a search engine in Python. 

To do this, I have read tutorials and working methodologies from web and books 
like Stanford IR book [ http://www-nlp.stanford.edu/IR-book/]. I know how to 
design a crawler, I know PostgresSql, I am fluent with PageRank, TF-IDF, Zipf's 
law, etc. 
I came to know of Whoosh[https://pypi.python.org/pypi/Whoosh/]

But I am looking for a total tutorial how to implement it. If any body may 
kindly direct me. 

I heard there are good source codes and prototypes, but I am not getting. 

Apology if this is not a question of the room. I tried to post as this is a 
room of Python bigwigs. 

Regards,
Subhabrata. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Writing Python File at Specific Interval

2014-07-20 Thread subhabangalore
On Thursday, July 10, 2014 5:21:01 AM UTC+5:30, Denis McMahon wrote:
> On Wed, 09 Jul 2014 07:36:49 -0700, subhabangalore wrote:
> 
> 
> 
> > The code (a basic crawler) would run every morning or evening, on a
> 
> > predefined time. [This part is fine].
> 
> > 
> 
> > In the next part, I am trying to store the daily results to a new file.
> 
> 
> 
> So what you want to do is store each day's results in a new file, so 
> 
> probably you want to create a filename that looks something like an iso 
> 
> 8601 date.
> 
> 
> 
> Luckily for you python has this functionality available:
> 
> 
> 
> https://docs.python.org/2/library/datetime.html#date-objects
> 
> 
> 
> $ python
> 
> Python 2.7.3 (default, Feb 27 2014, 19:58:35) 
> 
> [GCC 4.6.3] on linux2
> 
> Type "help", "copyright", "credits" or "license" for more information.
> 
> >>> from datetime import date
> 
> >>> fn = date.today().isoformat() + ".log"
> 
> >>> print fn
> 
> 2014-07-10.log
> 
> >>> quit()
> 
> $
> 
> 
> 
> Once you have a string containing your filename, you might use:
> 
> 
> 
> fp = open( fn, "w" )
> 
> fp.write( data )
> 
> fp.close()
> 
> 
> 
> -- 
> 
> Denis McMahon, denismfmcma...@gmail.com

Dear Group,
Thank you for your kind suggestion. It worked. 
Regards,
Subhabrata Banerjee.
-- 
https://mail.python.org/mailman/listinfo/python-list


Writing Python File at Specific Interval

2014-07-09 Thread subhabangalore
Dear Group,

I am trying to write a file, which would create a new file name 
as the code runs. 

The code (a basic crawler) would run every morning 
or evening, on a predefined time. [This part is fine]. 

In the next part, I am trying to store the daily 
results to a new file. 

As I researched I found some tips around time module,
logging module, pythoncom etc. But not getting any important
lead.

If any one of the esteemed members may kindly suggest.

Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Writing files at run time

2014-06-30 Thread subhabangalore
Dear Group,

In my previous 
post["https://groups.google.com/forum/#!topic/comp.lang.python/ZYjsskV5MgE";] I 
was trying to discuss some issue on file writing. 

I got an associated issue. 

I am trying to crawl a link, through urllib and trying to store its results in 
different files. As discussed I could work out a solution for this and with 
your kind help trying to learn some new coding styles. 

Now, I am getting an associated issue. 

The crawler I am trying to construct would run daily-may be at a predefined 
time. 
[I am trying to set the parameter with "time" module]. 

Now, in the file(s) data are stored, are assigned or created at one time.

Data changes daily if I crawl daily newspapers. 

I generally change the name of the files with a sitting for few minutes before 
a run. But this may not be the way. 

I am thinking of a smarter solution. 

If anyone of the esteemed members may kindly show a hint, how the name of the 
storing files may be changed automatically as crawler runs every day, so that 
data may be written there and retrieved. 

Thanking you in advance,
Regards,
Subhabrata Banerjee. 
 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Writing Multiple files at a times

2014-06-30 Thread subhabangalore
On Sunday, June 29, 2014 4:19:27 PM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I am trying to crawl multiple URLs. As they are coming I want to write them 
> as string, as they are coming, preferably in a queue. 
> 
> 
> 
> If any one of the esteemed members of the group may kindly help.
> 
> 
> 
> Regards,
> 
> Subhabrata Banerjee.

Dear Group,

Thank you for your kind suggestion. But I am not being able to sort out,
"fp = open( "scraped/body{:0>5d}.htm".format( n ), "w" ) "
please suggest.

Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Writing Multiple files at a times

2014-06-29 Thread subhabangalore
On Sunday, June 29, 2014 7:31:37 PM UTC+5:30, Roy Smith wrote:
> In article ,
> 
>  Dave Angel  wrote:
> 
> 
> 
> > subhabangal...@gmail.com Wrote in message:
> 
> > > Dear Group,
> 
> > > 
> 
> > > I am trying to crawl multiple URLs. As they are coming I want to write 
> > > them 
> 
> > > as string, as they are coming, preferably in a queue. 
> 
> > > 
> 
> > > If any one of the esteemed members of the group may kindly help.
> 
> > > 
> 
> > 
> 
> > >From your subject line,  it appears you want to keep multiple files open, 
> 
> > >and write to each in an arbitrary order.  That's no problem,  up to the 
> 
> > >operating system limits.  Define a class that holds the URL information 
> > >and 
> 
> > >for each instance,  add an attribute for an output file handle. 
> 
> > 
> 
> > Don't forget to close each file when you're done with the corresponding URL.
> 
> 
> 
> One other thing to mention is that if you're doing anything with 
> 
> fetching URLs from Python, you almost certainly want to be using Kenneth 
> 
> Reitz's excellent requests module (http://docs.python-requests.org/).  
> 
> The built-in urllib support in Python works, but requests is so much 
> 
> simpler to use.

Dear Group,

Sorry if I miscommunicated. 

I am opening multiple URLs with urllib.open, now one Url has huge html source 
files, like that each one has. As these files are read I am trying to 
concatenate them and put in one txt file as string. 
>From this big txt file I am trying to take out each html file body of each URL 
>and trying to write and store them with attempts like,

for i, line in enumerate(file1):
f = open("/python27/newfile_%i.txt" %i,'w')
f.write(line)
f.close()

Generally not much of an issue, but was thinking of some better options.

Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Writing Multiple files at a times

2014-06-29 Thread subhabangalore
Dear Group,

I am trying to crawl multiple URLs. As they are coming I want to write them as 
string, as they are coming, preferably in a queue. 

If any one of the esteemed members of the group may kindly help.

Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Understanding Python Code[Forward_Backward_Wikipedia]

2014-06-21 Thread subhabangalore
On Friday, June 20, 2014 12:37:01 AM UTC+5:30, Ian wrote:
> On Thu, Jun 19, 2014 at 12:44 PM,   wrote:
> 
> > Dear Group,
> 
> > Generally most of the issues are tackled here, but as I am trying to cross 
> > check my understanding I found another question,
> 
> >
> 
> > f_curr[st] = e[st][x_i] * prev_f_sum
> 
> >
> 
> > Here, if I give one print command and see the results,
> 
> > print "$$2",f_curr
> 
> >
> 
> > It is showing an iterative update like,
> 
> > $$2 {'Healthy': 0.3},
> 
> > $$2 {'Healthy': 0.3, 'Fever': 0.04001}
> 
> >
> 
> > I was trying to ask how the size is being updated, from 1 to 2 back to 1 
> > again 2... is it for any loop then which one, I tried to change but not 
> > being able
> 
> > to if any one of the esteemed members may kindly help me.
> 
> 
> 
> That statement is inside the for loop that builds the f_curr dict. One
> 
> state gets calculated on each iteration. The first time it prints, one
> 
> state has been added. The second time it prints, two states have been
> 
> added. You only have two states, so at that point the loop is done.
> 
> The next time it prints, it's on the next iteration of the outer (i,
> 
> x_i) loop and it's building a new f_curr dict. So then you see it
> 
> adding one state and then the second state to the new dict. And so on
> 
> and so forth until the outer loop completes.

Dear Group,

Thank you for the kind help. I could solve other portions of the code easily.
As this algorithm is an important algorithm so is its Wikipedia note, so I 
changed subject line bit, in case future users try to solve any questions, they 
may get the helps.
Thank you again especially to Ian. 

Regards,
Subhabrata Banerjee.
S

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Understanding Python Code

2014-06-19 Thread subhabangalore
On Thursday, June 19, 2014 7:57:38 PM UTC+5:30, wrote:
> On Thursday, June 19, 2014 7:39:42 PM UTC+5:30, Ian wrote:
> 
> > On Thu, Jun 19, 2014 at 3:48 AM, wrote:
> 
> > 
> 
> > > I am trying to see this line,
> 
> > 
> 
> > > prev_f_sum = sum(f_prev[k]*a[k][st] for k in states)
> 
> > 
> 
> > >
> 
> > 
> 
> > > a[k][st], and f_prev[k] I could take out and understood.
> 
> > 
> 
> > > Now as it is doing sum() so it must be over a list,
> 
> > 
> 
> > > I am trying to understand the number of entities in the list, thinking 
> > > whether to put len(), and see for which entities it is doing the sum.
> 
> > 
> 
> > 
> 
> > 
> 
> > It's summing a generator expression, not a list.  If it helps to
> 
> > 
> 
> > understand it, you could rewrite that line like this:
> 
> > 
> 
> > 
> 
> > 
> 
> > values_to_be_summed = []
> 
> > 
> 
> > for k in states:
> 
> > 
> 
> > values_to_be_summed.append(f_prev[k]*a[k][st])
> 
> > 
> 
> > prev_f_sum = sum(values_to_be_summed)
> 
> > 
> 
> > 
> 
> > 
> 
> > So the number of entities in the list is len(states).
> 
> 
> 
> Dear Group,
> 
> 
> 
> Thank you for your kind answer. As I put from the error I discovered it. 
> Please see my experiment almost near to your answer. I am trying one or two 
> questions like, why it is appending only two values at a time. If you want to 
> assist you may kindly help me assist me.
> 
> Regards,
> 
> Subhabrata Banerjee.
> 
> ***
> 
> MY EXPERIMENT
> 
> ***
> 
> else:
> 
>   for k in states:
> 
>   print "YYY1",f_prev[k]
> 
>   print "YYY2",a[k][st]
> 
>   prev_f_sum1=f_prev[k]*a[k][st]
> 
>   print "YYY3",prev_f_sum1
> 
>   prev_f_sum2 = sum(f_prev[k]*a[k][st] for k in 
> states)
> 
>   print "YYY4",prev_f_sum2
> 
> ***
Dear Group,
Generally most of the issues are tackled here, but as I am trying to cross 
check my understanding I found another question,

f_curr[st] = e[st][x_i] * prev_f_sum

Here, if I give one print command and see the results, 
print "$$2",f_curr

It is showing an iterative update like,
$$2 {'Healthy': 0.3},
$$2 {'Healthy': 0.3, 'Fever': 0.04001}

I was trying to ask how the size is being updated, from 1 to 2 back to 1 again 
2... is it for any loop then which one, I tried to change but not being able 
to if any one of the esteemed members may kindly help me.

Regards,
Subhabrata Banerjee.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Understanding Python Code

2014-06-19 Thread subhabangalore
On Thursday, June 19, 2014 7:39:42 PM UTC+5:30, Ian wrote:
> On Thu, Jun 19, 2014 at 3:48 AM, wrote:
> 
> > I am trying to see this line,
> 
> > prev_f_sum = sum(f_prev[k]*a[k][st] for k in states)
> 
> >
> 
> > a[k][st], and f_prev[k] I could take out and understood.
> 
> > Now as it is doing sum() so it must be over a list,
> 
> > I am trying to understand the number of entities in the list, thinking 
> > whether to put len(), and see for which entities it is doing the sum.
> 
> 
> 
> It's summing a generator expression, not a list.  If it helps to
> 
> understand it, you could rewrite that line like this:
> 
> 
> 
> values_to_be_summed = []
> 
> for k in states:
> 
> values_to_be_summed.append(f_prev[k]*a[k][st])
> 
> prev_f_sum = sum(values_to_be_summed)
> 
> 
> 
> So the number of entities in the list is len(states).

Dear Group,

Thank you for your kind answer. As I put from the error I discovered it. Please 
see my experiment almost near to your answer. I am trying one or two questions 
like, why it is appending only two values at a time. If you want to assist you 
may kindly help me assist me.
Regards,
Subhabrata Banerjee.
***
MY EXPERIMENT
***
else:
for k in states:
print "YYY1",f_prev[k]
print "YYY2",a[k][st]
prev_f_sum1=f_prev[k]*a[k][st]
print "YYY3",prev_f_sum1
prev_f_sum2 = sum(f_prev[k]*a[k][st] for k in 
states)
print "YYY4",prev_f_sum2
***
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Understanding Python Code

2014-06-19 Thread subhabangalore
On Thursday, June 19, 2014 12:30:12 PM UTC+5:30, Ian wrote:
> On Wed, Jun 18, 2014 at 11:50 PM,   wrote:
> 
> > Thank you for the reply. But as I checked it again I found,
> 
> > f_prev[k] is giving values of f_curr[st] = e[st][x_i] * prev_f_sum
> 
> > which is calculated later and again uses prev_f_sum.
> 
> 
> 
> f_prev is the f_curr that was calculated on the previous iteration of
> 
> the loop.  At each iteration after the first, the script calculates
> 
> f_curr based on the value of f_prev, that is, the old value of f_curr.
> 
> Then it reassigns the newly computed f_curr to f_prev, making it now
> 
> the previous, and on the next iteration it creates a new dict to store
> 
> the next f_curr.  Does that make sense?

Dear Group,

The logic seems going fine. I am just trying to cross check things once more,
so trying to generate the values and see on myself. 

I am trying to see this line,
prev_f_sum = sum(f_prev[k]*a[k][st] for k in states)

a[k][st], and f_prev[k] I could take out and understood.
Now as it is doing sum() so it must be over a list,
I am trying to understand the number of entities in the list, thinking whether 
to put len(), and see for which entities it is doing the sum.

Experimenting, if any one feels may kindly send any idea.

Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Understanding Python Code

2014-06-18 Thread subhabangalore
On Thursday, June 19, 2014 12:45:49 AM UTC+5:30, Ian wrote:
> 
> 
> > The questions are,
> 
> > i) prev_f_sum = sum(f_prev[k]*a[k][st] for k in states)
> 
> > here f_prev is called,
> 
> > f_prev is assigned to  f_curr ["f_prev = f_curr"]
> 
> > f_curr[st]  is again being calculated as, ["f_curr[st] = e[st][x_i] * 
> > prev_f_sum"] which again calls "prev_f_sum"
> 
> >
> 
> > I am slightly confused which one would be first calculated and how to 
> > proceed next?
> 
> 
> 
> These things that you describe as "calls" are not calls.  f_prev and
> 
> f_curr are data structures (in this case dicts), not functions.
> 
> Accessing "f_prev[k]" does not call f_prev or in any way cause
> 
> f_prev[k] to be computed; it just looks up what value is recorded in
> 
> the f_prev dict for the key k.
> 
> 
> 
> Python is an imperative language, not declarative.  If you want to
> 
> know what order these things are calculated in, just follow the
> 
> program flow.

Thank you for the reply. But as I checked it again I found,
f_prev[k] is giving values of f_curr[st] = e[st][x_i] * prev_f_sum
which is calculated later and again uses prev_f_sum.

Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Understanding Python Code

2014-06-18 Thread subhabangalore
Dear Group,

I have a Python code taken from 
Wikipedia.("http://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithm";)

The code is pasted below. 

>>> states = ('Healthy', 'Fever')
>>> end_state = 'E'
>>> observations = ('normal', 'cold', 'dizzy')
>>> start_probability = {'Healthy': 0.6, 'Fever': 0.4}
>>> transition_probability = {
   'Healthy' : {'Healthy': 0.69, 'Fever': 0.3, 'E': 0.01},
   'Fever' : {'Healthy': 0.4, 'Fever': 0.59, 'E': 0.01},
   }
>>> emission_probability = {
   'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
   'Fever' : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6},
   }
>>> def fwd_bkw(x, states, a_0, a, e, end_st):

L = len(x)
print "$$1",L
 
fwd = []
f_prev = {}
# forward part of the algorithm
for i, x_i in enumerate(x):
print "$$2",i,x_i
f_curr = {}
for st in states:
if i == 0:
# base case for the forward part
prev_f_sum = a_0[st]
print "$$3",prev_f_sum
else:
prev_f_sum = sum(f_prev[k]*a[k][st] for k in 
states) ##? 
print "$$4",prev_f_sum
 
f_curr[st] = e[st][x_i] * prev_f_sum
print "$$5",f_curr[st]
 
fwd.append(f_curr)
f_prev = f_curr
print "$$6",f_prev
 
p_fwd = sum(f_curr[k]*a[k][end_st] for k in states)
print "FORWARD IS:",p_fwd

 
bkw = []
b_prev = {}
# backward part of the algorithm
for i, x_i_plus in enumerate(reversed(x[1:]+(None,))):
print "##1:",i,x_i_plus
b_curr = {}
for st in states:
if i == 0:
# base case for backward part
b_curr[st] = a[st][end_st]
print "##2:",b_curr[st]
else:
b_curr[st] = sum(a[st][l]*e[l][x_i_plus]*b_prev[l] 
for l in states) ##?
print "##3:",b_curr
bkw.insert(0,b_curr)
b_prev = b_curr
print "##4:",b_prev
 
p_bkw = sum(a_0[l] * e[l][x[0]] * b_curr[l] for l in states)
print "BACKWARD IS:",p_bkw
 
# merging the two parts
posterior = []
for i in range(L):
posterior.append({st: fwd[i][st]*bkw[i][st]/p_fwd for st in states})
 
assert p_fwd == p_bkw
return fwd, bkw, posterior

>>> def example():
return fwd_bkw(observations,
   states,
   start_probability,
   transition_probability,
   emission_probability,
   end_state)

>>> for line in example():
print(' '.join(map(str, line)))


$$1 3
$$2 0 normal
$$3 0.6
$$5 0.3
$$3 0.4
$$5 0.04
$$6 {'Healthy': 0.3, 'Fever': 0.04001}
$$2 1 cold
$$4 0.223
$$5 0.0892
$$4 0.1136
$$5 0.03408
$$6 {'Healthy': 0.0892, 'Fever': 0.03408}
$$2 2 dizzy
$$4 0.07518
$$5 0.007518
$$4 0.0468672
$$5 0.02812032
$$6 {'Healthy': 0.007518, 'Fever': 0.0281203197}
FORWARD IS: 0.0003563832
##1: 0 None
##2: 0.01
##2: 0.01
##4: {'Healthy': 0.01, 'Fever': 0.01}
##1: 1 dizzy
##3: {'Healthy': 0.00249}
##3: {'Healthy': 0.00249, 'Fever': 0.00394}
##4: {'Healthy': 0.00249, 'Fever': 0.00394}
##1: 2 cold
##3: {'Healthy': 0.00104183998}
##3: {'Healthy': 0.00104183998, 'Fever': 0.00109578}
##4: {'Healthy': 0.00104183998, 'Fever': 0.00109578}
BACKWARD IS: 0.0003563832
{'Healthy': 0.3, 'Fever': 0.04001} {'Healthy': 0.0892, 'Fever': 
0.03408} {'Healthy': 0.007518, 'Fever': 0.0281203197}
{'Healthy': 0.00104183998, 'Fever': 0.00109578} {'Healthy': 0.00249, 
'Fever': 0.00394} {'Healthy': 0.01, 'Fever': 0.01}
{'Healthy': 0.8770110375573259, 'Fever': 0.1229889624426741} {'Healthy': 
0.623228030950954, 'Fever': 0.3767719690490461} {'Healthy': 0.2109527048413057, 
'Fever': 0.7890472951586943}
>>> 

As I was trying to understand it. [It is a Machine Learning topic, which has no 
connection with the question here, 
I am just trying to put the Python confusion.]

Generally I understood the code and to understand it in a better way,
I had put one print after the places I am having questions.

But two question still remains. I have put the places of question with "##?" 
mark.

The questions are,
i) prev_f_sum = sum(f_prev[k]*a[k][st] for k in states)
here f_prev is called, 
f_prev is assigned to  f_curr ["f_prev = f_curr"]
f_curr[st]  is again being calculated as, ["f_curr[st] = e[st][x_i] * 
prev_f_sum"] which again calls "prev_f_sum"

I am slightly confused which one would be first calculated and how to proceed 
next?

ii) The similar aspect happens again,

b_curr[st] = sum(a[st][l]*e[l][x_i_plus]*b_prev[l] for l in states)
here, b_prev is used, which is defined in
b_prev = b_curr

If any one of the esteemed members may kindly guide me to understand
this code.
Apolog

TextBlob on Windows

2014-05-23 Thread subhabangalore
Dear Group,

It seems there is a nice language processing library named TextBlob, like NLTK. 
But I am being unable to install it on my Windows(MS-Windows 7 machine. I am 
using Python 2.7

If anyone of the esteemed members may kindly suggest me the solution.

I tried the note in following URL
http://stackoverflow.com/questions/20562768/trouble-installing-textblob-for-python

but did not help much.

Thanking in Advance,
Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question on Debugging a code line

2014-05-11 Thread subhabangalore
On Sunday, May 11, 2014 11:50:32 AM UTC+5:30, subhaba...@gmail.com wrote:
> On Sunday, May 11, 2014 12:57:34 AM UTC+5:30, subhaba...@gmail.com wrote:
> 
> > Dear Room,
> 
> > 
> 
> > 
> 
> > 
> 
> > I was trying to go through a code given in 
> > http://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithm[ Forward 
> > Backward is an algorithm of Machine Learning-I am not talking on that
> 
> > 
> 
> > I am just trying to figure out a query on its Python coding.]
> 
> > 
> 
> > 
> 
> > 
> 
> > I came across the following codes.
> 
> > 
> 
> > 
> 
> > 
> 
> > >>> states = ('Healthy', 'Fever')
> 
> > 
> 
> > >>> end_state = 'E'
> 
> > 
> 
> > >>> observations = ('normal', 'cold', 'dizzy')
> 
> > 
> 
> > >>> start_probability = {'Healthy': 0.6, 'Fever': 0.4}
> 
> > 
> 
> > >>> transition_probability = {
> 
> > 
> 
> >'Healthy' : {'Healthy': 0.69, 'Fever': 0.3, 'E': 0.01},
> 
> > 
> 
> >'Fever' : {'Healthy': 0.4, 'Fever': 0.59, 'E': 0.01},
> 
> > 
> 
> >}
> 
> > 
> 
> > >>> emission_probability = {
> 
> > 
> 
> >'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
> 
> > 
> 
> >'Fever' : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6},
> 
> > 
> 
> >}
> 
> > 
> 
> > 
> 
> > 
> 
> > def fwd_bkw(x, states, a_0, a, e, end_st):
> 
> > 
> 
> > L = len(x)
> 
> > 
> 
> > fwd = []
> 
> > 
> 
> > f_prev = {} #THE PROBLEM 
> 
> > 
> 
> > # forward part of the algorithm
> 
> > 
> 
> > for i, x_i in enumerate(x):
> 
> > 
> 
> > f_curr = {}
> 
> > 
> 
> > for st in states:
> 
> > 
> 
> > if i == 0:
> 
> > 
> 
> > # base case for the forward part
> 
> > 
> 
> > prev_f_sum = a_0[st]
> 
> > 
> 
> > else:
> 
> > 
> 
> > prev_f_sum = sum(f_prev[k]*a[k][st] for k in states) ##
> 
> > 
> 
> >  
> 
> > 
> 
> > f_curr[st] = e[st][x_i] * prev_f_sum
> 
> > 
> 
> >  
> 
> > 
> 
> > fwd.append(f_curr)
> 
> > 
> 
> > f_prev = f_curr
> 
> > 
> 
> >  
> 
> > 
> 
> > p_fwd = sum(f_curr[k]*a[k][end_st] for k in states)
> 
> > 
> 
> > 
> 
> > 
> 
> > As this value was being called in prev_f_sum = sum(f_prev[k]*a[k][st] for k 
> > in states marked ## 
> 
> > 
> 
> > I wanted to know what values it is generating.
> 
> > 
> 
> > So, I had made the following experiment, after 
> 
> > 
> 
> > for i, x_i in enumerate(x): 
> 
> > 
> 
> > I had put print f_prev 
> 
> > 
> 
> > but I am not getting how f_prev is getting the values.
> 
> > 
> 
> > 
> 
> > 
> 
> > Here, 
> 
> > 
> 
> > x=observations,
> 
> > 
> 
> > states= states,
> 
> > 
> 
> > a_0=start_probability,
> 
> > 
> 
> > a= transition_probability,
> 
> > 
> 
> > e=emission_probability,
> 
> > 
> 
> > end_st= end_state
> 
> > 
> 
> > 
> 
> > 
> 
> > Am I missing any minor aspect?
> 
> > 
> 
> > Code is running fine. 
> 
> > 
> 
> > 
> 
> > 
> 
> > If any one of the esteemed members may kindly guide me.
> 
> > 
> 
> > 
> 
> > 
> 
> > Regards,
> 
> > 
> 
> > Subhabrata Banerjee.
> 
> 
> 
> Dear Sir,
> 
> Thank you for your kind reply. I will check. 
> 
> Regards,
> 
> Subhabrata Banerjee.

Dear Sir,
Thank you. It worked. I made another similar statement over another set of 
values on your reply it went nice.
Regards,
Subhabrata Banerjee.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question on Debugging a code line

2014-05-10 Thread subhabangalore
On Sunday, May 11, 2014 12:57:34 AM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Room,
> 
> 
> 
> I was trying to go through a code given in 
> http://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithm[ Forward 
> Backward is an algorithm of Machine Learning-I am not talking on that
> 
> I am just trying to figure out a query on its Python coding.]
> 
> 
> 
> I came across the following codes.
> 
> 
> 
> >>> states = ('Healthy', 'Fever')
> 
> >>> end_state = 'E'
> 
> >>> observations = ('normal', 'cold', 'dizzy')
> 
> >>> start_probability = {'Healthy': 0.6, 'Fever': 0.4}
> 
> >>> transition_probability = {
> 
>'Healthy' : {'Healthy': 0.69, 'Fever': 0.3, 'E': 0.01},
> 
>'Fever' : {'Healthy': 0.4, 'Fever': 0.59, 'E': 0.01},
> 
>}
> 
> >>> emission_probability = {
> 
>'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
> 
>'Fever' : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6},
> 
>}
> 
> 
> 
> def fwd_bkw(x, states, a_0, a, e, end_st):
> 
> L = len(x)
> 
> fwd = []
> 
> f_prev = {} #THE PROBLEM 
> 
> # forward part of the algorithm
> 
> for i, x_i in enumerate(x):
> 
> f_curr = {}
> 
> for st in states:
> 
> if i == 0:
> 
> # base case for the forward part
> 
> prev_f_sum = a_0[st]
> 
> else:
> 
> prev_f_sum = sum(f_prev[k]*a[k][st] for k in states) ##
> 
>  
> 
> f_curr[st] = e[st][x_i] * prev_f_sum
> 
>  
> 
> fwd.append(f_curr)
> 
> f_prev = f_curr
> 
>  
> 
> p_fwd = sum(f_curr[k]*a[k][end_st] for k in states)
> 
> 
> 
> As this value was being called in prev_f_sum = sum(f_prev[k]*a[k][st] for k 
> in states marked ## 
> 
> I wanted to know what values it is generating.
> 
> So, I had made the following experiment, after 
> 
> for i, x_i in enumerate(x): 
> 
> I had put print f_prev 
> 
> but I am not getting how f_prev is getting the values.
> 
> 
> 
> Here, 
> 
> x=observations,
> 
> states= states,
> 
> a_0=start_probability,
> 
> a= transition_probability,
> 
> e=emission_probability,
> 
> end_st= end_state
> 
> 
> 
> Am I missing any minor aspect?
> 
> Code is running fine. 
> 
> 
> 
> If any one of the esteemed members may kindly guide me.
> 
> 
> 
> Regards,
> 
> Subhabrata Banerjee.

Dear Sir,
Thank you for your kind reply. I will check. 
Regards,
Subhabrata Banerjee. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Question on Debugging a code line

2014-05-10 Thread subhabangalore
Dear Room,

I was trying to go through a code given in 
http://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithm[ Forward 
Backward is an algorithm of Machine Learning-I am not talking on that
I am just trying to figure out a query on its Python coding.]

I came across the following codes.

>>> states = ('Healthy', 'Fever')
>>> end_state = 'E'
>>> observations = ('normal', 'cold', 'dizzy')
>>> start_probability = {'Healthy': 0.6, 'Fever': 0.4}
>>> transition_probability = {
   'Healthy' : {'Healthy': 0.69, 'Fever': 0.3, 'E': 0.01},
   'Fever' : {'Healthy': 0.4, 'Fever': 0.59, 'E': 0.01},
   }
>>> emission_probability = {
   'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
   'Fever' : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6},
   }

def fwd_bkw(x, states, a_0, a, e, end_st):
L = len(x)
fwd = []
f_prev = {} #THE PROBLEM 
# forward part of the algorithm
for i, x_i in enumerate(x):
f_curr = {}
for st in states:
if i == 0:
# base case for the forward part
prev_f_sum = a_0[st]
else:
prev_f_sum = sum(f_prev[k]*a[k][st] for k in states) ##
 
f_curr[st] = e[st][x_i] * prev_f_sum
 
fwd.append(f_curr)
f_prev = f_curr
 
p_fwd = sum(f_curr[k]*a[k][end_st] for k in states)

As this value was being called in prev_f_sum = sum(f_prev[k]*a[k][st] for k in 
states marked ## 
I wanted to know what values it is generating.
So, I had made the following experiment, after 
for i, x_i in enumerate(x): 
I had put print f_prev 
but I am not getting how f_prev is getting the values.

Here, 
x=observations,
states= states,
a_0=start_probability,
a= transition_probability,
e=emission_probability,
end_st= end_state

Am I missing any minor aspect?
Code is running fine. 

If any one of the esteemed members may kindly guide me.

Regards,
Subhabrata Banerjee. 



 
   

-- 
https://mail.python.org/mailman/listinfo/python-list


Lowest Value in List

2013-10-02 Thread subhabangalore
Dear Group,

I am trying to work out a solution to the following problem in Python. 

The Problem:
Suppose I have three lists.
Each list is having 10 elements in ascending order.
I have to construct one list having 10 elements which are of the lowest value 
among these 30 elements present in the three given lists.

The Solution:

I tried to address the issue in the following ways:

a) I took three lists, like,
list1=[1,2,3,4,5,6,7,8,9,10]
list2=[0,1,2,3,4,5,6,7,8,9]
list3=[-5,-4,-3,-2,-1,0,1,2,3,4]

I tried to make sum and convert them as set to drop the repeating elements:
set_sum=set(list1+list2+list3)
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, -1, -5, -4, -3, -2])

In the next step I tried to convert it back to list as,
list_set=list(set_sum)
gave the value as,
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, -1, -5, -4, -3, -2]

Now, I imported heapq as, 
import heapq

and took the result as,
result=heapq.nsmallest(10,list_set)
it gave as,
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]

b) I am thinking to work out another approach.
I am taking the lists again as,

list1=[1,2,3,4,5,6,7,8,9,10]
list2=[0,1,2,3,4,5,6,7,8,9]
list3=[-5,-4,-3,-2,-1,0,1,2,3,4]

as they are in ascending order, I am trying to take first four/five elements of 
each list,like,

list1_4=list1[:4]
>>> list2_4=list2[:4]
>>> list3_4=list3[:4]

Now, I am trying to add them as,

list11=list1_4+list2_4+list3_4

thus, giving us the result

[1, 2, 3, 4, 0, 1, 2, 3, -5, -4, -3, -2]

Now, we are trying to sort the list of the set of the sum as,

sort_sum=sorted(list(set(list11)))

giving us the required result as,

[-5, -4, -3, -2, 0, 1, 2, 3, 4]

If by taking the value of each list portion as 4 gives as less number of 
elements in final value, as we are making set to avoid repeating numbers, we 
increase element count by one or two and if final result becomes more than 10 
we take first ten.

Are these approaches fine. Or should we think some other way.

If any learned member of the group can kindly let me know how to solve I would 
be helpful enough.

Thanking in Advance,
Subhabrata. 


-- 
https://mail.python.org/mailman/listinfo/python-list


Topic Modeling LDA Gensim

2013-07-19 Thread subhabangalore
Dear Group,

I am trying to use Gensim for Topic Modeling with LDA.

I have trained LDA but now I want to test it with new documents.


Should I use 

doc_lda = lda[doc_bow]

or is it something else?

If any one of the esteemed members of the group can kindly suggest?

Thanking in Advance,
Regards,
Subhabrata
-- 
http://mail.python.org/mailman/listinfo/python-list


HTML Parser

2013-07-02 Thread subhabangalore
Dear Group,

I was looking for a good tutorial for a "HTML Parser". My intention was to 
extract tables from web pages or information from tables in web pages. 

I tried to make a search, I got HTMLParser, BeautifulSoup, etc. HTMLParser 
works fine for me, but I am looking for a good tutorial to learn it nicely.

I could not use BeautifulSoup as I did not find an .exe file. 

I am using Python 2.7 on Windows 7 SP1 (64 bit). 

I am looking for a good tutorial for HTMLParser or any similar parser which 
have an .exe file for my environment and a good tutorial.

If anyone of the learned members can kindly suggest.

Thanking You in Advance,
Regards,
Subhabrata.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Sunday, June 16, 2013 12:17:18 AM UTC+5:30, ru...@yahoo.com wrote:
> On Saturday, June 15, 2013 11:54:28 AM UTC-6, subhaba...@gmail.com wrote:
> 
> 
> 
> > Thank you for the answer. But I want to learn bit of interesting
> 
> > regular expression forms where may I? 
> 
> > No Mark, thank you for your links but they were not sufficient.
> 
> 
> 
> Links to the Python reference documentation are useful for people
> 
> just beginning with some aspect of Python; they are for people who
> 
> already know Python and want to look up details.  So it's no
> 
> surprise that you did not find them useful.
> 
> 
> 
> > I am looking for more intriguing exercises, esp use of or in
> 
> > the pattern search. 
> 
> 
> 
> Have you tried searching on Google for "regular expression tutorial"?
> 
> It gives a lot of results.  I've never tried any of them so I can't 
> 
> recommend any one specifically but maybe you can find something 
> 
> useful there?
> 
> 
> 
> There is also a Python Howto on regular expressions at
> 
>   http://docs.python.org/3/howto/regex.html
> 
> 
> 
> Also, maybe the book "Regular Expressions Cookbook" would
> 
> be useful?  It seems to have a lot of specific expressions
> 
> for accomplishing various tasks and seems to be online for
> 
> free at
> 
>   http://it-ebooks.info/read/920/

Dear Group,

Thank you for the links. Yes, HOW-TO is good. The cook book should be good. 
Internet changes its contents so fast few days back there was a very good 
Regular Expression Tutorial by Alan Gauld or there were some mail discussions, 
I don't know where they are gone. There is one Gauld's tutorial but I think I 
read some think different.

Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Saturday, June 15, 2013 3:12:55 PM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I am trying to search the following pattern in Python.
> 
> 
> 
> I have following strings:
> 
> 
> 
>  (i)"In the ocean"
> 
>  (ii)"On the ocean"
> 
>  (iii) "By the ocean"
> 
>  (iv) "In this group"
> 
>  (v) "In this group"
> 
>  (vi) "By the new group"
> 
>.
> 
> 
> 
> I want to extract from the first word to the last word, 
> 
> where first word and last word are varying.
> 
> 
> 
> I am looking to extract out:
> 
>   (i) the
> 
>   (ii) the 
> 
>   (iii) the
> 
>   (iv) this
> 
>   (v) this
> 
>   (vi) the new
> 
>   .
> 
> 
> 
> The problem may be handled by converting the string to list and then 
> 
> index of list. 
> 
> 
> 
> But I am thinking if I can use regular expression in Python.
> 
> 
> 
> If any one of the esteemed members can help.
> 
> 
> 
> Thanking you in Advance,
> 
> 
> 
> Regards,
> 
> Subhabrata

Dear Group,

Thank you for the answer. But I want to learn bit of interesting regular 
expression forms where may I? No Mark, thank you for your links but they were 
not sufficient. I am looking for more intriguing exercises, esp use of or in 
the pattern search. 

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Saturday, June 15, 2013 8:34:59 PM UTC+5:30, Mark Lawrence wrote:
> On 15/06/2013 15:31, subhabangal...@gmail.com wrote:
> 
> >
> 
> > Dear Group,
> 
> >
> 
> > I know this solution but I want to have Regular Expression option. Just 
> > learning.
> 
> >
> 
> > Regards,
> 
> > Subhabrata.
> 
> >
> 
> 
> 
> Start here http://docs.python.org/2/library/re.html
> 
> 
> 
> Would you also please read and action this, 
> 
> http://wiki.python.org/moin/GoogleGroupsPython , thanks.
> 
> 
> 
> -- 
> 
> "Steve is going for the pink ball - and for those of you who are 
> 
> watching in black and white, the pink is next to the green." Snooker 
> 
> commentator 'Whispering' Ted Lowe.
> 
> 
> 
> Mark Lawrence

Dear Group,

Suppose I want a regular expression that matches both "Sent from my iPhone" and 
"Sent from my iPod". How do I write such an expression--is the problem, 
"Sent from my iPod"
"Sent from my iPhone"

which can be written as,
re.compile("Sent from my (iPhone|iPod)")

now if I want to slightly to extend it as,

"Taken from my iPod"
"Taken from my iPhone"

I am looking how can I use or in the beginning pattern?

and the third phase if the intermediate phrase,

"from my" if also differs or changes.

In a nutshell I want to extract a particular group of phrases,
where, the beginning and end pattern may alter like,

(i) either from beginning Pattern B1 to end Pattern E1,
(ii) or from beginning Pattern B1 to end Pattern E2,
(iii) or from beginning Pattern B2 to end Pattern E2,
.

Regards,
Subhabrata.






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
On Saturday, June 15, 2013 7:58:44 PM UTC+5:30, Mark Lawrence wrote:
> On 15/06/2013 14:45, Denis McMahon wrote:
> 
> > On Sat, 15 Jun 2013 13:41:21 +, Denis McMahon wrote:
> 
> >
> 
> >> first_and_last = [sentence.split()[i] for i in (0, -1)] middle =
> 
> >> sentence.split()[1:-2]
> 
> >
> 
> > Bugger! That last is actually:
> 
> >
> 
> > sentence.split()[1:-1]
> 
> >
> 
> > It just looks like a two.
> 
> >
> 
> 
> 
> I've a very strong sense of deja vu having round the same loop what, two 
> 
> hours ago?  Wondering out aloud the number of times a programmer has 
> 
> thought "That's easy, I don't need to test it".  How are the mighty fallen.
> 
> 
> 
> -- 
> 
> "Steve is going for the pink ball - and for those of you who are 
> 
> watching in black and white, the pink is next to the green." Snooker 
> 
> commentator 'Whispering' Ted Lowe.
> 
> 
> 
> Mark Lawrence

Dear Group,

I know this solution but I want to have Regular Expression option. Just 
learning.

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Pattern Search Regular Expression

2013-06-15 Thread subhabangalore
Dear Group,

I am trying to search the following pattern in Python.

I have following strings:

 (i)"In the ocean"
 (ii)"On the ocean"
 (iii) "By the ocean"
 (iv) "In this group"
 (v) "In this group"
 (vi) "By the new group"
   .

I want to extract from the first word to the last word, 
where first word and last word are varying.

I am looking to extract out:
  (i) the
  (ii) the 
  (iii) the
  (iv) this
  (v) this
  (vi) the new
  .

The problem may be handled by converting the string to list and then 
index of list. 

But I am thinking if I can use regular expression in Python.

If any one of the esteemed members can help.

Thanking you in Advance,

Regards,
Subhabrata
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Python NLTK

2013-04-07 Thread subhabangalore
On Monday, April 8, 2013 1:50:38 AM UTC+5:30, subhaba...@gmail.com wrote:
> On Sunday, April 7, 2013 2:14:41 AM UTC+5:30, Dave Angel wrote:
> 
> > On 04/06/2013 03:56 PM, subhabangal...@gmail.com wrote:
> 
> > 
> 
> > > Dear Group,
> 
> > 
> 
> > >
> 
> > 
> 
> > > I was using a package named NLTK in Python.
> 
> > 
> 
> > >
> 
> > 
> 
> > > I was trying to write a code given in section 3.8 of
> 
> > 
> 
> > >
> 
> > 
> 
> > > http://docs.huihoo.com/nltk/0.9.5/guides/tag.html.
> 
> > 
> 
> > >
> 
> > 
> 
> > > Here, in the >>> test = ['up', 'down', 'up'] if I put more than 3 values 
> > > and trying to write the reciprocal codes, like,
> 
> > 
> 
> > >
> 
> > 
> 
> > >  sequence = [(t, None) for t in test] and print '%.3f' % 
> > > (model.probability(sequence))
> 
> > 
> 
> > 
> 
> > 
> 
> > This 'and' operator is going to try to interpret the previous list as a 
> 
> > 
> 
> > boolean.  Could that be your problem?  Why aren't you putting these two 
> 
> > 
> 
> > statements on separate lines?  And what version of Python are you using? 
> 
> > 
> 
> >   If 2.x, you should get a syntax error because print is a statement. 
> 
> > 
> 
> > If 3.x, you should get a different error because you don't put parens 
> 
> > 
> 
> > around the preint expression.
> 
> > 
> 
> > 
> 
> > 
> 
> > >
> 
> > 
> 
> > > I am getting an error as,
> 
> > 
> 
> > >
> 
> > 
> 
> > > Traceback (most recent call last): File "", line 1, in 
> > > model.probability(sequence) File 
> > > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 228, in probability 
> > > return 2**(self.log_probability(self._transform.transform(sequence))) 
> > > File "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 259, in 
> > > log_probability alpha = self._forward_probability(sequence) File 
> > > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 694, in 
> > > _forward_probability alpha[0, i] = self._priors.logprob(state) + \ File 
> > > "C:\Python27\lib\site-packages\nltk\probability.py", line 689, in logprob 
> > > elif self._prob_dict[sample] == 0: return _NINF ValueError: The truth 
> > > value of an array with more than one element is ambiguous. Use a.any() or 
> > > a.all()
> 
> > 
> 
> > >
> 
> > 
> 
> > > If any learned member may kindly assist me how may I solve the issue.
> 
> > 
> 
> > >
> 
> > 
> 
> > 
> 
> > 
> 
> > Your error display has been trashed, thanks to googlegroups.
> 
> > 
> 
> >  http://wiki.python.org/moin/GoogleGroupsPython
> 
> > 
> 
> > Try posting with a text email message, since this is a text forum.
> 
> > 
> 
> > 
> 
> > 
> 
> > Your code is also sparse.  Why do you point us to fragments on the net, 
> 
> > 
> 
> > when you could show us the exact code you were running when it failed? 
> 
> > 
> 
> > I'm guessing you're running it from the interpreter, which can be very 
> 
> > 
> 
> > confusing once you have to ask for help.  Please put a sample of code 
> 
> > 
> 
> > into a file, run it, and paste into your text email both the contents of 
> 
> > 
> 
> > that file and the full traceback.  thanks.
> 
> > 
> 
> > 
> 
> > 
> 
> > The email address to post on this forum is  python-list@python.org
> 
> > 
> 
> > 
> 
> > 
> 
> > 
> 
> > 
> 
> > -- 
> 
> > 
> 
> > DaveA
> 
> 
> 
> Dear Sir, 
> 
> I generally solved this problem from some other angle but I would like to fix 
> this particular issue also so I am posting soon to you. 
> 
> Regards,
> 
> Subhabrata.

Dear Sir,
I was trying to give wrong input. I was making an input error.

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Splitting of string at an interval

2013-04-07 Thread subhabangalore
Dear Group,

I was looking to split a string in a particular interval, like,

If I have a string, 
string="The Sun rises in the east of  our earth"

I like to see it as, 
words=["The Sun","rises in","in the","east of","our earth"]

If any one of the learned members can kindly suggest.

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Python NLTK

2013-04-07 Thread subhabangalore
On Sunday, April 7, 2013 2:14:41 AM UTC+5:30, Dave Angel wrote:
> On 04/06/2013 03:56 PM, subhabangal...@gmail.com wrote:
> 
> > Dear Group,
> 
> >
> 
> > I was using a package named NLTK in Python.
> 
> >
> 
> > I was trying to write a code given in section 3.8 of
> 
> >
> 
> > http://docs.huihoo.com/nltk/0.9.5/guides/tag.html.
> 
> >
> 
> > Here, in the >>> test = ['up', 'down', 'up'] if I put more than 3 values 
> > and trying to write the reciprocal codes, like,
> 
> >
> 
> >  sequence = [(t, None) for t in test] and print '%.3f' % 
> > (model.probability(sequence))
> 
> 
> 
> This 'and' operator is going to try to interpret the previous list as a 
> 
> boolean.  Could that be your problem?  Why aren't you putting these two 
> 
> statements on separate lines?  And what version of Python are you using? 
> 
>   If 2.x, you should get a syntax error because print is a statement. 
> 
> If 3.x, you should get a different error because you don't put parens 
> 
> around the preint expression.
> 
> 
> 
> >
> 
> > I am getting an error as,
> 
> >
> 
> > Traceback (most recent call last): File "", line 1, in 
> > model.probability(sequence) File 
> > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 228, in probability 
> > return 2**(self.log_probability(self._transform.transform(sequence))) File 
> > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 259, in 
> > log_probability alpha = self._forward_probability(sequence) File 
> > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 694, in 
> > _forward_probability alpha[0, i] = self._priors.logprob(state) + \ File 
> > "C:\Python27\lib\site-packages\nltk\probability.py", line 689, in logprob 
> > elif self._prob_dict[sample] == 0: return _NINF ValueError: The truth value 
> > of an array with more than one element is ambiguous. Use a.any() or a.all()
> 
> >
> 
> > If any learned member may kindly assist me how may I solve the issue.
> 
> >
> 
> 
> 
> Your error display has been trashed, thanks to googlegroups.
> 
>  http://wiki.python.org/moin/GoogleGroupsPython
> 
> Try posting with a text email message, since this is a text forum.
> 
> 
> 
> Your code is also sparse.  Why do you point us to fragments on the net, 
> 
> when you could show us the exact code you were running when it failed? 
> 
> I'm guessing you're running it from the interpreter, which can be very 
> 
> confusing once you have to ask for help.  Please put a sample of code 
> 
> into a file, run it, and paste into your text email both the contents of 
> 
> that file and the full traceback.  thanks.
> 
> 
> 
> The email address to post on this forum is  python-list@python.org
> 
> 
> 
> 
> 
> -- 
> 
> DaveA

Dear Sir, 
I generally solved this problem from some other angle but I would like to fix 
this particular issue also so I am posting soon to you. 
Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error in Python NLTK

2013-04-06 Thread subhabangalore
On Sunday, April 7, 2013 2:14:41 AM UTC+5:30, Dave Angel wrote:
> On 04/06/2013 03:56 PM, subhabangal...@gmail.com wrote:
> 
> > Dear Group,
> 
> >
> 
> > I was using a package named NLTK in Python.
> 
> >
> 
> > I was trying to write a code given in section 3.8 of
> 
> >
> 
> > http://docs.huihoo.com/nltk/0.9.5/guides/tag.html.
> 
> >
> 
> > Here, in the >>> test = ['up', 'down', 'up'] if I put more than 3 values 
> > and trying to write the reciprocal codes, like,
> 
> >
> 
> >  sequence = [(t, None) for t in test] and print '%.3f' % 
> > (model.probability(sequence))
> 
> 
> 
> This 'and' operator is going to try to interpret the previous list as a 
> 
> boolean.  Could that be your problem?  Why aren't you putting these two 
> 
> statements on separate lines?  And what version of Python are you using? 
> 
>   If 2.x, you should get a syntax error because print is a statement. 
> 
> If 3.x, you should get a different error because you don't put parens 
> 
> around the preint expression.
> 
> 
> 
> >
> 
> > I am getting an error as,
> 
> >
> 
> > Traceback (most recent call last): File "", line 1, in 
> > model.probability(sequence) File 
> > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 228, in probability 
> > return 2**(self.log_probability(self._transform.transform(sequence))) File 
> > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 259, in 
> > log_probability alpha = self._forward_probability(sequence) File 
> > "C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 694, in 
> > _forward_probability alpha[0, i] = self._priors.logprob(state) + \ File 
> > "C:\Python27\lib\site-packages\nltk\probability.py", line 689, in logprob 
> > elif self._prob_dict[sample] == 0: return _NINF ValueError: The truth value 
> > of an array with more than one element is ambiguous. Use a.any() or a.all()
> 
> >
> 
> > If any learned member may kindly assist me how may I solve the issue.
> 
> >
> 
> 
> 
> Your error display has been trashed, thanks to googlegroups.
> 
>  http://wiki.python.org/moin/GoogleGroupsPython
> 
> Try posting with a text email message, since this is a text forum.
> 
> 
> 
> Your code is also sparse.  Why do you point us to fragments on the net, 
> 
> when you could show us the exact code you were running when it failed? 
> 
> I'm guessing you're running it from the interpreter, which can be very 
> 
> confusing once you have to ask for help.  Please put a sample of code 
> 
> into a file, run it, and paste into your text email both the contents of 
> 
> that file and the full traceback.  thanks.
> 
> 
> 
> The email address to post on this forum is  python-list@python.org
> 
> 
> 
> 
> 
> -- 
> 
> DaveA

Thanks Dave for your kind suggestions. I am checking on them and if I get any 
questions I am sending the room in detailed manner. Regards,Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Error in Python NLTK

2013-04-06 Thread subhabangalore
Dear Group,

I was using a package named NLTK in Python. 

I was trying to write a code given in section 3.8 of 

http://docs.huihoo.com/nltk/0.9.5/guides/tag.html.

Here, in the >>> test = ['up', 'down', 'up'] if I put more than 3 values and 
trying to write the reciprocal codes, like,

sequence = [(t, None) for t in test] and print '%.3f' % 
(model.probability(sequence))

I am getting an error as, 

Traceback (most recent call last): File "", line 1, in 
model.probability(sequence) File 
"C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 228, in probability 
return 2**(self.log_probability(self._transform.transform(sequence))) File 
"C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 259, in log_probability 
alpha = self._forward_probability(sequence) File 
"C:\Python27\lib\site-packages\nltk\tag\hmm.py", line 694, in 
_forward_probability alpha[0, i] = self._priors.logprob(state) + \ File 
"C:\Python27\lib\site-packages\nltk\probability.py", line 689, in logprob elif 
self._prob_dict[sample] == 0: return _NINF ValueError: The truth value of an 
array with more than one element is ambiguous. Use a.any() or a.all()

If any learned member may kindly assist me how may I solve the issue. 

Regards, 
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Simple Plot in Python

2013-03-16 Thread subhabangalore
On Saturday, March 16, 2013 5:12:41 PM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I have two sets of values in probability, like,
> 
> 
> 
> x=[0.1,0.2,0.3,0.4]
> 
> and
> 
> y=[0.2,0.4,0.6,0.8]
> 
> 
> 
> And I am trying to draw a simple graph with Python.
> 
> 
> 
> I was trying to draw in Matplotlib but did not find much help.
> 
> 
> 
> If any one in the room can kindly suggest.
> 
> 
> 
> Thanking You in Advance,
> 
> Regards,
> 
> Subhabrata.

Thanks. I don't know why it slipped my eyes.
Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Simple Plot in Python

2013-03-16 Thread subhabangalore
Dear Group,

I have two sets of values in probability, like,

x=[0.1,0.2,0.3,0.4]
and
y=[0.2,0.4,0.6,0.8]

And I am trying to draw a simple graph with Python.

I was trying to draw in Matplotlib but did not find much help.

If any one in the room can kindly suggest.

Thanking You in Advance,
Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python and Hidden Markov Model

2013-03-07 Thread subhabangalore
On Friday, March 8, 2013 2:18:06 AM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I was trying to learn Hidden Markov Model. In Python there are various 
> packages, but I was willing to do some basic calculation starting from the 
> scratch so that I can learn the model very aptly. Do you know of any thing 
> such?
> 
> 
> 
> Thanking you in Advance,
> 
> Regards,
> 
> Subhabrata.

Dear Sir,

Thank you for your kind reply. I agree with most of your points but I differ 
slightly 

also.

My problem is over model validation on continuous time Markov system.
Generally, I understand the theory and can run the kits like HMM.py or 
Scikit-learn.
The problem is if I can not fit the data in run time I would be at the mercy of 
the kit.
So I wanted to know the coding of the computation. 

I am specifically looking at the small python example of Forward, Backward and 
Viterbi 

calculation.
I tried to surf the web but did not help much. I do not know much of Scientific 
forum.

I thought as HMM.py, NLTK, Scikit-learn are Python implementations so there 
would be lot 

of people in the room who would know it. 

And I got people like you, so I can not say I am wrong!

Regards,
Subhabrata.  


-- 
http://mail.python.org/mailman/listinfo/python-list


Python and Hidden Markov Model

2013-03-07 Thread subhabangalore
Dear Group,

I was trying to learn Hidden Markov Model. In Python there are various 
packages, but I was willing to do some basic calculation starting from the 
scratch so that I can learn the model very aptly. Do you know of any thing such?

Thanking you in Advance,
Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Multiple Plotting in Matplotlib

2013-02-18 Thread subhabangalore
On Monday, February 18, 2013 9:18:34 PM UTC+5:30, Nelle Varoquaux wrote:
> > Dear Group,
> 
> >
> 
> > I am trying to view multiple plotting files in matplotlib. My numbers range 
> > from 5 to few hundred. I was trying to use plt.subplot(), and plt.figure(n).
> 
> > But they did not work.
> 
> > plt.subplot() did not work at all.
> 
> > plt.figure(n) works till n=4. After that I am trying to get error messages.
> 
> 
> 
> Can you specify what "did not work at all" means and paste the error
> 
> messages and the code you are using ?
> 
> 
> 
> Thanks,
> 
> N
> 
> 
> 
> >
> 
> > If any one of the learned members can kindly help.
> 
> >
> 
> > Thanking in Advance,
> 
> > Regards,
> 
> > Subhabrata.
> 
> > --
> 
> > http://mail.python.org/mailman/listinfo/python-list

Thanks Nelle. It seems there was a minor coding mistake I was doing.

Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Multiple Plotting in Matplotlib

2013-02-18 Thread subhabangalore
Dear Group,

I am trying to view multiple plotting files in matplotlib. My numbers range 
from 5 to few hundred. I was trying to use plt.subplot(), and plt.figure(n).
But they did not work.
plt.subplot() did not work at all.
plt.figure(n) works till n=4. After that I am trying to get error messages.

If any one of the learned members can kindly help.

Thanking in Advance,
Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Forward Backward Algorithm in Python

2013-02-07 Thread subhabangalore
On Friday, February 8, 2013 2:08:35 AM UTC+5:30, Dave Angel wrote:
> On 02/07/2013 03:13 PM, subhabangal...@gmail.com wrote:
> 
> > Dear Group,
> 
> > If any one can kindly help me with a simple Forward Backward algorithm 
> > implementation. I tried to search in web but did not help much.
> 
> >
> 
> > Thanking You in Advance,
> 
> > Regards,
> 
> > Subhabrata.
> 
> >
> 
> 
> 
> No idea what forward-backward-algorithm is.  But a simple search with 
> 
> DuckDuckGo gives a pile of refs, including the following with Python 
> 
> example code:
> 
> 
> 
> https://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithm
> 
> 
> 
> 
> 
> -- 
> 
> DaveA

Thanks Dave, but these are not properly explained. I checked almost all.
Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Forward Backward Algorithm in Python

2013-02-07 Thread subhabangalore
Dear Group,
If any one can kindly help me with a simple Forward Backward algorithm 
implementation. I tried to search in web but did not help much.

Thanking You in Advance,
Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Maximum Likelihood Estimation

2013-02-01 Thread subhabangalore
On Friday, February 1, 2013 10:47:04 PM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I am looking for a Python implementation of Maximum Likelihood Estimation. If 
> any one can kindly suggest. With a google search it seems 
> scipy,numpy,statsmodels have modules, but as I am not finding proper example 
> workouts I am failing to use them. 
> 
> 
> 
> I am using Python 2.7 on Windows 7.
> 
> 
> 
> Thanking You in Advance, 
> 
> 
> 
> Regards,
> 
> Subhabrata

Dear Sir,
The room would take care of you. They still guide me and sometimes the way they 
rebuke you have to see. You are in a nice room, you'd learn soon. It was bot? 
If you are testing please let me know I'd like to be part of the testing, and 
if you suggest may volunteer to send queries. 
Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Maximum Likelihood Estimation

2013-02-01 Thread subhabangalore
On Friday, February 1, 2013 11:07:48 PM UTC+5:30, 8 Dihedral wrote:
> subhaba...@gmail.com於 2013年2月2日星期六UTC+8上午1時17分04秒寫道:
> 
> > Dear Group,
> 
> > 
> 
> > 
> 
> > 
> 
> > I am looking for a Python implementation of Maximum Likelihood Estimation. 
> > If any one can kindly suggest. With a google search it seems 
> > scipy,numpy,statsmodels have modules, but as I am not finding proper 
> > example workouts I am failing to use them. 
> 
> > 
> 
> > 
> 
> > 
> 
> > I am using Python 2.7 on Windows 7.
> 
> > 
> 
> > 
> 
> > 
> 
> > Thanking You in Advance, 
> 
> > 
> 
> > 
> 
> > 
> 
> > Regards,
> 
> > 
> 
> > Subhabrata
> 
> 
> 
> I suggest you can google "python and symbolic 
> 
> computation" to get some package for your need first.
> 
> 
> 
> Because it seems that you have to work out some 
> 
> math formula and verify some random process first
> 
> of your data sources with noises .

Dear Group,
Thanks. I googled and found a new package named Sympy and could generate MLE 
graphs. Regards,Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Maximum Likelihood Estimation

2013-02-01 Thread subhabangalore
Dear Group,

I am looking for a Python implementation of Maximum Likelihood Estimation. If 
any one can kindly suggest. With a google search it seems 
scipy,numpy,statsmodels have modules, but as I am not finding proper example 
workouts I am failing to use them. 

I am using Python 2.7 on Windows 7.

Thanking You in Advance, 

Regards,
Subhabrata
-- 
http://mail.python.org/mailman/listinfo/python-list


Question on Python Conference

2013-01-19 Thread subhabangalore
Dear Group,

As I know Python Foundation organizes some conferences all through the year.
Most probably they are known as Pycon. But I have some different question. The 
question is, is it possible to attend it by Video Conferencing? Or if I request 
for the same will it be granted?

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question on for loop

2013-01-15 Thread subhabangalore
On Friday, January 4, 2013 11:18:24 AM UTC+5:30, Steven D'Aprano wrote:
> On Thu, 03 Jan 2013 12:04:03 -0800, subhabangalore wrote:
> 
> 
> 
> > Dear Group,
> 
> > If I take a list like the following:
> 
> > 
> 
> > fruits = ['banana', 'apple',  'mango'] 
> 
> > for fruit in fruits:
> 
> >print 'Current fruit :', fruit
> 
> > 
> 
> > Now,
> 
> > if I want variables like var1,var2,var3 be assigned to them, we may
> 
> > take, var1=banana,
> 
> > var2=apple,
> 
> > var3=mango
> 
> > 
> 
> > but can we do something to assign the variables dynamically
> 
> 
> 
> Easy as falling off a log. You can't write "var1", "var2" etc. but you 
> 
> can write it as "var[0]", "var[1]" etc.
> 
> 
> 
> var = ['banana', 'apple',  'mango'] 
> 
> print var[0]  # prints 'banana'
> 
> print var[1]  # prints 'apple'
> 
> print var[2]  # prints 'mango'
> 
> 
> 
> 
> 
> 
> 
> Of course "var" is not a very good variable name. "fruit" or "fruits" 
> 
> would be better.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven

Actually in many cases it is easy if you get the variable of list value, I was 
trying something like,
def func1(n):
list1=["x1","x2","x3","x4","x5","x6","x7","x8","x9","x10"]
blnk=[]
for i in range(len(list1)):
num1="var"+str(i)+"="+list1[i]
blnk.append(num1)
print blnk
Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Subgraph Drawing

2013-01-13 Thread subhabangalore
On Monday, January 14, 2013 6:05:49 AM UTC+5:30, Steven D'Aprano wrote:
> On Sun, 13 Jan 2013 12:05:54 -0800, subhabangalore wrote:
> 
> 
> 
> > Dear Group,
> 
> > 
> 
> > I have two questions, if I take a subseries of the matrix as in
> 
> > eigenvalue here, provided I have one graph of the full form in G, how
> 
> > may I show it, as if I do the nx.draw(G) it takes only the original
> 
> > graph.
> 
> 
> 
> Is this what you mean? If not, you will have to explain your question 
> 
> better.
> 
> 
> 
> 
> 
> L = = nx.laplacian(G)
> 
> E = numpy.linalg.eigvals(L)
> 
> nx.draw(E)
> 
> 
> 
> 
> 
> > >>> print numpy.linalg.eigvals(L)
> 
> > [  8.e+00   2.22044605e-16   1.e+00   1.e+00
> 
> >1.e+00   1.e+00   1.e+00   1.e+00]
> 
> > for more than 1000 nodes it is coming too slow on Windows 7 machine with
> 
> > 3GB RAM.
> 
> 
> 
> Get a faster machine. Or use fewer nodes. Or be patient and wait.
> 
> 
> 
> Solving a graph problem with 1000 nodes is a fairly big problem for a 
> 
> desktop PC. It will take time. Calculations don't just happen instantly, 
> 
> the more work you have to do the longer they take.
> 
> 
> 
> The last alternative is to ask on a specialist numpy list. But I expect 
> 
> they will probably tell you the same thing.
> 
> 
> 
> 
> 
> -- 
> 
> Steven

Dear Steven,

Thank you for your kind effort. You got the problem right. But it is giving 
following error,
Traceback (most recent call last):
  File "", line 1, in 
nx.draw(E)
  File "C:\Python27\lib\site-packages\networkx\drawing\nx_pylab.py", line 138, 
in draw
draw_networkx(G,pos=pos,ax=ax,**kwds)
  File "C:\Python27\lib\site-packages\networkx\drawing\nx_pylab.py", line 267, 
in draw_networkx
pos=nx.drawing.spring_layout(G) # default to spring layout
  File "C:\Python27\lib\site-packages\networkx\drawing\layout.py", line 241, in 
fruchterman_reingold_layout
A=nx.to_numpy_matrix(G,weight=weight)
  File "C:\Python27\lib\site-packages\networkx\convert.py", line 492, in 
to_numpy_matrix
nodelist = G.nodes()
AttributeError: 'numpy.ndarray' object has no attribute 'nodes'
>>> 

there are other solution of converting back the matrix to graph should I try 
that?

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Subgraph Drawing

2013-01-13 Thread subhabangalore
Dear Group,

I have two questions, if I take a subseries of the matrix as in eigenvalue here,
provided I have one graph of the full form in G, how may I show it, as if I do 
the nx.draw(G) it takes only the original graph. 

>>> import numpy
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> G=nx.Graph()
>>> G.add_edges_from([(1,2),(1,3),(1,3),(1,4),(1,5),(1,6),(1,7),(1,8)])
>>> L =nx.laplacian(G)
>>> print L
[[ 7. -1. -1. -1. -1. -1. -1. -1.]
 [-1.  1.  0.  0.  0.  0.  0.  0.]
 [-1.  0.  1.  0.  0.  0.  0.  0.]
 [-1.  0.  0.  1.  0.  0.  0.  0.]
 [-1.  0.  0.  0.  1.  0.  0.  0.]
 [-1.  0.  0.  0.  0.  1.  0.  0.]
 [-1.  0.  0.  0.  0.  0.  1.  0.]
 [-1.  0.  0.  0.  0.  0.  0.  1.]]
>>> print numpy.linalg.eigvals(L)
[  8.e+00   2.22044605e-16   1.e+00   1.e+00
   1.e+00   1.e+00   1.e+00   1.e+00]

for more than 1000 nodes it is coming too slow on Windows 7 machine with 3GB 
RAM.

If any one of the learned members can help.

Apology for any indentation error etc.

Thanking all in Advance,

Regards,
Subhabrata Banerjee.
-- 
http://mail.python.org/mailman/listinfo/python-list


For Loop in List

2013-01-13 Thread subhabangalore
Dear Group,

I have a list like,

>>> list1=[1,2,3,4,5,6,7,8,9,10,11,12]

Now, if I want to take a slice of it, I can.
It may be done in,
>>> list2=list1[:3]
>>> print list2
[1, 2, 3]

If I want to iterate the list, I may do as,

>>> for i in list1:
print "Iterated Value Is:",i


Iterated Value Is: 1
Iterated Value Is: 2
Iterated Value Is: 3
Iterated Value Is: 4
Iterated Value Is: 5
Iterated Value Is: 6
Iterated Value Is: 7
Iterated Value Is: 8
Iterated Value Is: 9
Iterated Value Is: 10
Iterated Value Is: 11
Iterated Value Is: 12

Now, I want to combine iterator with a slicing condition like

>>> for i=list2 in list1:
print "Iterated Value Is:",i

So, that I get the list in the slices like,
[1,2,3]
[4,5,6]
[7,8,9]
[10,11,12]

But if I do this I get a Syntax Error, is there a solution?

If anyone of the learned members may kindly let me know?

Apology for any indentation error,etc. 

Thanking You in Advance,

Regards,
Subhabrata 




-- 
http://mail.python.org/mailman/listinfo/python-list


Question on for loop

2013-01-03 Thread subhabangalore
Dear Group,
If I take a list like the following:

fruits = ['banana', 'apple',  'mango']
for fruit in fruits:
   print 'Current fruit :', fruit

Now, 
if I want variables like var1,var2,var3 be assigned to them, we may take,
var1=banana,
var2=apple,
var3=mango

but can we do something to assign the variables dynamically I was thinking
of 
var_series=['var1','var2','var3']
for var in var_series:
  for fruit in fruits:
   print var,fruits

If any one can kindly suggest.

Regards,
Subhabrata

NB: Apology for some alignment mistakes,etc.

-- 
http://mail.python.org/mailman/listinfo/python-list


Graph Drawing

2013-01-02 Thread subhabangalore
Dear Group,

In networkx module we generally try to draw the graph as,
>>> import networkx as nx
>>> G=nx.Graph()
>>> G.add_edge(1, 2, weight=4.7 )
>>> G.add_edge(1, 3, weight=4.5 )
.

Now, if I want to retrieve the information of traversal from 1 to 3, I can 
give, 
G.edges()

but I am looking for a command or function by which I can get not only the node 
names but also the weights. 

If anyone in the room can kindly suggest?

Regards,
Subhabrata. 

NB: Apology for any indentation error, etc.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matplotlib/Pylab Error

2012-12-11 Thread subhabangalore
On Tuesday, December 11, 2012 2:10:07 AM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I am trying to enumerate few interesting errors on pylab/matplotlib. 
> 
> If any of the learned members can kindly let me know how should I address 
> them.
> 
> 
> 
> I am trying to enumerate them as follows.
> 
> 
> 
> i) >>> import numpy
> 
> >>> import pylab
> 
> >>> t = numpy.arange(0.0, 1.0+0.01, 0.01)
> 
> >>> s = numpy.cos(2*2*numpy.pi*t)
> 
> >>> pylab.plot(t, s)
> 
> []
> 
> >>> pylab.show()
> 
> Exception in Tkinter callback
> 
> Traceback (most recent call last):
> 
>   File "C:\Python26\lib\lib-tk\Tkinter.py", line 1410, in __call__
> 
> return self.func(*args)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
> line 236, in resize
> 
> self.show()
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
> line 239, in draw
> 
> FigureCanvasAgg.draw(self)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_agg.py", 
> line 421, in draw
> 
> self.figure.draw(self.renderer)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\figure.py", line 898, in draw
> 
> func(*args)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\axes.py", line 1997, in draw
> 
> a.draw(renderer)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\axis.py", line 1045, in draw
> 
> tick.draw(renderer)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\axis.py", line 239, in draw
> 
> self.label1.draw(renderer)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\text.py", line 591, in draw
> 
> ismath=ismath)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_agg.py", 
> line 167, in draw_text
> 
> font.draw_glyphs_to_bitmap(antialiased=rcParams['text.antialiased'])
> 
> TypeError: draw_glyphs_to_bitmap() takes no keyword arguments
> 
> 
> 
> ii) Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit 
> (Intel)] on win32
> 
> Type "copyright", "credits" or "license()" for more information.
> 
> 
> 
> 
> 
> Personal firewall software may warn about the connection IDLE
> 
> makes to its subprocess using this computer's internal loopback
> 
> interface.  This connection is not visible on any external
> 
> interface and no data is sent to or received from the Internet.
> 
> 
> 
> 
> 
> IDLE 2.6.1  
> 
> >>> import networkx as nx
> 
> >>> G=nx.Graph()
> 
> >>> G.add_node(1)
> 
> >>> G.add_nodes_from([2,3])
> 
> >>> H=nx.path_graph(10)
> 
> >>> G.add_nodes_from(H)
> 
> >>> G.add_node(H)
> 
> >>> G.add_edge(1,2)
> 
> >>> G.draw()
> 
> 
> 
> Traceback (most recent call last):
> 
>   File "", line 1, in 
> 
> G.draw()
> 
> AttributeError: 'Graph' object has no attribute 'draw'
> 
> >>> import matplotlib.pyplot as plt
> 
> >>> plt.show()
> 
> >>> nx.draw(G)
> 
> >>> plt.show()
> 
> Exception in Tkinter callback
> 
> Traceback (most recent call last):
> 
>   File "C:\Python26\lib\lib-tk\Tkinter.py", line 1410, in __call__
> 
> return self.func(*args)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
> line 236, in resize
> 
> self.show()
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
> line 239, in draw
> 
> FigureCanvasAgg.draw(self)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_agg.py", 
> line 421, in draw
> 
> self.figure.draw(self.renderer)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\figure.py", line 898, in draw
> 
> func(*args)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\axes.py", line 1997, in draw
> 
> a.draw(renderer)
> 
>   File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
> draw_wrapper
> 
> draw(artist, renderer, *args, **kwargs)
> 
>   File "C:\Python26\

Matplotlib/Pylab Error

2012-12-10 Thread subhabangalore
Dear Group,

I am trying to enumerate few interesting errors on pylab/matplotlib. 
If any of the learned members can kindly let me know how should I address them.

I am trying to enumerate them as follows.

i) >>> import numpy
>>> import pylab
>>> t = numpy.arange(0.0, 1.0+0.01, 0.01)
>>> s = numpy.cos(2*2*numpy.pi*t)
>>> pylab.plot(t, s)
[]
>>> pylab.show()
Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Python26\lib\lib-tk\Tkinter.py", line 1410, in __call__
return self.func(*args)
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
line 236, in resize
self.show()
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
line 239, in draw
FigureCanvasAgg.draw(self)
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_agg.py", line 
421, in draw
self.figure.draw(self.renderer)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\figure.py", line 898, in draw
func(*args)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\axes.py", line 1997, in draw
a.draw(renderer)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\axis.py", line 1045, in draw
tick.draw(renderer)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\axis.py", line 239, in draw
self.label1.draw(renderer)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\text.py", line 591, in draw
ismath=ismath)
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_agg.py", line 
167, in draw_text
font.draw_glyphs_to_bitmap(antialiased=rcParams['text.antialiased'])
TypeError: draw_glyphs_to_bitmap() takes no keyword arguments

ii) Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit 
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.


Personal firewall software may warn about the connection IDLE
makes to its subprocess using this computer's internal loopback
interface.  This connection is not visible on any external
interface and no data is sent to or received from the Internet.


IDLE 2.6.1  
>>> import networkx as nx
>>> G=nx.Graph()
>>> G.add_node(1)
>>> G.add_nodes_from([2,3])
>>> H=nx.path_graph(10)
>>> G.add_nodes_from(H)
>>> G.add_node(H)
>>> G.add_edge(1,2)
>>> G.draw()

Traceback (most recent call last):
  File "", line 1, in 
G.draw()
AttributeError: 'Graph' object has no attribute 'draw'
>>> import matplotlib.pyplot as plt
>>> plt.show()
>>> nx.draw(G)
>>> plt.show()
Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Python26\lib\lib-tk\Tkinter.py", line 1410, in __call__
return self.func(*args)
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
line 236, in resize
self.show()
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_tkagg.py", 
line 239, in draw
FigureCanvasAgg.draw(self)
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_agg.py", line 
421, in draw
self.figure.draw(self.renderer)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\figure.py", line 898, in draw
func(*args)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\axes.py", line 1997, in draw
a.draw(renderer)
  File "C:\Python26\Lib\site-packages\matplotlib\artist.py", line 55, in 
draw_wrapper
draw(artist, renderer, *args, **kwargs)
  File "C:\Python26\Lib\site-packages\matplotlib\text.py", line 591, in draw
ismath=ismath)
  File "C:\Python26\Lib\site-packages\matplotlib\backends\backend_agg.py", line 
167, in draw_text
font.draw_glyphs_to_bitmap(antialiased=rcParams['text.antialiased'])
TypeError: draw_glyphs_to_bitmap() takes no keyword arguments

Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Parsing in Python

2012-12-08 Thread subhabangalore
Dear Group, 

I am looking at a readymade tool to resolve anaphora, and I am looking a Python 
based one. I checked NLTK. It has DRT parser. But I do not like that. In other 
parsers you have to insert grammar. But I am looking for a completely built in. 

If anyone can kindly suggest.
 
Regards, Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Cosine Similarity

2012-12-07 Thread subhabangalore
T

On Friday, December 7, 2012 9:47:46 AM UTC+5:30, Miki Tebeka wrote:
> On Thursday, December 6, 2012 2:15:53 PM UTC-8, subhaba...@gmail.com wrote:
> 
> > I am looking for some example of implementing Cosine similarity in python. 
> > I searched for hours but could not help much. NLTK seems to have a module 
> > but did not find examples. 
> 
> Should be easy with numpy:
> 
> import numpy as np
> 
> 
> 
> def cos(v1, v2):
> 
>return np.dot(v1, v2) / (np.sqrt(np.dot(v1, v1)) * np.sqrt(np.dot(v2, 
> v2)))
> 
> 
> 
> 
> 
> HTH,
> 
> --
> 
> Miki

Thanks Miki. It worked. Regards,Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Cosine Similarity

2012-12-06 Thread subhabangalore
Dear Group,

I am looking for some example of implementing Cosine similarity in python. I 
searched for hours but could not help much. NLTK seems to have a module but did 
not find examples. 

If anyone of the learned members may kindly help out.

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python Cluster

2012-12-04 Thread subhabangalore
On Wednesday, December 5, 2012 2:33:56 AM UTC+5:30, Miki Tebeka wrote:
> On Tuesday, December 4, 2012 11:04:15 AM UTC-8, subhaba...@gmail.com wrote:
> 
> > >>> cl = HierarchicalClustering(data, lambda x,y: abs(x-y))
> 
> > but now I want to visualize it if any one suggest how may I use 
> > visualization(like matplotlib or pyplot etc.) to see the data?
> 
> One option is to use a scatter plot with different color per cluster. See the 
> many examples in http://matplotlib.org/gallery.html.
> 
> 
> 
> HTH,
> 
> --
> 
> Miki

Thanks Miki. Good Gallery I think it'd do. We can plot as we feel. 
Regards,Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Python Cluster

2012-12-04 Thread subhabangalore
Dear Group,

I am trying to use the cluster module as,
>>> from cluster import *
>>> data = [12,34,23,32,46,96,13]
>>> cl = HierarchicalClustering(data, lambda x,y: abs(x-y))
>>> cl.getlevel(10)
[[96], [46], [12, 13, 23, 34, 32]]
>>> cl.getlevel(5)
[[96], [46], [12, 13], [23], [34, 32]]

but now I want to visualize it if any one suggest how may I use 
visualization(like matplotlib or pyplot etc.) to see the data?

Thanking in advance,
Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


A Discussion on Python and Data Visualization

2012-12-03 Thread subhabangalore
Dear Group,

I am trying to work out a data visualization module.

Here,
I am taking raw corpus,and processing it 
linguistically(tokenization,tagging,NED recognition)
and then trying to link the NED's with Latent Semantic Analysis or Relationship 
Mining or Network graph theory or cluster analysis and trying to visualize the 
result.

For NLP based works I am taking NLTK, LSA or Relationship Mining is also 
handled by NLTK,
for Network graph theory I am taking igraph/networkx, for cluster analysis I am 
using Pycluster/Cluster
and if any more extra visualization is required I am using matplotlib.

Now, am I going fine with the choice of software--(I am using Python2.7.3 on 
MS-Windows 7 with IDLE as GUI) or should I change any one?

If you have any suggestion?

Now, I was feeling as this is the age of Information Visualization Python may 
have a library for doing visual analytics which I do not know which would 
extract information and visualize?

If anyone can kindly suggest?

Thanking You in Advance,
Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Conversion of List of Tuples

2012-12-03 Thread subhabangalore
On Tuesday, December 4, 2012 1:28:17 AM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I have a tuple of list as,
> 
> 
> 
> tup_list=[(1,2), (3,4)]
> 
> Now if I want to covert as a simple list,
> 
> 
> 
> list=[1,2,3,4]
> 
> 
> 
> how may I do that?
> 
> 
> 
> If any one can kindly suggest? Googling didn't help much.
> 
> 
> 
> Regards,
> 
> Subhabrata.

Thanks. But I am not getting the counter "5posts 0 views"...if moderator can 
please check the issue.
-- 
http://mail.python.org/mailman/listinfo/python-list


Conversion of List of Tuples

2012-12-03 Thread subhabangalore
Dear Group,

I have a tuple of list as,

tup_list=[(1,2), (3,4)]
Now if I want to covert as a simple list,

list=[1,2,3,4]

how may I do that?

If any one can kindly suggest? Googling didn't help much.

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List problem

2012-12-02 Thread subhabangalore
On Sunday, December 2, 2012 9:29:22 PM UTC+5:30, Thomas Bach wrote:
> On Sun, Dec 02, 2012 at 04:16:01PM +0100, Lutz Horn wrote:
> 
> > 
> 
> > len([x for x in l if x[1] == 'VBD'])
> 
> > 
> 
> 
> 
> Another way is
> 
> 
> 
> sum(1 for x in l if x[1] == 'VBD')
> 
> 
> 
> which saves the list creation.
> 
> 
> 
> Regards,
> 
>   Thomas.

Thanks. After I posted I got a solution as,
[x for x, y in enumerate(chunk_word) if "/VB" in y]
but you are smarter.
Thanks.
Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Splitting Tree

2012-12-02 Thread subhabangalore
On Sunday, December 2, 2012 5:39:32 PM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> 
> 
> I am using NLTK and I used the following command,
> 
> 
> 
> chunk=nltk.ne_chunk(tag)
> 
> print "The Chunk of the Line Is:",chunk
> 
> 
> 
> 
> 
> The Chunk of the Line Is: (S
> 
>   ''/''
> 
>   It/PRP
> 
>   is/VBZ
> 
>   virtually/RB
> 
>   a/DT
> 
>   homecoming/NN
> 
>   ,/,
> 
>   ''/''
> 
>   said/VBD
> 
>   (PERSON Gen/NNP Singh/NNP)
> 
>   on/IN
> 
>   arrival/NN)
> 
> 
> 
> Now I am trying to split the output preferably by ",/,".
> 
> 
> 
> But how would I split a Tree object in python.
> 
> 
> 
> If I use command like,
> 
> chunk_word=chunk.split()
> 
> 
> 
> It is giving me the error as,
> 
> 
> 
> File "C:/Python27/docstructure1.py", line 38, in document_structure1
> 
> chunk1=chunk.split()
> 
> AttributeError: 'Tree' object has no attribute 'split'
> 
> 
> 
> If anyone of the learned members of the room can kindly help.
> 
> 
> 
> Regards,
> 
> Subhabrata.

Sorry to ask this. I converted in string and then splitted it.
-- 
http://mail.python.org/mailman/listinfo/python-list


List problem

2012-12-02 Thread subhabangalore
Dear Group,

I have a list of the following pattern,

[("''", "''"), ('Eastern', 'NNP'), ('Army', 'NNP'), ('Commander', 'NNP'), 
('Lt', 'NNP'), ('Gen', 'NNP'), ('Dalbir', 'NNP'), ('Singh', 'NNP'), ('Suhag', 
'NNP'), ('briefed', 'VBD'), ('the', 'DT'), ('Army', 'NNP'), ('chief', 'NN'), 
('on', 'IN'), ('the', 'DT'), ('operational', 'JJ'), ('preparedness', 'NN'), 
('and', 'CC'), ('the', 'DT'), ('security', 'NN'), ('scenario', 'NN'), ('in', 
'IN'), ('the', 'DT'), ('eastern', 'NN'), ('region', 'NN'), (',', ','), ("''", 
"''"), ('defence', 'NN'), ('spokesperson', 'NN'), ('Group', 'NNP'), ('Capt', 
'NNP'), ('T', 'NNP'), ('K', 'NNP'), ('Singha', 'NNP'), ('said', 'VBD'), 
('here', 'RB')]

Now, as we see it has multiple VBD elements.
I want to recognize,count and index them all.

If any one can kindly suggest.

Regards,
Subhabrata


-- 
http://mail.python.org/mailman/listinfo/python-list


Splitting Tree

2012-12-02 Thread subhabangalore
Dear Group,

I am using NLTK and I used the following command,

chunk=nltk.ne_chunk(tag)
print "The Chunk of the Line Is:",chunk


The Chunk of the Line Is: (S
  ''/''
  It/PRP
  is/VBZ
  virtually/RB
  a/DT
  homecoming/NN
  ,/,
  ''/''
  said/VBD
  (PERSON Gen/NNP Singh/NNP)
  on/IN
  arrival/NN)

Now I am trying to split the output preferably by ",/,".

But how would I split a Tree object in python.

If I use command like,
chunk_word=chunk.split()

It is giving me the error as,

File "C:/Python27/docstructure1.py", line 38, in document_structure1
chunk1=chunk.split()
AttributeError: 'Tree' object has no attribute 'split'

If anyone of the learned members of the room can kindly help.

Regards,
Subhabrata. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Textmining

2012-12-01 Thread subhabangalore
On Saturday, December 1, 2012 5:13:17 AM UTC+5:30, Dave Angel wrote:
> On 11/30/2012 02:48 PM, subhabangal...@gmail.com wrote:
> 
> > Dear Group,
> 
> > Python has one textming library, but I am failing to install it in Windows.
> 
> > If any one can kindly help.
> 
> > Regards,
> 
> > Subhabrata.
> 
> 
> 
> Please think about what you're asking, if you want people to help you. 
> 
> You say Python has a testming library,  But CPython's standard library
> 
> version 3.3 does NOT have a library called testming.  Neither does 2.7
> 
> in case you're running that one.  Now, maybe some other version of
> 
> Python has other stuff in its standard library, or maybe it's only
> 
> available on the Amiga port of python.  But you give no clues to which
> 
> one it was.
> 
> 
> 
> I repeated the search with a keyword of testmining, in case that's the
> 
> actual name of the library.  Still in neither 2.7 nor 3.3.
> 
> 
> 
> So I'm forced to gaze even closer into my crystal ball.  How about
> 
> 
> 
> http://scripts.downloadroute.com/textmining-95300f9f.html
> 
> http://www.testmine.com/
> 
> http://webscripts.softpedia.com/script/E-Commerce/Catalogs/textmining-66084.html
> 
> http://www.christianpeccei.com/projects/textmining/
> 
> http://pybrary.net/pyPdf/
> 
> http://code.activestate.com/recipes/511465/
> 
> http://www.unixuser.org/~euske/python/pdfminer/index.html
> 
> 
> 
> http://orange.biolab.si/
> 
> http://www.amazon.com/Python-Text-Processing-NLTK-Cookbook/dp/1849513600?tag=duckduckgo-d-20
> 
> http://linux.softpedia.com/get/Utilities/textmining-61802.shtml
> 
> http://pypi.python.org/pypi/textmining
> 
> http://orange-text.readthedocs.org/en/latest/
> 
> 
> 
> 
> 
> I could call a mechanic, and tell him my car makes a funny nose, and no
> 
> matter how hard I kick the right front tire the noise doesn't go away,
> 
> and he's unlikely to be able to help.  He'd probably like some
> 
> fundamental facts about the problem.  So do we.
> 
> 
> 
> What version of Windows OS are you running?
> 
> What version of what implementation of Python are you running?
> 
> What library, located at what URL did you try to install?
> 
> How did you try the installation?  What happened?  How did you know you
> 
> failed?
> 
> 
> 
> In many of these answers, you should paste actual program output rather
> 
> than paraphrasing.  Certainly, if you got an exception, you should paste
> 
> the entire stack trace.  And if you got that far, a minimal code example
> 
> that shows the problem.
> 
> 
> 
> -- 
> 
> 
> 
> DaveA

Dear Group,

Python has one textmining library. 
[Sorry for the spelling mistake in earlier post].

As I see it can be downloaded from,

http://pypi.python.org/pypi/textmining/1.0

I am running Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit 
(Intel)] on win32 with IDLE as GUI and Windows 7 as OS.

I am not getting how to download and use it.

If any one of the learned members in the room can kindly help it.

Regards,
Subhabrata Banerjee.
-- 
http://mail.python.org/mailman/listinfo/python-list


Textmining

2012-11-30 Thread subhabangalore
Dear Group,
Python has one textming library, but I am failing to install it in Windows.
If any one can kindly help.
Regards,
Subhabrata.
-- 
http://mail.python.org/mailman/listinfo/python-list


Few Issues on Parsing and Visualization

2012-11-29 Thread subhabangalore
Dear Group,

I am looking for some Python based Natural Language Tools.

(i)Parsers (either syntactic or semantic). NLTK has but there I have to input 
the grammar. I am looking for straight built in library like nltk tagging 
module.

(ii) I am looking for some ner extraction tools. NLTK has I am looking for 
another, pyner is there but could not download.

(iii) I am looking for relation extraction tool NLTK has but I am looking for 
another. 

(iv) I am looking for one information extraction library, I found GenSim but 
could not use it properly.

(v) I am looking for a visualization library, found networkx,matplotlib,vtk but 
I am looking for a visual analytics library.

I am looking all these tools for the use in Windows(XP/7) with Python2.6/2.7.

And if anyone can kindly let me know on how to use the Python binding of 
Stanford Parser.

Thanking in Advance,
Regards,
Subhabrata. 
 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Understanding Code

2012-11-16 Thread subhabangalore
On Tuesday, November 13, 2012 4:12:52 PM UTC+5:30, Peter Otten wrote:
> subhabangal...@gmail.com wrote:
> 
> 
> 
> > Dear Group,
> 
> > To improve my code writing I am trying to read good codes. Now, I have
> 
> > received a code,as given below,(apology for slight indentation errors) the
> 
> > code is running well. Now to comprehend the code, I am looking to
> 
> > understand it completely.
> 
> > 
> 
> > class Calculate:
> 
> >   def __init__(self):
> 
> > self.prior = {}
> 
> > self.total = {}
> 
> > self.count = 0
> 
> >   def add(self, cls, obs):
> 
> > self.prior[cls] = self.prior.get(cls, 0) + 1
> 
> > for idx, val in enumerate(obs):
> 
> > key = cls, idx, val
> 
> > self.total[key] = self.total.get(key, 0) + 1
> 
> > self.count += 1
> 
> >   def discr(self, cls, obs):
> 
> > result = self.prior[cls]/self.count
> 
> > for idx, val in enumerate(obs):
> 
> > freq = self.total.get((cls, idx, val), 0)
> 
> > result *= freq/self.prior[cls]
> 
> > return result
> 
> >   def classify(self, obs):
> 
> > candidates = [(self.discr(c, obs), c) for c in self.prior]
> 
> > return max(candidates)[1]
> 
> > 
> 
> > I am not understanding many parts of it, I am understanding many parts of
> 
> > it also.
> 
> > 
> 
> > So I am looking for an exercise what are the things I should know to
> 
> > understand it, (please do not give answers I would get back with the
> 
> > answers in a week and would discuss even how to write better than this).
> 
> 
> 
> Start with running the code for the simplest piece of the class:
> 
> >>> c = Calculate()
> 
> >>> c.add("x", [1,2,3])
> 
> 
> 
> Then inspect the attributes:
> 
> 
> 
> >>> c.prior
> 
> {'x': 1}
> 
> >>> c.total
> 
> {('x', 2, 3): 1, ('x', 1, 2): 1, ('x', 0, 1): 1}
> 
> >>> c.count
> 
> 
> 
> Now read the code for Calculate.add(). Do you understand what
> 
> 
> 
> > self.prior[cls] = self.prior.get(cls, 0) + 1
> 
> 
> 
> does? Experiment with a dict and its get() method in the interactive 
> 
> interpreter. Next to the loop.
> 
> 
> 
> > for idx, val in enumerate(obs):
> 
> > key = cls, idx, val
> 
> > self.total[key] = self.total.get(key, 0) + 1
> 
> > self.count += 1
> 
> 
> 
> Do you understand what enumerate() does? If not read its documentation with
> 
> 
> 
> >>> help(enumerate)
> 
> 
> 
> Do you understand what key looks like? If you don't add a print statement
> 
> 
> 
> > for idx, val in enumerate(obs):
> 
> > key = cls, idx, val
> 
>   print key
> 
> > self.total[key] = self.total.get(key, 0) + 1
> 
> > self.count += 1
> 
> 
> 
> What does
> 
> 
> 
> > self.total[key] = self.total.get(key, 0) + 1
> 
> 
> 
> do? Note that this line is very similar to
> 
> 
> 
> > self.prior[cls] = self.prior.get(cls, 0) + 1
> 
> 
> 
> which you have studied before.
> 
> 
> 
> > self.count += 1
> 
> 
> 
> This like the rest of your class is left as an exercise. The routine is 
> 
> always the same: 
> 
> 
> 
> - break parts that you don't understand into smaller parts
> 
> - consult the documentation on unknown classes, functions, methods, 
> 
> preferrably with help(some_obj) or dir(some_obj)
> 
> - run portions of the code or similar code in the interactive interpreter or 
> 
> with a little throw-away script.
> 
> - add print statements to inspect variables at interesting points in your 
> 
> script.

Dear Sir,

Thank you for your kind guidance.
I tried to do the following exercises,

(i) On dict.get():

>>> tel = {'jack': 4098, 'sape': 4139, 'obama':3059,'blair':3301}
>>> dict.get('obama')

>>> tel.get('obama')
3059

>>> for i in tel:
x1=tel.get(i)
print x1


4139
3301
4098
3059
>>> 
>>> tel.get('blair',0)
3301
>>> 
>>> tel.get('blair',0)+1
3302
>>>  


(ii) On enumerate:
>>> list1=["Man","Woman","Gentleman","Lady","Sir","Madam"]
>>> for i,j in enumerate(list1):
print i,j


0 Man
1 Woman
2 Gentleman
3 Lady
4 Sir
5 Madam
>>> 

(iii) Trying to check the values individually:
>>> class Calculate:
def __init__(self):
self.prior = {}  
self.total = {}  
self.count = 0   
def add(self, cls, obs):
self.prior[cls] = self.prior.get(cls, 0) + 1
for idx, val in enumerate(obs):
key = cls, idx, val
print key
self.total[key] = self.total.get(key, 0) + 1
self.count += 1


>>> x1=Calculate()
>>> x1.add("x", [1,2,3])
('x', 0, 1)
('x', 1, 2)
('x', 2, 3)

>>> class Calculate:

def __init__(self):
self.prior = {}  
self.total = {}  
self.count = 0   

def add(self, cls, obs):
self.prior[cls] = self.prior.get(cls, 0) + 1
for idx, val in enumerate(obs):
key = cls, idx, val
self.total[key] = self.total.get(key

  1   2   >