[Tutor] really basic py/regex

2018-03-30 Thread bruce
Hi.

Trying to quickly get the re.match()  to extract the groups from the string.

x="MATH 59900/40 [47490] - THE "

The regex has to return MATH, 59900, 40,, and 47490

d=re.match(r'(\D+)...) gets the MATH...

But I can't see (yet) how to get the rest of what I need...

Pointers would be useful.

Thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] prime factorisation

2018-03-12 Thread Bruce Todd Puls
I think You would do much better if You wrote pseudo code first,
i.e. write each step out in words,
code is much easier to write following pseudo code

Are You trying to factor Prime Numbers?
Prime Number factored (Prime Number and 1)


https://en.wikipedia.org/wiki/Table_of_prime_factors#1_to_100

https://www.mathsisfun.com/prime-factorization.html
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] vol 166, issue 20, 1. installing python and numpy on the Mac (OSX) (Peter Hodges)

2017-12-24 Thread Bruce Todd Puls
sudo -H python3.6 -m pip install numpy
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] really basic question..

2017-08-05 Thread bruce
Lord...

redid a search just now. found a bunch of sites that said it's
doable.. embarrased

Not sure what I was looking for earlier.. need r u m!



On Sat, Aug 5, 2017 at 11:44 AM, bruce <badoug...@gmail.com> wrote:
> Hey guys.
>
> A really basic question. I have the following:
>   try:
> element = WebDriverWait(driver,
> 100).until(EC.presence_of_element_located((By.ID,
> "remarketingStoreId")))
>   except TimeoutException:
> driver.close()
>
>
> I was wondering can I do something like the following to handle
> "multiple" exceptions? Ie, have an "except" block that catches all
> issues other than the specific TimeoutException.
>
>   try:
> element = WebDriverWait(driver,
> 100).until(EC.presence_of_element_located((By.ID,
> "remarketingStoreId")))
>   except TimeoutException:
> driver.close()
>   except :
> driver.close()
>
>
> I've looked all over SO, as well as the net in general. I might have
> ust missed what I was looking for though.
>
> Comments??  Thanks much.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] really basic question..

2017-08-05 Thread bruce
Hey guys.

A really basic question. I have the following:
  try:
element = WebDriverWait(driver,
100).until(EC.presence_of_element_located((By.ID,
"remarketingStoreId")))
  except TimeoutException:
driver.close()


I was wondering can I do something like the following to handle
"multiple" exceptions? Ie, have an "except" block that catches all
issues other than the specific TimeoutException.

  try:
element = WebDriverWait(driver,
100).until(EC.presence_of_element_located((By.ID,
"remarketingStoreId")))
  except TimeoutException:
driver.close()
  except :
driver.close()


I've looked all over SO, as well as the net in general. I might have
ust missed what I was looking for though.

Comments??  Thanks much.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] pythonic ascii decoding!

2017-07-31 Thread bruce
Hi guys.

Testing getting data from a number of different US based/targeted
websites. So the input data source for the most part, will be "ascii".
I'm getting a few "weird" chars every now and then asn as fas as I can
tell, they should be utf-8.

However, the following hasn;t always worked:
s=str(s).decode('utf-8').strip()

So, is there a quick/dirty approach I can use to simply strip out the
"non-ascii" chars. I know, this might not be the "best/pythonic" way,
and that it might result in loss of some data/chars, but I can live
with it for now.

thoughts/comments ??

thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] basic decorator question

2017-07-24 Thread bruce
Hi.

I've seen sites discuss decorators, as functions that "wrap" and
return functions.

But, I'm sooo confuzed! My real question though, can a decorator have
multiple internal functions? All the examples I've seen so far have a
single internal function.

And, if a decorator can have multiple internal functions, how would
the calling sequence work?

But as a start, if you have pointers to any really "basic" step by
step sites/examples I can look at, I'd appreciate it. I suspect I'm
getting flumoxed by something simple.

thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] centos 7 - new setup.. weird python!

2017-07-19 Thread bruce
Hi.

Testing setting up a new Cnntos7 instance.

I ran python -v from the cmdline...  and instantly got a bunch of the
following! Pretty sure this isn't correct.

Anyone able to give pointers as to what I've missed.

thanks

python -v
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# /usr/lib64/python2.7/site.pyc matches /usr/lib64/python2.7/site.py
import site # precompiled from /usr/lib64/python2.7/site.pyc
# /usr/lib64/python2.7/os.pyc matches /usr/lib64/python2.7/os.py
import os # precompiled from /usr/lib64/python2.7/os.pyc
.
.
.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] using sudo pip install

2017-04-20 Thread bruce
Hey guys..

Wanted to get thoughts?

On an IRC chat.. someone stated emphatically...

Never do a "sudo pip install --upgrade..."

The claim was that it could cause issues, enought to seriously
(possibly) damage the OS..

So, is this true??
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] subprocess.Popen / proc.communicate issue

2017-03-31 Thread bruce
Cameron!!!

You are 'da man!!

Read your exaplanation.. good stuff to recheck/test and investigate
over time

In the short term, I'll implement some tests!!

thanks!


On Thu, Mar 30, 2017 at 6:51 PM, Cameron Simpson <c...@zip.com.au> wrote:
> I wrote a long description of how .communicate can deadlock.
>
> Then I read the doco more carefully and saw this:
>
>  Warning: Use communicate() rather than .stdin.write, .stdout.read
>  or .stderr.read to avoid deadlocks due to any of the other OS
>  pipe buffers filling up and blocking the child process.
>
> This suggests that .communicate uses Threads to send and to gather data
> independently, and that therefore the deadlock situation may not arise.
>
> See what lsof and strace tell you; all my other advice stands regardless,
> and
> the deadlock description may or may not be relevant. Still worth reading and
> understanding it when looking at this kind of problem.
>
> Cheers,
> Cameron Simpson <c...@zip.com.au>
>
>
> On 31Mar2017 09:43, Cameron Simpson <c...@zip.com.au> wrote:
>>
>> On 30Mar2017 13:51, bruce <badoug...@gmail.com> wrote:
>>>
>>> Trying to understand the "correct" way to run a sys command ("curl")
>>> and to get the potential stderr. Checking Stackoverflow (SO), implies
>>> that I should be able to use a raw/text cmd, with "shell=true".
>>
>>
>> I strongly recommend avoiding shell=True if you can. It has many problems.
>> All stackoverflow advice needs to be considered with caution. However, that
>> is not the source of your deadlock.
>>
>>> If I leave the stderr out, and just use
>>>s=proc.communicate()
>>> the test works...
>>>
>>> Any pointers on what I might inspect to figure out why this hangs on
>>> the proc.communicate process/line??
>>
>>
>> When it is hung, run "lsof" on the processes from another terminal i.e.
>> lsof the python process and also lsof the curl process. That will make clear
>> the connections between them, particularly which file descriptors ("fd"s)
>> are associated with what.
>>
>> The run "strace" on the processes. That shoud show you what system calls
>> are in progress in each process.
>>
>> My expectation is that you will see Python reading from one file
>> descriptor and curl writing to a different one, and neither progressing.
>>
>> Personally I avoid .communicate and do more work myself, largerly to know
>> precisely what is going on with my subprocesses.
>>
>> The difficulty with .communicate is that Python must read both stderr and
>> stdout separately, but it will be doing that sequentially: read one, then
>> read the other. That is just great if the command is "short" and writes a
>> small enough amount of data to each. The command runs, writes, and exits.
>> Python reads one and sees EOF after the data, because the command has
>> exited. Then Python reads the other and collects the data and sees EOF
>> because the command has exited.
>>
>> However, if the output of the command is large on whatever stream Python
>> reads _second_, the command will stall writing to that stream. This is
>> because Python is not reading the data, and therefore the buffers fill
>> (stdio in curl plus the buffer in the pipe). So the command ("curl") stalls
>> waiting for data to be consumed from the buffers. And because it has
>> stalled, the command does not exit, and therefore Python does not see EOF on
>> the _first_ stream. So it sits waiting for more data, never reading from the
>> second stream.
>>
>> [...snip...]
>>>
>>> cmd='[r" curl -sS '
>>> #cmd=cmd+'-A  "Mozilla/5.0 (X11; Linux x86_64; rv:38.0)
>>> Gecko/20100101 Firefox/38.0"'
>>> cmd=cmd+"-A  '"+user_agent+"'"
>>> ##cmd=cmd+'   --cookie-jar '+cname+' --cookie '+cname+''
>>> cmd=cmd+'   --cookie-jar '+ff+' --cookie '+ff+''
>>> #cmd=cmd+'-e "'+referer+'"   -d "'+tt+'"  '
>>> #cmd=cmd+'-e "'+referer+'"'
>>> cmd=cmd+"-L '"+url1+"'"+'"]'
>>> #cmd=cmd+'-L "'+xx+'" '
>>
>>
>> Might I recommand something like this:
>>
>> cmd_args = [ 'curl', '-sS' ]
>> cmd_args.extend( [ '-A', user_agent ] )
>> cmd_args.extend( [ '--cookie-jar', ff, '--cookie', ff ] )
>> cmd_args.extend( [ '-L', url ]
>>
>> and using shell=False. This totally avoids any need to "quote" strings in
>> the command, becau

[Tutor] test

2017-03-30 Thread bruce
sent a question earlier.. and got a reply saying it was in the
moderation process???
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] subprocess.Popen / proc.communicate issue

2017-03-30 Thread bruce
Trying to understand the "correct" way to run a sys command ("curl")
and to get the potential stderr. Checking Stackoverflow (SO), implies
that I should be able to use a raw/text cmd, with "shell=true".

If I leave the stderr out, and just use
 s=proc.communicate()
the test works...

Any pointers on what I might inspect to figure out why this hangs on
the proc.communicate process/line??

I'm showing a very small chunk of the test, but its the relevant piece.

Thanks


.
.
.

  cmd='[r" curl -sS '
  #cmd=cmd+'-A  "Mozilla/5.0 (X11; Linux x86_64; rv:38.0)
Gecko/20100101 Firefox/38.0"'
  cmd=cmd+"-A  '"+user_agent+"'"
  ##cmd=cmd+'   --cookie-jar '+cname+' --cookie '+cname+''
  cmd=cmd+'   --cookie-jar '+ff+' --cookie '+ff+''
  #cmd=cmd+'-e "'+referer+'"   -d "'+tt+'"  '
  #cmd=cmd+'-e "'+referer+'"'
  cmd=cmd+"-L '"+url1+"'"+'"]'
  #cmd=cmd+'-L "'+xx+'" '

  try_=1
  while(try_):
proc=subprocess.Popen(cmd,
shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
s,err=proc.communicate()
s=s.strip()
err=err.strip()

if(err==0):
  try_=''

.
.
.

the cmd is generated to be:
cmd=[r" curl -sS -A  'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT
6.1; Trident/5.0; yie8)'   --cookie-jar
/crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
--cookie /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
   -L 
'http://www6.austincc.edu/schedule/index.php?op=browse=ViewSched=216F000=PCACC=2016=CC'"]




test code hangs, ctrl-C generates the following:
^CTraceback (most recent call last):
  File "/crawl_tmp/austinccFetch_cloud_test.py", line 3363, in 
ret=fetchClassSectionFacultyPage(a)
  File "/crawl_tmp/austinccFetch_cloud_test.py", line 978, in
fetchClassSectionFacultyPage
(s,err)=proc.communicate()
  File "/usr/lib64/python2.6/subprocess.py", line 732, in communicate
stdout, stderr = self._communicate(input, endtime)
  File "/usr/lib64/python2.6/subprocess.py", line 1328, in _communicate
stdout, stderr = self._communicate_with_poll(input, endtime)
  File "/usr/lib64/python2.6/subprocess.py", line 1400, in
_communicate_with_poll
ready = poller.poll(self._remaining_time(endtime))
KeyboardInterrupt



This works from the cmdline:
curl -sS -A  'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1;
Trident/5.0; yie8)'   --cookie-jar
/crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
--cookie /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
   -L 
'http://www6.austincc.edu/schedule/index.php?op=browse=ViewSched=216F000=PCACC=2016=CC'
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] implementing sed - termination error

2016-11-01 Thread bruce
Hi

Running a test on a linux box, with python.

Trying to do a search/replace over a file, for a given string, and
replacing the string with a chunk of text that has multiple lines.

>From the cmdline, using sed, no prob. however, implementing sed, runs
into issues, that result in a "termination error"

The error gets thrown, due to the "\" of the newline. SO, and other
sites have plenty to say about this, but haven't run across any soln.

The test file contains 6K lines, but, the process requires doing lots
of search/replace operations, so I'm interested in testing this method
to see how "fast" the overall process is.

The following psuedo code is what I've used to test. The key point
being changing the "\n" portion to try to resolved the termination
error.


import subprocess


ll_="ffdfdfdfg"
ll2_="12112121212121212"
hash="a"

data_=ll_+"\n"+ll2_+"\n"+qq22_
print data_

cc='sed -i "s/'+hash+'/'+data_+'/g" '+dname
print cc

proc=subprocess.Popen(cc, shell=True,stdout=subprocess.PIPE)
res=proc.communicate()[0].strip()



===
error
sed: -e expression #1, char 38: unterminated `s' command
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] a bit off topic.. - more of a high level arch question!

2016-10-21 Thread bruce
Hi.

Thinking of a situation where I have two "processes" running. They
each want to operate on a list of files in the dir on a first come
first operate basis. Once a process finishes with the file, it deletes
it.

Only one process operates on a file.

I'm curious for ideas/thoughts.

As far as I can tell, using some sort of PID/Lock file is "the" way of
handling this.

ProcessA looks to see if the PIDFile is in use,
 If it is, I wait a "bit"
 if the PIDFile is "empty", I set it an proceed
   --when I finish my work, i reset the PIDFile


As long as both/all processes follow this logic,
 things should work, unless you get a "race" condition
 on the PIDFile..

Any thoughts on how you might handle this kind of situation, short of
having a master process, that forks/spawns of children, with the
master iterating through the list of files..

Thanks..
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] selenium bindings...

2016-10-18 Thread bruce
Hi.

This is prob way off topic.

Looking at web examples from different sites for selenium/python
bindings. Basically, trying to get an understanding of how to get the
"page" content of a page, after an implicit/explicit wait.

I can see how to get an element, but can't see any site that describes
how to get the complete page...

As an example of getting an element...


from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading;)
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
finally:
driver.quit()


But, as to getting the complete page, in the "try".. no clue.

Any thoughts/pointers??

thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] xpath - html entities issue --

2016-10-04 Thread bruce
Hi.

Just realized I might have a prob with testing a crawl.

I get a page of data via a basic curl. The returned data is
html/charset-utf-8.

I did a quick replace ('','&') and it replaced the '' as desired.
So the content only had '&' in it..

I then did a parseString/xpath to extract what I wanted, and realized I
have '' as representative of the '&' in the returned xpath content.

My issue, is there a way/method/etc, to only return the actual char, not
the html entiy ()

I can provide a more comprehensive chunk of code, but minimized the post to
get to the heart of the issue. Also, I'd prefer not to use a sep parse lib.


code chunk

import libxml2dom

q1=libxml2dom

s2= q1.parseString(a.toString().strip(), html=1)
tt=s2.xpath(tpath)

tt=tt[0].toString().strip()
print "tit "+tt

-


the content of a.toString() (shortened)
.
.
.
 

Organization
Development & Change
Edition: 10th



.
.
.

the xpath results are



Organization
Development  Change
Edition: 10th



As you can see.. in the results of the xpath (toString())
 the & --> 

I'm wondering if there's a process that can be used within the toString()
or do you really have to wrap each xpath/toString with a unescape() kind of
process to convert htmlentities to the requisite chars.

Thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unicode decode/encode issue

2016-09-26 Thread bruce
Hey folks. (peter!)

Thanks for the reply.

I wound up doing:

  #s=s.replace('\u2013', '-')
  #s=s.replace(u'\u2013', '-')
  #s=s.replace(u"\u2013", "-")
  #s=re.sub(u"\u2013", "-", s)
  s=s.encode("ascii", "ignore")
  s=s.replace(u"\u2013", "-")
  s=s.replace("", "-")  ##<<< this was actually in the raw content
apparently

  print repr(s)

The test no longer has the unicode 'dash'

I'll revisit and simplify later. One or two of the above ines should be
able to be removed, and still have the unicode issue resolved.

Thanks


On Mon, Sep 26, 2016 at 1:54 PM, Peter Otten <__pete...@web.de> wrote:

> bruce wrote:
>
> > Hi.
> >
> > Ive got a "basic" situation that should be simpl. So it must be a user
> > (me) issue!
> >
> >
> > I've got a page from a web fetch. I'm simply trying to go from utf-8 to
> > ascii. I'm not worried about any cruft that might get stripped out as the
> > data is generated from a us site. (It's a college/class dataset).
> >
> > I know this is a unicode issue. I know I need to have a much more
> > robust/ythnic/correct approach. I will later, but for now, just want to
> > resolve this issue, and get it off my plate so to speak.
> >
> > I've looked at stackoverflow, as well as numerous other sites, so I turn
> > to the group for a pointer or two...
> >
> > The unicode that I'm dealing with is 'u\2013'
> >
> > The basic things I've done up to now are:
> >
> >   s=content
> >   s=ascii_strip(s)
> >   s=s.replace('\u2013', '-')
> >   s=s.replace(u'\u2013', '-')
> >   s=s.replace(u"\u2013", "-")
> >   s=re.sub(u"\u2013", "-", s)
> >   print repr(s)
> >
> > When I look at the input content, I have :
> >
> >  u'English 120 Course Syllabus \u2013 Fall \u2013 2006'
> >
> > So, any pointers on replacing the \u2013 with a simple '-' (dash) (or I
> > could even handle just a ' ' (space)
>
> I suppose you want to replace the DASH with HYPHEN-MINUS. For that both
>
> >   s=s.replace(u'\u2013', '-')
> >   s=s.replace(u"\u2013", "-")
>
> should work (the Python interpreter sees no difference between the two).
> Let's try:
>
> >>> s = u'English 120 Course Syllabus \u2013 Fall \u2013 2006'
> >>> t = s.replace(u"\u2013", "-")
> >>> s == t
> False
> >>> s
> u'English 120 Course Syllabus \u2013 Fall \u2013 2006'
> >>> t
> u'English 120 Course Syllabus - Fall - 2006'
>
> So it look like you did not actually try the code you posted.
>
> To remove all non-ascii codepoints you can use encode():
>
> >>> s.encode("ascii", "ignore")
> 'English 120 Course Syllabus  Fall  2006'
>
> (Note that the result is a byte string)
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] unicode decode/encode issue

2016-09-26 Thread bruce
Hi.

Ive got a "basic" situation that should be simpl. So it must be a user (me)
issue!


I've got a page from a web fetch. I'm simply trying to go from utf-8 to
ascii. I'm not worried about any cruft that might get stripped out as the
data is generated from a us site. (It's a college/class dataset).

I know this is a unicode issue. I know I need to have a much more
robust/ythnic/correct approach. I will later, but for now, just want to
resolve this issue, and get it off my plate so to speak.

I've looked at stackoverflow, as well as numerous other sites, so I turn to
the group for a pointer or two...

The unicode that I'm dealing with is 'u\2013'

The basic things I've done up to now are:

  s=content
  s=ascii_strip(s)
  s=s.replace('\u2013', '-')
  s=s.replace(u'\u2013', '-')
  s=s.replace(u"\u2013", "-")
  s=re.sub(u"\u2013", "-", s)
  print repr(s)

When I look at the input content, I have :

 u'English 120 Course Syllabus \u2013 Fall \u2013 2006'

So, any pointers on replacing the \u2013 with a simple '-' (dash) (or I
could even handle just a ' ' (space)

thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Unable to download , using Beautifulsoup

2016-07-29 Thread bruce
Hey Alan...

Wow APIs.. yeah.. would be cool!!!

I've worked on scraping data from lots of public sites, that have no issue
(as long as you're kind) that have no clue/resource regarding offerning
APIs.

However, yeah, if you''re looking to "rip" off a site that has adverts,
prob not a cool thing to do, no matter what tools are used.



On Fri, Jul 29, 2016 at 6:59 PM, Alan Gauld via Tutor <tutor@python.org>
wrote:

> On 29/07/16 23:10, bruce wrote:
>
> > The most "complete" is the use of a headless browser. However, the
> > use/implementation of a headless browser has its' own share of issues.
> > Speed, complexity, etc...
>
> Walter and Bruce have jumped ahead a few steps from where I was
> heading but basically it's an increasingly common scenario where
> web pages are no longer primarily html but rather are
> Javascript programs that fetch data dynamically.
>
> A headless browser is the brute force way to deal with such issues
> but a better (purer?) way is to access the same API that the browser
> is using. Many web sites now publish RESTful APIs with web
> services that you can call directly. It is worth investigating
> whether your target has this. If so that will generally provide
> a much nicer solution than trying to drive a headless browser.
>
> Finally you need to consider whether you have the right to the
> data without running a browser? Many sites provide information
> for free but get paid by adverts. If you bypass the web screen
> (adverts) you  bypass their revenue and they do not allow that.
> So you need to be sure that you are legally entitled to scrape
> data from the site or use an API.
>
> Otherwise you may be on the wrong end of a law suite, or at
> best be contributing to the demise of the very site you are
> trying to use.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Unable to download , using Beautifulsoup

2016-07-29 Thread bruce
In following up/on what Walter said.

If the browser without cookies/javascript enabled doesn't generate the
content, you need to have a different approach.

The most "complete" is the use of a headless browser. However, the
use/implementation of a headless browser has its' own share of issues.
Speed, complexity, etc...

A potentially better/useful method is to view/look at the traffic
(livehttpheaders for Firefox) to get a feel for exactly what the browser
requires. At the same time, view the subordinate jscript functions.

I've found it's often enough to craft the requisite cookies/curl functions
in order to simulate the browser data.

In a few cases though, I've run across situations where a headless browser
is the only real soln.



On Fri, Jul 29, 2016 at 3:28 AM, Crusier  wrote:

> I am using Python 3 on Windows 7.
>
> However, I am unable to download some of the data listed in the web
> site as follows:
>
> http://data.tsci.com.cn/stock/00939/STK_Broker.htm
>
> 453.IMC 98.28M 18.44M 4.32 5.33 1499.Optiver 70.91M 13.29M 3.12 5.34
> 7387.花旗环球 52.72M 9.84M 2.32 5.36
>
> When I use Google Chrome and use 'View Page Source', the data does not
> show up at all. However, when I use 'Inspect', I can able to read the
> data.
>
> '1453.IMC'
> '98.28M'
> '18.44M'
> '4.32'
> '5.33'
>
> '1499.Optiver '
> ' 70.91M'
> '13.29M '
> '3.12'
> '5.34'
>
> Please kindly explain to me if the data is hide in CSS Style sheet or
> is there any way to retrieve the data listed.
>
> Thank you
>
> Regards, Crusier
>
> from bs4 import BeautifulSoup
> import urllib
> import requests
>
>
>
>
> stock_code = ('00939', '0001')
>
> def web_scraper(stock_code):
>
> broker_url = 'http://data.tsci.com.cn/stock/'
> end_url = '/STK_Broker.htm'
>
> for code in stock_code:
>
> new_url  = broker_url + code + end_url
> response = requests.get(new_url)
> html = response.content
> soup = BeautifulSoup(html, "html.parser")
> Buylist = soup.find_all('div', id ="BuyingSeats")
> Selllist = soup.find_all('div', id ="SellSeats")
>
>
> print(Buylist)
> print(Selllist)
>
>
>
> web_scraper(stock_code)
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Counting and grouping dictionary values in Python 2.7

2016-07-13 Thread Bruce Dykes
On Fri, Jul 8, 2016 at 1:33 PM, Alan Gauld via Tutor <tutor@python.org>
wrote:

> On 08/07/16 14:22, Bruce Dykes wrote:
>
> > with it is writing the list of dictionaries to a .csv file, and to date,
> > we've been able to get by doing some basic analysis by simply using grep
> > and wc, but I need to do more with it now.
>
> I'm a big fan of using the right tool for the job.
> If you got your data in CSV have you considered using a
> spreadsheet to read the data and analyse it? They have lots
> of formulae and stats functions built in and can do really
> cool graphs etc and can read csv files natively.
>
> Python might be a better tool if you want regular identical reports, say
> on a daily basis, but for ad-hoc analysis, or at least till you know
> exactly what you need, Excel or Calc are possibly better tools.
>
>
>
We can and have used spreadsheets for small ad-hoc things, but no, we need
two things, first, as noted, a daily report with various basic analyses,
mainly totals, and percentages, and second, possibly, some near-current
alarm checks, depending. That's less important, actually, but it might be a
nice convenience. In the first instance, we want the reports to be accessed
and displayed as web pages. Now, likewise, I'm sure there's a CMS that
might make semi-quick work of this as well, but really, all I need to do is
to display some web pages and run some cgi scripts.

bkd
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Counting and grouping dictionary values in Python 2.7

2016-07-08 Thread Bruce Dykes
I'm compiling application logs from a bunch of servers, reading the log
entries, parsing each log entry into a dictionary, and compiling all the
log entries into a single list of dictionaries. At present, all I'm doing
with it is writing the list of dictionaries to a .csv file, and to date,
we've been able to get by doing some basic analysis by simply using grep
and wc, but I need to do more with it now.

Here's what the data structures look like:

NY = ['BX01','BX02','BK01','MN01','SI01']
NJ = ['NW01','PT01','PT02']
CT = ['ST01','BP01','NH01']

sales = [
{'store':'store','date':'date','time':'time','state':'state',transid':'transid','product':'product','price':'price'},
{'store':'BX01','date':'8','time':'08:55','state':'NY',transid':'387','product':'soup','price':'2.59'},
{'store':'NW01','date':'8','time':'08:57','state':'NJ',transid':'24','product':'apples','price':'1.87'},
{'store':'BX01','date':'8','time':'08:56','state':'NY',transid':'387','product':'crackers','price':'3.44'}]

The first group of list with the state abbreviations is there to add the
state information to the compiled log, as it's not included in the
application log. The first dictionary in the list, with the duplicated key
names in the value field is there to provide a header line as the first
line in the compiled .csv file.

Now, what I need to do with this arbitrarily count and total the values in
the dictionaries, ie the total amount and number of items for transaction
id 387, or the total number of crackers sold in NJ stores. I think the
collections library has the functions I need, but I haven't been able to
grok the examples uses I've seen online. Likewise, I know I could build a
lot of what I need using regex and lists, etc, but if Python 2.7 already
has the blocks there to be used, well let's use the blocks then.

Also, is there any particular advantage to pickling the list and having two
files, one, the pickled file to be read as a data source, and the .csv file
for portability/readability, as opposed to just a single .csv file that
gets reparsed by the reporting script?

Thanks in advance
bkd
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] decorators -- treat me like i'm 6.. what are they.. why are they?

2016-07-06 Thread bruce
Hi.

Saw the decorator thread earlier.. didn't want to pollute it. I know, I
could google!

But, what are decorators, why are decorators? who decided you needed them!

Thanks!
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] simple regex question

2016-05-01 Thread bruce
Hi. I have a chunk of text code, which has multiple lines.

I'd like to do a regex, find a pattern, and in the line that matches the
pattern, mod the line. Sounds simple.

I've created a test regex. However, after spending time/google.. can't
quite figure out how to then get the "complete" line containing the
returned regex/pattern.

Pretty sure this is simple, and i'm just missing something.

my test "text" and regex are:


  s='''
ACCT2081'''


  pattern = re.compile(r'Course\S+|\S+\|')
  aa= pattern.search(s).group()
  print "sss"
  print aa

so, once I get the group, I'd like to use the returned match to then get
the complete line..

pointers/thoughts!! (no laughing!!)

thanks guys..
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Py/selenium bindings

2016-02-28 Thread bruce
Hi.

This might be a bit beyond the group, but I figured no harm/no foul.

I'm looking to access a site that's generated via javascript. The jscript
of the site is invoked via the browser, generating the displayed content.

I'm testing using py/selenium bindings as docs indicate that the
py/selenium/PhantomJS browser (headless) combination should invoke the
jscript, and result in the required content.

However, I can't seem to generte anything other than the initial encrypted
page/content.

Any thoughts/comments would be useful.

The test script is:

#!/usr/bin/python
#-
#
#FileName:
#udel_sel.py
#
#
#-

#test python script
import subprocess
import re
import libxml2dom
import urllib
import urllib2
import sys, string
import time
import os
import os.path
from hashlib import sha1
from libxml2dom import Node
from libxml2dom import NodeList
import hashlib
import pycurl
import StringIO
import uuid
import simplejson
import copy
from selenium import webdriver



#

if __name__ == "__main__":
# main app


  url="
http://udel.bncollege.com/webapp/wcs/stores/servlet/TBListView?storeId=37554=Y=10001=-1=%3C%3Fxml+version%3D%221.0%22%3F%3E%3Ctextbookorder%3E%3Cschool+id%3D%22289%22+%2F%3E%3Ccourses%3E%3Ccourse+num%3D%22200%22+dept%3D%22ACCT%22+sect%3D%22010%22+term%3D%222163%22%2F%3E%3C%2Fcourses%3E%3C%2Ftextbookorder%3E
"


  driver = webdriver.PhantomJS()
  driver.get(url)
  xx=driver.page_source

  print xx

  sys.exit()

---
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] really basic - finding multiline chunk within larger chunk

2016-02-17 Thread bruce
hmm...

Ok. For some reason, it appears to be a whitespace issue, which is
what I thought.


The basic process that was used to get the subchunk to test for, was
to actually do a copy/cut/paste of the subtext from the master text,
and then to write the code to test.

Yeah, testing for "text" with whitespaces/multiline can be fragile.
And yeah, the text might have been from the 90s but that's irrelevant!

Thanks for confirming what I thought. Thanks also for the sample code as well.

I might just wind up stripping tabs/spaces and joining on space to pre
massage the content prior to handling it..

'ppreciate it guys/gals!


On Wed, Feb 17, 2016 at 12:02 AM, Danny Yoo  wrote:
> (Ah, I see that Joel Goldstick also was able to do the search
> successfully; sorry about missing your message Joel!)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] really basic - finding multiline chunk within larger chunk

2016-02-16 Thread bruce
Hi.

I've got a test, where I have a chunk of text "a" and a subset of text
"s2a". the subset is multiline.

For some reason, can't seem to return true on the find. I've pasted in
http://fpaste.org/323521/, but the following is an example as well.
(not sure if the psuedo code listed actually shows the chunks of text
with the whitespace correctly.

Obviously, this is an "example" not meant to be real code!!

Thoughts on what I've screwed up?

Thanks

aa='''
  
  Retail Price
  
  Less than $10
  
 

 
  
  Required
  
  
   Yes
  
  
 

 
  
  Used During
  
  Full Term
  
 

 
  
  Copies on Reserve in
Libraries
  
  
   No
  
  
 '''

---

as a test:

s2a='''
  
  Required
  
  
   Yes
  
  
 '''



  if (aa.find(s2a)>0):
print "here ppp \n"
  else:
print "err \n"

  sys.exit()
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Creating a webcrawler

2016-01-09 Thread bruce
Hi Isac.

I'm not going to get into the pythonic stuff.. People on the list are
way better than I.  I've been doing a chunk of crawling, it's not too
bad, depending on what you're trying to accomplish and the site you're
targeting.

So, no offense, but I'm going to treat you like a 6 year old (google
it - from a movie!)

You need to back up, and analyze the site/pages/structure you're going
after. Use the tools - firefox - livehttpheaders/nettraffic/etc..
  -you want to be able to see what the exchange is between the
client/browser, as well as the server..
  -often, this gives you the clues/insite to crafting the request from
your client back to the server for the item/data you're going for...

Once you've gotten that together, setup the basic process with
wget/curl etc to get a feel for any weird issues - cert issues?
-security issues - are cookies required - etc.. A good deal of this
stuff can be resolved/checked out at this level, without jumping into
coding..

Once you're comfortable at this point, you can crank out some simple
code to go after the site you're targeting.

In the event you really have a javascript/dynamic site that you can't
handle in any other manner, you're going to need to go use a 'headless
browser' process.

There are a number of headless browser projects - I think most run on
the webit codebase (don't quote me). Casper/phantomjs, there are also
pythonic implementations as well...

So, there you go, should/hopefully this will get you on your way!



On Fri, Jan 8, 2016 at 9:01 PM, Whom Isac  wrote:
> Hi I want to create a web-crawler but dont have any lead to choose any
> module. I have came across the Jsoup but I am not familiar with how to use
> it in 3.5 as I tried looking at a similar web crawler codes from 3.4 dev
> version.
> I just want to build that crawler to crawl through a javascript enable site
> and automatically detect a download link (for video file)
> .
> And should I be using pickles to write the data in the text file/ save file.
> Thanks
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] idle??

2016-01-08 Thread bruce
Hey guys/gals - list readers

Recently came across someone here mentioning IDLE!! -- not knowing
this. I hit google for a look.

Is IDLE essentially an ide for doing py dev? I see there's a
windows/linux (rpms) for it.

I'm running py.. I normally do $$python to pop up the py env for quick
tests.. and of course run my test scripts/apps from the cmdline via
./foo.py...

So, where does IDLE fit into this

Thanks

(and yeah, I know I could continue to look at google, and even install
the rpms to really check it out!!)

tia!!
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] idle??

2016-01-08 Thread bruce
Thanks Alan...

So, as an IDE/shell.. I assume it's not quite Eclipse, butallows you
to do reasonable editing/snyax tracking/etc.. as well as run apps
within the window/shell.. I assume breakpoints as well, and a good
chunk of the rest of the usual IDE functions...

What about function completion? Where I type a function.. and it
displays a "list" of potential function/defs ? Does it provide
"function" or item hoovering. where cursor can be placed of a
function/item and information about the func, or item
(type/struct/etc..) is displayed?

Thanks again' much appreciated!!




On Fri, Jan 8, 2016 at 6:42 PM, Alan Gauld <alan.ga...@btinternet.com> wrote:
> On 08/01/16 19:07, bruce wrote:
>
>> Is IDLE essentially an ide for doing py dev? I see there's a
>> windows/linux (rpms) for it.
>
> Yes, its the official IDE for Python.
>
> There is an "unofficial" version called xidle which tends
> to get a lot of the new stuff before it makes it into the
> official release. For a long time not much happened with
> IDLE but recently there has been a bunch of activity so
> I'm hopeful we may soon see some new features appearing.
>
>> So, where does IDLE fit into this
>
> It incorporates a shell window where you can type commands
> and you can create blank editor windows(with syntax
> highlighting etc etc) from which you can save files,
> run them, debug them etc.
>
> There are some YouTube and ShowMeDo videos around and
> Danny Yoo has a short tutorial that is quite old but
> still pretty much applicable.
>
> There is official documentation on the python.org
> website too.
>
> Finally, it's not universally loved and definitely has
> some quirks but it's adequate for getting started,
> definitely better than notepad, say, on Windows.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] parser recommendations (was Re: Tutor Digest, Vol 142, Issue 11)

2015-12-14 Thread bruce
beautifulsoup, selenium + PhantomJS, and dryscrape

no knowledge of dryscape, never used it.

The other tools/apps are used to handle/parse html/websites.

Ssoup can handle xml/html as well as other input structs. Good for
being able to parse the resulting struct/dom to extract data, or to
change/modify the struct itself.

Selenium is a framework, acting as a browser env, allowing you to
'test' the site/html. It's good for certain uses regarding testing.

Phantomjs/casperjs are exxentially headless broswers, allow you to
also run/parse websites. While Soup is more for static, Phantom
because it's an actual headless browser, allows you to deal with
dynamic sites as well as static.




On Mon, Dec 14, 2015 at 2:56 PM, Alan Gauld  wrote:
> On 14/12/15 16:16, Crusier wrote:
>
> Please always supply a useful subject line when replying to the digest
> and also delete all irrelevant text. Some people pay by the byte and we
> have all received these messages already.
>
>> Thank you very much for answering the question. If you don't mind,
>> please kindly let me know which library I should focus on among
>> beautifulsoup, selenium + PhantomJS, and dryscrape.
>
> I don't know anything about the others but Beautiful soup
> is good for html, especially badly written/generated html.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Beautiful Soup

2015-12-13 Thread bruce
Hey Crusier/ (And Others...)

For your site...

As Alan mentioned, its a mix of html/jscript/etc..

So, you're going (or perhaps should) need to extract just the
json/struct that you need, and then go from there. I speak of
experience, as I've had to hande a number of sites that are
essentially just what you have.

Here's a basic guide to start:
--I use libxml, simplejson

fetch the page

in the page, do a split, to get the exact json (string) that you want.
-you'll do to splits, 1st gets rid of extra pre json stuff
 2nd gets rid of extra post json stuf that you don't need
--at this point, you should have the json string you need, or you
should be pretty close..

-now, you might need to "pretty" up what you have as py/json only
accepts key/value in certain format single/double quotes, etc..

once you've gotten this far, you might actually have the json string,
in which case, you can load it directly into the json, and proceed as
you wish.

you might also find that what you have, is really a py dictionary, and
you can handle that as well!

Have fun, let us know if you have issues...



On Sun, Dec 13, 2015 at 2:44 AM, Crusier  wrote:
> Dear All,
>
> I am trying to scrap the following website, however, I have
> encountered some problems. As you can see, I am not really familiar
> with regex and I hope you can give me some pointers to how to solve
> this problem.
>
> I hope I can download all the transaction data into the database.
> However, I need to retrieve it first. The data which I hope to
> retrieve it is as follows:
>
> "
> 15:59:59 A 500 6.790 3,395
> 15:59:53 B 500 6.780 3,390
>
> Thank you
>
> Below is my quote:
>
> from bs4 import BeautifulSoup
> import requests
> import re
>
> url = 
> 'https://bochk.etnet.com.hk/content/bochkweb/eng/quote_transaction_daily_history.php?code=6881=F=09=16=S=44c99b61679e019666f0570db51ad932=0=0'
>
> def turnover_detail(url):
> response = requests.get(url)
> html = response.content
> soup = BeautifulSoup(html,"html.parser")
> data = soup.find_all("script")
> for json in data:
> print(json)
>
> turnover_detail(url)
>
> Best Regards,
> Henry
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] ascii to/from AL32UTF8 conversion

2015-11-22 Thread bruce
Hi.

Doing a 'simple' test with linux command line curl, as well as pycurl
to fetch a page from a server.

The page has a charset of  >>AL32UTF8.

Anyway to conert this to straight ascii. Python is throwing a
notice/error on the charset in another part of the test..

The target site is US based, so there's no weird chars in it.. I
suspect that the page/system is based on legacy oracle

The metadata of the page is



I tried the usual

foo = foo.decode('utf-8')
foo = foo.decode('ansii')
etc..

but no luck.

Thanks for any pointers/help
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] ncurses question

2015-10-30 Thread bruce
Hi.

Looking over various sites on ncurses. Curious. I see various chunks
of code for creating multiple windows.. But I haven't seen any kind of
example that shows how to 'move' or switch between multiple windows.

Anyone have any code sample, or any tutorial/site that you could point me to!

I'm thinking of putting together a simple test to be able to select
between a couple of test windows, select the given field in the
window, and then generate the results in a lower window based on
what's selected..

Just curious. Any pointers, greatly appreciated.

Thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Scraping Wikipedia Table (The retruned file is empty)

2015-10-27 Thread bruce
my $0.02 for what it might be worth..

You have some people on the list who are straight out begineers   who
might be doing cut/copy/paste from 'bad code'. You have people coming
from other languages.. and then you have some who are trying to 'get
through' something, who aren't trying to be the dev!! And yeah, more
time, could easily (in most cases) provide an answer, but sometimes,
you just want to get a soln, and move on to the other 99 probs (no
offense jay z!!)

you guys have been a godsend at times!

thanks - keep up the good fight/work.



On Sun, Oct 25, 2015 at 9:56 PM, Alan Gauld  wrote:
> On 24/10/15 00:15, Mark Lawrence wrote:
>
>>> Looking more at the code...
>>>
>>>  > for x in range(len(drama_actor)):
>>>
>>> This looks unusual...
>>
>>
>> A better question IMHO is "where did you learn to write code like that
>> in the first place", as I've seen so many examples of this that I cannot
>> understand why people bother writing Python tutorials, as they clearly
>> don't get read?
>>
>
> I think its just a case of bad habits from other languages being
> hard to shake off. If your language doesn't have a for-each operator then
> its hard to wrap your brain around any other kind of for loop
> than one based on indexes.
>
> It's a bit like dictionaries. They are super powerful but beginners coming
> from other languages nearly always start out using
> arrays(ie lists) and trying to "index" them by searching which
> is hugely more complex, but it's what they are used too.
>
> JavaScript programmers tend to think the same about Python
> programmers who insist on writing separate functions for
> call backs rather than just embedding an anonymous function.
> But Python programmers are used to brain dead lambdas with
> a single expression so they don't tend to think about
> embedding a full function. Familiarity with an idiom makes
> it easier to stick with what you know than to try something new.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] generate a list/dict with a dynamic name..

2015-09-27 Thread bruce
Hi.

I can do a basic
 a=[]
to generate a simple list..

i can do a a="aa"+bb"

how can i do a
 a=[]

where a would have the value of "aabb"

in other words, generate a list/dict with a dynamically generated name

IRC replies have been "don't do it".. or it's bad.. but no one has
said you can do it this way..

just curious..

thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] aws/cloud questions..

2015-08-28 Thread bruce
Evening group!

Hope wee'all doing well, having fun. yada yada..!!

I'm considering taking a dive into the cloud with an app that would
be comprised of distributed machines, running py apps, talking to db
on different server(s), etc..

So, I was wondering if anyone has good docs/tutorials/walk through(s)
that you can provide, or even if someone is willing to play the role
of online mentor/tutor!!

Thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Parsing/Crawling test College Class Site.

2015-06-01 Thread bruce
Hi. I'm creating a test py app to do a quick crawl of a couple of
pages of a psoft class schedule site. Before I start asking
questions/pasting/posting code... I wanted to know if this is the kind
of thing that can/should be here..

The real issues I'm facing aren't so much pythonic as much as probably
dealing with getting the cookies/post attributes correct. There's
ongoing jscript on the site, but I'm hopeful/confident :) that if the
cookies/post is correct, then the target page can be fetched..

If this isn't the right list, let me know! And if it is, I'll start posting..

Thanks

-bd
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] trying to convert pycurl/html to ascii

2015-03-29 Thread bruce
Hi.

Doing a quick/basic pycurl test on a site and trying to convert the
returned page to pure ascii.

The page has the encoding line

meta http-equiv=Content-Type content=text/html;charset=ISO-8859-1

The test uses pycurl, and the StringIO to fetch the page into a str.

pycurl stuff
.
.
.
foo=gg.getBuffer()

-at this point, foo has the page in a str buffer.


What's happening, is that the test is getting the following kind of error/

UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 20:
invalid start byte

The test is using python 2.6 on redhat.

I've tried different decode functions based on different
sites/articles/stackoverflow but can't quite seem to resolve the issue.

Any thoughts/pointers would be useful!

Thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Installing twisted

2014-11-26 Thread bruce
Hey...

When you get this resolved.. if you don't mind.. post the soln back here!!

thanks

ps. I know, not strictly a py language issue.. but might really help
someone struggling to solve the same issue!



On Wed, Nov 26, 2014 at 7:45 PM, Gary
gwengst...@yahoo.com.dmarc.invalid wrote:
 Hi all,
 I have been trying to install the zope interface as part of the twisted 
 installation with no luck.

 Any suggestions ?


 Sent from my iPad
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] try/exception - error block

2014-08-03 Thread bruce
Hi.

I have a long running process, it generates calls to a separate py
app. The py app appears to generate errors, as indicated in the
/var/log/messages file for the abrtd daemon.. The errors are
intermittent.

So, to quickly capture all possible exceptions/errors, I decided to
wrap the entire main block of the test py func in a try/exception
block.

This didn't work, as I'm not getting any output in the err file
generated in the exception block.

I'm posting the test code I'm using. Pointers/comments would be helpful/useful.


 the if that gets run is the fac1 logic which operates on the input
packet/data..
elif (level=='collegeFaculty1'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
  ret=getParseCollegeFacultyList1(url,content)


Thanks.

if __name__ == __main__:
# main app

  try:
#college=asu
#url=https://webapp4.asu.edu/catalog;
#termurl=https://webapp4.asu.edu/catalog/TooltipTerms.ext;


#termVal=2141
#
# get the input struct, parse it, determine the level
#

#cmd='cat /apps/parseapp2/asuclass1.dat'
#print cmd= +cmd
#proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
#content=proc.communicate()[0].strip()
#print content
#sys.exit()

#s=getClasses(content)

#print arg1 =,sys.argv[0]
if(len(sys.argv)2):
  print error\n
  sys.exit()

a=sys.argv[1]
aaa=a

#
# data is coming from the parentApp.php
#data has been rawurlencode(json_encode(t))
#-reverse/split the data..
#-do the fetch,
#-save the fetched page/content if any
#-create the returned struct
#-echo/print/return the struct to the
# calling parent/call
#

##print urllib.unquote_plus(a).decode('utf8')
#print \n
#print simplejson.loads(urllib.unquote_plus(a))
z=simplejson.loads(urllib.unquote_plus(a))
##z=simplejson.loads(urllib.unquote(a).decode('utf8'))
#z=simplejson.loads(urllib2.unquote(a).decode('utf8'))

#print aa \n
print z
#print \n bb \n

#
#-passed in
#
url=str(z['currentURL'])
level=str(z['level'])
cname=str(z['parseContentFileName'])


#
# need to check the contentFname
# -should have been checked in the parentApp
# -check it anyway, return err if required
# -if valid, get/import the content into
# the content var for the function/parsing
#

##cmd='echo ${yolo_clientFetchOutputDir}/'
cmd='echo ${yolo_clientParseInputDir}/'
#print cmd= +cmd
proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
cpath=proc.communicate()[0].strip()

cname=cpath+cname
#print cn = +cname+\n
#sys.exit()


cmd='test -e '+cname+'  echo 1'
#print cmd= +cmd
proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
c1=proc.communicate()[0].strip()

if(not c1):
  #got an error - process it, return
  print error in parse

#
# we're here, no err.. got content
#

#fff= sdsu2.dat
with open(cname,r) as myfile:
  content=myfile.read()
  myfile.close()


#-passed in
#college=louisville
#url=http://htmlaccess.louisville.edu/classSchedule/;
#termVal=4138


#print term = +str(termVal)+\n
#print url = +url+\n

#jtest()
#sys.exit()

#getTerm(url,college,termVal)


ret={} # null it out to start
if (level=='rState'):
  #ret=getTerm(content,termVal)
  ret=getParseStates(content)

elif (level=='stateCollegeList'):
#getDepts(url,college, termValue,termName)
  ret=getParseStateCollegeList(url,content)

elif (level=='collegeFaculty1'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
  ret=getParseCollegeFacultyList1(url,content)

elif (level=='collegeFaculty2'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
  ret=getParseCollegeFacultyList2(content)



#
# the idea of this section.. we have the resulting
# fetched content/page...
#

a={}
status=False
if(ret['status']==True):

  s=ascii_strip(ret['data'])
  if(((s.find(/html)-1) or (s.find(/HTML)-1)) and
  ((s.find(html)-1) or (s.find(HTML)-1)) and
   level=='classSectionDay'):

status=True
  #print herh
  #sys.exit()

  #
  # build the returned struct
  #
  #

  a['Status']=True
  a['recCount']=ret['count']
  a['data']=ret['data']
  a['nextLevel']=''
  a['timestamp']=''
  a['macAddress']=''
elif(ret['status']==False):
  a['Status']=False
  a['recCount']=0
  a['data']=''
  a['nextLevel']=''
  a['timestamp']=''
  a['macAddress']=''

res=urllib.quote(simplejson.dumps(a))
##print res

name=subprocess.Popen('uuidgen -t', shell=True,stdout=subprocess.PIPE)
name=name.communicate()[0].strip()

Re: [Tutor] try/exception - error block

2014-08-03 Thread bruce
chris.. my bad.. I wasnt intending to mail you personally.

Or I wouldn't have inserted the thanks guys!

 thanks guys...

 but in all that.. no one could tell me .. why i'm not getting any
 errs/exceptions in the err file which gets created on the exception!!!

 but thanks for the information on posting test code!

Don't email me privately - respond to the list :)

Also, please don't top-post.

ChrisA

On Sun, Aug 3, 2014 at 10:29 AM, bruce badoug...@gmail.com wrote:
 Hi.

 I have a long running process, it generates calls to a separate py
 app. The py app appears to generate errors, as indicated in the
 /var/log/messages file for the abrtd daemon.. The errors are
 intermittent.

 So, to quickly capture all possible exceptions/errors, I decided to
 wrap the entire main block of the test py func in a try/exception
 block.

 This didn't work, as I'm not getting any output in the err file
 generated in the exception block.

 I'm posting the test code I'm using. Pointers/comments would be 
 helpful/useful.

 
  the if that gets run is the fac1 logic which operates on the input
 packet/data..
 elif (level=='collegeFaculty1'):
 #getClasses(url, college, termVal,termName,deptName,deptAbbrv)
   ret=getParseCollegeFacultyList1(url,content)
 

 Thanks.

 if __name__ == __main__:
 # main app

   try:
 #college=asu
 #url=https://webapp4.asu.edu/catalog;
 #termurl=https://webapp4.asu.edu/catalog/TooltipTerms.ext;


 #termVal=2141
 #
 # get the input struct, parse it, determine the level
 #

 #cmd='cat /apps/parseapp2/asuclass1.dat'
 #print cmd= +cmd
 #proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
 #content=proc.communicate()[0].strip()
 #print content
 #sys.exit()

 #s=getClasses(content)

 #print arg1 =,sys.argv[0]
 if(len(sys.argv)2):
   print error\n
   sys.exit()

 a=sys.argv[1]
 aaa=a

 #
 # data is coming from the parentApp.php
 #data has been rawurlencode(json_encode(t))
 #-reverse/split the data..
 #-do the fetch,
 #-save the fetched page/content if any
 #-create the returned struct
 #-echo/print/return the struct to the
 # calling parent/call
 #

 ##print urllib.unquote_plus(a).decode('utf8')
 #print \n
 #print simplejson.loads(urllib.unquote_plus(a))
 z=simplejson.loads(urllib.unquote_plus(a))
 ##z=simplejson.loads(urllib.unquote(a).decode('utf8'))
 #z=simplejson.loads(urllib2.unquote(a).decode('utf8'))

 #print aa \n
 print z
 #print \n bb \n

 #
 #-passed in
 #
 url=str(z['currentURL'])
 level=str(z['level'])
 cname=str(z['parseContentFileName'])


 #
 # need to check the contentFname
 # -should have been checked in the parentApp
 # -check it anyway, return err if required
 # -if valid, get/import the content into
 # the content var for the function/parsing
 #

 ##cmd='echo ${yolo_clientFetchOutputDir}/'
 cmd='echo ${yolo_clientParseInputDir}/'
 #print cmd= +cmd
 proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
 cpath=proc.communicate()[0].strip()

 cname=cpath+cname
 #print cn = +cname+\n
 #sys.exit()


 cmd='test -e '+cname+'  echo 1'
 #print cmd= +cmd
 proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
 c1=proc.communicate()[0].strip()

 if(not c1):
   #got an error - process it, return
   print error in parse

 #
 # we're here, no err.. got content
 #

 #fff= sdsu2.dat
 with open(cname,r) as myfile:
   content=myfile.read()
   myfile.close()


 #-passed in
 #college=louisville
 #url=http://htmlaccess.louisville.edu/classSchedule/;
 #termVal=4138


 #print term = +str(termVal)+\n
 #print url = +url+\n

 #jtest()
 #sys.exit()

 #getTerm(url,college,termVal)


 ret={} # null it out to start
 if (level=='rState'):
   #ret=getTerm(content,termVal)
   ret=getParseStates(content)

 elif (level=='stateCollegeList'):
 #getDepts(url,college, termValue,termName)
   ret=getParseStateCollegeList(url,content)

 elif (level=='collegeFaculty1'):
 #getClasses(url, college, termVal,termName,deptName,deptAbbrv)
   ret=getParseCollegeFacultyList1(url,content)

 elif (level=='collegeFaculty2'):
 #getClasses(url, college, termVal,termName,deptName,deptAbbrv)
   ret=getParseCollegeFacultyList2(content)



 #
 # the idea of this section.. we have the resulting
 # fetched content/page...
 #

 a={}
 status=False
 if(ret['status']==True):

   s=ascii_strip(ret['data'])
   if(((s.find(/html)-1) or (s.find(/HTML)-1)) and
   ((s.find(html)-1) or (s.find(HTML)-1)) and
level=='classSectionDay'):

 status=True
   #print

Re: [Tutor] try/exception - error block

2014-08-03 Thread bruce
Hi Alan.

Yep, the err file in the exception block gets created. and the weird
thing is it matches the time of the abrtd information in the
/var/log/messages log..

Just nothing in the file!



On Sun, Aug 3, 2014 at 4:01 PM, Alan Gauld alan.ga...@btinternet.com wrote:
 On 03/08/14 18:52, bruce wrote:

 but in all that.. no one could tell me .. why i'm not getting any
 errs/exceptions in the err file which gets created on the exception!!!


 Does the file actually get created?
 Do you see the print statement output - are they what you expect?

 Did you try the things Steven suggested.


except Exception, e:
  print e
  print pycolFac1 - error!! \n;
  name=subprocess.Popen('uuidgen -t',
 shell=True,stdout=subprocess.PIPE)
  name=name.communicate()[0].strip()
  name=name.replace(-,_)


 This is usually a bad idea. You are using name for the process and its
 output. Use more names...
 What about:

 uuid=subprocess.Popen('uuidgen -t',shell=True,stdout=subprocess.PIPE)
 output=uuid.communicate()[0].strip()
 name=output.replace(-,_)

  name2=/home/ihubuser/parseErrTest/pp_+name+.dat


 This would be a good place to insert a print

 print name2

  ofile1=open(name2,w+)


 Why are you using w+ mode? You are only writing.
 Keep life as simple as possible.

  ofile1.write(e)


 e is quite likely to be empty

  ofile1.write(aaa)


 Are you sure aaa exists at this point? Remember you are catching all errors
 so if an error happens prior to aaa being created this will
 fail.

  ofile1.close()


 You used the with form earlier, why not here too.
 It's considered better style...

 Some final comments.
 1) You call sys.exit() several times inside
 the try block. sys.exit will not be caught by your except block,
 is that what you expect?.

 2) The combination of confusing naming of variables,
 reuse of names and poor code layout and excessive commented
 code makes it very difficult to read your code.
 That makes it hard to figure out what might be going on.
 - Use sensible variable names not a,aaa,z, etc
 - use 3 or 4 level indentation not 2
 - use a version control system (RCS,CVS, SVN,...) instead
   of commenting out big blocks
 - use consistent code style
  eg with f as ... or open(f)/close(f) but not both
 - use the os module (and friends) instead of subprocess if possible

 3) Have you tried deleting all the files in the
 /home/ihubuser/parseErrTest/ folder and starting again,
 just to be sure that your current code is actually
 producing the empty files?

 4) You use tmpParseDir in a couple of places but I don't
 see it being set anywhere?


 That's about the best I can offer based on the
 information available.

 --
 Alan G
 Author of the Learn to Program web site
 http://www.alan-g.me.uk/
 http://www.flickr.com/photos/alangauldphotos

 --
 https://mail.python.org/mailman/listinfo/python-list
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] capturing errors/exceptions..

2014-08-01 Thread bruce
Hi.

Really basic question!!

Got a chunk of some test python, trying to figure out a quick/easy way
to capture all/any errors/exceptions that get thrown..

For the test process, I need to ensure that I capture any/all
potential errors..

-Could/Should I wrap the entire func in a try/catch when I call the
function from the parent process?
-Should I have separate try/catch blocks within the function?
-The test py app is being run from the CLI, is there a py command line
attribute that auto captures all errors?


Any thoughts..

Thanks!



A sample of the test code is:

def getParseCollegeFacultyList1(url, content):

  s=content

  s=s.replace(nbsp;,)

  if(debug==1):
print s=+s

  url=url.strip(/)

  #got the page/data... parse it and get the schools..
  #use the dept list as the school

  # s contains HTML not XML text
  d = libxml2dom.parseString(s, html=1)

  ###
  #--
  #--create the output data file for the registrar/start data
  #--
  #--
  ###


#term_in=201336sel_subj=ACCT

  if(debug==1):
print inside parse state/college function \n

  #---Form

  #fetch the option val/text for the depts which are used
  #as the dept abbrv/name on the master side
  #-- the school matches the dept...
  #-- this results in separate packets for each dept

  p=//a[contains(@href,'SelectTeacher') and @id='last']//attribute::href
  ap=//a[contains(@href,'campusRatings.jsp')]//attribute::href

  hpath=//div[@id='profInfo']/ul/li[1]//a/attribute::href#
-get the college website
  cpath=//div[@id='profInfo']/ul/li[2]/text()
#-get the city,state
  colpath=//h2/text()#-college name

  xpath=//a[contains(@title,'school id:')]/attribute::href

  hh_ = d.xpath(hpath)
  cc_ = d.xpath(cpath)
  col_ = d.xpath(colpath)
  ap_ = d.xpath(ap)

  if(debug==1):
print hhl +str(len(hh_))
print ccl +str(len(cc_))


  web=
  if (len(hh_)0):
web=hh_[0].textContent

  city=
  if (len(cc_)0):
city=cc_[0].textContent

  colname=
  if (len(col_)0):
colname=col_[0].textContent

  colname=colname.encode('ascii', 'ignore').strip()


  #
  # set up out array
  #
  ret={}
  out={}
  row={}
  jrow=

  ndx=0

  pcount_ = d.xpath(p)
  if(len(pcount_)==0):

#at least one success/entry.. but apparently only a single page..
status=True

#count=pcount_[0].textContent.strip()
#countp=count.split('pageNo=')
#count=countp[1]
#rr=countp[0]

if(len(ap_)==1):
  idd=ap_[0].textContent.strip()
  idd=idd.split(?sid=)
  idd=idd[1].split()
  idd=idd[0].strip()

  nurl=url+/SelectTeacher.jsp?sid=+idd+pageNo=1
  #nurl=url+pageNo=1

  row={}
  row['WriteData']=True
  row['tmp5']=web
  row['tmp6']=city
  row['tmp7']=colname
  row['tmp8']=nurl

  #don't json for now
  #jrow=simplejson.dumps(row)
  jrow=row
  out[ndx]=jrow

  ndx = ndx+1

  else:
#at least one success/entry.. set the status
status=True

count=pcount_[0].textContent.strip()
countp=count.split('pageNo=')
count=countp[1]
rr=countp[0]

if(debug==1):
  print c =+str(count)+\n

for t in range(1,int(count)+1):
  nurl=url+rr+pageNo=+str(t)

  if(debug==1):
print nurl = +nurl+\n

  row={}
  row['WriteData']=True
  row['tmp5']=web
  row['tmp6']=city
  row['tmp7']=colname
  row['tmp8']=nurl

  #don't json for now
  #jrow=simplejson.dumps(row)
  jrow=row
  out[ndx]=jrow

  ndx = ndx+1


  ret['data']=simplejson.dumps(out)
  ret['count']=ndx
  ret['status']=status
  return(ret)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] capturing errors/exceptions..

2014-08-01 Thread bruce
Clarification.

The test py app is being invoked via a system function from a separate
app, and not stacktrace gets created. All I have is in the
/var/log/messages, an indication that the pyTest app generated an
error..

This is noted by the abrtd process, but I have no other data to go
on.. Which is why I'm interested in implementing some basic
capture/display all/any error approach to get a feel for what's
happening..

On Fri, Aug 1, 2014 at 10:54 AM, Steven D'Aprano st...@pearwood.info wrote:
 On Fri, Aug 01, 2014 at 10:14:38AM -0400, bruce wrote:
 Hi.

 Really basic question!!

 Got a chunk of some test python, trying to figure out a quick/easy way
 to capture all/any errors/exceptions that get thrown..

 Why do you want to do that? The answer to your question will depend on
 what you expect to do with the exception once you've caught it, and the
 answer might very well be don't do that.


 For the test process, I need to ensure that I capture any/all
 potential errors..

 Hmmm. I don't quite see the reason for this.

 If you're running by hand, manually, surely you want to see the
 exceptions so that you can fix them? If there's an exception, what do
 you expect to do next?

 If you're using the unittest module, it already captures the exceptions
 for you, no need to re-invent the wheel.


 -Could/Should I wrap the entire func in a try/catch when I call the
 function from the parent process?

 You mean something like this?

 try:
 mymodule.function_being_test(x, y, z)
 except Exception:
 do_something_with_exception()


 Sure. Sounds reasonable, if you have something reasonable to do once
 you've captured the exception.


 -Should I have separate try/catch blocks within the function?

 No. That means that the function is constrained by the testing regime.


 -The test py app is being run from the CLI, is there a py command line
 attribute that auto captures all errors?

 No. How would such a thing work? In general, once an exception occurs,
 you get a cascade of irrelevant errors:

 n = lne(some_data)  # Oops I meant len
 m = 2*n + 1  # oops, this fails because n doesn't exist
 value = something[m]  # now this fails because m doesn't exist
 ...

 Automatically recovering from an exception and continuing is not
 practical, hence Python halts after an exception unless you take steps
 to handle it yourself.



 --
 Steven
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] cdata/aml question..

2014-04-13 Thread bruce
Hi.

The following text contains sample data. I'm simply trying to parse it
using libxml2dom as the lib to extract data.

As an example, to get the name/desc

test data
class_meta_datadepartmentsdepartmentname![CDATA[A
HTG]]/namedesc![CDATA[American
Heritage]]/desc/departmentdepartmentname![CDATA[ACC]]/namedesc![CDATA[Accounting]]/desc/department

d = libxml2dom.parseString(s, html=1)

p1=//department/name
p2=//department/desc

pcount_ = d.xpath(p1)
p2_ = d.xpath(p2)
print str(len(pcount_))
nba=0

for a in pcount_:
  abbrv=a.nodeValue
  print abbrv
  abbrv=a.toString()
  print abbrv
  abbrv=a.textContent
  print abbrv

neither of the above generates any of the CML name/desc data..

any pointers on what I'm missing???

I can/have created a quick parse/split process to get the data, but I
thought there'd be a straight forward process to extract the data
using one of the py/libs..

thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python Question

2014-01-11 Thread bruce
hey amy..

ok.. before we jump to coding (and forgive me if what I'm about to
type is really basic!) let's play a bit with what's called
psuedo-code.

psuedo-code is a technique to kind of put your thoughts about a
problem/approach in a hash of code/english.. kind of lets you lay out
what you're trying to solve/program.

so for you issue:

you need to think about what you're trying to do.

you want to give the user back something, based on you doing something
to the thing the user gives you.

so this means, you need some way of getting user input
you want to do something to the input, so you need some way of
capturing the input to perform the something (better known as an
operation) on the user's input..

then you want to redisplay stuff back to the user, so you're going to
need a way of displaying back to the user the data/output..

create the psuedo-code, post it, and we'll get this in no time!



On Sat, Jan 11, 2014 at 12:23 PM, Amy Davidson amydavid...@sympatico.ca wrote:
 Hey!
 So luckily with the texts that were sent to me, I was able to figure out the
 answer(yay)!

 Unfortunately I am now stuck on a different question.

 Write a function called highlight() that prompts the user for a string.
 Your code should ensure that the string is all lower case.
 Next, prompt the user for a smaller 'substring' of one or more characters.
 Then replace every occurrence of the substring in the first string with an
 upper case.
 Finally, report to the user how many changes were made (i.e., how many
 occurrences of the substring there were).”
 On Jan 11, 2014, at 1:04 AM, Alex Kleider aklei...@sonic.net wrote:

 On 2014-01-10 17:57, Amy Davidson wrote:

 Hey Danny,
 I just started taking the course (introduction to Computer Science) on
 last Tuesday, so I am not to familiar. I have been doing my best to
 understand  the material by reading the text book, Learn Python the
 hard way.


 A lot of people seem to think the Hard Way is the way to go.  I disagree.
 I found that Allen Downey's book is excellent and free (although the book is
 also available in 'real' print which works better for me.)

 http://www.greenteapress.com/thinkpython/

 My copy covers Python 2.7, you use Python 3 I believe, but I doubt that that
 will be too much of a problem.  At the intro level the differences are few.

 ak

 In my quest to answer the question given to me, I have searched the
 internet high and low of other functions thus, I am familiar with the
 basic knowledge of them (i.e. starting with def) as well as examples.
 We can attempt the approach to the method that you prefer.
 Thans for helping me, by the way.
 On Jan 10, 2014, at 5:25 PM, Danny Yoo d...@hashcollision.org wrote:

 On Fri, Jan 10, 2014 at 2:00 PM, Keith Winston keithw...@gmail.com wrote:

 Amy, judging from Danny's replies, you may be emailing him and not the
 list. If you want others to help, or to report on your progress,
 you'll need to make sure the tutor email is in your reply to:

 Hi Amy,
 Very much so.  Please try to use Reply to All if you can.
 If you're wondering why I'm asking for you to try to recall any other
 example function definitions, I'm doing so specifically because it is
 a general problem-solving technique.  Try to see if the problem that's
 stumping you is similar to things you've seen before.  Several of the
 heuristics from Polya's How to Solve It refer to this:
   http://en.wikipedia.org/wiki/How_to_Solve_It
 If you haven't ever seen any function definition ever before, then we
 do have to start from square one.  But this would be a very strange
 scenario, to be asked to write a function definition without having
 seen any previous definitions before.
 If you have seen a function before, then one approach we might take is
 try to make analogies to those previous examples.  That's an approach
 I'd prefer.

 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor




 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] trying to parse an xml file

2013-12-14 Thread bruce
Hi.

Looking at a file --
http://www.marquette.edu/mucentral/registrar/snapshot/fall13/xml/BIOL_bysubject.xml

The file is generated via online/web url, and appears to be XML.

However, when I use elementtree:
  document = ElementTree.parse( '/apps/parseapp2/testxml.xml' )

I get an invalid error : not well-formed (invalid token):

I started to go through the file, to remove offending chars, but
decided there has to be a better approach. I also looked at the
underlying url/page to see what it's doing with the javascript to
parse the XML.


Anyone have any python suggestions as to how to proceed to parse out the data!

thanks


the javascript chunk ::

var dsSnapshot = new Spry.Data.XMLDataSet(xml/BIOL_bysubject.xml,
RECORDS/RECORD);
dsSnapshot.setColumnType(nt, html);
dsSnapshot.setColumnType(ti, html);
dsSnapshot.setColumnType(new, html);
dsSnapshot.setColumnType(se, html);
dsSnapshot.setColumnType(mt, html);
dsSnapshot.setColumnType(ex, html);
dsSnapshot.setColumnType(in, html);
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] libtiff--can't find library

2012-06-29 Thread R Bruce van Dover
Presumably this is a newbie question; apologies in advance, but I have 
spent hours trying to RTFM, to no avail. Can anyone help?


I've installed pylibtiff-0.1-svn.win32.exe since I want to be able to 
read a TIFF file. But when I type (in IDLE) I get



from  libtiffimport  TIFFfile,  TIFFimage

Traceback (most recent call last):
 File pyshell#1, line 1, inmodule
  from libtiff import TIFFfile, TIFFimage
  File E:\Python27\lib\site-packages\libtiff\__init__.py, line 4, inmodule
 from .libtiff import libtiff, TIFF
  File E:\Python27\lib\site-packages\libtiff\libtiff.py, line 35, inmodule
raise ImportError('Failed to find TIFF library. Make sure that libtiff is 
installed and its location is listed in PATH|LD_LIBRARY_PATH|..')
ImportError: Failed to find TIFF library. Make sure that libtiff is installed 
and its location is listed in PATH|LD_LIBRARY_PATH|..

 import sys
 print sys.path

['E:\\Python27\\Lib\\idlelib', 'E:\\Windows\\system32\\python27.zip', 
'E:\\Python27\\DLLs', 'E:\\Python27\\lib', 'E:\\Python27\\lib\\plat-win', 
'E:\\Python27\\lib\\lib-tk', 'E:\\Python27', 'E:\\Python27\\lib\\site-packages']

Libtiff is in the 'E:\\Python27\\lib\\site-packages' directory as it's 
supposed to. So is, e.g., Numpy, which imports just fine.


What am I doing wrong? FWIW, I tried the PIL package, and had the same 
problem (module not found). Why do these modules not import when Numpy, 
matplotlib, scipy, etc. import as expected?


Running Win7, 32bit, Python 2.7.1.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] generating unique set of dicts from a list of dicts

2012-01-10 Thread bruce
trying to figure out how to generate a unique set of dicts from a
json/list of dicts.

initial list :::
[{pStart1a: 
{termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI,
instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH,
pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH},
pSearch1a:
{chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}},
 {pStart1:},
 
{pStart1a:{termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI,
 instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH,
 
pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH},
 pSearch1a:
 {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}},
 {pStart1:}]



As an exmple, the following is the test list:

[{pStart1a: 
{termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI,
instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH,
pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH},
pSearch1a:
{chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}},
 {pStart1:},
 
{pStart1a:{termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI,
 instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH,
 
pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH},
 pSearch1a:
 {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}},
 {pStart1:}]

Trying to get the following, list of unique dicts, so there aren't
duplicate dicts.
 Searched various sites/SO.. and still have a mental block.

[
  {pStart1a:
  {termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI,
   
instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH,
   
pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH},
  pSearch1a:
  {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}},
  {pStart1:}]

I was considering iterating through the initial list, copying each
dict into a new list, and doing a basic comparison, adding the next
dict if it's not in the new list.. is there another/better way?

posted this to StackOverflow as well.  
http://stackoverflow.com/questions/8808286/simplifying-a-json-list-to-the-unique-dict-items
 

There was a potential soln that I couldn't understand.


-
The simplest approach -- using list(set(your_list_of_dicts)) won't
work because Python dictionaries are mutable and not hashable (that
is, they don't implement __hash__). This is because Python can't
guarantee that the hash of a dictionary won't change after you insert
it into a set or dict.

However, in your case, since you (don't seem to be) modifying the data
at all, you can compute your own hash, and use this along with a
dictionary to relatively easily find the unique JSON objects without
having to do a full recursive comparison of each dictionary to the
others.

First, we need a function to compute a hash of the dictionary. Rather
than trying to build our own hash function, let's use one of the
built-in ones from hashlib:

def dict_hash(d):
out = hashlib.md5()
for key, value in d.iteritems():
out.update(unicode(key))
out.update(unicode(value))
return out.hexdigest()

(Note that this relies on unicode(...) for each of your values
returning something unique -- if you have custom classes in the
dictionaries whose __unicode__ returns something like MyClass
instance, this will fail or will require modification. Also, in your
example, your dictionaries are flat, but I'll leave it as an exercise
to the reader how to expand this solution to work with dictionaries
that contain other dicts or lists.)

Since dict_hash returns a string, which is immutable, you can now use
a dictionary to find the unique elements:

uniques_map = {}
for d in list_of_dicts:
uniques[dict_hash(d)] = d
unique_dicts = uniques_map.values()

*** not sure what the uniqes is, or what/how it should be defined


thoughts/comments are welcome

thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] list issue.. i think

2011-12-22 Thread bruce
hi.

got a test where i have multiple lists with key/values. trying to figure
out how to do a join/multiply, or whatever python calls it, where i have a
series of resulting lists/dicts that look like the following..

the number of lists/rows is dynamic..
the size of the list/rows will also be dynamic as well.

i've looked over the py docs, as well as different potential solns..

psuedo code, or pointers would be helpful.

thanks...

test data
a['a1']=['a1','a2','a3']
a['a2']=['b1','b2','b3']
a['a3']=['c1','c2','c3']

end test result::
a1:a1,a2:b1,a3:c1
a1:a2,a2:b1,a3:c1
a1:a3,a2:b1,a3:c1

a1:a1,a2:b2,a3:c1
a1:a2,a2:b2,a3:c1
a1:a3,a2:b2,a3:c1

a1:a1,a2:b3,a3:c1
a1:a2,a2:b3,a3:c1
a1:a3,a2:b3,a3:c1

a1:a1,a2:b1,a3:c2
a1:a2,a2:b1,a3:c2
a1:a3,a2:b1,a3:c2

a1:a1,a2:b2,a3:c2
a1:a2,a2:b2,a3:c2
a1:a3,a2:b2,a3:c2

a1:a1,a2:b3,a3:c2
a1:a2,a2:b3,a3:c2
a1:a3,a2:b3,a3:c2

a1:a1,a2:b1,a3:c3
a1:a2,a2:b1,a3:c3
a1:a3,a2:b1,a3:c3

a1:a1,a2:b2,a3:c3
a1:a2,a2:b2,a3:c3
a1:a3,a2:b2,a3:c3

a1:a1,a2:b3,a3:c3
a1:a2,a2:b3,a3:c3
a1:a3,a2:b3,a3:c3
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] multi-threaded/parallel processing - local tutor

2011-01-23 Thread bruce
not looking for docs.. already have code.

looking to actually talk to someone in the san fran/bay area for an in
person talk/tutor session.

thanks


2011/1/22 शंतनू shanta...@gmail.com:
 You may find following useful.

 2.6+ --- http://docs.python.org/library/multiprocessing.html
 3.x --- http://docs.python.org/dev/library/multiprocessing.html


 On 23-Jan-2011, at 11:46 AM, bruce wrote:

 Hi.

 I'm working on a project that uses python to spawn/create multiple
 threads, to run parallel processes to fetch data from websites. I'm
 looking to (if possible) go over this in person with someone in the
 San Fran area. Lunch/beer/oj can be on me!!

 It's a little too complex to try to describe here, and pasting the
 code/apps wouldn't do any good without an associated conversation.

 So, if you're in the Bay area, and you're up to some in person
 tutoring, let me know.

 Thanks guys for this list!!
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] multi-threaded/parallel processing - local tutor

2011-01-22 Thread bruce
Hi.

I'm working on a project that uses python to spawn/create multiple
threads, to run parallel processes to fetch data from websites. I'm
looking to (if possible) go over this in person with someone in the
San Fran area. Lunch/beer/oj can be on me!!

It's a little too complex to try to describe here, and pasting the
code/apps wouldn't do any good without an associated conversation.

So, if you're in the Bay area, and you're up to some in person
tutoring, let me know.

Thanks guys for this list!!
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] list of tutors for python

2011-01-21 Thread bruce
Hi guys.

Please don't slam me!! I'm working on a project, looking for a pretty
good number of pythonistas. Trying to find resources that I should
look to to find them, and thought I would try here for suggestions.

Any comments would be appreciated.

Thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] [Visualpython-users] tkinter and visual with objects

2009-02-05 Thread Bruce Sherwood

The following works to produce a window with nothing displayed in it:

ball = sphere()
ball.visible = 0

Another scheme would be this:

scene.range = 1
ball = sphere(radius=1e-6)

The point is that Visual doesn't create a window unless there is 
something to display.


Bruce Sherwood

Mr Gerard Kelly wrote:

I'm trying to make this very simple program, where the idea is that you
click a tkinter button named Ball and then a ball will appear in the
visual window.

Problem is that the window itself doesn't pop up until the button is
pressed and the ball is created. I would like it to start out blank, and
then have the ball appear in it when the button is pressed.

I thought that having self.display=display() in the __init__ of the
Application would do this, but it doesn't seem to.

What do I need to add to this code to make it start out with a blank window?


from visual import *
from Tkinter import *
import sys


class Ball:
  def __init__(self):
sphere(pos=(0,0,0))

class Application:
  def __init__(self, root):


self.frame = Frame(root)
self.frame.pack()

self.display=display()

self.button=Button(self.frame, text=Ball, command=self.ball)
self.button.pack()

  def ball(self):
self.ball=Ball()

root=Tk()
app=Application(root)

root.mainloop()

--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
___
Visualpython-users mailing list
visualpython-us...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/visualpython-users

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] [Visualpython-users] VPython and Tkinter

2009-02-04 Thread Bruce Sherwood
In Visual 3,  there is an example program (Tk-visual.py, if I remember 
correctly) which shows a Tk window controlling actions in a separate 
Visual window.


In Visual 5, I believe that this program would still work on Windows and 
Linux, but because there seems to be no way to make this work in the 
Carbon-based Mac version, the application was removed from the set of 
examples, which are platform-independent.


Bruce Sherwood

Mr Gerard Kelly wrote:

Is there a way to make separate VPython and Tkinter windows run
simultaneously from the one program? Or to have the VPython window run
inside a Tkinter toplevel?

If I have a program that uses both, it seems that one window has to
close before the other will start running.


  

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor