Re: Fake news Detect

2020-07-18 Thread Vincent Davis
Data Sceptic has a couple podcast and some of the code is open source.
https://dataskeptic.com/blog/episodes/2018/algorithmic-detection-of-fake-news

Thanks
Vincent Davis
720-301-3003
*Want to get a hold of me?*


*SMS: awesome.phone: ok...*
*email: bad!*


On Fri, Jul 17, 2020 at 11:39 PM Mike Dewhirst 
wrote:

> On 18/07/2020 6:16 am, Grant Edwards wrote:
> > On 2020-07-17, Dennis Lee Bieber  wrote:
> >> On Fri, 17 Jul 2020 16:02:15 - (UTC), Gazu 
> declaimed
> >> the following:
> >>
> >>> Hey Guys I am new to python and i am building a fake news detection
> >>> system ...
> >>  I suspect that, if anyone had done this already, it would likely be
> >> found on some source code archive (github?) -- and you'd just be
> >> duplicating the effort.
> >>
> >>  Essentially, since the core of this functionality depends upon the
> >> algorithm, YOU will have to develop the algorithm.
> > Or he could do something easier like eliminating hunger, war and
> > Covid-19.
>
> Or like changing culture to give more weight to education, integrity
> etc. We need systems to automatically identify fake news and educate
> believers. News consumers have to do it.
>
> News consumers need a system where they can go to check news items to
> see if they are credible. Without the cooperation of news conduits - to
> label news items with the source - that will be difficult.
>
> However, that doesn't mean the crowd can't check credibility. So,
> culture change is needed. No-one wants to be outed as a fake news source.
>
> Here's a project. Build an automatic news aggregation site which
> collects all news in two pages per news item. Page 1 for the item and
> page 2 for the crowd credibility assessment and naming of the apparent
> source. Should work somewhat like Wikipedia. Except editors for page 2
> would need a threshold score for being correct. Everyone can criticise
> but you lose points for being on the wrong side of history.
>
> That'll be 2 cents
>
> Mike
>
> >
> > --
> > Grant
> >
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


How to match scipy.spatial.distance results back to pandas df

2020-07-16 Thread Vincent Davis
I am trying to find points in the activity that are near points in segment.
I get the results from distance() and what to identify the points in
segment and activity that are within some value of each other .0001 for
example.
One idea I had was to append each array row (or column) to the
corresponding row in the df after converting it to a dict so that I can
lookup the point in the other df.

Any suggestions, ideas?

from scipy.spatial import distance
a = distance.cdist(segment[['Latitude', 'Longitude']],
activity[['Latitude', 'Longitude']], 'euclidean')
b = a[a < .0001]
b

array([8.83911760e-05, 6.31347765e-05, 3.89486842e-05, 2.13775583e-05,
   2.10950231e-05, 4.10487515e-05, 6.7000e-05, 9.10878697e-05,
   7.61183289e-05, 9.90050504e-05, 7.88162420e-05, 5.90931468e-05,
   4.50111097e-05, 4.97393205e-05, 6.78969808e-05, 8.52115016e-05,
...


Thanks
Vincent Davis
720-301-3003
*Want to get a hold of me?*


*SMS: awesome.phone: ok...*
*email: bad!*
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Convert and analyze image data in a spreadsheet.

2020-06-14 Thread Vincent Davis
Dennis,
Thanks for your ideas. The researcher I am working with just told me the
data is wrong and needs to send me new data and there are other problems
with exactly what their research questions is. So this goes nowhere for now.

Thanks
Vincent Davis
720-301-3003
*Want to get a hold of me?*


*SMS: awesome.phone: ok...*
*email: bad!*


On Thu, Jun 11, 2020 at 1:17 PM Dennis Lee Bieber 
wrote:

>
> My previous response hasn't made it through the
> gmane<>list<>usenet<>back cycle so I can't "talk to myself"...
>
> On Thu, 11 Jun 2020 07:44:25 -0600, Vincent Davis
>  declaimed the following:
>
> >Looking for a little advise.
> >I have 6x6 color (CIELAB <
> https://en.wikipedia.org/wiki/CIELAB_color_space>)
> >data in a spreadsheet (color samples of clothing). I would like to
> >visualize this as an image on a per channel (l,a,b) and as a color image.
> I
> >can read the data from the spreadsheet. I am not sure the path a should
> >choose from there. I think the path is to read this data into a numpy
> array
> >and use scikit-image.
> >
>
> I'm going to speak blasphemy and mention the R statistics package.
> It
> has
>
> https://stat.ethz.ch/R-manual/R-devel/library/grDevices/html/convertColor.html
> which includes L*a*b* as a color space. Though it does not seem to have
> Adobe RGB...
> """
> "XYZ", "sRGB", "Apple RGB", "CIE RGB", "Lab", "Luv".
> """
>
> Also links to the main page of a source of the math used in
> converting
>
> It also mentions something that wasn't obvious in other references:
> """
> The Lab and Luv spaces describe colors of objects, and so require the
> specification of a reference ‘white light’ color. Illuminant D65 is a
> standard indirect daylight, Illuminant D50 is close to direct sunlight, and
> Illuminant A is the light from a standard incandescent bulb. Other standard
> CIE illuminants supported are B, C, E and D55. RGB colour spaces are
> defined relative to a particular reference white, and can be only
> approximately translated to other reference whites.
> """
>
> ... hence conversions from L*a*b* to, say, sRGB, will differ based upon
> what illumination reference is used!
>
>
>
> --
> Wulfraed Dennis Lee Bieber AF6VN
> wlfr...@ix.netcom.com
> http://wlfraed.microdiversity.freeddns.org/
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Convert and analyze image data in a spreadsheet.

2020-06-11 Thread Vincent Davis
Looking for a little advise.
I have 6x6 color (CIELAB <https://en.wikipedia.org/wiki/CIELAB_color_space>)
data in a spreadsheet (color samples of clothing). I would like to
visualize this as an image on a per channel (l,a,b) and as a color image. I
can read the data from the spreadsheet. I am not sure the path a should
choose from there. I think the path is to read this data into a numpy array
and use scikit-image.

Questions:
1. I am not sure how to get the 3 color measurements into a color image
pixel.
2. I only kinda understand color spaces, this data is in CIELAB, do I want
to keep it in that color format? I think yes.
3. How can I visualize this data as a 6x6 color image and visualize each
color on a gray scale.
4. General hints or link of how to proceed would be helpful.

Thanks
Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: clusters of numbers

2018-12-15 Thread Vincent Davis
Why not start with a histogram.

Vincent

On Sat, Dec 15, 2018 at 6:46 PM Marc Lucke  wrote:

> hey guys,
>
> I have a hobby project that sorts my email automatically for me & I want
> to improve it.  There's data science and statistical info that I'm
> missing, & I always enjoy reading about the pythonic way to do things too.
>
> I have a list of percentage scores:
>
>
> (1,11,1,7,5,7,2,2,2,10,10,1,2,2,1,7,2,1,7,5,3,8,2,6,3,2,7,2,12,3,1,2,19,3,5,1,1,7,8,8,1,5,6,7,3,14,6,1,6,7,6,15,6,3,7,2,6,23,2,7,1,21,21,8,8,3,2,20,1,3,12,3,1,2,10,16,16,15,6,5,3,2,2,11,1,14,6,3,7,1,5,3,3,14,3,7,3,5,8,3,6,17,1,1,7,3,1,2,6,1,7,7,12,6,6,2,1,6,3,6,2,1,5,1,8,10,2,6,1,7,3,5,7,7,5,7,2,5,1,19,19,1,12,5,10,2,19,1,3,19,6,1,5,11,2,1,2,5,2,5,8,2,2,2,5,3,1,21,2,3,7,10,1,8,1,3,17,17,1,5,3,10,14,1,2,14,14,1,15,6,3,2,17,17,1,1,1,2,2,3,3,2,2,7,7,2,1,2,8,2,20,3,2,3,12,7,6,5,12,2,3,11,3,1,1,8,16,10,1,6,6,6,11,1,6,5,2,5,11,1,2,10,6,14,6,3,3,5,2,6,17,15,1,2,2,17,5,3,3,5,8,1,6,3,14,3,2,1,7,2,8,11,5,14,3,19,1,3,7,3,3,8,8,6,1,3,1,14,14,10,3,2,1,12,2,3,1,2,2,6,6,7,10,10,12,24,1,21,21,5,11,12,12,2,1,19,8,6,2,1,1,19,10,6,2,15,15,7,10,14,12,14,5,11,7,12,2,1,14,10,7,10,3,17,25,10,5,5,3,12,5,2,14,5,8,1,11,5,29,2,7,20,12,14,1,10,6,17,16,6,7,11,12,3,1,23,11,10,11,5,10,6,2,17,15,20,5,10,1,17,3,7,15,5,11,6,19,14,15,7,1,2,17,8,15,10,26,6,1,2,10,6,14,12,6,1,16,6,12,10,10,14,1,6,1,6,6,12,6,6,1,2,5,10
 
,8,10,1,6,8,17,11,6,3,6,5,1,2,1,2,6,6,12,14,7,1,7,1,8,2,3,14,11,6,3,11,3,1,6,17,12,8,2,10,3,12,12,2,7,5,5,17,2,5,10,12,21,15,6,10,10,7,15,11,2,7,10,3,1,2,7,10,15,1,1,6,5,5,3,17,19,7,1,15,2,8,7,1,6,2,1,15,19,7,15,1,8,3,3,20,8,1,11,7,8,7,1,12,11,1,10,17,2,23,3,7,20,20,3,11,5,1,1,8,1,6,2,11,1,5,1,10,7,20,17,8,1,2,10,6,2,1,23,11,11,7,2,21,5,5,8,1,1,10,12,15,2,1,10,5,2,2,5,1,2,11,10,1,8,10,12,2,12,2,8,6,19,15,8,2,16,7,5,14,2,1,3,3,10,16,20,5,8,14,8,3,14,2,1,5,16,16,2,10,8,17,17,10,10,11,3,5,1,17,17,3,17,5,6,7,7,12,19,15,20,11,10,2,6,6,5,5,1,16,16,8,7,2,1,3,5,20,20,6,7,5,23,14,3,10,2,2,7,10,10,3,5,5,8,14,11,14,14,11,19,5,5,2,12,25,5,2,11,8,10,5,11,10,12,10,2,15,15,15,5,10,1,12,14,8,5,6,2,26,15,21,15,12,2,8,11,5,5,16,5,2,17,3,2,2,3,15,3,8,10,7,10,3,1,14,14,8,8,8,19,10,12,3,8,2,20,16,10,6,15,6,1,12,12,15,15,8,11,17,7,7,7,3,10,1,5,19,11,7,12,8,12,7,5,10,1,11,1,6,21,1,1,10,3,8,5,6,5,20,25,17,5,2,16,14,11,1,17,10,14,5,16,5,2,7,3,8,17,7,19,12,6,5,1,3,12,43,11,8,11,5,19,10,5,11,7,20,6,12,35,5,3,
 
17,10,2,12,6,5,21,24,15,5,10,3,15,1,12,6,3,17,3,2,3,5,5,14,11,8,1,8,10,5,25,8,7,2,6,3,11,1,11,7,3,10,7,12,10,8,6,1,1,17,3,1,1,2,19,6,10,2,2,7,5,16,3,2,11,10,7,10,21,3,5,2,21,3,14,6,7,2,24,3,17,3,21,8,5,11,17,5,6,10,5,20,1,12,2,3,20,6,11,12,14,6,6,1,14,15,12,15,6,20,7,7,19,3,7,5,16,12,6,7,2,10,3,2,11,8,6,6,5,1,11,1,15,21,14,6,3,2,2,5,6,1,3,5,3,6,20,1,15,12,2,3,3,7,1,16,5,24,10,7,1,12,16,8,26,16,15,10,19,11,6,6,5,6,5)
>
>   & I'd like to know know whether, & how the numbers are clustered.  In
> an extreme & illustrative example, 1..10 would have zero clusters;
> 1,1,1,2,2,2,7,7,7 would have 3 clusters (around 1,2 & 7);
> 17,22,20,45,47,51,82,84,83  would have 3 clusters. (around 20, 47 &
> 83).  In my set, when I scan it, I intuitively figure there's lots of
> numbers close to 0 & a lot close to 20 (or there abouts).
>
> I saw info about k-clusters but I'm not sure if I'm going down the right
> path.  I'm interested in k-clusters & will teach myself, but my priority
> is working out this problem.
>
> Do you know the name of the algorithm I'm trying to use?  If so, are
> there python libraries like numpy that I can leverage?  I imagine that I
> could iterate from 0 to 100% using that as an artificial mean, discard
> values that are over a standard deviation away, and count the number of
> scores for that mean; then at the end of that I could set a threshold
> for which the artificial mean would be kept something like (no attempt
> at correct syntax:
>
> means={}
> deviation=5
> threshold=int(0.25*len(list))
> for i in range 100:
>count=0
>for j in list:
>  if abs(j-i) > deviation:
>count+=1
>if count > threshold:
>  means[i]=count
>
> That algorithm is entirely untested & I think it could work, it's just I
> don't want to reinvent the wheel.  Any ideas kindly appreciated.
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: plot / graph connecting re ordered lists

2018-01-23 Thread Vincent Davis
On Tue, Jan 23, 2018 at 4:15 PM Dennis Lee Bieber <wlfr...@ix.netcom.com>
wrote:

> On Tue, 23 Jan 2018 13:51:55 -0700, Vincent Davis
> <vinc...@vincentdavis.net> declaimed the following:
>
> >Looking for suggestions. I have an ordered list of names these names will
> >be reordered. I am looking to make a plot, graph, with the two origins of
>
> IE: you have two lists with the same items in different orders...
>
> >the names in separate columns and a line connecting them to visually
> >represent how much they have moved in the reordering.
> >Surely there is some great example code for this on the net an am not
> >finding a clean example.
> >
>
> Determine positions:
>
> pos = []
> for p, name in enumerate(first_list):
> np = second_list.index(name)
> pos.append( (name, p, np) )
>
> for (name, p, np) in pos:
> draw_line((1,p) , (2, np))
> label( (1, p), name)
>
> Exact details of graphics package and scaling left as an exercise


Actualy, it’s recomendations for a graphing package And an example using it
for such a graph that I am most interested in. I know how to relate the
names on the 2 lists.


> --
> Wulfraed Dennis Lee Bieber AF6VN
> wlfr...@ix.netcom.comHTTP://wlfraed.home.netcom.com/
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
Sent from mobile app. Vincent Davis 720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


plot / graph connecting re ordered lists

2018-01-23 Thread Vincent Davis
Looking for suggestions. I have an ordered list of names these names will
be reordered. I am looking to make a plot, graph, with the two origins of
the names in separate columns and a line connecting them to visually
represent how much they have moved in the reordering.
Surely there is some great example code for this on the net an am not
finding a clean example.

Thanks
Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: x=something, y=somethinelse and z=crud all likely to fail - how do i wrap them up

2016-02-01 Thread Vincent Davis
On Sat, Jan 30, 2016 at 9:58 PM, Veek. M <vek.m1...@gmail.com> wrote:

> Is there some other nice way to wrap this stuff up?
> I can't do:
> try:
>  x=
>  y=
>  z=
> except:
>

I happend to


Have just been doing the something
similar. You can put x,y,x in a list and loop over it. In my case a dict
was better.
See the example here.
https://github.com/vincentdavis/USAC_data/blob/master/tools.py#L24

Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue25989] documentation version switcher is broken fro 2.6, 3.2, 3.3

2016-01-01 Thread Vincent Davis

New submission from Vincent Davis:

>From the documentation pages for python 2.7 and 3.4, 3.5, 3.6 it is possible 
>to select another python version in the breadcrumb at the top left of the 
>page. This is not available for python 2.6, 3.2 and  3.3.

See related issue which is closed.
https://bugs.python.org/issue25113

I posted this on pythondotorg but I guess this is a cpython issue not a website 
issue. https://github.com/python/pythondotorg/issues/868

Berker Peksag response to the report
"The version switcher is activated via a versionswitcher option in Doc/Makefile 
in CPython codebase. Docs are generated daily by using that Makefile, but 2.6, 
3.2 and 3.3 are in security-fix-only mode (which means they won't even get 
documentation fixes) so the daily build script skips generating docs for those 
versions."

--
assignee: docs@python
components: Documentation
messages: 257317
nosy: Vincentdavis, docs@python
priority: normal
severity: normal
status: open
title: documentation version switcher is broken fro 2.6, 3.2, 3.3
type: behavior

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25989>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



unicodedata with chr() not the same between python 3.4 and 3.5

2015-12-22 Thread Vincent Davis
​I was expecting the code below to be the same between python3.4 and 3.5. I
need a mapping between the integers and unicode that is consistant between
3.4 and 3.5

>>>
import unicodedata
>>>
u = ''.join(chr(i) for i in range(65536) if (unicodedata.category(chr(i))
in ('Lu', 'Ll')))[945:965]
>>> u
'ԡԢԣԤԥԦԧԨԩԪԫԬԭԮԯԱԲԳԴԵ'

Python 3.4
>>>
import unicodedata
>>>
u = ''.join(chr(i) for i in range(65536) if (unicodedata.category(chr(i))
in ('Lu', 'Ll')))[945:965]
>>> u
'ԢԣԤԥԦԧԱԲԳԴԵԶԷԸԹԺԻԼԽԾ'

As you can see they are not the same
​.​


'ԡԢԣԤԥԦԧԨԩԪԫԬԭԮԯԱԲԳԴԵ'
'ԢԣԤԥԦԧԱԲԳԴԵԶԷԸԹԺԻԼԽԾ'




Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Catogorising strings into random versus non-random

2015-12-21 Thread Vincent Davis
On Mon, Dec 21, 2015 at 7:25 AM, Vlastimil Brom <vlastimil.b...@gmail.com>
wrote:

> > baby lions at play
> > saturday_morning12
> > Fukushima
> > ImpossibleFork
> >
> >
> > (note that some use underscores, others spaces, and some CamelCase) while
> > others are completely meaningless (or mostly so):
> >
> >
> > xy39mGWbosjY
> > 9sjz7s8198ghwt
> > rz4sdko-28dbRW00u
>

My first thought it to search google for each wor
​d​
or phase and count
​(google gives a count) ​
the results. For example if you search for "xy39mGWbosjY" there is one
result as of now,
​which
 is an archive of this tread. If you search for any given word or even the
phrase
​, for example​
"baby lions at play
​
" you get a much larger set of results
​ ~500​
. I assue there are many was to search google with python, this looks like
one. https://pypi.python.org/pypi/google

Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Try: rather than if :

2015-12-14 Thread Vincent Davis
On Mon, Dec 14, 2015 at 4:14 PM, Cameron Simpson <c...@zip.com.au> wrote:

> First, notice that the code inside the try/except _only_ fetches the
> attribute.  Your version calls the "write" attribute, and also accesses
> handle.name. Either of those might also emit AttributeError, and should
> probably not be silently caught.
>

​I think the intent of the original code was to check if handle had the
attribute "name", I don't think the attribute "write" was the issue.

So then possibly this based on your suggestion:
try:
write = handel.write
except AttributeError:
raise
try:
name = handel.name
write("# Report_file: %s\n" % name)
except AttributeError:
    pass
write("\n")


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Try: rather than if :

2015-12-14 Thread Vincent Davis
On Mon, Dec 14, 2015 at 4:53 PM, Ian Kelly  wrote:

>
> Except that catching an exception just to immediately re-raise it is
> silly. This would be better:
>
> try:
> name = handle.name
> except AttributeError:
> pass
> else:
> handle.write("# Report_file: %s\n" % name)


​Ya that would be silly.

Thanks​ everyone for the education.
-- 
https://mail.python.org/mailman/listinfo/python-list


Try: rather than if :

2015-12-14 Thread Vincent Davis
In the code below try is used to check if handle has the attribute name. It
seems an if statement could be used. Is there reason one way would be
better than another?

def write_header(self):
handle = self.handle
try:
handle.write("# Report_file: %s\n" % handle.name)
except AttributeError:
pass
handle.write("\n")

The specific use case I noticed this was
https://github.com/biopython/biopython/blob/master/Bio/AlignIO/EmbossIO.py#L38

Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: manually build a unittest/doctest object.

2015-12-08 Thread Vincent Davis
On Tue, Dec 8, 2015 at 2:06 AM, Peter Otten <__pete...@web.de> wrote:

> But why would you want to do that?


Thanks Peter, I want to do that because I want to test jupyter notebooks.
​The notebook is in JSON and I can get the source and result out but it was
unclear to me how to stick this into a test. doctest seemed the simplest
but maybe there is a better way.

I also tried something like:
assert exec("""print('hello word')""") == 'hello word'


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: manually build a unittest/doctest object.

2015-12-08 Thread Vincent Davis
On Tue, Dec 8, 2015 at 2:06 AM, Peter Otten <__pete...@web.de> wrote:

> >>> import doctest
> >>> example = doctest.Example(
> ... "print('hello world')\n",
> ... want="hello world\n")
> >>> test = doctest.DocTest([example], {}, None, None, None, None)
> >>> runner = doctest.DocTestRunner(verbose=True)
> >>> runner.run(test)
> Trying:
> print('hello world')
> Expecting:
> hello world
> ok
> TestResults(failed=0, attempted=1)
>

​and now how to do a multi line statement​.

>>> import doctest
>>> example =
doctest.Example("print('hello')\nprint('world')",want="hello\nworld")
>>> test = doctest.DocTest([example], {}, None, None, None, None)
>>> runner = doctest.DocTestRunner(verbose=True)
>>> runner.run(test)

Trying:
print('hello')
print('world')
Expecting:
hello
world
**
Line 1, in None
Failed example:
print('hello')
print('world')
Exception raised:
Traceback (most recent call last):
  File "/Users/vincentdavis/anaconda/envs/py35/lib/python3.5/doctest.py",
line 1320, in __run
    compileflags, 1), test.globs)
  File "", line 1
print('hello')
 ^
SyntaxError: multiple statements found while compiling a single statement



Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: manually build a unittest/doctest object.

2015-12-08 Thread Vincent Davis
On Tue, Dec 8, 2015 at 7:30 AM, Laura Creighton <l...@openend.se> wrote:

> >--
> >https://mail.python.org/mailman/listinfo/python-list
>
> Check out this:
> https://pypi.python.org/pypi/pytest-ipynb
>

​Thanks Laura, I think I read the descript as saying I could run untittests
on source code from a jupyter notebook. Reading closer this seems like it
will work.
Not that I mind learning more about how doctests work ;-)


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


manually build a unittest/doctest object.

2015-12-07 Thread Vincent Davis
If I have a string that is python code, for example
mycode = "print('hello world')"
myresult = "hello world"
How can a "manually" build a unittest (doctest) and test I get myresult

I have attempted to build a doctest but that is not working.
e = doctest.Example(source="print('hello world')/n", want="hello world\n")
t = doctest.DocTestRunner()
t.run(e)

Thanks
Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: shorten "compress" long integer to short ascii.

2015-11-20 Thread Vincent Davis
On Thu, Nov 19, 2015 at 9:27 PM, Paul Rubin <no.email@nospam.invalid> wrote:

> You can't improve much.  A decimal digit carries log(10,2)=3.32 bits
> of information.  A reasonable character set for Twitter-style links
> might have 80 or so characters (upper/lower alphabetic, digits, and
> a dozen or so punctuation characters), or log(80,2)=
>

​Where do I find out more about the how to calculate information per digit?
​
Lots of nice little tricks you used below. Thanks for sharing.


> Here is my shortened version:
>
>   import string
>
>   # alphabet here is 83 chars
>   alphabet = string.ascii_lowercase + \
>string.ascii_uppercase +'!"#$%&\'()*+,-./:;<=>?@[]^_`{|}~'
>   alphabet_size = len(alphabet)
>
>   decoderdict = dict((b,a) for a,b in enumerate(alphabet))
>
>   def encoder(integer):
>   a,b = divmod(integer, alphabet_size)
>   if a == 0: return alphabet[b]
>   return encoder(a) + alphabet[b]
>
>   def decoder(code):
> return reduce(lambda n,d: n*alphabet_size + decoderdict[d], code, 0)
>
>   def test():
>   n = 92928729379271
>   short = encoder(n)
>   backagain = decoder(short)
>   nlen = len(str(n))
>   print (nlen, len(short), float(len(short))/nlen)
>   assert n==backagain, (n,short,b)
>
>   test()
>




Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


shorten "compress" long integer to short ascii.

2015-11-19 Thread Vincent Davis
My goal is to shorten a long integer into a shorter set of characters.
Below is what I have which gets me about a 45-50% reduction. Any suggestion
on how to improve upon this?
I not limited to ascii but I didn't see how going to utf8 would help.
The resulting string needs to be something I could type/paste into twitter
for example.

On a side note string.punctuation contains "\\" what is \\ ?

import string
import random

# Random int to shorten
r = random.getrandbits(300)
lenofr = len(str(r))

l = string.ascii_lowercase + string.ascii_uppercase +
'!"#$%&\'()*+,-./:;<=>?@[]^_`{|}~'
n = [str(x) for x in list(range(10,93))]
decoderdict = dict(zip(l, n))
encoderdict = dict(zip(n, l))

def encoder(integer):
s = str(integer)
ls = len(s)
p = 0
code = ""
while p < ls:
if s[p:p+2] in encoderdict.keys():
code = code + encoderdict[s[p:p+2]]
p += 2
else:
code = code + s[p]
p += 1
return code

def decoder(code):
integer = ""
for c in code:
if c.isdigit():
integer = integer + c
else:
integer = integer + decoderdict[c]
return int(integer)

short = encoder(r)
backagain = decoder(short)

print(lenofr, len(short), len(short)/lenofr, r==backagain)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Public key encryption example.

2015-11-18 Thread Vincent Davis
Found an example, needs a little updating but then it works (appears to) in
python 3.5.
http://coding4streetcred.com/blog/post/Asymmetric-Encryption-Revisited-(in-PyCrypto)

Vincent Davis
720-301-3003

On Wed, Nov 18, 2015 at 5:04 PM, Chris Angelico <ros...@gmail.com> wrote:

> On Thu, Nov 19, 2015 at 10:56 AM, Paul Rubin <no.email@nospam.invalid>
> wrote:
> > Vincent Davis <vinc...@vincentdavis.net> writes:
> >> I am looking for the "simplest" example of sending(encrypting) and
> >> receiving(decrypting) using public key encryption. I am think of
> something
> >> along the lines of having all the keys in local files and saving and
> >> reading the message from a local file.
> >
> > It's very easy to make mistakes doing stuff like that.  Your simplest
> > bet is to shell out to GPG or something comparable.
>
> It's not that hard to pull up a library. I've never done it in Python,
> though.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Public key encryption example.

2015-11-18 Thread Vincent Davis
This might be a "Let me Google that for you question", I tried.
I am looking for the "simplest" example of sending(encrypting) and
receiving(decrypting) using public key encryption. I am think of something
along the lines of having all the keys in local files and saving and
reading the message from a local file.

Possibly using cryptography library elliptic-curve
https://cryptography.io/en/latest/hazmat/primitives/asymmetric/ec/#elliptic-curve-signature-algorithms

Surly there is an example out there?

Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: write csv to object and read into pandas

2015-10-15 Thread Vincent Davis
That worked, Thanks!

Vincent Davis
720-301-3003

On Thu, Oct 15, 2015 at 6:11 AM, Peter Otten <__pete...@web.de> wrote:

> Oscar Benjamin wrote:
>
> > On 15 October 2015 at 09:16, Peter Otten <__pete...@web.de> wrote:
> >>
> >> def preprocess(filename):
> >> with open(filename) as f:
> >> for row in csv.reader(f):
> >> # do stuff
> >> yield row
> >>
> >> rows = preprocess("pandas.csv")
> >
> > Take the with statement outside of the generator and do something like:
>
> When will I ever learn :(
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


write csv to object and read into pandas

2015-10-14 Thread Vincent Davis
I have a csv file I have to make some changes to before I read it into
pandas. Currently I open the csv read each row, make changes and save it to
a new file. Then read it into pandas with pandas.read_csv(). How do I skip
writing the file to disk? Using python3.5.

This is what I am doing now.

with open(infile,"r") as fin:
with open(outfile,"w") as fout:
writer=csv.writer(fout)
for row in csv.reader(fin):
#do stuff to the row
writer.writerow(row)

df = pandas.csv_reader(outfile)

Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: assertRaises() help

2015-05-27 Thread Vincent Davis
On Wed, May 27, 2015 at 4:55 PM, Cameron Simpson c...@zip.com.au wrote:

 First, test your test by hand running:

  to_datetime('2015-02-29', coerce=False)

 _Does_ it raise ValueError?


​Well that was not expected.​ Thanks


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


assertRaises() help

2015-05-27 Thread Vincent Davis
I am trying to add a test to pandas Int the first case I assert that I get
a NaT value, in the second I what to test that I get a value error.
def
test_day_not_in_month_coerce_true
() works
I am trying to duplicate them with coerce=
False which
will give a ValueError but I cant get the tests to work.


class TestDaysInMonth(tm.TestCase):
def test_day_not_in_month_coerce_true(self):
self.assertTrue(isnull(to_datetime('2015-02-29', coerce=True)))
self.assertTrue(isnull(to_datetime('2015-02-29', format=%Y-%m-%d,
coerce=True)))
self.assertTrue(isnull(to_datetime('2015-02-32', format=%Y-%m-%d,
coerce=True)))
self.assertTrue(isnull(to_datetime('2015-04-31', format=%Y-%m-%d,
coerce=True)))
def test_day_not_in_month_coerce_false(self):
self.assertRaises(ValueError, to_datetime, '2015-02-29',
coerce=False)

what I get is

FAIL: test_day_not_in_month_coerce_false
(pandas.tests.test_tseries.TestDaysInMonth)
--
Traceback (most recent call last):
  File /Users/vmd/GitHub/pandas_vmd/pandas/tests/test_tseries.py, line
747, in test_day_not_in_month_coerce_false
self.assertRaises(ValueError, to_datetime, '2015-02-29', coerce=False)
  File /Users/vmd/GitHub/pandas_vmd/pandas/util/testing.py, line 1576, in
assertRaises
_callable(*args, **kwargs)
  File /Users/vmd/GitHub/pandas_vmd/pandas/util/testing.py, line 1640, in
__exit__
raise AssertionError({0} not raised..format(name))
AssertionError: ValueError not raised.

From the docs maybe I should be using a with statement​.


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: getting fieldnames from Dictreader before reading lines

2015-05-09 Thread Vincent Davis
Not sure what I was doing wrong, it seems to work now.

Vincent Davis
720-301-3003

On Sat, May 9, 2015 at 4:46 PM, Vincent Davis vinc...@vincentdavis.net
wrote:

 I am reading a file with Dictreader and writing a new file. I want use the
 fieldnames in the Dictwriter from the reader. See below How should I be
 doing this?

 See how I am using reader.fieldnames in the the Dictwriter. I get an error
 (below)

 with open(readfile, 'r', encoding='utf-8', errors='ignore', newline='') as
 csvread:
 reader = DictReader(csvread)
 with open(writefile, 'w') as csvwrite:
 writer = DictWriter(csvwrite, delimiter=',',
 fieldnames=reader.fieldnames)
 for line in reader:
 pass

 ValueErrorTraceback (most recent call 
 last)ipython-input-13-0dac622bb8a9 in module() 1 reader.fieldnames()
 /Users/vmd/anaconda/envs/py34/lib/python3.4/csv.py in fieldnames(self) 94 
 if self._fieldnames is None: 95 try:--- 96   
   self._fieldnames = next(self.reader) 97 except 
 StopIteration: 98 pass
 ValueError: I/O operation on closed file.



 Thanks
 Vincent
 ​ Davis​


-- 
https://mail.python.org/mailman/listinfo/python-list


getting fieldnames from Dictreader before reading lines

2015-05-09 Thread Vincent Davis
I am reading a file with Dictreader and writing a new file. I want use the
fieldnames in the Dictwriter from the reader. See below How should I be
doing this?

See how I am using reader.fieldnames in the the Dictwriter. I get an error
(below)

with open(readfile, 'r', encoding='utf-8', errors='ignore', newline='') as
csvread:
reader = DictReader(csvread)
with open(writefile, 'w') as csvwrite:
writer = DictWriter(csvwrite, delimiter=',',
fieldnames=reader.fieldnames)
for line in reader:
pass

ValueErrorTraceback (most recent call
last)ipython-input-13-0dac622bb8a9 in module() 1
reader.fieldnames()
/Users/vmd/anaconda/envs/py34/lib/python3.4/csv.py in fieldnames(self)
94 if self._fieldnames is None: 95
try:--- 96 self._fieldnames = next(self.reader)
97 except StopIteration: 98 pass
ValueError: I/O operation on closed file.



Thanks
Vincent
​ Davis​
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: getting fieldnames from Dictreader before reading lines

2015-05-09 Thread Vincent Davis
On Sat, May 9, 2015 at 5:55 PM, Dave Angel da...@davea.name wrote:

 1) you're top-posting, putting your response  BEFORE the stuff you're
responding to.


I responded to my own email, seemed ok to top post on myself saying it was
resolved.


 2) both messages are in html, which thoroughly messed up parts of your
error messages.

I am posting from google mail (not google groups). Kindly let me know if
this email is also html.



Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


date from day (count) of year

2015-04-24 Thread Vincent Davis
How does one get the date given the day of a year.

 dt.datetime.now().timetuple().tm_yday

114
How would I get the Date of the 114 day of 2014?

Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: date from day (count) of year

2015-04-24 Thread Vincent Davis
On Fri, Apr 24, 2015 at 8:01 AM, Ian Kelly ian.g.ke...@gmail.com wrote:

  dt.date(2014, 1, 1) + dt.timedelta(114 - 1)
 datetime.date(2014, 4, 24)


​Thanks!​
Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Vincent Davis
 Which DictReader? Do you mean the one in the csv module? I will assume so.

​yes.​



 # untested
 with open(dfile, 'r', encoding='utf-8', errors='ignore', newline='') as f:
 reader = csv.DictReader(f)
 for row in reader:
 print(row['fieldname'])


What you have seems to work, now I need to go find my strange symbols that
are not ​'utf-8' and see what happens
I was thought, that I had to open with 'rb' to use ​encoding?


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Vincent Davis
On Tue, Apr 14, 2015 at 7:48 AM, Steven D'Aprano 
steve+comp.lang.pyt...@pearwood.info wrote:

 with open(dfile, 'rb') as f:
 for line in f:
 try:
 s = line.decode('utf-8', 'strict')
 except UnicodeDecodeError as err:
 print(err)

 If you need help deciphering the errors, please copy and paste them here
 and
 we'll see what we can do.


Below are the errors. I knew about these and I think the correct encoding
is windows-1252. I will paste some code and output at the end of this email
that prints the offending column in the line. These are very likely errors,
and so I what to remove them. I am reading this csv into django sqlite3 db.
What is strange to me is that using
​​
with open(dfile, 'r', encoding='utf-8', errors='ignore', newline='')
​​
 does not seem to remove these
​, it seems to correctly save them to the db which I don't understand.​
​


'utf-8' codec can't decode byte 0xa6 in position 368: invalid start byte
'utf-8' codec can't decode byte 0xac in position 223: invalid start byte
'utf-8' codec can't decode byte 0xa6 in position 1203: invalid start byte
'utf-8' codec can't decode byte 0xa2 in position 44: invalid start byte
'utf-8' codec can't decode byte 0xac in position 396: invalid start byte

import chardet
with open(DATA/ATSDTA_ATSP600.csv, 'rb') as f:
for line in f:
code = chardet.detect(line)
#if code == {'confidence': 0.5, 'encoding': 'windows-1252'}:
if code != {'encoding': 'ascii', 'confidence': 1.0}:
print(code)
win = line.decode('windows-1252').split(',') #windows-1252
norm = line.decode('utf-8', 'ignore').split(',')
ascii = line.decode('ascii', ignore).split(',')
ascii2 = line.decode('ISO-8859-1').split(',')

for w, n, a, a2 in zip(win, norm, ascii, ascii2):
if w != n:
print(w
​)
​ print(
n
​)
​
a, a2)
print(win[0])

​## Output​

{'encoding': 'windows-1252', 'confidence': 0.5}
¦¦   
040543
{'encoding': 'windows-1252', 'confidence': 0.5}
LEASE GREGPRU D ¬ETERSPM  LEASE GREGPRU D ETERSPM
   LEASE GREGPRU D ETERSPM  LEASE
GREGPRU D ¬ETERSPM 
979643
{'encoding': 'windows-1252', 'confidence': 0.5}
¦¦   
986979
{'encoding': 'windows-1252', 'confidence': 0.5}
WELLS FARGO ¢ COMPANYWELLS FARGO  COMPANY
   WELLS FARGO  COMPANYWELLS
FARGO ¢ COMPANY   
994946
{'encoding': 'windows-1252', 'confidence': 0.5}
OSSOSSO¬¬O  OSSOSSOO  OSSOSSOO  OSSOSSO¬¬O 
996535



Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Vincent Davis
I had been reading in a file like so. (python 3)
with open(dfile, 'rb') as f:
for line in f:

​line
 = line.decode('utf-8', 'ignore').split(',')

​How can I ​do accomplish decode('utf-8', 'ignore') when reading with
 DictReader()


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python regex exercise

2015-04-04 Thread Vincent Davis
On Sat, Apr 4, 2015 at 5:51 PM, Thomas 'PointedEars' Lahn 
pointede...@web.de wrote:

  Do anyone have good links to python regex or other python problems for
  beginners but with solution.
 
  Please mail me.


​I recently found​ this
https://regex101.com/#python


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue21360] mailbox.Maildir should ignore files named with a leading dot

2015-01-02 Thread Vincent Davis

Changes by Vincent Davis vinc...@vincentdavis.com:


--
nosy: +Vincentdavis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21360
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15933] flaky test in test_datetime

2015-01-02 Thread Vincent Davis

Vincent Davis added the comment:

Rather than dealing with the time delta how about getting the time twice and 
checking that we are between and at least once we have the same day.
i.e.

ts1 = time()
today = self.theclass.today()
ts2 = time()
todayagain1 = self.theclass.fromtimestamp(ts1)
todayagain2 = self.theclass.fromtimestamp(ts2)
#This would then cover all the cases could separate these cases, I dontsee the 
need for a loop.
self.assertTrue(today == todayagain1 or today == todayagain2 
or todayagain1 = today = todayagain1)

--
nosy: +Vincentdavis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15933
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20544] Use specific asserts in operator tests

2015-01-02 Thread Vincent Davis

Vincent Davis added the comment:

Looks like this is ready to be applied and closed or just closed.

--
nosy: +Vincentdavis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20544
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18983] Specify time unit for timeit CLI

2015-01-02 Thread Vincent Davis

Vincent Davis added the comment:

Anything else need to be done on this patch?

--
nosy: +Vincentdavis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18983
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: suggestions for VIN parsing

2014-12-29 Thread Vincent Davis
On Mon, Dec 29, 2014 at 7:47 AM, Denis McMahon denismfmcma...@gmail.com
wrote:

 K .. D would be the appropriate month prefixes for the 1981 model year,
 but if both the 1981 and 1982 model years used DA as a year prefix, there
 would be some prefixes that appeared twice, in the 1981 model year and
 the 1982 model year.


​Ah , I had not looked close at that yet. I found a different more
extensive site.
http://www.britishonly.com/tech/joust/techtiptriumphmf.htm​


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: suggestions for VIN parsing

2014-12-29 Thread Vincent Davis
On Sun, Dec 28, 2014 at 11:50 PM, Rick Johnson rantingrickjohn...@gmail.com
 wrote:

 3. I see that you are utilizing regexps to aid in the logic,
 and although i agree that regexps are overkill for this
 problem (since it could technically be solved with string
 methods) if *I* had to solve this problem, i would use the
 power of regexps -- although i would use them more wisely ;-)

 I have not studied the data thoroughly, but just by grazing
 over the code you posted i can see a few distinct patterns
 that emerge from the VIN data-set. Here is a description of
 the patterns:

 \d+n
 \d+na
 d\d+
 du\d+

 and the last pattern being all digits:

 \d+

 Even though your verbose-run-on-conditional would most
 likely execute faster, i prefer to write code (when
 performance is not mission critical!) in the most readable
 and maintainable fashion. And in order to achieve that goal,
 you always want to keep the main logic as succinct as
 possible whist encapsulating the difficult bits in suitably
 abstracted structures.


​Rick,
Thanks for your suggestions, I was just starting version2 and wanted to do
something like you suggest.
Another question. I what to change the logic so that rather than return THE
match it return all matches. I want to do this for 2 reasons, 1, it would
act as a kinda test, If I only expect one match and I get more than I
likely have a problem, 2, I found a more extensive (maybe better) list of
frame numbers http://www.britishonly.com/tech/joust/techtiptriumphmf.htm,
I could see some overlapping although I have not looked real close yet.



Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: suggestions for VIN parsing

2014-12-28 Thread Vincent Davis
 'tp1959'
elif g[0][0] == '0' and 29364 = int(g[0]) = 30424: # tp1960:
029364 - 030424
return 'tp1960'
else:
return None
else:
return None

vin_test_list = ['101n', '500n', '234na', '15809NA', '25000', '32303',
'44135', '56700', '70930', '0100', 'H11512', 'D15789', 'DU101']
for vin in vin_test_list:
print(vin_to_year2(vin))


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


suggestions for VIN parsing

2014-12-25 Thread Vincent Davis
I would like to parse the VIN, frame and engine numbers found on this page
(below). I don't really know any regex, I have looked a little at
pyparsing. I have some other similar numbers to. I am looking for
suggestions, which tool should I learn, how should I approach this.
http://www.britishspares.com/41.php

Thanks
Vincent
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: suggestions for VIN parsing

2014-12-25 Thread Vincent Davis
These are vintage motorcycles so the VIN's are not like modern VIN's
these are frame numbers and engine number.
I don't want to parse the page, I what a function that given a VIN (frame
or engine number) returns the year the bike was made.


Vincent Davis
720-301-3003

On Thu, Dec 25, 2014 at 5:56 PM, Dan Stromberg drsali...@gmail.com wrote:

 On Thu, Dec 25, 2014 at 4:02 PM, Vincent Davis vinc...@vincentdavis.net
 wrote:
  I would like to parse the VIN, frame and engine numbers found on this
 page
  (below). I don't really know any regex, I have looked a little at
 pyparsing.
  I have some other similar numbers to. I am looking for suggestions, which
  tool should I learn, how should I approach this.
  http://www.britishspares.com/41.php

 I don't see any VIN numbers there offhand (they perhaps don't belong
 on the public internet since they can sometimes be used to make a car
 key), but most people parse HTML using Python via lxml or
 BeautifulSoup.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: suggestions for VIN parsing

2014-12-25 Thread Vincent Davis
Tim and Ben,
Thanks for your input, I am working on it now and will come back when I
have questions.
Any comment on using pyparsing VS regex

Vincent Davis
720-301-3003

On Thu, Dec 25, 2014 at 7:18 PM, Ben Finney ben+pyt...@benfinney.id.au
wrote:

 Vincent Davis vinc...@vincentdavis.net writes:

  I don't want to parse the page, I what a function that given a VIN
  (frame or engine number) returns the year the bike was made.

 So the page has a collection of tables which the reader can use to
 manually look up the VIN, or elements of the VIN, to determine the range
 of dates of manufacture.

 Your problem is to come up with a suitable data structure to map VIN to
 date range.

 The core of this, in Python, is going to be a ‘dict’ instance. You need
 to represent “part of a VIN” as the key, and “range of dates” as the
 resulting value.


 For the value, a “date range” is expressed simply by a 2-item tuple of
 dates::

 import datetime
 date_range = (datetime.date(1979, 8, 1), datetime.date(1980, 7, 31))

 If you want something a little more expressive, make a namedtuple to
 name the items in the tuple::

 import datetime
 import collections

 DateRange = collections.namedtuple('DateRange', ['begin', 'end'])

 date_range = DateRange(
 begin=datetime.date(1979, 8, 1),
 end=datetime.date(1980, 7, 31))


 Given that a VIN is (despite the number) not a number, but instead a
 string of characters, I would recommend using “string prefix” as the
 key.

 To match a VIN, iterate through the keys and attempt a match against the
 prefix; if a match is found, the date range is obtained simply by
 getting the corresponding value from the dictionary.

 However, you have some entries in those tables with “prefix ranges”. You
 can extrapolate from what I wrote here to come up with a method for
 matching within a range of prefixes.


 I *strongly* recommend keeping the data set small until you come up with
 a working means to store and look up the information. While you do so,
 feel free to post (small!) code examples here to show your working.

 Good hunting.

 --
  \ “The Vatican is not a state.… a state must have territory. This |
   `\ is a palace with gardens, about as big as an average golf |
 _o__) course.” —Geoffrey Robertson, 2010-09-18 |
 Ben Finney

 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


speed up pandas calculation

2014-07-30 Thread Vincent Davis
I know this is a general python list and I am asking about pandas but this
question is probably not great for asking on stackoverflow.
I have a list of files (~80 files, ~30,000 rows) I need to process with my
current code it is take minutes for each file. Any suggestions of a fast
way. I am try to stick with pandas for educational purposes. Any
suggestions would be great. If you are curious the can find the data file I
am using below here. http://www.nber.org/nhamcs/data/nhamcsopd2010.csv

drugs_current = {'CITALOPRAM': 4332,
 'ESCITALOPRAM': 4812,
 'FLUOXETINE': 236,
 'FLUVOXAMINE': 3804,
 'PAROXETINE': 3157,
 'SERTRALINE': 880,
 'METHYLPHENIDATE': 900,
 'DEXMETHYLPHENIDATE': 4777,
 'AMPHETAMINE-DEXTROAMPHETAMINE': 4035,
 'DEXTROAMPHETAMINE': 804,
 'LISDEXAMFETAMINE': 6663,
 'METHAMPHETAMINE': 805,
 'ATOMOXETINE': 4827,
 'CLONIDINE': 44,
 'GUANFACINE': 717}

drugs_98_05 = { 'SERTRALINE': 56635,
'CITALOPRAM': 59829,
'FLUOXETINE': 80006,
'PAROXETINE_HCL': 57150,
'FLUVOXAMINE': 57064,
'ESCITALOPRAM': 70466,
'DEXMETHYLPHENIDATE': 70427,
'METHYLPHENIDATE': 70374,
'METHAMPHETAMINE': 53485,
'AMPHETAMINE1': 70257,
'AMPHETAMINE2': 70258,
'AMPHETAMINE3': 50265,
'DEXTROAMPHETAMINE1': 70259,
'DEXTROAMPHETAMINE2': 70260,
'DEXTROAMPHETAMINE3': 51665,
'COMBINATION_PRODUCT': 51380,
'FIXED_COMBINATION': 51381,
'ATOMOXETINE': 70687,
'CLONIDINE1': 51275,
'CLONIDINE2': 70357,
'GUANFACINE': 52498
   }

df = pd.read_csv('nhamcsopd2010.csv' , index_col='PATCODE',
low_memory=False)
col_init = list(df.columns.values)
keep_col = ['PATCODE', 'PATWT', 'VDAY', 'VMONTH', 'VYEAR', 'MED1', 'MED2',
'MED3', 'MED4', 'MED5']
for col in col_init:
if col not in keep_col:
del df[col]
if f[-3:] == 'csv' and f[-6:-4] in ('93', '94', '95', '96', '97', '98',
'99', '00', '91', '02', '03', '04', '05'):
drugs = drugs_98_05
elif f[-3:]  == 'csv' and f[-6:-4] in ('06', '08', '09', '10'):
drugs = drugs_current
for n in drugs:
df[n] = df[['MED1','MED2','MED3','MED4','MED5']].isin([drugs[n]]).any(1)


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: speed up pandas calculation

2014-07-30 Thread Vincent Davis
On Wed, Jul 30, 2014 at 6:28 PM, Vincent Davis vinc...@vincentdavis.net
wrote:

 The real slow part seems to be
 for n in drugs:
 df[n] =
 df[['MED1','MED2','MED3','MED4','MED5']].isin([drugs[n]]).any(1)


​I was wrong, this is fast, it was selecting the columns that was slow.
using
keep_col = ['PATCODE', 'PATWT', 'VDAYR', 'VMONTH', 'MED1', 'MED2', 'MED3',
'MED4', 'MED5']
df = df[keep_col]

took the time down from 19sec to 2 sec.


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: speed up pandas calculation

2014-07-30 Thread Vincent Davis
On Wed, Jul 30, 2014 at 5:57 PM, Skip Montanaro skip.montan...@gmail.com
wrote:

  df = pd.read_csv('nhamcsopd2010.csv' , index_col='PATCODE',
 low_memory=False)
  col_init = list(df.columns.values)
  keep_col = ['PATCODE', 'PATWT', 'VDAY', 'VMONTH', 'VYEAR', 'MED1',
 'MED2', 'MED3', 'MED4', 'MED5']
  for col in col_init:
  if col not in keep_col:
  del df[col]

 I'm no pandas expert, but a couple things come to mind. First, where is
 your code slow (profile it, even with a few well-placed prints)? If it's in
 read_csv there might be little you can do unless you load those data
 repeatedly, and can save a pickled data frame as a caching measure. Second,
 you loop over columns deciding one by one whether to keep or toss a column.
 Instead try

 df = df[keep_col]

 Third, if deleting those other columns is costly, can you perhaps just
 ignore them?

 Can't be more investigative right now. I don't have pandas on Android. :-)


So the df = df[keep_col] is not fast but it is not that slow. You made me
think of a solution to that part. just slice and copy. The only gotya is
that the keep_col have to actually exist
 keep_col = ['PATCODE', 'PATWT', 'VDAYR', 'VMONTH', 'MED1', 'MED2', 'MED3',
'MED4', 'MED5']
df = df[keep_col]

The real slow part seems to be
for n in drugs:
df[n] = df[['MED1','MED2','MED3','MED4','MED5']].isin([drugs[n]]).any(1)



Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


bz2.decompress as file handle

2014-05-18 Thread Vincent Davis
I have a file compressed with bz2 and a function that expects a file handle.
When I decompress the bz2 file I get a string (binary) not a file handle.

Here is what I have that does not work. There is no error (thats a seperate
issue) CelFile.read just fails to read the data(string).

from Bio.Affy import CelFile
from bz2 import decompress,

with open('Tests/Affy/affy_v3_ex.CEL.bz2', 'rb') as handle:
cel_data = decompress(handle.read())

c = CelFile.read(cel_data)



​​
​Thanks​
Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: bz2.decompress as file handle

2014-05-18 Thread Vincent Davis
Well after posting, I think I figured it out.
The key is to use StringIO to get a file handle on the string. The fact
that it is binary just complicates it a little.

with open('Tests/Affy/affy_v3_ex.CEL.bz2', 'rb') as handle:
cel_data = StringIO(decompress(handle.read()).decode('ascii'))

Vincent Davis
720-301-3003


On Sun, May 18, 2014 at 8:32 PM, Tim Chase python.l...@tim.thechases.comwrote:

 On 2014-05-18 19:53, Vincent Davis wrote:
  I have a file compressed with bz2 and a function that expects a
  file handle. When I decompress the bz2 file I get a string (binary)
  not a file handle.

  from bz2 import decompress,
 
  with open('Tests/Affy/affy_v3_ex.CEL.bz2', 'rb') as handle:
  cel_data = decompress(handle.read())

 When I try (without the Bio.Affy which isn't part of the stdlib), I
 get correct bytes from this:

 tim@bigbox:~$ echo hello world  test.txt
 tim@bigbox:~$ bzip2 -9 test.txt
 tim@bigbox:~$ python3
 Python 3.2.3 (default, Feb 20 2013, 14:44:27)
 [GCC 4.7.2] on linux2
 Type help, copyright, credits or license for more information.
  from bz2 import decompress
  with open('test.txt.bz2', 'rb') as f:
 ... data = decompress(f.read())
 ...
  data
 b'hello world\n'


  c = CelFile.read(cel_data)

 So either you have bad data in the file to begin with, or your
 CelFile.read() function has a bug in it.

 -tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: bz2.decompress as file handle

2014-05-18 Thread Vincent Davis
On Sun, May 18, 2014 at 9:44 PM, Ian Kelly ian.g.ke...@gmail.com wrote:

 You can just use bz2.open:

  with bz2.open('test.txt.bz2', 'rt', encoding='ascii') as f:
 ... print(f.read())


​Thanks I like that better then my solution.
​


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Merge/append CSV files with different headers

2014-03-24 Thread Vincent Davis
I have several csv file I need to append (vertically). They have different
but overlapping headers. For example;
file1 headers ['a', 'b', 'c']
file2 headers ['d', 'e']
file3 headers ['c', 'd']

Is there a better way than this
import csv
def merge_csv(fileList, newFileName):
allHeaders = set([])
for afile in fileList:
with open(afile, 'rb') as csvfilesin:
eachheader = csv.reader(csvfilesin, delimiter=',').next()
allHeaders.update(eachheader)
print(allHeaders)
with open(newFileName, 'wb') as csvfileout:
outfile = csv.DictWriter(csvfileout, allHeaders)
outfile.writeheader()
for afile in fileList:
print('***'+afile)
with open(afile, 'rb') as csvfilesin:
rows = csv.DictReader(csvfilesin, delimiter=',')
for r in rows:
print(allHeaders.issuperset(r.keys()))
outfile.writerow(r)

Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Merge/append CSV files with different headers

2014-03-24 Thread Vincent Davis
Thanks for the feedback.

Vincent Davis
720-301-3003


On Mon, Mar 24, 2014 at 1:44 PM, Chris Angelico ros...@gmail.com wrote:

 On Tue, Mar 25, 2014 at 4:50 AM, Vincent Davis vinc...@vincentdavis.net
 wrote:
  I have several csv file I need to append (vertically). They have
 different
  but overlapping headers. For example;
  file1 headers ['a', 'b', 'c']
  file2 headers ['d', 'e']
  file3 headers ['c', 'd']
 
  Is there a better way than this

 Summary of your code:

 1) Build up a set of all headers used, by opening each file and
 reading the headers.
 2) Go through each file a second time and write them out.

 That seems like the best approach, broadly. You might be able to
 improve it a bit (it might be tidier to open each file once, but since
 you're using two different CSV readers, it'd probably not be), but by
 and large, I'd say you have the right technique. Your processing time
 here is going to be dominated by the actual work of copying.

 The only thing you might want to consider is order. The headers all
 have a set order to them, and it'd make sense to have the output come
 out as ['a', 'b', 'c', 'd', 'e'] - the first three from the first
 file, then adding in everything from subsequent files in the order
 they were found. Could be done easily enough by using 'in' and
 .append() on a list, rather than using a set. But if that doesn't
 matter to you, or if something simple like sort the headers
 alphabetically will do, then I think you basically have what you
 want.

 ChrisA
 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generate De Bruijn sequence memory and string vs lists

2014-01-24 Thread Vincent Davis
On Fri, Jan 24, 2014 at 2:29 AM, Gregory Ewing
greg.ew...@canterbury.ac.nzwrote:

 If all you want is a mapping between a sequence of
 length n and compact representation of it, there's
 a much simpler way: just convert it to a base-k
 integer, where k is the size of the alphabet.

 The resulting integer won't be any larger than an
 index into the de Bruijn sequence would be, and
 you can easily recover the original sequence from
 its encoding without needing any kind of lookup
 table.


​True, ​the all you want is a mapping is not quite true. I actually plan
to plot frequency (the number of times an observed sub sequence overlaps a
value in the De Bruijn sequence) The way the sub sequences overlap is
important to me and I don't see a way go from base-k (or any other base) to
the index location in the De Bruijn sequence. i.e. a decoding algorithm.


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generate De Bruijn sequence memory and string vs lists

2014-01-24 Thread Vincent Davis
On Fri, Jan 24, 2014 at 2:23 AM, Peter Otten __pete...@web.de wrote:

 Then, how do you think Python /knows/ that it has to repeat the code 10
 times on my slow and 100 times on your fast machine? It runs the bench
 once, then 10, then 100, then 1000 times -- until there's a run that takes
 0.2 secs or more. The total expected minimum time without startup overhead
 is then


​Ah, I did not know about the calibration. That and I did not notice the
100 on my machine vs 10 on yours.​


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
For reference, Wikipedia entry for De Bruijn sequence
http://en.wikipedia.org/wiki/De_Bruijn_sequence

At the above link is a python algorithm for generating De Brujin sequences.
It works fine but outputs a list of integers [0, 0, 0, 1, 0, 1, 1, 1] and I
would prefer a string '00010111'. This can be accomplished by changing the
last line from;
return sequence
to
return ''.join([str(i) for i in sequence])
See de_bruijn_1 Below.

The other option would be to manipulate strings directly (kind of).
I butchered the original algorithm to do this. See de_bruijn_2 below. But
it is much slower and ungly.

I am wanting to make a few large De Bruijin sequences. hopefully on the
order of de_bruijn(4, 50) to de_bruijn(4, 100) (wishful thinking?). I don't
know the limits (memory or time) for the current algorithms. I think I am
will hit the memory mazsize limit at about 4^31. The system I will be using
has 64GB RAM.
The size of a De Brujin sequence is k^n

My questions;
1, de_bruijn_2 is ugly, any suggestions to do it better?
2, de_bruijn_2 is significantly slower than de_bruijn_1. Speedups?
3, Any thought on which is more memory efficient during computation.

 1 
def de_bruijn_1(k, n):

De Bruijn sequence for alphabet size k (0,1,2...k-1)
and subsequences of length n.
From wikipedia Sep 22 2013

a = [0] * k * n
sequence = []
def db(t, p,):
if t  n:
if n % p == 0:
for j in range(1, p + 1):
sequence.append(a[j])
else:
a[t] = a[t - p]
db(t + 1, p)
for j in range(int(a[t - p]) + 1, k):
a[t] = j
db(t + 1, t)
db(1, 1)
#return sequence  #original
return ''.join([str(i) for i in sequence])

d1 = de_bruijn_1(4, 8)

 2 
def de_bruijn_2(k, n):
global sequence
a = '0' * k * n
sequence = ''
def db(t, p):
global sequence
global a
if t  n:
if n % p == 0:
for j in range(1, p + 1):
sequence = sequence + a[j]
else:
a = a[:t] + a[t - p]  + a[t+1:]
db(t + 1, p)
for j in range(int(a[t - p]) + 1, k):
a = a[:t] + str(j)  + a[t+1:]
db(t + 1, t)
return sequence
db(1, 1)
return sequence

d2 = de_bruijn_2(4, 8)


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 10:18 AM, Dave Angel da...@davea.name wrote:

 If memory size is your issue,  why not make the function a
  generator,  by replacing the append with a yield?


​One more thought on the generator. I have an idea for how to use the
generator but I still need 1, chucks of size n de_brujin(k, n) and the
ordering the same ordering as found in ​de_brujin(k, n).
I am not really sure how to modify the algorithm to do that. Any ideas? I
won't have time to think hard about that until later.


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On 1/23/14, 10:18 AM, Dave Angel wrote:
 (something about your message seems to make it unquotable)

Not sure why the message was not quotable. I sent it using gmail.

On 1/23/14, 10:18 AM, Dave Angel wrote:
 64gig is 4^18, so you can forget about holding a string of size 4^50

I guess I will have to buy more memory or be happy with less, 4**17 would
be ok.

On 1/23/14, 10:18 AM, Dave Angel wrote:
 If memory size is your issue,  why not make the function a
   generator,  by replacing the append with a yield?

I plan to use the sequence as an index to count occurrences of sequences of
length n. A generator is equivalent to using itertools.permutations (i
think that the right itertool). My thought is that I don't have to store
each individual (sub)sequence since the De Brujin sequence contains all of
them. i.e. it is a compact representation of every sequence generated by
itertools.permutations.


Vincent Davis



On Thu, Jan 23, 2014 at 10:18 AM, Dave Angel da...@davea.name wrote:

  Vincent Davis vinc...@vincentdavis.net Wrote in message:
 
 (something about your message seems to make it unquotable)

 64gig is 4^18, so you can forget about holding a string of size 4^50

 If memory size is your issue,  why not make the function a
  generator,  by replacing the append with a yield?


 --
 DaveA

 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 12:02 PM, Peter Otten __pete...@web.de wrote:

 I just noted that the first Python loop can be eliminated:

 def debruijn(k, n):
 a = k * n * bytearray([0])
 sequence = bytearray()
 extend = sequence.extend # factor out method lookup
 def db(t, p):
 if t  n:
 if n % p == 0:
 extend(a[1: p+1])
 else:
 a[t] = a[t - p]
 db(t + 1, p)
 for j in xrange(a[t - p] + 1, k):
 a[t] = j
 db(t + 1, t)
 db(1, 1)
 return sequence.translate(_mapping)


I am not really sure what _mapping should be. The code above does not run
because
NameError: global name '_mapping' is not defined
I tried to get the bytearray
​ ​
sequence to convert to ascii but don't know how to.


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 2:36 PM, Mark Lawrence breamore...@yahoo.co.ukwrote:

 FTR string.maketrans is gone from Python 3.2+.  Quoting from
 http://docs.python.org/dev/whatsnew/3.2.html#porting-to-python-3-2 The
 previously deprecated string.maketrans() function has been removed in favor
 of the static methods bytes.maketrans() and bytearray.maketrans(). This
 change solves the confusion around which types were supported by the string
 module. Now, str, bytes, and bytearray each have their own maketrans and
 translate methods with intermediate translation tables of the appropriate
 type.


​Thanks for pointing this out Mark, ​I will soon be running this on 3.3+


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generate De Bruijn sequence memory and string vs lists

2014-01-23 Thread Vincent Davis
On Thu, Jan 23, 2014 at 3:15 PM, Peter Otten __pete...@web.de wrote:

 $ python -m timeit -s 'from debruijn_compat import debruijn as d' 'd(4, 8)'
 10 loops, best of 3: 53.5 msec per loop
 $ python -m timeit -s 'from debruijn_compat import debruijn_bytes as d'
 'd(4, 8)'
 10 loops, best of 3: 22.2 msec per loop
 $ python3 -m timeit -s 'from debruijn_compat import debruijn as d' 'd(4,
 8)'
 10 loops, best of 3: 68 msec per loop
 $ python3 -m timeit -s 'from debruijn_compat import debruijn_bytes as d'
 'd(4, 8)'
 10 loops, best of 3: 21.7 msec per loop


Excellent Peter!
I have a question, the times reported don't make sense to me, for example
$ python3 -m timeit -s 'from debruijn_compat import debruijn_bytes as d'
'd(4, 8)'
100 loops, best of 3: 10.2 msec per loop
This took ~4 secs (stop watch) which is much more that 10*.0102 Why is this?

$ python3 -m timeit -s 'from debruijn_compat import debruijn_bytes as d'
'd(4, 11)'
10 loops, best of 3: 480 msec per loop​
This took ~20 secs vs .480*10

d(4, 14) takes about 24 seconds (one run)

Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flip a graph

2014-01-04 Thread Vincent Davis
You might think about using an array to represent the canvas. Starting with
it filled with  and then for each point change it to X.
The print the rows of the array.

You can make the array/canvas arbitrarily large and then plot multiple
different paths onto the same array.


Vincent Davis
720-301-3003


On Sat, Jan 4, 2014 at 9:15 AM, Jason Friedman jsf80...@gmail.com wrote:

 I am teaching Python to a class of six-graders as part of an after-school
 enrichment.  These are average students.  We wrote a non-GUI rocket
 lander program:  you have a rocket some distance above the ground, a
 limited amount of fuel and a limited burn rate, and the goal is to have the
 rocket touch the ground below some threshold velocity.

 I thought it would be neat, after a game completes, to print a graph
 showing the descent.

 Given these measurements:
 measurement_dict = { # time, height
 0: 10,
 1: 9,
 2: 9,
 3: 8,
 4: 8,
 5: 7,
 6: 6,
 7: 4,
 8: 5,
 9: 3,
 10: 2,
 11: 1,
 12: 0,
 }

 The easiest solution is to have the Y axis be time and the X axis distance
 from the ground, and the code would be:

 for t, y in measurement_dict.items():
 print(X * y)

 That output is not especially intuitive, though.  A better visual would be
 an X axis of time and Y axis of distance:

 max_height = max(measurement_dict.values())
 max_time = max(measurement_dict.keys())
 for height in range(max_height, 0, -1):
 row = list(  * max_time)
 for t, y in measurement_dict.items():
 if y = height:
 row[t] = 'X'
 print(.join(row))

 My concern is whether the average 11-year-old will be able to follow such
 logic.  Is there a better approach?

 --
 https://mail.python.org/mailman/listinfo/python-list


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flip a graph

2014-01-04 Thread Vincent Davis
When printing the rows of the array/canvas you might add \n to the end of
each row and print the canvas all at once rather than a print statement for
each row.

Vincent Davis
720-301-3003


On Sat, Jan 4, 2014 at 3:10 PM, Vincent Davis vinc...@vincentdavis.netwrote:

 You might think about using an array to represent the canvas. Starting
 with it filled with  and then for each point change it to X.
 The print the rows of the array.

 You can make the array/canvas arbitrarily large and then plot multiple
 different paths onto the same array.


 Vincent Davis
 720-301-3003


 On Sat, Jan 4, 2014 at 9:15 AM, Jason Friedman jsf80...@gmail.com wrote:

 I am teaching Python to a class of six-graders as part of an after-school
 enrichment.  These are average students.  We wrote a non-GUI rocket
 lander program:  you have a rocket some distance above the ground, a
 limited amount of fuel and a limited burn rate, and the goal is to have the
 rocket touch the ground below some threshold velocity.

 I thought it would be neat, after a game completes, to print a graph
 showing the descent.

 Given these measurements:
 measurement_dict = { # time, height
 0: 10,
 1: 9,
 2: 9,
 3: 8,
 4: 8,
 5: 7,
 6: 6,
 7: 4,
 8: 5,
 9: 3,
 10: 2,
 11: 1,
 12: 0,
 }

 The easiest solution is to have the Y axis be time and the X axis
 distance from the ground, and the code would be:

 for t, y in measurement_dict.items():
 print(X * y)

 That output is not especially intuitive, though.  A better visual would
 be an X axis of time and Y axis of distance:

 max_height = max(measurement_dict.values())
 max_time = max(measurement_dict.keys())
 for height in range(max_height, 0, -1):
 row = list(  * max_time)
 for t, y in measurement_dict.items():
 if y = height:
 row[t] = 'X'
 print(.join(row))

 My concern is whether the average 11-year-old will be able to follow such
 logic.  Is there a better approach?

 --
 https://mail.python.org/mailman/listinfo/python-list



-- 
https://mail.python.org/mailman/listinfo/python-list


lookup xpath (other?) to value in html

2013-12-31 Thread Vincent Davis
I have a about 255 data fields that I am trying to verify on thousands of
webpages.
For example:
value: 255,000
sqft: 1800

Since I have the correct answer for several pages I would like to lookup
get the location (xpath?) of the data/field value in the page so that I can
check other pages.

Any suggestions?

Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: lookup xpath (other?) to value in html

2013-12-31 Thread Vincent Davis

 I'm not sure what you are looking for.  Do you have a sample web page,
 and can you show us the output you'd like to see from that webpage?
 Have you looked at http://www.crummy.com/software/BeautifulSoup/?


For example this URL;
http://jeffco.us/ats/displaygeneral.do?sch=001690
The the land sqft is 11082.
Google Chrome gives me the xpath to that data as;
//*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8]

What I would like to do (using python) is given 11082 at what xpath can
that be found? (may be more that one)
The examples I can find using google refer to, given xpath what is the
value (the opposite of what I want)

Vincent Davis


On Tue, Dec 31, 2013 at 6:45 PM, Jason Friedman jsf80...@gmail.com wrote:

  I have a about 255 data fields that I am trying to verify on thousands of
  webpages.
  For example:
  value: 255,000
  sqft: 1800
  
  Since I have the correct answer for several pages I would like to lookup
 get
  the location (xpath?) of the data/field value in the page so that I can
  check other pages.

 I'm not sure what you are looking for.  Do you have a sample web page,
 and can you show us the output you'd like to see from that webpage?
 Have you looked at http://www.crummy.com/software/BeautifulSoup/?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: lookup xpath (other?) to value in html

2013-12-31 Thread Vincent Davis

 Which Chrome extension are you using to get that path?

Built in, right click on source  copy xpath​​

Ya that gets square footage and I like how you did it, are you interested
in doing that for all information on the page and also the historical pages
;-)
Since I have the data for some of the pages, I got this from the county on
a cd, I thought defining the xpath would be easier using bs4 or
http://lxml.de/




Vincent Davis
720-301-3003


On Tue, Dec 31, 2013 at 10:30 PM, Jason Friedman jsf80...@gmail.com wrote:

  For example this URL;
  http://jeffco.us/ats/displaygeneral.do?sch=001690
  The the land sqft is 11082.
  Google Chrome gives me the xpath to that data as;
  //*[@id=content]/p[1]/table[4]/tbody/tr[2]/td[8]
 
  What I would like to do (using python) is given 11082 at what xpath can
 that
  be found? (may be more that one)
  The examples I can find using google refer to, given xpath what is the
 value
  (the opposite of what I want)

 Which Chrome extension are you using to get that path?

 Are you always interested in the square footage?  Here is a solution
 using Beautiful Soup:

 $ cat square-feet.py
 #!/usr/bin/env python
 import bs4
 import requests
 import sys
 url = sys.argv[1]
 request = requests.get(url)
 soup = bs4.BeautifulSoup(request.text)
 is_sqft_mark_found, is_total_mark_found = False, False
 for line in soup.get_text().splitlines():
 if line.startswith(Land Sqft):
 is_sqft_mark_found = True
 continue
 elif is_sqft_mark_found and line.startswith(Total):
 is_total_mark_found = True
 continue
 elif is_total_mark_found:
 print(line.strip() +  total square feet.)
 break

 $ python3 square-feet.py http://jeffco.us/ats/displaygeneral.do?sch=001690
 11082 total square feet.
 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Getting updates and restarting a long running url request.

2013-12-26 Thread Vincent Davis
On Wed, Dec 25, 2013 at 11:24 PM, Jason Friedman jsf80...@gmail.com wrote:

 Could you keep track of success?

 result_dict = dict()
 for id in taxid_list:
 result_dict[id] = False
 while not all(result_dict.values()): # continue if not every ID was
 successful
 for id in taxid_list:
 if result_dict[id]:
 continue # We were already successful with this ID
 try:
 this_result = get_BLAST(id)
 result_dict[id] = True
 except:
 print(A warning.)


Thanks for your response.​
Would this not keep requesting/submitting additional (duplicate) BLAST
queries?

  try:
 this_result = get_BLAST(id)
 result_dict[id] = True


​It seems like I need to use threading, and there appears to be no way to
know if NCBIWWW.qblast is still waiting on results. It will either give a
result or possibly produce and error I suppose if the for example I lost
the connection to the internet but I am not really sure about that.

That said after some more research I found this tread.
http://lists.open-bio.org/pipermail/biopython/2013-April/008507.html



Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Getting updates and restarting a long running url request.

2013-12-22 Thread Vincent Davis
I am using biopython's NCBIWWW.qblast which sends a request to the ncbi
website and waits for a result. The relevant code can be found at the link
below starting at about 151. Basically it is a while loop waiting for the
blast query.
http://biopython.org/DIST/docs/api/Bio.Blast.NCBIWWW-pysrc.html

My problem; I am submitting about 75 request (one at a time and with
delays) and they can each take minutes to complete. I think sometimes the
request/response/query fails which results in me needing to restart
the process.

I am looking for suggestion on how to monitor and restart the process if I
think it has failed.

I am using the following code to submit the query/
def get_BLAST(taxid, queryseq, args=None):
'''
Input taxid to BLAST queryseq against
'''
e_query = txid + taxid +  [ORGN]
#, other_advanced='-G 4 -E 1'
blast_result = NCBIWWW.qblast(blastn, nt, queryseq, megablast=True,
entrez_query=e_query, word_size='11', other_advanced='-G 5 -E 2')
return NCBIXML.read(blast_result)


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using pythons smtp server

2013-12-13 Thread Vincent Davis

 You don't send mail using an SMTP server.  You receive mail using an
 SMTP server.
 ​​


Um maybe, I guess it is a matter of perspective.

Let me rephrase my question. ​​I want to send an email using python but do
not want to use an external service. Does python have the ability to send
emails without installing additional software or using an external
server/service?
Maybe I am wrong, I thought examples like s = smtplib.SMTP('localhost')
​​ are using a local(outside of python) smtp server, like postfix.




Vincent Davis
720-301-3003


On Fri, Dec 13, 2013 at 9:40 AM, Grant Edwards invalid@invalid.invalidwrote:

 On 2013-12-13, Vincent Davis vinc...@vincentdavis.net wrote:

  I have an app that generates a file one a day and would like to email it
  using pythons SMTP server.

 You don't send mail using an SMTP server.  You receive mail using an
 SMTP server.

  http://docs.python.org/2/library/smtpd.html#smtpd.SMTPServer
  The documentation is kinda sparse and I cant seem to find any good
 examples.
 
  Basically what I want to do; when my app runs it would initiate a SMTP
  server, send the attachment and shutdown the SMTP after.

 Newsgroups: comp.lang.python
 From: Grant Edwards invalid@invalid.invalid
 Subject: Re: Using pythons smtp server
 References: mailman.4046.1386908855.18130.python-l...@python.org
 Followup-To:

 On 2013-12-13, Vincent Davis vinc...@vincentdavis.net wrote:

  I have an app that generates a file one a day and would like to email
  it using pythons SMTP server.

 You don't send mail using an SMTP server.  You receive mail using an
 SMTP server.  You send mail using an SMTP client.

  http://docs.python.org/2/library/smtpd.html#smtpd.SMTPServer
  The documentation is kinda sparse and I cant seem to find any good
 examples.
 
  Basically what I want to do; when my app runs it would initiate a SMTP
  server, send the attachment and shutdown the SMTP after.

 https://www.google.com/search?q=python+send+email+smtp

 --
 Grant Edwards   grant.b.edwardsYow! The PINK SOCKS were
   at   ORIGINALLY from 1952!!
   gmail.comBut they went to MARS
around 1953!!
 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using pythons smtp server

2013-12-13 Thread Vincent Davis
Obviously I don't really know how this works. I have used python to send
email using my smtp server (whatever that may be gmail, postfix..)
But I don't want to do that. After a little more research I think what I
need to do is lookup the MX address of the address I want to send the email
too.
Then submit the email to that address using smtplib.SMTP
​Do I have that right?
​


Vincent Davis
720-301-3003


On Fri, Dec 13, 2013 at 10:24 AM, Dennis Lee Bieber
wlfr...@ix.netcom.comwrote:

 On Thu, 12 Dec 2013 18:01:58 -0700, Vincent Davis
 vinc...@vincentdavis.net declaimed the following:

 I have an app that generates a file one a day and would like to email it
 using pythons SMTP server.
 http://docs.python.org/2/library/smtpd.html#smtpd.SMTPServer
 The documentation is kinda sparse and I cant seem to find any good
 examples.
 
 Basically what I want to do; when my app runs it would initiate a SMTP
 server, send the attachment and shutdown the SMTP after.
 

 I suspect you don't want the server per se -- that's more a unit
 for
 receiving SMTP mail (sure, you can start it, but then you have to send the
 email to IT so it can relay it to the next server in the line).

 Look into the smtplib module (section 20.12 in the v2.7.2
 documentation) in order to send email TO a mail server
 --
 Wulfraed Dennis Lee Bieber AF6VN
 wlfr...@ix.netcom.comHTTP://wlfraed.home.netcom.com/

 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using pythons smtp server

2013-12-13 Thread Vincent Davis
Grant, Chris
Thanks !!!
I guess in the end this is a bad idea, (for my purposes) I should just use
my gmail account smtp server.

Vincent Davis
720-301-3003


On Fri, Dec 13, 2013 at 11:15 AM, Chris Angelico ros...@gmail.com wrote:

 On Sat, Dec 14, 2013 at 4:13 AM, Vincent Davis vinc...@vincentdavis.net
 wrote:
  Let me rephrase my question. I want to send an email using python but do
 not
  want to use an external service. Does python have the ability to send
 emails
  without installing additional software or using an external
 server/service?

 Any SMTP server you install has to do one of three things with the
 mail you give it:

 1) Accept it locally. Presumably the wrong thing to do here.
 2) Deliver it to the authoritative SMTP server for the domain.
 3) Deliver it to an intermediate server.

 (Edit: Your next mail shows that you understand that, as looking up
 the MX record is what I was going to say here.)

 So if you want to avoid using an external intermediate server, you
 need to find and talk to the authoritative server. Now, this is where
 another big consideration comes in. What envelope From address are you
 going to use? Is your own IP address allowed to send mail for that
 domain? If not, you may be forced to use the legitimate server for
 that domain. There are other concerns, too; if you don't have a nice
 name to announce in the HELO, you might find your mail treated as
 spam. But if you deal with all that, then yes, the only thing you need
 to do is look up the MX record and pick the best server. (And then
 deal with other concerns like coping with that one being down, which
 is the advantage of having a local mail queue. But sometimes that
 doesn't matter, like if you're sending to yourself for notifications.)

 ChrisA
 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Using pythons smtp server

2013-12-12 Thread Vincent Davis
I have an app that generates a file one a day and would like to email it
using pythons SMTP server.
http://docs.python.org/2/library/smtpd.html#smtpd.SMTPServer
The documentation is kinda sparse and I cant seem to find any good examples.

Basically what I want to do; when my app runs it would initiate a SMTP
server, send the attachment and shutdown the SMTP after.

Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


? get negative from prod(x) when x is positive integers

2013-06-28 Thread Vincent Davis
I have a list of a list of integers. The lists are long so i cant really
show an actual example of on of the lists, but I know that they contain
only the integers 1,2,3,4. so for example.
s2 = [[1,2,2,3,2,1,4,4],[2,4,3,2,3,1]]

I am calculating the product, sum, max, min of each list in s2 but I
get negative or 0 for the product for a lot of the lists. (I am doing this
in ipython)

for x in s2:
print('len = ', len(x), 'sum = ', sum(x), 'prod = ', prod(x), 'max = ',
max(x), 'min = ', min(x))

...

('len = ', 100, 'sum = ', 247, 'prod = ', 0, 'max = ', 4, 'min = ', 1)
('len = ', 100, 'sum = ', 230, 'prod = ', -4611686018427387904, 'max =
', 4, 'min = ', 1)
('len = ', 100, 'sum = ', 261, 'prod = ', 0, 'max = ', 4, 'min = ', 1)

.

('prod =', 0, 'max =', 4, 'min =', 1)
('prod =', 1729382256910270464, 'max =', 4, 'min =', 1)
('prod =', 0, 'max =', 4, 'min =', 1)




Whats going on?



Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ? get negative from prod(x) when x is positive integers

2013-06-28 Thread Vincent Davis
@Joshua
You are using numpy.prod()
Wow, since sum([1,2,3,4]) worked I tried prod([1,2,3,4]) and got the right
answer so I just used that. Confusing that it would use numpy.prod(), I
realize now there is no python prod(). At no point do I import numpy in
my code. The seems to be a result of using ipython, or at least how I am
using it ipython notebook --pylab inline.

Thanks

Vincent Davis
720-301-3003


On Fri, Jun 28, 2013 at 4:04 PM, Joshua Landau
joshua.landau...@gmail.comwrote:

 On 28 June 2013 15:38, Vincent Davis vinc...@vincentdavis.net wrote:
  I have a list of a list of integers. The lists are long so i cant really
  show an actual example of on of the lists, but I know that they contain
 only
  the integers 1,2,3,4. so for example.
  s2 = [[1,2,2,3,2,1,4,4],[2,4,3,2,3,1]]
 
  I am calculating the product, sum, max, min of each list in s2 but I
 get
  negative or 0 for the product for a lot of the lists. (I am doing this in
  ipython)
 
  for x in s2:
  print('len = ', len(x), 'sum = ', sum(x), 'prod = ', prod(x), 'max =
 ',
  max(x), 'min = ', min(x))
 
  ...
 
  ('len = ', 100, 'sum = ', 247, 'prod = ', 0, 'max = ', 4, 'min = ', 1)
  ('len = ', 100, 'sum = ', 230, 'prod = ', -4611686018427387904, 'max =
 ', 4,
  'min = ', 1)
  ('len = ', 100, 'sum = ', 261, 'prod = ', 0, 'max = ', 4, 'min = ', 1)
 
  .
 
  ('prod =', 0, 'max =', 4, 'min =', 1)
  ('prod =', 1729382256910270464, 'max =', 4, 'min =', 1)
  ('prod =', 0, 'max =', 4, 'min =', 1)
 
  
 
 
  Whats going on?

 Let me guess.
 These are your lists (sorted):

 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
 4, 4, 4, 4, 4, 4, 4, 4]

 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
 4, 4, 4, 4, 4, 4, 4, 4]

 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3,
 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4,
 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
 4, 4, 4, 4, 4, 4, 4, 4]

 You are using numpy.prod()

 Numpy.prod overflows:

  numpy.prod([-9223372036854775808, 2])
 ... 0

 You want to use something that doesn't such as:

 def prod(iter):
 p = 1
 for elem in iter:
 p *= elem
 return p

 and then you get your correct products:

 8002414661101704746694488837062656
 3907429033741066770846918377472
 682872717747345471717929714096013312

-- 
http://mail.python.org/mailman/listinfo/python-list


get each pair from a string.

2012-10-21 Thread Vincent Davis
I am looking for a good way to get every pair from a string. For example,
input:
x = 'apple'
output
'ap'
'pp'
'pl'
'le'

I am not seeing a obvious way to do this without multiple for loops, but
maybe there is not :-)
In the end I am going to what to get triples, quads... also.

Thanks
Vincent
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get each pair from a string.

2012-10-21 Thread Vincent Davis
@Emile,
I feel a little stupid, in my mind it was more difficult than in reality.

x = 'apple'
for f in range(len(x)-1):
print(x[f:f+2])

@Ian,
Thanks for that I was just looking in to that. I wonder which is faster I
have a large set of strings to process. I'll try some timings if I get a
chance later today.


Thanks again!
Vincent




On Mon, Oct 22, 2012 at 12:45 AM, Emile van Sebille em...@fenx.com wrote:

 On 10/21/2012 11:33 AM, Vincent Davis wrote:

 I am looking for a good way to get every pair from a string. For example,
 input:
 x = 'apple'
 output
 'ap'
 'pp'
 'pl'
 'le'

 I am not seeing a obvious way to do this without multiple for loops, but
 maybe there is not :-)
 In the end I am going to what to get triples, quads... also.


 How far have you gotten?  Show us the loops you're trying now and any
 errors you're getting.

 Emile



 --
 http://mail.python.org/**mailman/listinfo/python-listhttp://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get each pair from a string.

2012-10-21 Thread Vincent Davis
@vbr
Thats interesting. I would never have come up with that.

Vincent



On Sun, Oct 21, 2012 at 3:48 PM, Vlastimil Brom vlastimil.b...@gmail.comwrote:

 vbr
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get each pair from a string.

2012-10-21 Thread Vincent Davis
To All,
I appreciate the range of answers and the time each of you take to think
about and answer my question. Whether or not I use them I find them all
educational.
Thanks again.

Vincent



On Mon, Oct 22, 2012 at 2:03 AM, Emile van Sebille em...@fenx.com wrote:

 On 10/21/2012 12:06 PM, Ian Kelly wrote:

 On Sun, Oct 21, 2012 at 12:58 PM, Vincent Davis
 vinc...@vincentdavis.net wrote:

 x = 'apple'
 for f in range(len(x)-1):
  print(x[f:f+2])

 @Ian,
 Thanks for that I was just looking in to that. I wonder which is faster I
 have a large set of strings to process. I'll try some timings if I get a
 chance later today.


 The solution you came up with is probably faster, but less general --
 it will only work on sliceable sequences like strings, not arbitrary
 iterables.


 So the simple loop is the right answer for sliceable sequences like
 strings, but not if your code needs to deal with arbitrary iterables such
 as those that the standard library authors are expected to handle.

 So, as OP's a self confessed newbie asking about slicing, why provide an
 example requiring knowledge of tee, enumerate, next and izip?


 def nwise(iterable, n=2):
 iters = tee(iterable, n)
 for i, it in enumerate(iters):
 for _ in range(i):
 next(it, None)
 return izip(*iters)

 It's good that the standard library provides these tools as a convenience,
 but when all you need is a derringer, why reach for a howitzer?

 Emile


 --
 http://mail.python.org/**mailman/listinfo/python-listhttp://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Understanding and dealing with an exception

2012-10-14 Thread Vincent Davis
Yes afile is the file name and extension, ifile is the full file name and
path.

Thanks
Vincent

On Sunday, October 14, 2012, MRAB wrote:

 On 2012-10-14 05:23, Vincent Davis wrote:

 I am working on a script to find bad image files. I am using PIL
 and specifically image.verify() I have a set of known to be bad image
 files to test. I also what to be able to test any file for example a
 .txt and deal with the exception.
 Currently my code is basically

 try:
  im = Image.open(ifile)
  try:
  print(im.verify())
  except:
  print('Pil image.verify() failed: ' + afile)
 except IOError:
  print('PIL cannot identify image file: ' + afile)
 except:
  print(ifile)
  print(Unexpected error doing PIL.Image.open():, sys.exc_info()[0])
  raise

  [snip]
 I notice that you have both ifile and afile. Is that correct?

 --
 http://mail.python.org/**mailman/listinfo/python-listhttp://mail.python.org/mailman/listinfo/python-list



-- 
Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


Understanding and dealing with an exception

2012-10-13 Thread Vincent Davis
I am working on a script to find bad image files. I am using PIL
and specifically image.verify() I have a set of known to be bad image files
to test. I also what to be able to test any file for example a .txt and
deal with the exception.
Currently my code is basically

try:
im = Image.open(ifile)
try:
print(im.verify())
except:
print('Pil image.verify() failed: ' + afile)
except IOError:
print('PIL cannot identify image file: ' + afile)
except:
print(ifile)
print(Unexpected error doing PIL.Image.open():, sys.exc_info()[0])
raise

I have a lot of file that have an IOError.  I would expect this error for
any non image file.
I have yet to have image.verify() All failures have been IOError.

Then I got this error (below). Which to me to me is a bug in PIL?
The file seems ok when I open it for editing in an external editor.

So my question, I don't what to raise this exception thereby stoping the
script nor record the image as bad or good. This would possibly lead to
false positives or negatives.
Then again I assume it would be possible to get this error because the file
is corrupt.
I am not really sure how to deal with this. Any advise.

fixed-width.psd
('Unexpected error doing PIL.Image.open():', type
'exceptions.OverflowError')



OverflowError: Python int too large to convert to C long
File /Volumes/Hafnium/Google Drive/bad images/untitled-2.py, line 21, in
module
  im = Image.open(ifile)
File
/Library/Frameworks/Python.framework/Versions/7.0/lib/python2.7/site-packages/PIL/Image.py,
line 1965, in open
  return factory(fp, filename)
File
/Library/Frameworks/Python.framework/Versions/7.0/lib/python2.7/site-packages/PIL/ImageFile.py,
line 91, in __init__
  self._open()
File
/Library/Frameworks/Python.framework/Versions/7.0/lib/python2.7/site-packages/PIL/PsdImagePlugin.py,
line 123, in _open
  self.layers = _layerinfo(self.fp)
File
/Library/Frameworks/Python.framework/Versions/7.0/lib/python2.7/site-packages/PIL/PsdImagePlugin.py,
line 230, in _layerinfo
  t = _maketile(file, m, bbox, 1)
File
/Library/Frameworks/Python.framework/Versions/7.0/lib/python2.7/site-packages/PIL/PsdImagePlugin.py,
line 266, in _maketile
  bytecount = read(channels * ysize * 2)

Vincent
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Understanding and dealing with an exception

2012-10-13 Thread Vincent Davis
Oops, I was going to make note of the file size. 1.2MB

Vincent



On Sat, Oct 13, 2012 at 10:31 PM, Chris Angelico ros...@gmail.com wrote:

 On Sun, Oct 14, 2012 at 3:23 PM, Vincent Davis vinc...@vincentdavis.net
 wrote:
  OverflowError: Python int too large to convert to C long
  line 266, in _maketile
bytecount = read(channels * ysize * 2)

 Is the file over 2GB? Might be a limitation, more than a bug, and one
 that could possibly be raised by using a 64-bit build.

 Alternatively, you could deem them invalid for exceeding your file
 size limit (either before passing to PIL, or on catching this
 exception).

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Understanding and dealing with an exception

2012-10-13 Thread Vincent Davis
I can open it is and all looks good using Pixelmator (I don't have
Photoshop installed). I don't think there is anything wrong with the image.

Part of my question is a result of being new to actually using exceptions
in my programs and dealing with the exceptions is a primary part of what I
need to do with this program. When I get an exception that seems to be an
issue with PIL (i.e. not my program or a problem with the image) I am not
sure what the right or conventional way to deal with it is.

Vincent


On Sat, Oct 13, 2012 at 10:49 PM, Chris Angelico ros...@gmail.com wrote:

 On Sun, Oct 14, 2012 at 3:36 PM, Vincent Davis vinc...@vincentdavis.net
 wrote:
  Oops, I was going to make note of the file size. 1.2MB

 Then I'd definitely declare the file bad; I don't know what the valid
 ranges for channels and ysize are, but my reading of that is that your
 file's completely corrupt, maybe even malicious. PIL probably ought to
 check these things, so there may be a tracker issue coming from this,
 but I'd be inclined to declare any thrown exception as meaning it's a
 bad file. Call it failed a security check perhaps.

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Check email header for RFC 822 standard and match emails between imap servers.

2011-08-18 Thread Vincent Davis
The sort story, I have been attempting to use the Google Migration assistant
to migrate emails from one google account to another, about 80,000 emails.
I have two problems.
1. Many emails that fail to transfer because of errors like Invalid RFC 822
Message: Date header quot;Mon Feb 05 22:07:16 2007quot; is invalid. See
full error below.
 emails that.
2. I need to delete the emails from the old account that have been
transferred.

And the Two question are:
1:
I am able to connect and get an email. But I am not clear how I would check
that the header is valid or identify the problem.
grl = imaplib.IMAP4_SSL('imap.gmail.com', 993)
grl.login('n...@domain.com', 'password')
grl.fetch(17006, 'uid')
# So now I have an email how do I check the header, I know how to view it
but not check it for RFC 822.

2:
Since I don't know which emails have been transferred I want to delete all
the email that have. To be more correct I don't know which ones on the old
account, when they are moved to the new account they get the label as
transferred. How should I compare emails? The uid is different on each
server so I think using the TO: FROM: and DATE: TIME: would work.
How do I compare emails in this way?
How to I get the TO: FROM: DATE: TIME: from one email to reach for the same
email in the other account.


Sample error from google migration app.

2011-08-16T16:47:47.141-06:00  808 E:Network
ExchangeMigration!WinHttp::ExecuteHttpRequestIStreamResponse @ 696 (
gmetan...@domain.com gmetan...@glsworld.com) Response:

Invalid RFC 822 Message: Date header quot;Mon Feb 05 22:07:16 2007quot; is
invalid.

2011-08-16T16:47:47.141-06:00  808 E:Migration
ExchangeMigration!EmailUploader::HandleStatus @ 462
(gmetan...@domain.comgmetan...@glsworld.com)
Permanent Message Failure, skipping the message!.

2011-08-16T16:47:47.328-06:00  808 E:Migration
ExchangeMigration!IMAPMessageWrapper::GetMessageSentTime @ 154 (
gmetan...@domain.com gmetan...@glsworld.com) Failed with 0x80004005, last
successful line = 151.

2011-08-16T16:47:47.328-06:00  808 E:Migration
ExchangeMigration!IMAPMessageWrapper::GetMessageReceivedTime @ 170 (
gmetan...@domain.com gmetan...@glsworld.com) Failed with 0x80004001, last
successful line = 168.

2011-08-16T16:47:47.328-06:00  808 E:Migration
ExchangeMigration!GetMessageDescription @ 198
(gmetan...@domain.comgmetan...@glsworld.com)
Sent: 2011-08-16T22:47:47.000Z. Received: 2011-08-16T22:47:47.000Z. Size:
196114. Subject RE: ppt templates.


-- 
Thanks
Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


set a breakpoint in malloc_error_break to debug?

2011-04-06 Thread Vincent Davis
Not sure what is going on here. the set wset is large I am sure but ... Is
this something I am going wrong?

def walked_dir(adir):
wdirset = set()
for dirpath, dirnames, filenames in os.walk(adir):
for name in filenames:
if isfile(dirpath+'/'+name):
fullfilename = dirpath+'/'+name
the_stats = stat(fullfilename)
wdirset.add((dirpath, tuple(dirnames), tuple(filenames),
fullfilename, name, the_stats))
if not len(filenames):
wdirset.add((dirpath, tuple(dirnames), tuple(filenames), None,
None, None))
return wdirset

directory = '/Users/vmd/Dropbox'
wset = walked_dir(directory)


 wset
Python(19914) malloc: *** mmap(size=1536290816) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Python(19914) malloc: *** mmap(size=1536290816) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Python(19914) malloc: *** mmap(size=1536290816) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Traceback (most recent call last):
  File string, line 1, in fragment
MemoryError:


Python 2.7.1 |EPD 7.0-2 (32-bit)| (r271:86832, Dec  3 2010, 15:41:32)
[GCC 4.0.1 (Apple Inc. build 5488)]

-- 
Thanks
Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


Standard config file format

2011-04-05 Thread Vincent Davis
I am working on a program to monitor directory file changes and am would
like a configuration file. This file would specify email addresses, file and
directory locations.. Is there a preferred format to use with python?
-- 
Thanks
Vincent Davis
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Download an attachment from an IMAP email

2011-02-04 Thread Vincent Davis
hgc


On Thu, Feb 3, 2011 at 6:52 PM, Kushal Kumaran 
kushal.kumaran+pyt...@gmail.com kushal.kumaran%2bpyt...@gmail.com wrote:

 On Fri, Feb 4, 2011 at 3:44 AM, Vincent Davis vinc...@vincentdavis.net
 wrote:
  I have a few emails I am trying to download from my google account. I
 seem
  to be getting the message but each of these messages have an attachment.
 I
  don't understand what I ned to do to get and save the attachment to a
 local
  file.
  Here is what I have so far.
  M = imaplib.IMAP4_SSL(IMAP_SERVER, IMAP_PORT)
  rc, resp = M.login('x@', 'X')
  print rc, resp
  M.select('[Gmail]/All Mail')
  M.search(None, 'FROM', 'some...@logitech.com')
  #M.fetch(121, '(body[header.fields (subject)])')
  M.fetch(121, '(RFC822)')

 Take a look at the email module.  The message_from_string() function
 can convert the string representation of the email (as obtained by
 M.fetch(121, '(RFC822)') into a message object.


Thanks
Vincent


 --
 regards,
 kushal




-- 
Thanks
Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


Download an attachment from an IMAP email

2011-02-03 Thread Vincent Davis
I have a few emails I am trying to download from my google account. I seem
to be getting the message but each of these messages have an attachment. I
don't understand what I ned to do to get and save the attachment to a local
file.
Here is what I have so far.
M = imaplib.IMAP4_SSL(IMAP_SERVER, IMAP_PORT)
rc, resp = M.login('x@', 'X')
print rc, resp
M.select('[Gmail]/All Mail')
M.search(None, 'FROM', 'some...@logitech.com')
#M.fetch(121, '(body[header.fields (subject)])')
M.fetch(121, '(RFC822)')

-- 
Thanks
Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


get python bit version as in (32 or 64)

2010-10-19 Thread Vincent Davis
How do I get the bit version of the installed python. In my case, osx
python2.7 binary installed. I know it runs 64 bt as I can see it in
activity monitor. but how do I ask python?
sys.version
'2.7 (r27:82508, Jul  3 2010, 21:12:11) \n[GCC 4.0.1 (Apple Inc. build 5493)]'

-- 
Thanks
Vincent Davis
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get python bit version as in (32 or 64)

2010-10-19 Thread Vincent Davis
On Tue, Oct 19, 2010 at 3:29 PM, Philip Semanchuk phi...@semanchuk.com wrote:

 On Oct 19, 2010, at 5:18 PM, Vincent Davis wrote:

 How do I get the bit version of the installed python. In my case, osx
 python2.7 binary installed. I know it runs 64 bt as I can see it in
 activity monitor. but how do I ask python?
 sys.version
 '2.7 (r27:82508, Jul  3 2010, 21:12:11) \n[GCC 4.0.1 (Apple Inc. build 
 5493)]'


 I don't think there's an official way to do this. The canonical way appears 
 to be to test the value of sys.maxint and see whether or not it is a 32- or 
 64-bit long.

 See here for more details:

 http://stackoverflow.com/questions/1405913/how-do-i-determine-if-my-python-shell-is-executing-in-32bit-or-64bit-mode.

Great thanks
Vincent



 Cheers
 Philip
 --
 http://mail.python.org/mailman/listinfo/python-list




-- 
Thanks
Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get python bit version as in (32 or 64)

2010-10-19 Thread Vincent Davis
On Tue, Oct 19, 2010 at 3:55 PM, Philip Semanchuk phi...@semanchuk.com wrote:

 On Oct 19, 2010, at 5:38 PM, Hexamorph wrote:

 On 19.10.2010 23:18, Vincent Davis wrote:
 How do I get the bit version of the installed python. In my case, osx
 python2.7 binary installed. I know it runs 64 bt as I can see it in
 activity monitor. but how do I ask python?
 sys.version
 '2.7 (r27:82508, Jul  3 2010, 21:12:11) \n[GCC 4.0.1 (Apple Inc. build 
 5493)]'


 In [1]: import platform

 In [2]: platform.architecture()
 Out[2]: ('32bit', 'ELF')

 In [3]:


 Looks a lot better than my suggestion!
Yes that looks like the right way of doing it. Interesting though that
platform.machine()=i386 and not something about 64.
 print platform.machine()
i386
 print platform.architecture()
('64bit', '')
 import sys; sys.maxint
9223372036854775807

Thanks
Vincent



 --
 http://mail.python.org/mailman/listinfo/python-list




-- 
Thanks
Vincent Davis
720-301-3003
-- 
http://mail.python.org/mailman/listinfo/python-list


Python script to install python

2010-07-08 Thread Vincent Davis
I would like to have a python script that would download the most
recent svn of python, configure, make, install and cleanup after
itself. I am not replacing the python version I would be using to run
the script.
I was struggling to get this to work and I assume someone else has
done it better.  Any pointers?

Thanks
Vincent
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python script to install python

2010-07-08 Thread Vincent Davis
On Thu, Jul 8, 2010 at 9:11 AM, Daniel Fetchinson
fetchin...@googlemail.com wrote:
 I would like to have a python script that would download the most
 recent svn of python, configure, make, install and cleanup after
 itself. I am not replacing the python version I would be using to run
 the script.
 I was struggling to get this to work and I assume someone else has
 done it better.  Any pointers?

 Assuming you are on linux I recommend not using a python script for
 this but rather a shell script. From a python script you would most of
 the time be calling shell commands anyway. In a shell script you would
 do something like this:

 
 #!/bin/bash

 svn checkout 
 cd whatever
 ./configure --whatever-options-you-like
 make
 # you probably want to run this as root
 make install
 # you probably don't want to be root anymore
 cd ..
 rm -rf whatever

 

Ok I'll take your advice and just use a shell script. I am on osx by the way.

Thanks
Vincent

 If you are on windows I assume a similar strategy is best.

 Cheers,
 Daniel



 --
 Psss, psss, put it down! - http://www.cafepress.com/putitdown
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Installing or adding python dev installed python.

2010-06-19 Thread Vincent Davis
I have several versions of python installed and some I have built from
source which seems to install the python-dev on osx. I know that on
ubuntu python-dev is an optional install. The main python version I
use is the enthought distribution. Can I install the python-dev tools
with this? How. It there a good place for me to better understand what
python-dev is and how to get it installed on osx?

Thanks
Vincent
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: getting up arrow in terminal to scroll thought history of python commands

2010-06-14 Thread Vincent Davis
On Sun, Jun 13, 2010 at 6:24 PM, Irmen de Jong irmen-nosp...@xs4all.nl wrote:
 On 14-6-2010 1:19, Vincent Davis wrote:

 I just installed 2.6 and 3.1 from current maintenance source on Mac
 OSx. When I am running as an interactive terminal session the up arrow
 does not scroll thought the history of the py commands I have entered
 I just get ^[[A. When I install from a compiled source it works fine.
 Whats the fix for this?

 Thanks
 Vincent

 I'm guessing you don't have the readline module.

 Compile and install GNU Readline, then type 'make' again in your Python
 source tree. It should now no longer report a missing 'readline' module.

What exactly do you mean by 'make' again in your Python source tree.

Thanks
Vincent


 -irmen
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: getting up arrow in terminal to scroll thought history of python commands

2010-06-14 Thread Vincent Davis
On Mon, Jun 14, 2010 at 6:49 AM, Thomas Jollans tho...@jollans.com wrote:
 On 06/14/2010 02:37 PM, Vincent Davis wrote:
 On Sun, Jun 13, 2010 at 6:24 PM, Irmen de Jong irmen-nosp...@xs4all.nl 
 wrote:
 On 14-6-2010 1:19, Vincent Davis wrote:

 I just installed 2.6 and 3.1 from current maintenance source on Mac
 OSx. When I am running as an interactive terminal session the up arrow
 does not scroll thought the history of the py commands I have entered
 I just get ^[[A. When I install from a compiled source it works fine.
 Whats the fix for this?

 Thanks
 Vincent

 I'm guessing you don't have the readline module.

 Compile and install GNU Readline, then type 'make' again in your Python
 source tree. It should now no longer report a missing 'readline' module.

 What exactly do you mean by 'make' again in your Python source tree.

 You installed Python from source didn't you? At some point you'll have
 to invoke make, unless some tool did that for you.

 Anyway, make sure readline is installed, and then recompile Python.

So I should run
./configure
make install
again?
Will this overwrite other py packages I have installed?

Vincent





 Thanks
 Vincent


 -irmen
 --
 http://mail.python.org/mailman/listinfo/python-list


 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


getting up arrow in terminal to scroll thought history of python commands

2010-06-13 Thread Vincent Davis
I just installed 2.6 and 3.1 from current maintenance source on Mac
OSx. When I am running as an interactive terminal session the up arrow
does not scroll thought the history of the py commands I have entered
I just get ^[[A. When I install from a compiled source it works fine.
Whats the fix for this?

Thanks
Vincent
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: getting up arrow in terminal to scroll thought history of python commands

2010-06-13 Thread Vincent Davis
On Sun, Jun 13, 2010 at 5:28 PM, Gerry Reno gr...@verizon.net wrote:
 sounds like your keymapping got messed with.

 you could just:
 set -o vi
 python
 ESC, Ctrl-j
 and now ESC-k and ESC-j will take you back and forth in history (std vi
 editing)

This is done within python? Let make sure I am clear. This is only an
issue within the interactive python for the python dist I have built
from source not other pythons or terminal in general. I look into the
commands you suggested more but ESC-k and ESC-j don't sound very
appealing to me.

Thanks
Vincent

 -Gerry



 Jun 13, 2010 07:22:40 PM, vinc...@vincentdavis.net wrote:

 I just installed 2.6 and 3.1 from current maintenance source on Mac
 OSx. When I am running as an interactive terminal session the up arrow
 does not scroll thought the history of the py commands I have entered
 I just get ^[[A. When I install from a compiled source it works fine.
 Whats the fix for this?

 Thanks
 Vincent
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: lambda question

2010-06-12 Thread Vincent Davis
On Fri, Jun 11, 2010 at 10:11 PM, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Fri, Jun 11, 2010 at 9:31 PM, Vincent Davis vinc...@vincentdavis.net 
 wrote:
 Starting with an example.
 In [23]: x = [1,2,3,4,4,4,5,5,3,2,2,]
 In [24]: y = set(x)
 In [25]: y
 Out[25]: set([1, 2, 3, 4, 5])
 In [26]: y2 = len(set(x))
 In [27]: y2
 Out[27]: 5

 How would I do the above y2 = len(set(x)) but have len(set()) in a
 dictionary. I know how to do ..
 In [30]: d = dict(s=set)
 In [32]: d['s'](x)
 Out[32]: set([1, 2, 3, 4, 5])

 but not sure how to add the len() and thought maybe the answer in a
 lambda function.
 I know I could def a function but would prefer to keep it all on one line.

 d = dict(s=lambda x: len(set(x)))
 d['s'](x)
 5

I must have been half asleep, I thought for sure I tried that. Well it
works great this morning :)

Thanks
Vincent

 Cheers,
 Ian
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


lambda question

2010-06-11 Thread Vincent Davis
Starting with an example.
In [23]: x = [1,2,3,4,4,4,5,5,3,2,2,]
In [24]: y = set(x)
In [25]: y
Out[25]: set([1, 2, 3, 4, 5])
In [26]: y2 = len(set(x))
In [27]: y2
Out[27]: 5

How would I do the above y2 = len(set(x)) but have len(set()) in a
dictionary. I know how to do ..
In [30]: d = dict(s=set)
In [32]: d['s'](x)
Out[32]: set([1, 2, 3, 4, 5])

but not sure how to add the len() and thought maybe the answer in a
lambda function.
I know I could def a function but would prefer to keep it all on one line.

Thanks
Vincent
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python Jobs

2010-06-09 Thread Vincent Davis
On Wed, Jun 9, 2010 at 2:21 PM, Michael Chambliss em...@mchambliss.com wrote:
 I use Python for my own entertainment and for quick jobs, but haven't been
 able to use it professionally up to this point.  As a former Perl developer
 and someone that's currently required to code in Java I'm starting to wish I
 had this opportunity.  Can anyone comment on the Python job market?  If
 you're currently employed writing Python apps, I'd be particularly
 interested in knowing any of the following:
 - Your location - country, state or city, whatever you care to provide
 - Your focus - Product Development (web sites/apps), Education, RD/Science,
 IT/Sys Admin, etc
 - Your company size
 - Your compensation relative to the .NET/Java developers you know -
 generally higher/lower?

 In my area (Denver, CO) I predominantly see Java positions, followed closely
 by .NET.  I'll occasionally see something pop up related to PHP or Ruby web
 development but hardly ever Python, so I'm just curious if I'm looking in
 the wrong places.
 Thanks for any input!
 -Mike

You might take a look at Front Range pythoneers. The is a mailing list
an I think monthly meetups.  I see some job post coma across the list
now and then.
http://www.meetup.com/frpythoneers/

I am also in the Denver area and have been meaning to go to one of the meetups.

Vincent


 --
 http://mail.python.org/mailman/listinfo/python-list


-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >