[issue34870] Core dump when Python VSCode debugger is attached

2018-10-02 Thread Per Lundberg


New submission from Per Lundberg :

My code has recently started triggering a core dump in the Python executable 
when the VSCode debugger is attached. This doesn't happen right away; it seems 
to happen more or less _after_ the program is done executing (I just placed a 
breakpoint and stepped it through).

The program in question is this: 
https://github.com/hiboxsystems/trac-to-gitlab/blob/master/migrate.py

To help in the debugging of this, I installed python2.7-dbg and gdb-python2 on 
my Debian machine, and re-ran the script using this version. Here is the GDB 
output when analyzing the backtrace:

$ gdb /usr/bin/python2.7-dbg core
GNU gdb (Debian 8.1-4+b1) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python2.7-dbg...done.
[New LWP 19749]
[New LWP 19744]
[New LWP 19747]
[New LWP 19754]
[New LWP 19751]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/python2.7-dbg -m ptvsd --host localhost --port 
43959 migrate.py --only'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  PyEval_EvalFrameEx (f=0x7f815c002310, throwflag=0) at ../Python/ceval.c:3347
3347if (tstate->frame->f_exc_type != NULL)
[Current thread is 1 (Thread 0x7f815bfff700 (LWP 19749))]


The python backtrace looks like this:

(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib/python2.7/threading.py", line 371, in wait
self._acquire_restore(saved_state)
  File "/usr/lib/python2.7/Queue.py", line 177, in get
self.not_empty.wait(remaining)
  File 
"/home/per/.vscode/extensions/ms-python.python-2018.8.0/pythonFiles/experimental/ptvsd/ptvsd/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py",
 line 458, in _on_run
cmd = self.cmdQueue.get(1, 0.1)
  File 
"/home/per/.vscode/extensions/ms-python.python-2018.8.0/pythonFiles/experimental/ptvsd/ptvsd/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py",
 line 319, in run
self._on_run()
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
  File "/usr/lib/python2.7/threading.py", line 774, in __bootstrap
self.__bootstrap_inner()


And the C-level backtrace:

(gdb) bt 
#0  PyEval_EvalFrameEx (f=Frame 0x7f815c002310, for file 
/usr/lib/python2.7/threading.py, line 371, in wait (), throwflag=0)
at ../Python/ceval.c:3347
#1  0x5624534af42c in PyEval_EvalCodeEx (co=0x7f816216e7d0, 
globals={'current_thread': None, '_BoundedSemaphore': None, 
'currentThread': None, '_Timer': None, '_format_exc': None, 'Semaphore': None, 
'_deque': None, 'activeCount': None, '_profile_hook': None, '_sleep': None, 
'_trace_hook': None, 'ThreadError': None, '_enumerate': None, 
'_start_new_thread': None, 'BoundedSemaphore': None, '_shutdown': None, 
'__all__': None, '_original_start_new_thread': None, '_Event': None, 
'active_count': None, '__package__': None, '_Condition': None, '_RLock': None, 
'_test': None, 'local': None, '__doc__': None, 'Condition': None, '_Verbose': 
None, '_DummyThread': None, 'Thread': None, 'warnings': None, '__builtins__': 
{'bytearray': None, 'IndexError': None, 'all': None, 'help': None, 'vars': 
None, 'SyntaxError': None, 'unicode': None, 'UnicodeDecodeError': None, 
'memoryview': None, 'isinstance': None, 'copyright': None, 'NameError': None, 
'BytesWarning': None, 'dict': None, 'input': None, 'oct': None, 'bin': None, 
'SystemExit': None, 'StandardError': No
 ne, 'format': None, 'repr': None, 'sor...(truncated), locals=0x0, 
args=0x562454463068, argcount=2, 
kws=0x562454463078, kwcount=0, defs=0x7f8162116408, defcount=1, 
closure=0x0) at ../Python/ceval.c:3604
#2  0x5624534b23a7 in fast_function (func=, pp_stack=0x7f815bffd3e8, n=2, na=2, nk=0)
at ../Python/ceval.c:4467
#3  0x5624534b1f8a in call_function (pp_stack=0x7f815bffd3e8, oparg=1) at 
../Python/ceval.c:4392
#4  0x5624534ac45d in PyEval_EvalFrameEx (
f=Frame 0x562454462eb0, for file /usr/lib/python2.7/Queue.py, line 177, in 
get (self=, maxsize=0, all_tasks_done=<_Condition(_Verbose__verbose=False, 
_Condition__lock=, acquire=, 
_Condition__waiters=[], release=) at remote 0x7f81

[issue25144] 3.5 Win install fails with "TARGETDIR"

2017-11-20 Thread Per Fryking

Per Fryking <fryk...@gmail.com> added the comment:

Got the same issue with the 3.6 installer from python.org

The thing is that I can't elevate the priviliges to be administrator. So I'm 
stuck.

Uploading the log. Running windows 7

--
nosy: +Per Fryking
Added file: https://bugs.python.org/file47278/Python 3.6.3 
(32-bit)_20171120135800.log

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue25144>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22544] Inconsistent cmath.log behaviour

2015-04-25 Thread Per Brodtkorb

Per Brodtkorb added the comment:

This is not only a problem for division. It also applies to multiplication as 
exemplified here:

 complex(0,inf)+1  # expect 1 + infj
Out[16]: (1+infj)

 (complex(0,inf)+1)*1  # expect 1 + infj
Out[17]: (nan+infj)

 complex(inf, 0) + 1j  # expect inf + 1j
Out[18]: (inf+1j)

 (complex(inf, 0)+1j)*1  # expect inf + 1j
Out[19]: (inf, nanj)

--
nosy: +pbrod
versions: +Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22544
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14507] Segfault with starmap and izip combo

2012-04-05 Thread Per Myren

New submission from Per Myren progr...@gmail.com:

The following code crashes with a segfault on Python 2.7.2:

from operator import add
from itertools import izip, starmap

a = b = [1]
for i in xrange(10):
a = starmap(add, izip(a, b))

list(a)


It also crashes with Python 3.2.2:

from operator import add
from itertools import starmap

a = b = [1]
for i in range(10):
a = starmap(add, zip(a, b))

list(a)

--
components: Library (Lib)
messages: 157576
nosy: progrper
priority: normal
severity: normal
status: open
title: Segfault with starmap and izip combo
type: crash
versions: Python 2.7, Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14507
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2011-11-30 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

Ah, I thought that he had reused most of the original C code in _lzmamodule.c 
not replaced by python code, but I see that not being the case now (only slight 
fragments;).

Oh well, I thought that I'd still earned a note with some slight credit at 
least, but I guess I won't go postal or anything in lack of either.. :p

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2011-11-29 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

Not meaning to sound petty, but wouldn't it be common etiquette to retain some 
original copyright notice from original code intact..?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12013] file /usr/local/lib/python3.1/lib-dynload/_socket.so: symbol inet_aton: referenced symbol not found

2011-11-16 Thread Per Rosengren

Per Rosengren per.roseng...@gmail.com added the comment:

On Linux:
nm -C /lib/libc.so.6 |grep ' inet_aton'
000cbce0 W inet_aton

This means that when Python is build with GCC (like on linux), inet_aton
is in system libc. 

If you build with GCC in solaris, inet_aton will be taken from the GCC lib dir. 
You need to put that GCC lib dir in your LD_LIBRARY_PATH when you run Python.

--
nosy: +Per.Rosengren

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12013
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12394] packaging: generate scripts from callable (dotted paths)

2011-06-24 Thread Per Cederqvist

Changes by Per Cederqvist ce...@lysator.liu.se:


--
nosy: +ceder

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12394
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11817] berkeley db 5.1 support

2011-04-09 Thread Per Øyvind Karlsen

New submission from Per Øyvind Karlsen peroyv...@mandriva.org:

This patch adds support for berkeley db = 5.1.

--
components: Extension Modules
files: Python-2.7.1-berkeley-db-5.1.patch
keywords: patch
messages: 133442
nosy: proyvind
priority: normal
severity: normal
status: open
title: berkeley db 5.1 support
versions: Python 2.7
Added file: http://bugs.python.org/file21601/Python-2.7.1-berkeley-db-5.1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11817
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11817] berkeley db 5.1 support

2011-04-09 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

forgot some additional config checks in setup.py in previous patch..

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11817
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11817] berkeley db 5.1 support

2011-04-09 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

sloppysloppy...

fix previous patch

--
Added file: http://bugs.python.org/file21602/Python-2.7.1-berkeley-db-5.1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11817
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-31 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

I've uploaded a new version of the patch to 
http://codereview.appspot.com/2724043/ now.

I'd be okay on doing maintenance directly against the CPython repository btw. :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-31 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

LZMAFile, LZMACompressor  LZMADecompressor are all inspired by and written to 
be as similar to bz2's for easier use  maintenance. I must admit that I 
haven't really put much thought into alternate ways to implement them beyond 
monkey see, monkey do.. ;)

LZMAOptions is a bit awkwardly written, but it doesn't serve documentation 
purposes only, it also exposes these values for max, min etc. to python (ie. as 
used by it's regression tests) and are also used when processing various 
compression options passed.

IMO it does serve a useful purpose, but certainly wouldn't hurt from being 
rewritten in some better way...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-31 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

Hehe, don't feel guily on my part at least, I had already implemented it like 
this long before. :p

I guess I could rewrite it following these suggestions, but I probably won't be 
able to finish it in time for 3.2 beta.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-28 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

All fixed now. :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-28 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

Here's a patch with the latest code generated against py3k branch, it comes 
with Doc/library/lzma.rst as well now.

--
keywords: +patch
Added file: http://bugs.python.org/file19405/py3k-lzmamodule.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-28 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

here's Lib/test/teststring.lzma, required by the test suite.

--
Added file: http://bugs.python.org/file19406/teststring.lzma

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-28 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

here's Lib/test/teststring.xz, required by the test suite.

--
Added file: http://bugs.python.org/file19407/teststring.xz

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-10-27 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

I've (finally) finalized the api and prepared pyliblzma to be ready for 
inclusion now.

The code can be found in the 'py3k' branch referred to earlier.

Someone else (don't remember who:p) volunteered for writing the PEP earlier, so 
I leave it up to that person to write the PEP, I won't be able to get around to 
do so myself in the near future..

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-05-29 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

I've ported pyliblzma to py3k now and also implemented the missing 
functionality I mentioned earlier, for anyone interested in my progress the 
branch is found at:
https://code.launchpad.net/~proyvind/pyliblzma/py3k

I need to fix some memory leakages (side effect of the new PyUnicode/Pybytes 
change I'm not 100% with yet;) and some various memory errors reported by 
valgrind etc. though, but things are starting to look quite nice already. :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4015] [patch] make installed scripts executable on windows

2010-05-28 Thread Per

Per pybugs.pho...@safersignup.com added the comment:

On POSIX the interpreter will be read from the first line of a file.
On Windows the interpreter will be read from the Registry 
HKEY_CLASSES_ROOT\.file-extension .

So the correct way to associate a interpreter to a file is to invent a 
file-extension for every interpreter.
Like /usr/bin/python /usr/bin/python3 and /usr/bin/python3.1 on POSIX, there 
should be .py .py3 and .py31 on Windows!

I attached a example-registry-patch to register extensions for 2.5, 2.6 and 3.1 
.
If you want to use it, you need to adjust the paths!

I propose to change all Python-Windows-installer to install versioned 
extensions.

If you want a switcher application, it should read the first line of the script 
and match it against .*/python(.*)$. So the default POSIX 
#!/usr/bin/python3.1 can be kept unchanged. With that rexex the app-path can 
be read from 
HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\regex-match\InstallPath\.

BTW.
It would be nice if Python would call itself Python 3.1 instead of python 
in the Open with...-list! The current naming is problematic if you install 
more than one Python version.

--
nosy: +phobie
Added file: http://bugs.python.org/file17481/hklm_python_extensions.reg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4015
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5689] please support lzma compression as an extension and in the tarfile module

2010-05-26 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

if you're already looking at issue6715, then I don't get why you're asking.. ;)

quoting from msg106433:
For my code, feel free to use your own/any other license you'd like or even 
public domain (if the license of bz2module.c that much of it's derived from 
permits of course)!

The reason why I picked LGPLv3 in the past was simply just because liblzma at 
the time was licensed under it, so I just picked the same for simplicity.
I've actually already dual-licensed it under the python license in addition on 
the project page though, but I just forgot updating the module's metadata as 
well before I released 0.5.3 last month..

Martin: For LGPL (or even GPL for that matter, disregarding linking 
restrictions) libraries you don't have to distribute the sources of those 
libraries at all (they're already made available by others, so that would be 
quite overly redundant, uh?;). LGPL actually doesn't even care at all about the 
license of your software as long as you only dynamically link against it.

I don't really get what the issue would be even if liblzma were still LGPL, it 
doesn't prohibit you from distributing a dynamically linked library along with 
python either if necessary (which of course would be of convenience on 
win32..)..

tsktsk, discussions about python module for xz compression should anyways be 
kept at issue6715 as this one is about the tarfile module ;p

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5689
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-05-26 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

Yeah, I guess I anyways can just break the current API right away to make it 
compatible with future changes, I've already figured since long ago how it 
should look like. It's not like I have to implement the actual functionality to 
ensure compatibility, no-op works like charm. ;)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5689] please support lzma compression as an extension and in the tarfile module

2010-05-25 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

I'm the author of the pyliblzma module, and if desired, I'd be happy to help 
out adapting pyliblzma for inclusion with python.
Most of it's code is based on bz2module.c, so it shouldn't be very far away 
from being good 'nuff.
What I see as required is:
* clean out use of C99 types etc.
* clean up the LZMAOptions class (this is the biggest difference from the bz2 
module, as the filter supports a wide range of various options, everything 
related such as parsing, api documentation etc. was placed in it's own class, 
I've yet to receive any feedback on this decission or find any remote 
equivalents out there to draw inspiration from;)
* While most of the liblzma API has been implemented, support for 
multiple/alternate filters still remains to be implemented. When done it will 
also cause some breakage with the current pyliblzma API.

I plan on doing these things sooner or later anyways, it's pretty much just a 
matter of motivation and priorities standing in the way, actual interest from 
others would certainly have a positive effect on this. ;)

For other alternatives to the LGPL liblzma, you really don't have any, keep in 
mind that LZMA is merely the algorithm, while xz (and LZMA_alone, used for 
'.lzma', now obsolete, but still supported) are the actual format you want 
support for. The LZMA SDK does not provide any compatibility for this.

--
nosy: +proyvind

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5689
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5689] please support lzma compression as an extension and in the tarfile module

2010-05-25 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

ps: pylzma uses the LZMA SDK, which is not what you want.
pyliblzma (not the same module;) OTOH uses liblzma, which is the library used 
by xz/lzma utils

You'll find it available at http://launchpad.net/pyliblzma

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5689
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-05-25 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

Ooops, I kinda should've commented on this issue here in stead, rather than in 
issue5689, so I'll just copy-paste it here as well:

I'm the author of the pyliblzma module, and if desired, I'd be happy to help 
out adapting pyliblzma for inclusion with python.
Most of it's code is based on bz2module.c, so it shouldn't be very far away 
from being good 'nuff.
What I see as required is:
* clean out use of C99 types etc.
* clean up the LZMAOptions class (this is the biggest difference from the bz2 
module, as the filter supports a wide range of various options, everything 
related such as parsing, api documentation etc. was placed in it's own class, 
I've yet to receive any feedback on this decission or find any remote 
equivalents out there to draw inspiration from;)
* While most of the liblzma API has been implemented, support for 
multiple/alternate filters still remains to be implemented. When done it will 
also cause some breakage with the current pyliblzma API.

I plan on doing these things sooner or later anyways, it's pretty much just a 
matter of motivation and priorities standing in the way, actual interest from 
others would certainly have a positive effect on this. ;)

For other alternatives to the LGPL liblzma, you really don't have any, keep in 
mind that LZMA is merely the algorithm, while xz (and LZMA_alone, used for 
'.lzma', now obsolete, but still supported) are the actual format you want 
support for. The LZMA SDK does not provide any compatibility for this.

--
nosy: +proyvind

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6715] xz compressor support

2010-05-25 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

ah, you're right, I forgot that the license for the library had changed as well 
(motivated by attempt of pleasing BSD people IIRC;), in the past the library 
was LGPL while only the 'xz' util was public domain..

For my code, feel free to use your own/any other license you'd like or even 
public domain (if the license of bz2module.c that much of it's derived from 
permits of course)!

I guess everyone should be happy now then. :)

Btw. for review, I think the code already available should be pretty much good 
'nuff for an initial review. Some feedback on things not derived from 
bz2module.c would be nice, especially the LZMAOptions class would be nice as 
it's where most of the remaining work required for adding additional filters 
support. Would kinda blow if I did the work using an approach that would be 
dismissed as utterly rubbish. ;)

Oh well, it's out there available for anyone already, I probably 
won't(/shouldn't;) have time for it in a month at least, do as you please 
meanwhile. :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6715
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: creating pipelines in python

2009-11-25 Thread per
Thanks to all for your replies.  i want to clarify what i mean by a
pipeline.  a major feature i am looking for is the ability to chain
functions or scripts together, where the output of one script -- which
is usually a file -- is required for another script to run.  so one
script has to wait for the other.  i would like to do this over a
cluster, where some of the scripts are distributed as separate jobs on
a cluster but the results are then collected together.  so the ideal
library would have easily facilities for expressing this things:
script X and Y run independently, but script Z depends on the output
of X and Y (which is such and such file or file flag).

is there a way to do this? i prefer not to use a framework that
requires control of the clusters etc. like Disco, but something that's
light weight and simple. right now ruffus seems most relevant but i am
not sure -- are there other candidates?

thank you.

On Nov 23, 4:02 am, Paul Rudin paul.nos...@rudin.co.uk wrote:
 per perfr...@gmail.com writes:
  hi all,

  i am looking for a python package to make it easier to create a
  pipeline of scripts (all in python). what i do right now is have a
  set of scripts that produce certain files as output, and i simply have
  a master script that checks at each stage whether the output of the
  previous script exists, using functions from the os module. this has
  several flaws and i am sure someone has thought of nice abstractions
  for making these kind of wrappers easier to write.

  does anyone have any recommendations for python packages that can do
  this?

 Not entirely what you're looking for, but the subprocess module is
 easier to work with for this sort of thing than os. See e.g. 
 http://docs.python.org/library/subprocess.html#replacing-shell-pipeline

-- 
http://mail.python.org/mailman/listinfo/python-list


creating pipelines in python

2009-11-22 Thread per
hi all,

i am looking for a python package to make it easier to create a
pipeline of scripts (all in python). what i do right now is have a
set of scripts that produce certain files as output, and i simply have
a master script that checks at each stage whether the output of the
previous script exists, using functions from the os module. this has
several flaws and i am sure someone has thought of nice abstractions
for making these kind of wrappers easier to write.

does anyone have any recommendations for python packages that can do
this?

thanks.
-- 
http://mail.python.org/mailman/listinfo/python-list


efficiently splitting up strings based on substrings

2009-09-05 Thread per
I'm trying to efficiently split strings based on what substrings
they are made up of.
i have a set of strings that are comprised of known substrings.
For example, a, b, and c are substrings that are not identical to each
other, e.g.:
a = 0 * 5
b = 1 * 5
c = 2 * 5

Then my_string might be:

my_string = a + b + c

i am looking for an efficient way to solve the following problem.
suppose i have a short
string x that is a substring of my_string.  I want to split the
string x into blocks based on
what substrings (i.e. a, b, or c) chunks of s fall into.

to illustrate this, suppose x = 00111. Then I can detect where x
starts in my_string
using my_string.find(x).  But I don't know how to partition x into
blocks depending
on the substrings.  What I want to get out in this case is: 00,
111.  If x were 00122,
I'd want to get out 00,1, 22.

is there an easy way to do this?  i can't simply split x on a, b, or c
because these might
not be contained in x.  I want to avoid doing something inefficient
like looking at all substrings
of my_string etc.

i wouldn't mind using regular expressions for this but i cannot think
of an easy regular
expression for this problem.  I looked at the string module in the
library but did not see
anything that seemd related but i might have missed it.

any help on this would be greatly appreciated.  thanks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: efficiently splitting up strings based on substrings

2009-09-05 Thread per
On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
 On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
  I'm trying to efficiently split strings based on what substrings
  they are made up of.
  i have a set of strings that are comprised of known substrings.
  For example, a, b, and c are substrings that are not identical to each
  other, e.g.:
  a = 0 * 5
  b = 1 * 5
  c = 2 * 5

  Then my_string might be:

  my_string = a + b + c

  i am looking for an efficient way to solve the following problem.
  suppose i have a short
  string x that is a substring of my_string.  I want to split the
  string x into blocks based on
  what substrings (i.e. a, b, or c) chunks of s fall into.

  to illustrate this, suppose x = 00111. Then I can detect where x
  starts in my_string
  using my_string.find(x).  But I don't know how to partition x into
  blocks depending
  on the substrings.  What I want to get out in this case is: 00,
  111.  If x were 00122,
  I'd want to get out 00,1, 22.

  is there an easy way to do this?  i can't simply split x on a, b, or c
  because these might
  not be contained in x.  I want to avoid doing something inefficient
  like looking at all substrings
  of my_string etc.

  i wouldn't mind using regular expressions for this but i cannot think
  of an easy regular
  expression for this problem.  I looked at the string module in the
  library but did not see
  anything that seemd related but i might have missed it.

 I'm not sure I understand your question exactly.  You seem to imply
 that the order of the substrings of x is consistent.  If that's the
 case, this ought to help:

  import re
  x = 00122
  m = re.match(r(0*)(1*)(2*), x)
  m.groups()

 ('00', '1', '22') y = 00111
  m = re.match(r(0*)(1*)(2*), y)
  m.groups()

 ('00', '111', '')

 You'll have to filter out the empty groups for yourself, but that's
 no great problem.

 --
 Rhodri James *-* Wildebeest Herder to the Masses

The order of the substrings is consistent but what if it's not 0, 1, 2
but a more complicated string? e.g.

a = 1030405, b = 1babcf, c = fUUIUP

then the substring x might be 4051ba, in which case using a regexp
with (1*) will not work since both a and b substrings begin with the
character 1.

your solution works if that weren't a possibility, so what you wrote
is definitely the kind of solution i am looking for. i am just not
sure how to solve it in the general case where the substrings might be
similar to each other (but not similar enough that you can't tell
where the substring came from).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: efficiently splitting up strings based on substrings

2009-09-05 Thread per
On Sep 5, 7:07 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
 On Sat, 05 Sep 2009 23:54:08 +0100, per perfr...@gmail.com wrote:
  On Sep 5, 6:42 pm, Rhodri James rho...@wildebst.demon.co.uk wrote:
  On Sat, 05 Sep 2009 22:54:41 +0100, per perfr...@gmail.com wrote:
   I'm trying to efficiently split strings based on what substrings
   they are made up of.
   i have a set of strings that are comprised of known substrings.
   For example, a, b, and c are substrings that are not identical to each
   other, e.g.:
   a = 0 * 5
   b = 1 * 5
   c = 2 * 5

   Then my_string might be:

   my_string = a + b + c

   i am looking for an efficient way to solve the following problem.
   suppose i have a short
   string x that is a substring of my_string.  I want to split the
   string x into blocks based on
   what substrings (i.e. a, b, or c) chunks of s fall into.

   to illustrate this, suppose x = 00111. Then I can detect where x
   starts in my_string
   using my_string.find(x).  But I don't know how to partition x into
   blocks depending
   on the substrings.  What I want to get out in this case is: 00,
   111.  If x were 00122,
   I'd want to get out 00,1, 22.

   is there an easy way to do this?  i can't simply split x on a, b, or c
   because these might
   not be contained in x.  I want to avoid doing something inefficient
   like looking at all substrings
   of my_string etc.

   i wouldn't mind using regular expressions for this but i cannot think
   of an easy regular
   expression for this problem.  I looked at the string module in the
   library but did not see
   anything that seemd related but i might have missed it.

  I'm not sure I understand your question exactly.  You seem to imply
  that the order of the substrings of x is consistent.  If that's the
  case, this ought to help:

   import re
   x = 00122
   m = re.match(r(0*)(1*)(2*), x)
   m.groups()

  ('00', '1', '22') y = 00111
   m = re.match(r(0*)(1*)(2*), y)
   m.groups()

  ('00', '111', '')

  You'll have to filter out the empty groups for yourself, but that's
  no great problem.

  The order of the substrings is consistent but what if it's not 0, 1, 2
  but a more complicated string? e.g.

  a = 1030405, b = 1babcf, c = fUUIUP

  then the substring x might be 4051ba, in which case using a regexp
  with (1*) will not work since both a and b substrings begin with the
  character 1.

 Right.  This looks approximately nothing like what I thought your
 problem was.  Would I be right in thinking that you want to match
 substrings of your potential substrings against the string x?

 I'm sufficiently confused that I think I'd like to see what your
 use case actually is before I make more of a fool of myself.

 --
 Rhodri James *-* Wildebeest Herder to the Masses

it's exactly the same problem, except there are no constraints on the
strings.  so the problem is, like you say, matching the substrings
against the string x. in other words, finding out where x aligns to
the ordered substrings abc, and then determine what chunk of x belongs
to a, what chunk belongs to b, and what chunk belongs to c.

so in the example i gave above, the substrings are: a = 1030405, b =
1babcf, c = fUUIUP, so abc = 10304051babcffUUIUP

given a substring like 4051ba, i'd want to split it into the chunks a,
b, and c. in this case, i'd want the result to be: [405, 1ba] --
i.e. 405 is the chunk of x that belongs to a, and 1ba the chunk
that belongs to be. in this case, there are no chunks of c.  if x
instead were 4051babcffUU, the right output is: [405, 1babcf,
fUU], which are the corresponding chunks of a, b, and c that make up
x respectively.

i'm not sure how to approach this. any ideas/tips would be greatly
appreciated. thanks again.
-- 
http://mail.python.org/mailman/listinfo/python-list


allowing output of code that is unittested?

2009-07-15 Thread per
hi all,

i am using the standard unittest module to unit test my code. my code
contains several print statements which i noticed are repressed when i
call my unit tests using:

if __name__ == '__main__':
suite = unittest.TestLoader().loadTestsFromTestCase(TestMyCode)
unittest.TextTestRunner(verbosity=2).run(suite)

is there a way to allow all the print statements in the code that is
being run by the unit test functions to be printed to stdio?  i want
to be able to see the output of the tested code, in addition to the
output of the unit testing framework.

thank you.
-- 
http://mail.python.org/mailman/listinfo/python-list


fastest native python database?

2009-06-17 Thread per
hi all,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

does anyone have any recommendations? the only candidates i've seen
are snaklesql and buzhug... any thoughts/benchmarks on these?

any info on this would be greatly appreciated. thank you
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: fastest native python database?

2009-06-17 Thread per
i would like to add to my previous post that if an option like SQLite
with a python interface (pysqlite) would be orders of magnitude faster
than naive python options, i'd prefer that. but if that's not the
case, a pure python solution without dependencies on other things
would be the best option.

thanks for the suggestion, will look into gadfly in the meantime.

On Jun 17, 11:38 pm, Emile van Sebille em...@fenx.com wrote:
 On 6/17/2009 8:28 PM per said...

  hi all,

  i'm looking for a native python package to run a very simple data
  base. i was originally using cpickle with dictionaries for my problem,
  but i was making dictionaries out of very large text files (around
  1000MB in size) and pickling was simply too slow.

  i am not looking for fancy SQL operations, just very simple data base
  operations (doesn't have to be SQL style) and my preference is for a
  module that just needs python and doesn't require me to run a separate
  data base like Sybase or MySQL.

 You might like gadfly...

 http://gadfly.sourceforge.net/gadfly.html

 Emile



  does anyone have any recommendations? the only candidates i've seen
  are snaklesql and buzhug... any thoughts/benchmarks on these?

  any info on this would be greatly appreciated. thank you



-- 
http://mail.python.org/mailman/listinfo/python-list


generating random tuples in python

2009-04-20 Thread per
hi all,

i am generating a list of random tuples of numbers between 0 and 1
using the rand() function, as follows:

for i in range(0, n):
  rand_tuple = (rand(), rand(), rand())
  mylist.append(rand_tuple)

when i generate this list, some of the random tuples might be
very close to each other, numerically. for example, i might get:

(0.553, 0.542, 0.654)

and

(0.581, 0.491, 0.634)

so the two tuples are close to each other in that all of their numbers
have similar magnitudes.

how can i maximize the amount of numeric distance between the
elements of
this list, but still make sure that all the tuples have numbers
strictly
between 0 and 1 (inclusive)?

in other words i want the list of random numbers to be arbitrarily
different (which is why i am using rand()) but as different from other
tuples in the list as possible.

thank you for your help

--
http://mail.python.org/mailman/listinfo/python-list


Re: generating random tuples in python

2009-04-20 Thread per
On Apr 20, 11:08 pm, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:
 On Mon, 20 Apr 2009 11:39:35 -0700, per wrote:
  hi all,

  i am generating a list of random tuples of numbers between 0 and 1 using
  the rand() function, as follows:

  for i in range(0, n):
    rand_tuple = (rand(), rand(), rand()) mylist.append(rand_tuple)

  when i generate this list, some of the random tuples might be very close
  to each other, numerically. for example, i might get:
 [...]
  how can i maximize the amount of numeric distance between the elements
  of
  this list, but still make sure that all the tuples have numbers strictly
  between 0 and 1 (inclusive)?

 Well, the only way to *maximise* the distance between the elements is to
 set them to (0.0, 0.5, 1.0).

  in other words i want the list of random numbers to be arbitrarily
  different (which is why i am using rand()) but as different from other
  tuples in the list as possible.

 That means that the numbers you are generating will no longer be
 uniformly distributed, they will be biased. That's okay, but you need to
 describe *how* you want them biased. What precisely do you mean by
 maximizing the distance?

 For example, here's one strategy: you need three random numbers, so
 divide the complete range 0-1 into three: generate three random numbers
 between 0 and 1/3.0, called x, y, z, and return [x, 1/3.0 + y, 2/3.0 + z].

 You might even decide to shuffle the list before returning them.

 But note that you might still happen to get (say) [0.332, 0.334, 0.668]
 or similar. That's the thing with randomness.

 --
 Steven

i realize my example in the original post was misleading. i dont want
to maximize the difference between individual members of a single
tuple -- i want to maximize the difference between distinct tuples. in
other words, it's ok to have (.332, .334, .38), as long as the other
tuple is, say, (.52, .6, .9) which is very difference from (.332, .
334, .38).  i want the member of a given tuple to be arbitrary, e.g.
something like (rand(), rand(), rand()) but that the tuples be very
different from each other.

to be more formal by very different, i would be happy if they were
maximally distant in ordinary euclidean space... so if you just plot
the 3-tuples on x, y, z i want them to all be very different from each
other.  i realize this is obviously biased and that the tuples are not
uniformly distributed -- that's exactly what i want...

any ideas on how to go about this?

thank you.
--
http://mail.python.org/mailman/listinfo/python-list


loading program's global variables in ipython

2009-03-22 Thread per
hi all,

i have a file that declares some global variables, e.g.

myglobal1 = 'string'
myglobal2 = 5

and then some functions. i run it using ipython as follows:

[1] %run myfile.py

i notice then that myglobal1 and myglobal2 are not imported into
python's interactive namespace. i'd like them too -- how can i do
this?

 (note my file does not contain a __name__ == '__main__' clause.)

thanks.
--
http://mail.python.org/mailman/listinfo/python-list


splitting a large dictionary into smaller ones

2009-03-22 Thread per
hi all,

i have a very large dictionary object that is built from a text file
that is about 800 MB -- it contains several million keys.  ideally i
would like to pickle this object so that i wouldnt have to parse this
large file to compute the dictionary every time i run my program.
however currently the pickled file is over 300 MB and takes a very
long time to write to disk - even longer than recomputing the
dictionary from scratch.

i would like to split the dictionary into smaller ones, containing
only hundreds of thousands of keys, and then try to pickle them. is
there a way to easily do this? i.e. is there an easy way to make a
wrapper for this such that i can access this dictionary as just one
object, but underneath it's split into several? so that i can write
my_dict[k] and get a value, or set my_dict[m] to some value without
knowing which sub dictionary it's in.

if there aren't known ways to do this, i would greatly apprciate any
advice/examples on how to write this data structure from scratch,
reusing as much of the dict() class as possible.

thanks.

large_dict[a]
--
http://mail.python.org/mailman/listinfo/python-list


Re: splitting a large dictionary into smaller ones

2009-03-22 Thread per
On Mar 22, 10:51 pm, Paul Rubin http://phr...@nospam.invalid wrote:
 per perfr...@gmail.com writes:
  i would like to split the dictionary into smaller ones, containing
  only hundreds of thousands of keys, and then try to pickle them.

 That already sounds like the wrong approach.  You want a database.

fair enough - what native python database would you recommend? i
prefer not to install anything commercial or anything other than
python modules
--
http://mail.python.org/mailman/listinfo/python-list


parsing tab separated data efficiently into numpy/pylab arrays

2009-03-13 Thread per
hi all,

what's the most efficient / preferred python way of parsing tab
separated data into arrays? for example if i have a file containing
two columns one corresponding to names the other numbers:

col1\t col 2
joe\t  12.3
jane   \t 155.0

i'd like to parse into an array() such that i can do: mydata[:, 0] and
mydata[:, 1] to easily access all the columns.

right now i can iterate through the file, parse it manually using the
split('\t') command and construct a list out of it, then convert it to
arrays. but there must be a better way?

also, my first column is just a name, and so it is variable in length
-- is there still a way to store it as an array so i can access: mydata
[:, 0] to get all the names (as a list)?

thank you.
--
http://mail.python.org/mailman/listinfo/python-list


[issue5411] add xz compression support to distutils

2009-03-09 Thread Per Øyvind Karlsen

Per Øyvind Karlsen peroyv...@mandriva.org added the comment:

hmm, I'm unsure about how this should be done..


I guess such a test would belong in Lib/distutils/test_dist.py, but I'm
uncertain about how it should be done, ie. should it be a test for doing
'bdist', 'bdist_rpm' and 'sdist' for each of the formats supported? I
cannot seem to find any tests for the currently supported formats and
such tests would introduce dependencies on the tools used to compress
with these formats..

--
message_count: 2.0 - 3.0

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5411
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



speeding up reading files (possibly with cython)

2009-03-07 Thread per
hi all,

i have a program that essentially loops through a textfile file thats
about 800 MB in size containing tab separated data... my program
parses this file and stores its fields in a dictionary of lists.

for line in file:
  split_values = line.strip().split('\t')
  # do stuff with split_values

currently, this is very slow in python, even if all i do is break up
each line using split() and store its values in a dictionary, indexing
by one of the tab separated values in the file.

is this just an overhead of python that's inevitable? do you guys
think that switching to cython might speed this up, perhaps by
optimizing the main for loop?  or is this not a viable option?

thank you.
--
http://mail.python.org/mailman/listinfo/python-list


[issue5411] add xz compression support to distutils

2009-03-03 Thread Per Øyvind Karlsen

New submission from Per Øyvind Karlsen peroyv...@mandriva.org:

Here's a patch that adds support for xz compression:
http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/python/current/SOURCES/Python-2.6.1-distutils-xz-support.patch?view=log

--
assignee: tarek
components: Distutils
messages: 83072
nosy: proyvind, tarek
severity: normal
status: open
title: add xz compression support to distutils
type: feature request
versions: Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5411
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



setting PYTHONPATH to override system wide site-packages

2009-02-28 Thread per
hi all,

i recently installed a new version of a package using python setup.py
install --prefix=/my/homedir on a system where i don't have root
access. the old package still resides in /usr/lib/python2.5/site-
packages/ and i cannot erase it.

i set my python path as follows in ~/.cshrc

setenv PYTHONPATH /path/to/newpackage

but whenever i go to python and import the module, the version in site-
packages is loaded. how can i override this setting and make it so
python loads the version of the package that's in my home dir?

 thanks.


--
http://mail.python.org/mailman/listinfo/python-list


Re: setting PYTHONPATH to override system wide site-packages

2009-02-28 Thread per
On Feb 28, 11:24 pm, Carl Banks pavlovevide...@gmail.com wrote:
 On Feb 28, 7:30 pm, per perfr...@gmail.com wrote:

  hi all,

  i recently installed a new version of a package using python setup.py
  install --prefix=/my/homedir on a system where i don't have root
  access. the old package still resides in /usr/lib/python2.5/site-
  packages/ and i cannot erase it.

  i set my python path as follows in ~/.cshrc

  setenv PYTHONPATH /path/to/newpackage

  but whenever i go to python and import the module, the version in site-
  packages is loaded. how can i override this setting and make it so
  python loads the version of the package that's in my home dir?

 What happens when you run the command print sys.path from the Python
 prompt?  /path/to/newpackage should be the second item, and shoud be
 listed in front of the site-packages dir.

 What happens when you run print os.eviron['PYTHONPATH'] at the
 Python interpreter?  It's possible that the sysadmin installed a
 script that removes PYTHONPATH environment variable before invoking
 Python.  What happens when you type which python at the csh prompt?

 What happens when you type ls /path/to/newpackage at your csh
 prompt?  Is the module you're trying to import there?

 You approach should work.  These are just suggestions on how to
 diagnose the problem; we can't really help you figure out what's wrong
 without more information.

 Carl Banks

hi,

i am setting it programmatically now, using:

import sys
sys.path = []

sys.path now looks exactly like what it looked like before, except the
second element is my directory. yet when i do

import mymodule
print mymodule.__version__

i still get the old version...

any other ideas?
--
http://mail.python.org/mailman/listinfo/python-list


Re: setting PYTHONPATH to override system wide site-packages

2009-02-28 Thread per
On Feb 28, 11:53 pm, per perfr...@gmail.com wrote:
 On Feb 28, 11:24 pm, Carl Banks pavlovevide...@gmail.com wrote:



  On Feb 28, 7:30 pm, per perfr...@gmail.com wrote:

   hi all,

   i recently installed a new version of a package using python setup.py
   install --prefix=/my/homedir on a system where i don't have root
   access. the old package still resides in /usr/lib/python2.5/site-
   packages/ and i cannot erase it.

   i set my python path as follows in ~/.cshrc

   setenv PYTHONPATH /path/to/newpackage

   but whenever i go to python and import the module, the version in site-
   packages is loaded. how can i override this setting and make it so
   python loads the version of the package that's in my home dir?

  What happens when you run the command print sys.path from the Python
  prompt?  /path/to/newpackage should be the second item, and shoud be
  listed in front of the site-packages dir.

  What happens when you run print os.eviron['PYTHONPATH'] at the
  Python interpreter?  It's possible that the sysadmin installed a
  script that removes PYTHONPATH environment variable before invoking
  Python.  What happens when you type which python at the csh prompt?

  What happens when you type ls /path/to/newpackage at your csh
  prompt?  Is the module you're trying to import there?

  You approach should work.  These are just suggestions on how to
  diagnose the problem; we can't really help you figure out what's wrong
  without more information.

  Carl Banks

 hi,

 i am setting it programmatically now, using:

 import sys
 sys.path = []

 sys.path now looks exactly like what it looked like before, except the
 second element is my directory. yet when i do

 import mymodule
 print mymodule.__version__

 i still get the old version...

 any other ideas?

in case it helps, it gives me this warning when i try to import the
module

/usr/lib64/python2.5/site-packages/pytz/__init__.py:29: UserWarning:
Module dateutil was already imported from /usr/lib64/python2.5/site-
packages/dateutil/__init__.pyc, but /usr/lib/python2.5/site-packages
is being added to sys.path
  from pkg_resources import resource_stream
--
http://mail.python.org/mailman/listinfo/python-list


optimizing large dictionaries

2009-01-15 Thread Per Freem
hello

i have an optimization questions about python. i am iterating through
a file and counting the number of repeated elements. the file has on
the order
of tens of millions elements...

i create a dictionary that maps elements of the file that i want to
count
to their number of occurs. so i iterate through the file and for each
line
extract the elements (simple text operation) and see if it has an
entry in the dict:

for line in file:
  try:
elt = MyClass(line)# extract elt from line...
my_dict[elt] += 1
  except KeyError:
my_dict[elt] = 1

i am using try/except since it is supposedly faster (though i am not
sure
about this? is this really true in Python 2.5?).

the only 'twist' is that my elt is an instance of a class (MyClass)
with 3 fields, all numeric. the class is hashable, and so my_dict[elt]
works well.
the __repr__ and __hash__ methods of my class simply return str()
representation
of self, while __str__ just makes everything numeric field into a
concatenated string:

class MyClass

  def __str__(self):
return %s-%s-%s %(self.field1, self.field2, self.field3)

  def __repr__(self):
return str(self)

  def __hash__(self):
return hash(str(self))


is there anything that can be done to speed up this simply code? right
now it is taking well over 15 minutes to process, on a 3 Ghz machine
with lots of RAM (though this is all taking CPU power, not RAM at this
point.)

any general advice on how to optimize large dicts would be great too

thanks for your help.
--
http://mail.python.org/mailman/listinfo/python-list


Re: optimizing large dictionaries

2009-01-15 Thread Per Freem
thanks to everyone for the excellent suggestions. a few follow up q's:

1] is Try-Except really slower? my dict actually has two layers, so
my_dict[aKey][bKeys]. the aKeys are very small (less than 100) where
as the bKeys are the ones that are in the millions.  so in that case,
doing a Try-Except on aKey should be very efficient, since often it
will not fail, where as if I do: if aKey in my_dict, that statement
will get executed for each aKey. can someone definitely say whether
Try-Except is faster or not? My benchmarks aren't conclusive and i
hear it both ways from several people (though majority thinks
TryExcept is faster).

2] is there an easy way to have nested defaultdicts? ie i want to say
that my_dict = defaultdict(defaultdict(int)) -- to reflect the fact
that my_dict is a dictionary, whose values are dictionary that map to
ints. but that syntax is not valid.

3] more importantly, is there likely to be a big improvement for
splitting up one big dictionary into several smaller ones? if so, is
there a straight forward elegant way to implement this? the way i am
thinking is to just fix a number of dicts and populate them with
elements. then during retrieval, try the first dict, if that fails,
try the second, if not the third, etc... but i can imagine how that's
more likely to lead to bugs / debugging give the way my code is setup
so i am wondering whether it is really worth it.
if it can lead to a factor of 2 difference, i will definitely
implement it -- does anyone have experience with this?

On Jan 15, 5:58 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:
 On Thu, 15 Jan 2009 23:22:48 +0100, Christian Heimes wrote:
  is there anything that can be done to speed up this simply code? right
  now it is taking well over 15 minutes to process, on a 3 Ghz machine
  with lots of RAM (though this is all taking CPU power, not RAM at this
  point.)

  class MyClass(object):
      # a new style class with slots saves some memory
      __slots__ = (field1, field2, field2)

 I was curious whether using slots would speed up attribute access.

  class Parrot(object):

 ...     def __init__(self, a, b, c):
 ...             self.a = a
 ...             self.b = b
 ...             self.c = c
 ... class SlottedParrot(object):

 ...     __slots__ = 'a', 'b', 'c'
 ...     def __init__(self, a, b, c):
 ...             self.a = a
 ...             self.b = b
 ...             self.c = c
 ...

  p = Parrot(23, something, [1, 2, 3])
  sp = SlottedParrot(23, something, [1, 2, 3])

  from timeit import Timer
  setup = from __main__ import p, sp
  t1 = Timer('p.a, p.b, p.c', setup)
  t2 = Timer('sp.a, sp.b, sp.c', setup)
  min(t1.repeat())
 0.83308887481689453
  min(t2.repeat())

 0.62758088111877441

 That's not a bad improvement. I knew that __slots__ was designed to
 reduce memory consumption, but I didn't realise they were faster as well.

 --
 Steven

--
http://mail.python.org/mailman/listinfo/python-list


efficient interval containment lookup

2009-01-12 Thread Per Freem
hello,

suppose I have two lists of intervals, one significantly larger than
the other.
For example listA = [(10, 30), (5, 25), (100, 200), ...] might contain
thousands
of elements while listB (of the same form) might contain hundreds of
thousands
or millions of elements.
I want to count how many intervals in listB are contained within every
listA. For example, if listA = [(10, 30), (600, 800)] and listB =
[(20, 25), (12, 18)] is the input, then the output should be that (10,
30) has 2 intervals from listB contained within it, while (600, 800)
has 0. (Elements of listB can be contained within many intervals in
listA, not just one.)

What is an efficient way to this?  One simple way is:

for a_range in listA:
  for b_range in listB:
is_within(b_range, a_range):
  # accumulate a counter here

where is_within simply checks if the first argument is within the
second.

I'm not sure if it's more efficient to have the iteration over listA
be on the outside or listB.  But perhaps there's a way to index this
that makes things more efficient?  I.e. a smart way of indexing listA
such that I can instantly get all of its elements that are within some
element
of listB, maybe?  Something like a hash, where this look up can be
close to constant time rather than an iteration over all lists... if
there's any built-in library functions that can help in this it would
be great.

any suggestions on this would be awesome. thank you.
--
http://mail.python.org/mailman/listinfo/python-list


Re: efficient interval containment lookup

2009-01-12 Thread Per Freem
thanks for your replies -- a few clarifications and questions. the
is_within operation is containment, i.e. (a,b) is within (c,d) iff a
= c and b = d. Note that I am not looking for intervals that
overlap... this is why interval trees seem to me to not be relevant,
as the overlapping interval problem is way harder than what I am
trying to do. Please correct me if I'm wrong on this...

Scott Daniels, I was hoping you could elaborate on your comment about
bisect. I am trying to use it as follows: I try to grid my space
(since my intervals have an upper and lower bound) into segments (e.g.
of 100) and then I take these bins and put them into a bisect list,
so that it is sorted. Then when a new interval comes in, I try to
place it within one of those bins.  But this is getting messy: I don't
know if I should place it there by its beginning number or end
number.  Also, if I have an interval that overlaps my boundaries --
i.e. (900, 1010) when my first interval is (0, 1000), I may miss some
items from listB when i make my count.  Is there an elegant solution
to this?  Gridding like you said seemed straight forward but now it
seems complicated..

I'd like to add that this is *not* a homework problem, by the way.

On Jan 12, 4:05 pm, Robert Kern robert.k...@gmail.com wrote:
 [Apologies for piggybacking, but I think GMane had a hiccup today and missed 
 the
 original post]

 [Somebody wrote]:

  suppose I have two lists of intervals, one significantly larger than
  the other.
  For example listA = [(10, 30), (5, 25), (100, 200), ...] might contain
  thousands
  of elements while listB (of the same form) might contain hundreds of
  thousands
  or millions of elements.
  I want to count how many intervals in listB are contained within every
  listA. For example, if listA = [(10, 30), (600, 800)] and listB =
  [(20, 25), (12, 18)] is the input, then the output should be that (10,
  30) has 2 intervals from listB contained within it, while (600, 800)
  has 0. (Elements of listB can be contained within many intervals in
  listA, not just one.)

 Interval trees.

 http://en.wikipedia.org/wiki/Interval_tree

 --
 Robert Kern

 I have come to believe that the whole world is an enigma, a harmless enigma
   that is made terrible by our own mad attempt to interpret it as though it 
 had
   an underlying truth.
    -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list


Re: efficient interval containment lookup

2009-01-12 Thread Per Freem
On Jan 12, 10:58 pm, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:
 On Mon, 12 Jan 2009 14:49:43 -0800, Per Freem wrote:
  thanks for your replies -- a few clarifications and questions. the
  is_within operation is containment, i.e. (a,b) is within (c,d) iff a
 = c and b = d. Note that I am not looking for intervals that
  overlap... this is why interval trees seem to me to not be relevant, as
  the overlapping interval problem is way harder than what I am trying to
  do. Please correct me if I'm wrong on this...

 To test for contained intervals:
 a = c and b = d

 To test for overlapping intervals:

 not (b  c or a  d)

 Not exactly what I would call way harder.

 --
 Steven

hi Steven,

i found an implementation (which is exactly how i'd write it based on
the description) here: 
http://hackmap.blogspot.com/2008/11/python-interval-tree.html

when i use this however, it comes out either significantly slower or
equal to a naive search. my naive search just iterates through a
smallish list of intervals and for each one says whether they overlap
with each of a large set of intervals.

here is the exact code i used to make the comparison, plus the code at
the link i have above:

class Interval():
def __init__(self, start, stop):
self.start = start
self.stop = stop

import random
import time
num_ints = 3
init_intervals = []
for n in range(0,
num_ints):
start = int(round(random.random()
*1000))
end = start + int(round(random.random()*500+1))
init_intervals.append(Interval(start, end))
num_ranges = 900
ranges = []
for n in range(0, num_ranges):
  start = int(round(random.random()
*1000))
  end = start + int(round(random.random()*500+1))
  ranges.append((start, end))
#print init_intervals
tree = IntervalTree(init_intervals)
t1 = time.time()
for r in ranges:
  tree.find(r[0], r[1])
t2 = time.time()
print interval tree: %.3f %((t2-t1)*1000.0)
t1 = time.time()
for r in ranges:
  naive_find(init_intervals, r[0], r[1])
t2 = time.time()
print brute force: %.3f %((t2-t1)*1000.0)

on one run, i get:
interval tree: 8584.682
brute force: 8201.644

is there anything wrong with this implementation? it seems very right
to me but i am no expert. any help on this would be relly helpful.
--
http://mail.python.org/mailman/listinfo/python-list


Re: efficient interval containment lookup

2009-01-12 Thread Per Freem
i forgot to add, my naive_find is:

def naive_find(intervals, start, stop):
  results = []
  for interval in intervals:
if interval.start = start and interval.stop = stop:
  results.append(interval)
  return results

On Jan 12, 11:55 pm, Per Freem perfr...@yahoo.com wrote:
 On Jan 12, 10:58 pm, Steven D'Aprano



 ste...@remove.this.cybersource.com.au wrote:
  On Mon, 12 Jan 2009 14:49:43 -0800, Per Freem wrote:
   thanks for your replies -- a few clarifications and questions. the
   is_within operation is containment, i.e. (a,b) is within (c,d) iff a
  = c and b = d. Note that I am not looking for intervals that
   overlap... this is why interval trees seem to me to not be relevant, as
   the overlapping interval problem is way harder than what I am trying to
   do. Please correct me if I'm wrong on this...

  To test for contained intervals:
  a = c and b = d

  To test for overlapping intervals:

  not (b  c or a  d)

  Not exactly what I would call way harder.

  --
  Steven

 hi Steven,

 i found an implementation (which is exactly how i'd write it based on
 the description) 
 here:http://hackmap.blogspot.com/2008/11/python-interval-tree.html

 when i use this however, it comes out either significantly slower or
 equal to a naive search. my naive search just iterates through a
 smallish list of intervals and for each one says whether they overlap
 with each of a large set of intervals.

 here is the exact code i used to make the comparison, plus the code at
 the link i have above:

 class Interval():
     def __init__(self, start, stop):
         self.start = start
         self.stop = stop

 import random
 import time
 num_ints = 3
 init_intervals = []
 for n in range(0,
 num_ints):
     start = int(round(random.random()
 *1000))
     end = start + int(round(random.random()*500+1))
     init_intervals.append(Interval(start, end))
 num_ranges = 900
 ranges = []
 for n in range(0, num_ranges):
   start = int(round(random.random()
 *1000))
   end = start + int(round(random.random()*500+1))
   ranges.append((start, end))
 #print init_intervals
 tree = IntervalTree(init_intervals)
 t1 = time.time()
 for r in ranges:
   tree.find(r[0], r[1])
 t2 = time.time()
 print interval tree: %.3f %((t2-t1)*1000.0)
 t1 = time.time()
 for r in ranges:
   naive_find(init_intervals, r[0], r[1])
 t2 = time.time()
 print brute force: %.3f %((t2-t1)*1000.0)

 on one run, i get:
 interval tree: 8584.682
 brute force: 8201.644

 is there anything wrong with this implementation? it seems very right
 to me but i am no expert. any help on this would be relly helpful.

--
http://mail.python.org/mailman/listinfo/python-list


Re: efficient interval containment lookup

2009-01-12 Thread Per Freem
hi brent, thanks very much for your informative reply -- didn't
realize this about the size of the interval.

thanks for the bx-python link.  could you (or someone else) explain
why the size of the interval makes such a big difference? i don't
understand why it affects efficiency so much...

thanks.

On Jan 13, 12:24 am, brent bpede...@gmail.com wrote:
 On Jan 12, 8:55 pm, Per Freem perfr...@yahoo.com wrote:



  On Jan 12, 10:58 pm, Steven D'Aprano

  ste...@remove.this.cybersource.com.au wrote:
   On Mon, 12 Jan 2009 14:49:43 -0800, Per Freem wrote:
thanks for your replies -- a few clarifications and questions. the
is_within operation is containment, i.e. (a,b) is within (c,d) iff a
   = c and b = d. Note that I am not looking for intervals that
overlap... this is why interval trees seem to me to not be relevant, as
the overlapping interval problem is way harder than what I am trying to
do. Please correct me if I'm wrong on this...

   To test for contained intervals:
   a = c and b = d

   To test for overlapping intervals:

   not (b  c or a  d)

   Not exactly what I would call way harder.

   --
   Steven

  hi Steven,

  i found an implementation (which is exactly how i'd write it based on
  the description) 
  here:http://hackmap.blogspot.com/2008/11/python-interval-tree.html

  when i use this however, it comes out either significantly slower or
  equal to a naive search. my naive search just iterates through a
  smallish list of intervals and for each one says whether they overlap
  with each of a large set of intervals.

  here is the exact code i used to make the comparison, plus the code at
  the link i have above:

  class Interval():
      def __init__(self, start, stop):
          self.start = start
          self.stop = stop

  import random
  import time
  num_ints = 3
  init_intervals = []
  for n in range(0,
  num_ints):
      start = int(round(random.random()
  *1000))
      end = start + int(round(random.random()*500+1))
      init_intervals.append(Interval(start, end))
  num_ranges = 900
  ranges = []
  for n in range(0, num_ranges):
    start = int(round(random.random()
  *1000))
    end = start + int(round(random.random()*500+1))
    ranges.append((start, end))
  #print init_intervals
  tree = IntervalTree(init_intervals)
  t1 = time.time()
  for r in ranges:
    tree.find(r[0], r[1])
  t2 = time.time()
  print interval tree: %.3f %((t2-t1)*1000.0)
  t1 = time.time()
  for r in ranges:
    naive_find(init_intervals, r[0], r[1])
  t2 = time.time()
  print brute force: %.3f %((t2-t1)*1000.0)

  on one run, i get:
  interval tree: 8584.682
  brute force: 8201.644

  is there anything wrong with this implementation? it seems very right
  to me but i am no expert. any help on this would be relly helpful.

 hi, the tree is inefficient when the interval is large. as the size of
 the interval shrinks to much less than the expanse of the tree, the
 tree will be faster. changing 500 to 50 in both cases in your script,
 i get:
 interval tree: 3233.404
 brute force: 9807.787

 so the tree will work for limited cases. but it's quite simple. check
 the tree in 
 bx-python:http://bx-python.trac.bx.psu.edu/browser/trunk/lib/bx/intervals/opera...
 for a more robust implementation.
 -brentp


--
http://mail.python.org/mailman/listinfo/python-list


RE: listdir reports [Error 1006] The volume for a file has been externally altered so that the opened file is no longer valid

2009-01-08 Thread Per Olav Kroka
FYI: the '/*.*' is part of the error message returned. 

-Original Message-
From: ch...@rebertia.com [mailto:ch...@rebertia.com] On Behalf Of Chris
Rebert
Sent: Wednesday, January 07, 2009 6:40 PM
To: Per Olav Kroka
Cc: python-list@python.org
Subject: Re: listdir reports [Error 1006] The volume for a file has been
externally altered so that the opened file is no longer valid

 PS: Why does the listdir() function add '*.*' to the path?

Don't know what you're talking about. It doesn't do any globbing or add
*.* to the path. Its exclusive purpose is to list the contents of a
directory, so /in a sense/ it does add *.*, but then not adding *.*
would make the function completely useless given its purpose.

 PS2: Why does the listdir() function add '/*.*' to the path on windows

 and not '\\*.*' ?

You can use either directory separator (\ or /) with the Python APIs on
Windows. rc:\WINDOWS\ works just as well as c:/WINDOWS/.

Cheers,
Chris

--
Follow the path of the Iguana...
http://rebertia.com
--
http://mail.python.org/mailman/listinfo/python-list


[issue3810] os.chdir() et al: is the path str or bytes?

2008-09-08 Thread Per Cederqvist

New submission from Per Cederqvist [EMAIL PROTECTED]:

The documentation at
http://docs.python.org/dev/3.0/library/os.html#os.chdir doesn't specify
if the path argument to os.chdir() should be a str or a bytes, or if
maybe both are acceptable.  This is true for most of the
file-manipulating functions in the os module.

os.listdir() talks about Unicode objects.  It should probably talk about
bytes and str instead.

--
assignee: georg.brandl
components: Documentation
messages: 72820
nosy: ceder, georg.brandl
severity: normal
status: open
title: os.chdir() et al: is the path str or bytes?
versions: Python 3.0

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3810
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2315] TimedRotatingFileHandler does not account for daylight savings time

2008-03-17 Thread Per Cederqvist

New submission from Per Cederqvist [EMAIL PROTECTED]:

If TimedRotatingFileHandler is instructed to roll over the log at
midnight or on a certain weekday, it needs to consider when daylight
savings time starts and ends. The current code just blindly adds
self.interval to self.rolloverAt, totally ignoring that sometimes it
should add 23 or 25 hours instead of 24 hours.

(I suspect that the implementation would be simpler if you use the
datetime module, rather than attempt to patch the existing code.)

--
components: Library (Lib)
messages: 63622
nosy: ceder
severity: normal
status: open
title: TimedRotatingFileHandler does not account for daylight savings time
type: behavior
versions: Python 2.5

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2315
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2316] TimedRotatingFileHandler names files incorrectly if nothing is logged during an interval

2008-03-17 Thread Per Cederqvist

New submission from Per Cederqvist [EMAIL PROTECTED]:

If nothing is logged during an interval, the TimedRotatingFileHandler
will give bad names to future log files.

The enclosed example program sets up a logger that rotates the log every
second.  It then logs a few messages with sleep of 1, 2, 4, 1 and 1
seconds between the messages.  The log files will have names that
increase with one second per log file, but the content for the last file
will be generated a different second.

An example run produced the message

  2008-03-17 09:16:06: 1 sec later

in a log file named badlogdir/logfile.2008-03-17_09-16-02.

This problem was likely introduced in revision 42066.  The root cause is
that self.rolloverAt is increased by self.interval in doRollover - but
if nothing was logged for a while, it should be increased by a multiple
of self.interval.

--
messages: 63624
nosy: ceder
severity: normal
status: open
title: TimedRotatingFileHandler names files incorrectly if nothing is logged 
during an interval

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2316
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2316] TimedRotatingFileHandler names files incorrectly if nothing is logged during an interval

2008-03-17 Thread Per Cederqvist

Per Cederqvist [EMAIL PROTECTED] added the comment:

The attached program will generate log messages with a timestamp that
are logged into a file with an unexpected extension.

To run:

  mkdir badlogdir
  python badlogger.py

Running the program takes about 9 seconds.

Added file: http://bugs.python.org/file9687/badlogger.py

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2316
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2316] TimedRotatingFileHandler names files incorrectly if nothing is logged during an interval

2008-03-17 Thread Per Cederqvist

Changes by Per Cederqvist [EMAIL PROTECTED]:


--
components: +Library (Lib)
type:  - behavior
versions: +Python 2.5

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2316
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2317] TimedRotatingFileHandler logic for removing files wrong

2008-03-17 Thread Per Cederqvist

New submission from Per Cederqvist [EMAIL PROTECTED]:

There are three issues with log file removal in the
TimedRotatingFileHandler class:

 - Removal will stop working in the year 2100, as the code assumes that
   timestamps start with .20.

 - If you run an application with backupCount set to a high number, and
   then restarts it with a lower number, the code will still not remove
   as many log files as you expect.  It will never remove more than one
   file when it rotates the log.

 - It assumes that no other files matches baseFilename + .20*, so
   make sure that you don't log to both log and
   log.20th.century.fox in the same directory!

Suggested fix: use os.listdir() instead of glob.glob(), filter all
file names using a proper regexp, sort the result, and use a while
loop to remove files until the result is small enough.  To reduce the
risk of accidentally removing an unrelated file, the filter regexp
should be based on the logging interval, just as the filename is.

My suggested fix means that old files may not be removed if you change
the interval.  I think that is an acceptable behavior, but it should
probably be documented to avoid future bug reports on this subject. :-)

--
components: Library (Lib)
messages: 63626
nosy: ceder
severity: normal
status: open
title: TimedRotatingFileHandler logic for removing files wrong
type: behavior
versions: Python 2.5

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2317
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2318] TimedRotatingFileHandler: rotate every month, or every year

2008-03-17 Thread Per Cederqvist

New submission from Per Cederqvist [EMAIL PROTECTED]:

In my curent project, I would like to rotate log files on the 1st of
every month.  The TimedRotatingFileHandler class cannot do this, even
though it tries to be very generic.

I imagine that other projects would like to rotate the log file every
year.  That can also not be done.

--
components: Library (Lib)
messages: 63627
nosy: ceder
severity: normal
status: open
title: TimedRotatingFileHandler: rotate every month, or every year
type: feature request

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2318
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Program eating memory, but only on one machine?

2007-01-22 Thread Per B. Sederberg
Hi Everybody:

I'm having a difficult time figuring out a a memory use problem.  I
have a python program that makes use of numpy and also calls a small C
module I wrote because part of the simulation needed to loop and I got
a massive speedup by putting that loop in C.  I'm basically
manipulating a bunch of matrices, so nothing too fancy.

That aside, when the simulation runs, it typically uses a relatively
small amount of memory (about 1.5% of my 4GB of RAM on my linux
desktop) and this never increases.  It can run for days without
increasing beyond this, running many many parameter set iterations.
This is what happens both on my Ubuntu Linux machine with the
following Python specs:

Python 2.4.4c1 (#2, Oct 11 2006, 20:00:03)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type help, copyright, credits or license for more information.
 import numpy
 numpy.version.version
'1.0rc1'

and also on my Apple MacBook with the following Python specs:

Python 2.4.3 (#1, Apr  7 2006, 10:54:33)
[GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin
Type help, copyright, credits or license for more information.
 import numpy
 numpy.version.version
'1.0.1.dev3435'



Well, that is the case on two of my test machines, but not on the one
machine that I really wish would work, my lab's cluster, which would
give me 20-fold increase in the number of processes I could run.  On
that machine, each process is using 2GB of RAM after about 1 hour (and
the cluster MOM eventually kills them).  I can watch the process eat
RAM at each iteration and never relinquish it.  Here's the Python spec
of the cluster:

Python 2.4.4 (#1, Jan 21 2007, 12:09:48)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-49)] on linux2
Type help, copyright, credits or license for more information.
 import numpy
 numpy.version.version
'1.0.1'

It also showed the same issue with the April 2006 2.4.3 release of python.

I have tried using the gc module to force garbage collection after
each iteration, but no change.  I've done many newsgroup/google
searches looking for known issues, but none found.  The only major
difference I can see is that our cluster is stuck on a really old
version of gcc with the RedHat Enterprise that's on there, but I found
no suggestions of memory issues online.

So, does anyone have any suggestions for how I can debug this problem?
 If my program ate up memory on all machines, then I would know where
to start and would blame some horrible programming on my end.  This
just seems like a less straightforward problem.

Thanks for any help,
Per
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Program eating memory, but only on one machine?

2007-01-22 Thread Per B.Sederberg

Wolfgang Draxinger wdraxinger at darkstargames.de writes:
 
  So, does anyone have any suggestions for how I can debug this
  problem?
 
 Have a look at the version numbers of the GCC used. Probably
 something in your C code fails if it interacts with GCC 3.x.x.
 It's hardly Python eating memory, this is probably your C
 module. GC won't help here, since then you must add this into
 your C module.
 
   If my program ate up memory on all machines, then I would know
  where to start and would blame some horrible programming on my
  end. This just seems like a less straightforward problem.
 
 GCC 3.x.x brings other runtime libs, than GCC 4.x.x, I would
 check into that direction.
 

Thank you for the suggestions.  Since my C module is such a small part of the
simulations, I can just comment out the call to that module completely (though I
am still loading it) and fill in what the results would have been with random
values.  Sadly, the program still eats up memory on our cluster.

Still, it could be something related to compiling Python with the older GCC.

I'll see if I can make a really small example program that eats up memory on our
cluster.  That way we'll have something easy to work with.

Thanks,
Per



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Program eating memory, but only on one machine? (Solved, sort of)

2007-01-22 Thread Per B.Sederberg
Per B.Sederberg persed at princeton.edu writes:

 I'll see if I can make a really small example program that eats up memory on
 our cluster.  That way we'll have something easy to work with.

Now this is weird.  I figured out the bug and it turned out that every time you
call numpy.setmember1d in the latest stable release of numpy it was using up a
ton of memory and never releasing it.

I replaced every instance of setmember1d with my own method below and I have
zero increase in memory.  It's not the most efficient of code, but it gets the
job done...


def ismember(a,b):
ainb = zeros(len(a),dtype=bool)
for item in b:
ainb = ainb | (a==item)
return ainb

I'll now go post this problem on the numpy forums.

Best,
Per




-- 
http://mail.python.org/mailman/listinfo/python-list


(question) How to use python get access to google search without query quota limit

2006-05-05 Thread Per
I am doing a Natural Language processing project for academic use,

I think google's rich retrieval information and query-segment might be
of help, I downloaded google api, but there is query limit(1000/day),
How can I write python code to simulate the browser-like-activity to
submit more than 10k queries in one day?

applying for more than 10 licence keys and changing them if
query-quota-exception raised is not a neat idea...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: (question) How to use python get access to google search without query quota limit

2006-05-05 Thread Per
Yeah, Thanks Am,

I can be considered as an advanced google user, presumably.. But I am
not a advanced programmer yet.

If everyone can generate unlimited number of queries, soon the
user-query-data, which I believe is google's most advantage, will be in
chaos. Can they simply ignore some queries from a certain licence key
or.. so that they can keep their user-query-statistics normal and yet
provide cranky queriers reseanable response?

-- 
http://mail.python.org/mailman/listinfo/python-list


Is there such an idiom?

2006-03-19 Thread Per
http://jaynes.colorado.edu/PythonIdioms.html

Use dictionaries for searching, not lists. To find items in common
between two lists, make the first into a dictionary and then look for
items in the second in it. Searching a list for an item is linear-time,
while searching a dict for an item is constant time. This can often let
you reduce search time from quadratic to linear.

Is this correct?
s = [1,2,3,4,5...]
t = [4,5,6,,8,...]
how to find whether there is/are common item(s) between two list in
linear-time?
how to find the number of common items between two list in linear-time?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there such an idiom?

2006-03-19 Thread Per
Thanks Ron,
 surely set is the simplest way to understand the question, to see
whether there is a non-empty intersection. But I did the following
thing in a silly way, still not sure whether it is going to be linear
time.
def foo():
l = [...]
s = [...]
dic = {}
for i in l:
dic[i] = 0
k=0
while k len(s):
if s[k] in dic:
return True
else: pass
k+=1
if k == len(s):
return False


I am still a rookie, and partly just migrated from Haskell...
I am not clear about how making one of the lists a dictionary is
helpful

-- 
http://mail.python.org/mailman/listinfo/python-list


serial port server cnhd38

2005-02-22 Thread per . bergstrom
To whom it may concern,
The serial port server 'cnhd38' has been terminated (on who's
initiative, I don't know).
It affects the users of the (at least) following nodes:
cnhd36, cnhd44, cnhd45, cnhd46, cnhd47.
The new terminal server to use is called 'msp-t01'. The port
numbers that are of interest for the nodes mentioned above are
as follows:
port 17: this port is shared between:
  cnhd44/etm4 serial port (via riscwatch), currently connected here.
  cnhd36/console port
port 18: this port goes to cnhd44/console port
port 19: this port goes to cnhd45/console port
port 20: this port goes to cnhd47/console port
port 21: this port goes to cnhd46/console port
To connect to a port, just enter the following command:
telnet msp-t01 prefixportnumber
... an extra enter should give you the prompt.
prefix is always 20
portnumber is the port number...
example, connect to cnhd47/console port:
telnet msp-t01 2020
br
/Per
--
http://mail.python.org/mailman/listinfo/python-list


test

2004-12-23 Thread Per Erik Stendahl
sdfdsafasd
--
http://mail.python.org/mailman/listinfo/python-list