[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?

2013-10-25 Thread cantor

New submission from cantor:

import lzma
from functools import partial
import multiprocessing


def run_lzma(data,c):
return c.compress(data)


def split_len(seq, length):
return [str.encode(seq[i:i+length]) for i in range(0, len(seq), length)]



def lzma_mp(sequence,threads=3):
  lzc = lzma.LZMACompressor()
  blocksize = int(round(len(sequence)/threads))
  strings = split_len(sequence, blocksize)
  lzc_partial = partial(run_lzma,c=lzc)
  pool=multiprocessing.Pool()
  lzc_pool = list(pool.map(lzc_partial,strings))
  pool.close()
  pool.join()
  out_flush = lzc.flush()
  return b.join(lzc_pool + [out_flush])

sequence = 'AJKGJFKSHFKLHALWEHAIHWEOIAH 
IOAHIOWEHIOHEIOFEAFEASFEAFWEWEWFQWEWQWQGEWQFEWFDWEWEGEFGWEG'


lzma_mp(sequence,threads=3)

--
components: ctypes
messages: 201278
nosy: cantor
priority: normal
severity: normal
status: open
title: lzma hangs for a very long time when run in parallel using python's 
muptiprocessing module?
type: behavior
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19395
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?

2013-10-25 Thread cantor

cantor added the comment:

lzma

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19395
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?

2013-10-25 Thread cantor

Changes by cantor cantorma...@gmail.com:


--
nosy: +nadeem.vawda

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19395
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?

2013-10-25 Thread cantor

Changes by cantor cantorma...@gmail.com:


--
components:  -ctypes

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19395
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19395] unpickled LZMACompressor is crashy

2013-10-25 Thread cantor

cantor added the comment:

just to mention that map() (i.e. the non parallel version) works:

import lzma
from functools import partial
import multiprocessing

def run_lzma(data,c):
return c.compress(data)


def split_len(seq, length):
return [str.encode(seq[i:i+length]) for i in range(0, len(seq), length)]



sequence='AJKGJFKSHFKLHALWEHAIHWEOIAH 
IOAHIOWEHIOHEIOFEAFEASFEAFWEWEWFQWEWQWQGEWQFEWFDWEWEGEFGWEG'
threads=3
blocksize = int(round(len(sequence)/threads))
strings = split_len(sequence, blocksize)


#map works

lzc = lzma.LZMACompressor()
out = list(map(lzc.compress,strings))
out_flush = lzc.flush()
result = b.join(out + [out_flush])
lzma.compress(str.encode(sequence))
lzma.compress(str.encode(sequence)) == result
True

# map with the use of partial function works as well 
lzc = lzma.LZMACompressor()
lzc_partial = partial(run_lzma,c=lzc)
out = list(map(lzc_partial,strings))
out_flush = lzc.flush()
result = b.join(out + [out_flush])
lzma.compress(str.encode(sequence)) == result

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19395
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19395] unpickled LZMACompressor is crashy

2013-10-25 Thread cantor

cantor added the comment:

in python 2.7.3  this kind of works however it is less efficient than the pure 
lzma.compress()

from threading import Thread
from backports import lzma
from functools import partial
import multiprocessing


class CompressClass(Thread):
  def __init__ (self,data,c):
Thread.__init__(self)
self.exception=False
self.data=data
self.datacompressed=
self.c=c
  def getException(self):
return self.exception   
  def getOutput(self):
return self.datacompressed
  def run(self):
self.datacompressed=(self.c).compress(self.data)


def split_len(seq, length):
return [seq[i:i+length] for i in range(0, len(seq), length)]



def launch_multiple_lzma(data,c):
print 'cores'
present=CompressClass(data,c) 
present.start()  
present.join()
return present.getOutput()


def threaded_lzma_map(sequence,threads):
  lzc = lzma.LZMACompressor()
  blocksize = int(round(len(sequence)/threads))
  lzc_partial = partial(launch_multiple_lzma,c=lzc)
  lzlist = map(lzc_partial,split_len(sequence, blocksize))
  #pool=multiprocessing.Pool()
  #lzclist = pool.map(lzc_partial,split_len(sequence, blocksize))
  #pool.close()
  #pool.join()
  out_flush = lzc.flush()
  res = .join(lzlist + [out_flush])
  return res 

sequence = 'AJKGJFKSHFKLHALWEHAIHWEOIAH 
IOAHIOWEHIOHEIOFEAFEASFEAFWEWEWFQWEWQWQGEWQFEWFDWEWEGEFGWEG'

lzma.compress(sequence) == threaded_lzma_map(sequence,threads=16)

Any way this could be imporved?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19395
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19395] unpickled LZMACompressor is crashy

2013-10-25 Thread cantor

cantor added the comment:

python 3.3 version - tried this code and got a sliglty faster processing time 
then when running lzma.compress() on its own. Could this be improved upon?

import lzma
from functools import partial
from threading import Thread

def split_len(seq, length):
return [str.encode(seq[i:i+length]) for i in range(0, len(seq), length)]


class CompressClass(Thread):
  def __init__ (self,data,c):
Thread.__init__(self)
self.exception=False
self.data=data
self.datacompressed=
self.c=c
  def getException(self):
return self.exception   
  def getOutput(self):
return self.datacompressed
  def run(self):
self.datacompressed=(self.c).compress(self.data)


def launch_multiple_lzma(data,c):
present=CompressClass(data,c) 
present.start()  
present.join()
return present.getOutput()


def threaded_lzma_map(sequence,threads):
  lzc = lzma.LZMACompressor()
  blocksize = int(round(len(sequence)/threads))
  lzc_partial = partial(launch_multiple_lzma,c=lzc)
  lzlist = list(map(lzc_partial,split_len(sequence, blocksize)))
  out_flush = lzc.flush()
  return b.join(lzlist + [out_flush])

threaded_lzma_map(sequence,threads=16)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19395
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18038] Unhelpful error message on invalid encoding specification

2013-05-22 Thread Max Cantor

New submission from Max Cantor:

When you specify a nonexistent encoding at the top of a file, like so for 
example:

# -*- coding: fakefakefoobar -*-

The following exception occurs:

SyntaxError: encoding problem: with BOM

This is very unhelpful, especially in cases where you might have made a typo in 
the encoding.

--
components: Library (Lib)
messages: 189840
nosy: Max.Cantor
priority: normal
severity: normal
status: open
title: Unhelpful error message on invalid encoding specification
type: behavior
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18038
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17769] python-config --ldflags gives broken output when statically linking Python with --as-needed

2013-04-16 Thread Max Cantor

New submission from Max Cantor:

On certain Linux distributions such as Ubuntu, the linker is invoked by default 
with --as-needed, which has an undesireable side effect when linking static 
libraries: it is bad at detecting required symbols, and the order of libraries 
on the command line become significant.

Right now, on my Ubuntu 12.10 system with a custom 32-bit version of Python, I 
get the following command output:

mcantor@hpmongo:~$ /opt/pym32/bin/python-config --ldflags
-L/opt/pym32/lib/python2.7/config -lpthread -ldl -lutil -lm -lpython2.7 
-Xlinker -export-dynamic

When linking a project with those flags, I get the following error:

/usr/bin/ld: /opt/pym32/lib/python2.7/config/libpython2.7.a(dynload_shlib.o): 
undefined reference to symbol 'dlopen@@GLIBC_2.1'
/usr/bin/ld: note: 'dlopen@@GLIBC_2.1' is defined in DSO 
/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../i386-linux-gnu/libdl.so so try 
adding it to the linker command line
/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../i386-linux-gnu/libdl.so: could not 
read symbols: Invalid operation
collect2: error: ld returned 1 exit status

To resolve the error, I moved -ldl and -lutil *AFTER* -lpython2.7, so the 
relevant chunk of my gcc command line looked like this:

-L/opt/pym32/lib/python2.7/config -lpthread -lm -lpython2.7 -ldl -lutil 
-Xlinker -export-dynamic

I have no idea why --as-needed has such an unpleasant side effect when static 
libraries are being used, and it's arguable from my perspective that this 
behavior is the real bug. However it's equally likely that there's a good 
reason for that behavior, like it causes a slowdown during leap-years on Apple 
IIs or something. So here I am. python-config ought to respect the quirks of 
--as-needed when outputting its ldflags.

--
components: Build, Cross-Build
messages: 187121
nosy: Max.Cantor
priority: normal
severity: normal
status: open
title: python-config --ldflags gives broken output when statically linking 
Python with --as-needed
type: behavior
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17769
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com