On 5/1/2014 3:49 PM, Tom Graves wrote:
Hello,

I am trying to use python (2.6.6) to read a jar file that contains
python files.  I'm simply setting
PYTHONPATH=spark-assembly-1.0.0-SNAPSHOT-hadoop2.4.0.jar.
   Unfortunately it fails to read the python files from the jar file and
if run in verbose mode just shows:

import zipimport # builtin
# installed zipimport hook
# zipimport: found 0 names in spark-assembly-1.0.0-SNAPSHOT-hadoop2.4.0.jar

I was messing around and noticed that if I reduce the number of files
and directories in the jar to below 65536 then it works:

import zipimport # builtin
# installed zipimport hook
# zipimport: found 65452 names in pyspark.jar

Is this a known limitation

It is definitely not documented, which it should be if intentional.

> or is this perhaps fixed in newer version or

Install 3.4 and try. Or go to http://bugs.python.org,
click search, enter 'zipimport' in the title box, change 'open' in the status box to 'dont care', and look at 44 titles returned (I did not see anything that looked relevant).

All I know is that in the discussion about a ziplib issue, someone mentioned using zips with 100000s of files. But zipimport could have an additional limitation.

is there a work around?

Multiple archives?

Note, I'm not subscribed to the mailing list so please copy me in
response if possible.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to