subject:"Spark Python with SequenceFile containing numpy deserialized data in str form"

Re: Spark Python with SequenceFile containing numpy deserialized data in str form

2015-08-30 Thread Peter Aberline

Hi, I saw the posting about storing NumPy values in sequence files: http://mail-archives.us.apache.org/mod_mbox/spark-user/201506.mbox/%3cCAJQK-mg1PUCc_hkV=q3n-01ioq_pkwe1g-c39ximco3khqn...@mail.gmail.com%3e I’ve had a go at implementing this, and issued a PR request at https://github.com/apach

Re: Spark Python with SequenceFile containing numpy deserialized data in str form

2015-06-08 Thread Sam Stoelinga

Update: I've done a workaround to use saveAsPickleFile instead which handles everything correctly. It stays in byte format. Noticed python got messy with str and byte being the same in Python 2.7, wondering whether using Python 3 would have the same problem. I would still like to use a cross langu

Re: Spark Python with SequenceFile containing numpy deserialized data in str form

2015-06-08 Thread Sam Stoelinga

Update: Using bytearray before storing to RDD is not a solution either. This happens when trying to read the RDD when the value was stored as python bytearray: Traceback (most recent call last): [0/9120] File "/vagrant/python/kmeans.py", line 24, in features = sc.sequenceFile(featu

Spark Python with SequenceFile containing numpy deserialized data in str form

2015-06-08 Thread Sam Stoelinga

Hi all, I'm storing an rdd as sequencefile with the following content: key=filename(string) value=python str from numpy.savez(not unicode) In order to make sure the whole numpy array get's stored I have to first serialize it with: def serialize_numpy_array(numpy_array): output = io.BytesIO()

Re: Spark Python with SequenceFile containing numpy deserialized data in str form

Re: Spark Python with SequenceFile containing numpy deserialized data in str form

Re: Spark Python with SequenceFile containing numpy deserialized data in str form

Spark Python with SequenceFile containing numpy deserialized data in str form

4 matches

Site Navigation

Mail list logo

Footer information