read back the original data. Will try converting the str to bytearray
before storing it to a seqeencefile.
Thanks,
Sam Stoelinga
)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
On Tue, Jun 9, 2015 at 11:04 AM, Sam Stoelinga sammiest...@gmail.com
wrote:
Hi all,
I'm storing an rdd as sequencefile with the following content:
key=filename(string) value=python str from numpy.savez(not unicode
language usable SequenceFile instead of
using Picklefile though, so if anybody has pointers would appreciate that :)
On Tue, Jun 9, 2015 at 11:35 AM, Sam Stoelinga sammiest...@gmail.com
wrote:
Update: Using bytearray before storing to RDD is not a solution either.
This happens when trying to read
.
On Fri, Jun 5, 2015 at 2:17 PM, Sam Stoelinga sammiest...@gmail.com wrote:
Yea should have emphasized that. I'm running the same code on the same VM.
It's a VM with spark in standalone mode and I run the unit test directly on
that same VM. So OpenCV is working correctly on that same machine
2, 2015 at 5:06 AM, Davies Liu dav...@databricks.com wrote:
Could you run the single thread version in worker machine to make sure
that OpenCV is installed and configured correctly?
On Sat, May 30, 2015 at 6:29 AM, Sam Stoelinga sammiest...@gmail.com
wrote:
I've verified the issue lies
:
Please file a bug here: https://issues.apache.org/jira/browse/SPARK/
Could you also provide a way to reproduce this bug (including some
datasets)?
On Thu, Jun 4, 2015 at 11:30 PM, Sam Stoelinga sammiest...@gmail.com
wrote:
I've changed the SIFT feature extraction to SURF feature extraction
Please ignore this whole thread. It's working out of nowhere. I'm not sure
what was the root cause. After I restarted the VM the previous SIFT code
also started working.
On Fri, Jun 5, 2015 at 10:40 PM, Sam Stoelinga sammiest...@gmail.com
wrote:
Thanks Davies. I will file a bug later with code
?
If the bytes came from sequenceFile() is broken, it's easy to crash a
C library in Python (OpenCV).
On Thu, May 28, 2015 at 8:33 AM, Sam Stoelinga sammiest...@gmail.com
wrote:
Hi sparkers,
I am working on a PySpark application which uses the OpenCV library. It
runs
fine when running
.COLOR_BGR2GRAY)
sift = cv2.xfeatures2d.SIFT_create()
kp, descriptors = sift.detectAndCompute(gray, None)
return (imgfilename, test)
And corresponding tests.py:
https://gist.github.com/samos123/d383c26f6d47d34d32d6
On Sat, May 30, 2015 at 8:04 PM, Sam Stoelinga sammiest...@gmail.com
wrote
This is the error message taken from STDERR of the worker log:
https://gist.github.com/samos123/3300191684aee7fc8013
Would like pointers or tips on how to debug further? Would be nice to know
the reason why the worker crashed.
Thanks,
Sam Stoelinga
org.apache.spark.SparkException: Python worker exited
.
Looking forward to hear you point out my stupidity or provide work-arounds
that could make Spark KMeans work well on large datasets.
Regards,
Sam Stoelinga
PM, Jeetendra Gangele gangele...@gmail.com
wrote:
How you are passing feature vector to K means?
its in 2-D space of 1-D array?
Did you try using Streaming Kmeans?
will you be able to paste code here?
On 29 April 2015 at 17:23, Sam Stoelinga sammiest...@gmail.com wrote:
Hi Sparkers,
I
Guys, great feedback by pointing out my stupidity :D
Rows and columns got intermixed hence the weird results I was seeing.
Ignore my previous issues will reformat my data first.
On Wed, Apr 29, 2015 at 8:47 PM, Sam Stoelinga sammiest...@gmail.com
wrote:
I'm mostly using example code, see here
13 matches
Mail list logo