Github user darose commented on the pull request:
https://github.com/apache/spark/pull/629#issuecomment-42672273
Thanks for the suggestions. I don't think this is a deployment issue
though. I don't have any spark/shark remnants installed from packages on the
client machine. (I don't even have an /opt/cloudera directory - my Cloudera
packages seem to get installed under /usr/lib/hadoop, /usr/lib/hive, etc.).
Rather, I was manually deploying the binaries I built to /usr/lib/spark and
/usr/lib/shark, and I've been completely removing those directory trees each
time I do a new build.
And similarly on the Hadoop cluster machines: These are Amazon EC2 AMI's
that I'm building, off of a fresh pristine Ubuntu 13.10 base, so there's no
spark/shark remnants present before I start.
So I'm fairly certain this is an issue of me building incorrectly. I think
what I did to build was:
* grab the current master branch of spark & shark
* update SparkBuild.scala to downgrade the version from 1.0-SNAPSHOT to
0.9.1
* update SharkBuild.scala to use jets3t 0.9.0
* build both with sbt assembly and sbt package
* when done, copy the versions of spark-core_2.10-0.9.1.jar,
spark-bagel_2.10-0.9.1.jar, spark-mllib_2.10-0.9.1.jar, and
spark-repl_2.10-0.9.1.jar that were generated during the spark build and use
them to replace the corresponding jars in lib_managed in the shark build. (My
thinking here was that the shark build was pulling those jars from maven, and
that perhaps the "class incompatible" issue was being caused by spark and shark
using different versions of those jars.)
But result was the json4s issue I posted above.
In any case, as this is a build/deployment issue, probably best for me to
take this off GitHub. Would be very grateful, though, if you might be able to
assist in getting a working spark/shark build for us. Our company is a
Cloudera support customer, so I'll try following up through those channels. If
you don't mind, I'll suggest that the support team get in touch with you about
this, as you're obviously the most well-versed on the issue.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---