I believe what I did was when I set up Oozie with the setup script where you 
specify the version of Hadoop and such, I also added additional jars like the 
Cassandra jars and some of its dependencies there and the cassandra.yaml, 
cassandra-env.sh and potentially the topology properties file.  Then with the 
configuration outlined on the Cassandra wiki that you posted, I just used the 
built-in Pig support and it worked fine.  You might try a simple test case to 
read from and write to Cassandra and look for errors either in the job setup 
(the 1 mapper job that Oozie creates to initialize the job) or in the job 
itself.

The specific jars from Cassandra that I added as additional jars were:
cassandra-all
cassandra-thrift
guava
high-scale-lib
lib-thrift
log4j
snake-yaml
commons-io
then cassandra.yaml, cassandra-env.sh, and cassandra-topology.properties file 
(if using property file snitch)

I reference those jars in the environment variable LIBEXT_JARS then execute:
bin/oozie-setup.sh prepare-war -jars $LIBEXT_JARS -extjs ./ext-2.2.zip

Hopefully that helps,

Jeremy

On 28 Nov 2013, at 15:31, Miguel Angel Martin junquera 
<mianmarjun.mailingl...@gmail.com> wrote:

> hi Jeremy,
> 
> I do not try test it  still, I only test examples pig from oozie project
> without cassadra.
> 
> * pig-cassandra* sets the cassandra pig libraries .jar in the the
> PIG_CLASSPATH env var. and after call the original shell script  *pig* from
> PIG_HOME/bin/pig and , up to now, I launch pig scripts with pig_cassandra
> directly.
> 
> I do not know and did not  see how oozie launch pig and I supose that Oozie
> launch the PIG_HOME/bin/pig.
> 
> If you are using  this config and the pig scripts that use cassandra works
> fine  , I suspose that the trick is  putting  the cassandra jars
> dependencies and other udf or libraries that you use in the pig scripts  in
> the oozie  sharelib or in the lib folder of the job.
> 
> 
> On the other hand, I do not know if  i have to configure some thing  like
> this.
> 
> http://wiki.apache.org/cassandra/HadoopSupport#Oozie
> 
> I am using Cassandra 1.2.10, Oozie 4.0.0 adn pig 0.11.1.
> 
> I try to test these options and see if it works-
> 
> Thanks in advance
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 2013/11/28 Jeremy Hanna <jeremy.hanna1...@gmail.com>
> 
>> If I remember correctly when I configured pig, cassandra, and oozie to
>> work together, I just used vanilla pig but gave it the jars it needed.
>> 
>> What is the problem you’re experiencing that you are unable to do this?
>> 
>> Jeremy
>> 
>> On 28 Nov 2013, at 12:56, Miguel Angel Martin junquera <
>> mianmarjun.mailingl...@gmail.com> wrote:
>> 
>>> hi all;
>>> 
>>> What is the best way to integrate cassandra pig-extension with oozie?
>>> 
>>> can be configure  oozie to use pig-cassandra instead of pig?
>>> 
>>> Some ideas that I thinking are:
>>> 
>>> Launching a Shell job    that runs ./pig-cassandra script.pig
>>> or   changing environment variables  vakues
>>> or the original to include the pig-cassandra code .... etc
>>> 
>>> Thanks and regards
>> 
>> 

Reply via email to