I don't know about cassandra, but you should probably fix the log4j for
cassandra so that you get the stacktrace:

log4j:WARN No appenders could be found for logger
(com.datastax.bdp.hadoop.cfs.CassandraFileSystem).
log4j:WARN Please initialize the log4j system properly.

On Thu, Nov 6, 2014 at 10:49 AM, Deepak Reddy <[email protected]> wrote:

> I didn’t check the catalina.out file before
>
> Here is the log detail
>
> log4j:WARN No appenders could be found for logger
> (com.datastax.bdp.hadoop.cfs.CassandraFileSystem).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
> DC or rack not found in snitch properties, check your configuration in:
> cassandra-rackdc.properties
> Fatal configuration error; unable to start server.  See log for stacktrace.
> Nov 6, 2014 12:17:53 AM org.apache.coyote.http11.Http11Protocol pause
> INFO: Pausing Coyote HTTP/1.1 on http-11000
> Nov 6, 2014 12:17:54 AM org.apache.catalina.core.StandardService stop
> INFO: Stopping service Catalina
> Nov 6, 2014 12:17:54 AM org.apache.catalina.core.StandardWrapper unload
> INFO: Waiting for 1 instance(s) to be deallocated
> Nov 6, 2014 12:17:55 AM org.apache.catalina.core.StandardWrapper unload
> INFO: Waiting for 1 instance(s) to be deallocated
> Nov 6, 2014 12:17:56 AM org.apache.catalina.core.StandardWrapper unload
> INFO: Waiting for 1 instance(s) to be deallocated
> Nov 6, 2014 12:17:56 AM org.apache.catalina.loader.WebappClassLoader
> clearReferencesJdbc
> SEVERE: The web application [/oozie] registered the JDBC driver
> [com.mysql.jdbc.Driver] but failed to unregister it when the web
> application was stopped. To prevent a memory leak, the JDBC Driver has been
> forcibly unregistered.
> Nov 6, 2014 12:17:56 AM org.apache.catalina.loader.WebappClassLoader
> clearReferencesThreads
> SEVERE: The web application [/oozie] appears to have started a thread
> named [FileWatchdog] but has failed to stop it. This is very likely to
> create a memory leak.
> Nov 6, 2014 12:17:56 AM org.apache.catalina.loader.WebappClassLoader
> clearReferencesThreads
> SEVERE: The web application [/oozie] is still processing a request that
> has yet to finish. This is very likely to create a memory leak. You can
> control the time allowed for requests to finish by using the unloadDelay
> attribute of the standard Context implementation.
> Nov 6, 2014 12:17:56 AM org.apache.catalina.loader.WebappClassLoader
> checkThreadLocalMapForLeaks
> SEVERE: The web application [/oozie] created a ThreadLocal with key of
> type [org.apache.oozie.util.XLog$Info$1] (value
> [org.apache.oozie.util.XLog$Info$1@76bcfa38]) and a value of type [
> org.apache.oozie.util.XLog.Info] (value
> [org.apache.oozie.util.XLog$Info@70cfaf6e]) but failed to remove it when
> the web application was stopped. This is very likely to create a memory
> leak.
> Nov 6, 2014 12:17:56 AM org.apache.catalina.loader.WebappClassLoader
> checkThreadLocalMapForLeaks
> SEVERE: The web application [/oozie] created a ThreadLocal with key of
> type [org.apache.hadoop.io.Text$1] (value
> [org.apache.hadoop.io.Text$1@5306989e]) and a value of type
> [sun.nio.cs.UTF_8.Encoder] (value [sun.nio.cs.UTF_8$Encoder@26c94114])
> but failed to remove it when the web application was stopped. This is very
> likely to create a memory leak.
> Nov 6, 2014 12:17:56 AM org.apache.catalina.loader.WebappClassLoader
> checkThreadLocalMapForLeaks
> SEVERE: The web application [/oozie] created a ThreadLocal with key of
> type [org.apache.oozie.util.XLog$Info$1] (value
> [org.apache.oozie.util.XLog$Info$1@76bcfa38]) and a value of type [
> org.apache.oozie.util.XLog.Info] (value
> [org.apache.oozie.util.XLog$Info@66cf9bf0]) but failed to remove it when
> the web application was stopped. This is very likely to create a memory
> leak.
> Nov 6, 2014 12:17:56 AM org.apache.catalina.loader.WebappClassLoader
> checkThreadLocalMapForLeaks
> SEVERE: The web application [/oozie] created a ThreadLocal with key of
> type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@4806de4b]) and
> a value of type [org.apache.oozie.util.Instrumentation.Cron] (value
> [org.apache.oozie.util.Instrumentation$Cron@3aec32de]) but failed to
> remove it when the web application was stopped. This is very likely to
> create a memory leak.
>
> Cat /etc/dse/cassandra/cassandra-rackdc.properties
>
> # These properties are used with GossipingPropertyFileSnitch and will
> # indicate the rack and dc for this node
> dc=DC2
> rack=RAC1
>
> am I missing something.
>
> Thanks,
> Deepak
>
> -----Original Message-----
> From: Shwetha GS [mailto:[email protected]]
> Sent: Wednesday, November 05, 2014 9:03 PM
> To: [email protected]
> Subject: Re: oozie crashing on cassandra cluster
>
> Is there any error in catalina.out?
>
> On Thu, Nov 6, 2014 at 9:55 AM, Deepak Reddy <[email protected]>
> wrote:
>
> > I am trying to setup oozie on a Cassandra cluster with the following
> > changes
> >
> > When I try to run an example job oozie server freezes up and is not
> > responding anymore.
> >
> > Can you give me more info on why this is happening.
> >
> > Thanks,
> > Deepak
> >
> > The details of my server are as below
> >
> > Oozie-site.xml
> >
> >     <property>
> >
> > <name>oozie.service.HadoopAccessorService.supported.filesystems</name>
> >                 <value>hdfs,hftp,webhdfs,cfs</value>
> >                 <description>
> >                         Enlist the different filesystems supported for
> > federation. If wildcard "*" is specified,
> >                         then ALL file schemes will be allowed.
> >                 </description>
> >     </property>
> >
> > <property>
> >     <name>cassandra.thrift.address</name>
> >     <value>sjc-prd-dt21</value>
> > </property>
> > <property>
> >     <name>cassandra.thrift.port</name>
> >     <value>9160</value>
> > </property>
> > <property>
> >     <name>cassandra.partitioner.class</name>
> >     <value>org.apache.cassandra.dht.RandomPartitioner</value>
> > </property>
> > <property>
> >     <name>cassandra.consistencylevel.read</name>
> >     <value>LOCAL_QUORUM</value>
> > </property>
> > <property>
> >     <name>cassandra.consistencylevel.write</name>
> >     <value>LOCAL_QUORUM</value>
> > </property>
> > <property>
> >     <name>cassandra.range.batch.size</name>
> >     <value>1024</value>
> > </property>
> > <property>
> >     <name>mapreduce.fileoutputcommitter.marksuccessfuljobs</name>
> >     <value>false</value>
> > </property>
> >
> > Hadoop-site.xml
> >
> > <property>
> > <name>fs.cfs.impl</name>
> > <value>com.datastax.bdp.hadoop.cfs.CassandraFileSystem</value>
> > </property>
> > <property>
> >     <name>cassandra.thrift.address</name>
> >     <value>sjc-prd-dt21</value>
> > </property>
> > <property>
> >     <name>cassandra.thrift.port</name>
> >     <value>9160</value>
> > </property>
> > <property>
> >     <name>fs.default.name</name>
> >     <value>cfs://sjc-prd-dt21:9160</value>
> > </property>
> >
> > After starting the oozie server, I am running an example job as
> >
> > /usr/local/oozie/bin/oozie job -oozie http://sjc-prd-dt21:11000/oozie
> > -config examples/apps/no-op/job.properties -run
> >
> > The job.properties file looks like
> >
> > nameNode=cfs://sjc-prd-dt21:9160
> > jobTracker=sjc-prd-dt22:8012
> >
> > oozie.wf.application.path=${nameNode}/user/cassandra/examples/apps/no-
> > op
> >
> > in the oozie logs I see the following info before oozie freezes up
> >
> > 2014-11-04 14:20:20,022 DEBUG HadoopAccessorService:545 -
> > USER[cassandra] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Checking if
> > filesystem cfs is supported
> > 2014-11-04 14:20:20,030 DEBUG UserGroupInformation:146 - hadoop login
> > 2014-11-04 14:20:20,031 DEBUG UserGroupInformation:95 - hadoop login
> > commit
> > 2014-11-04 14:20:20,032 DEBUG UserGroupInformation:125 - using local
> > user:UnixPrincipal: root
> > 2014-11-04 14:20:20,034 DEBUG UserGroupInformation:493 - UGI
> > loginUser:root
> > 2014-11-04 14:20:20,036 DEBUG UserGroupInformation:1143 -
> > PriviledgedAction as:cassandra via root
> > from:org.apache.oozie.service.HadoopAccessorService.createFileSystem(H
> > adoopAccessorService.java:420)
> > 2014-11-04 14:20:20,047 DEBUG FileSystem:1381 - Creating filesystem
> > for cfs://sjc-prd-dt21:9160/user/cassandra/examples/apps/no-op
> > 2014-11-04 14:20:20,714 INFO
> > StatusTransitService$StatusTransitRunnable:539 - USER[-] GROUP[-]
> > Acquired lock for [org.apache.oozie.service.StatusTransitService]
> > 2014-11-04 14:20:20,714 INFO PauseTransitService:539 - USER[-]
> > GROUP[-] Acquired lock for
> > [org.apache.oozie.service.PauseTransitService]
> > 2014-11-04 14:20:20,715 INFO
> > StatusTransitService$StatusTransitRunnable:539 - USER[-] GROUP[-]
> > Running coordinator status service first instance
> > 2014-11-04 14:20:20,957 INFO
> > StatusTransitService$StatusTransitRunnable:539 - USER[-] GROUP[-]
> > Running bundle status service first instance
> > 2014-11-04 14:20:20,959 INFO
> > CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:539 -
> > USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-]
> > CoordMaterializeTriggerService - Curr Date= Tue Nov 04 14:25:20 EST
> > 2014, Num jobs to materialize = 0
> > 2014-11-04 14:20:20,977 DEBUG
> > ActionCheckerService$ActionCheckRunnable:545
> > - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] QUEUING [] for
> > potential checking
> > 2014-11-04 14:20:20,987 INFO
> > StatusTransitService$StatusTransitRunnable:539 - USER[-] GROUP[-]
> > Released lock for [org.apache.oozie.service.StatusTransitService]
> > 2014-11-04 14:20:21,029 DEBUG PurgeXCommand:545 - USER[-] GROUP[-]
> > TOKEN[-] APP[-] JOB[-] ACTION[-] Execute command [purge] key [null]
> > 2014-11-04 14:20:21,030 DEBUG PurgeXCommand:545 - USER[-] GROUP[-]
> > TOKEN[-] APP[-] JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs
> > older than [30] days, Coordinator Jobs older than [7] days, and
> > Bundlejobs older than [7] days.
> > 2014-11-04 14:20:21,030 DEBUG PurgeXCommand:545 - USER[-] GROUP[-]
> > TOKEN[-] APP[-] JOB[-] ACTION[-] ENDED Purge deleted [0] workflows,
> > [0] coordinators, [0] bundles
> > 2014-11-04 14:20:21,033 INFO PauseTransitService:539 - USER[-]
> > GROUP[-] Released lock for
> > [org.apache.oozie.service.PauseTransitService]
> > 2014-11-04 14:20:21,047 DEBUG RecoveryService$RecoveryRunnable:545 -
> > USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] QUEUING [ WF_ACTIONS
> > 0, COORD_ACTIONS : 0, COORD_READY_JOBS : 0, BUNDLE_ACTIONS : 0] for
> > potential recovery
> > 2014-11-04 14:20:23,624 INFO Services:539 - Shutdown
> >
>
> --
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Reply via email to