Eric/Ari - Thanks... got it working. On Jan 28, 2010, at 3:01 PM, Ariel Rabkin wrote:
> You can also try defining fs.default.name in chukwa-demux-conf.xml > > On Thu, Jan 28, 2010 at 1:36 PM, Eric Yang <ey...@yahoo-inc.com> wrote: >> Echo on what Ari said. Make sure your hdfs-site.xml is in HADOOP_CONF_DIR >> or in the class path. Demux uses this file to determine location of your >> HDFS. >> >> Regards, >> Eric >> >> >> On 1/28/10 11:58 AM, "Ariel Rabkin" <asrab...@gmail.com> wrote: >> >>> We don't use demux at my site, so I'd love to have Eric or Jerome jump >>> in here. But that said: >>> >>> I believe the typical way to set this up is to have conf/chukwa-env.sh >>> define HADOOP_CONF_DIR; the filesystem is then specified via the >>> Hadoop configuration. (fs.default.name) You shouldn't need to change >>> chukwa-demux-conf. >>> >>> In re processSinkFiles -- What version of Chukwa are you using? In >>> Chukwa 0.3, the only formal release we've done so far, there's no >>> processSinkFiles.sh, and the line in start-data-processors that >>> references it has been commented out. You don't need it; references >>> to it are a historical artifact that should go away in the next >>> release. >>> >>> --Ari >>> >>> On Thu, Jan 28, 2010 at 11:15 AM, Corbin Hoenes <cor...@tynt.com> wrote: >>>> I'm having some difficulty with the demux part of setting up chukwa. I >>>> assume I am supposed to run the start-data-processors.sh script to startup >>>> all the map reduce jobs that handle demux and archiving. >>>> >>>> My goal is to pull the logs we are collecting out of the sink files and >>>> into >>>> something we can start to run our pig scripts on. >>>> >>>> When I run start-data-processors it gives me this though: >>>> >>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: >>>> file:/chukwa/demuxProcessing/mrInput >>>> at >>>> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190) >>>> at >>>> org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInput >>>> Format.java:44) >>>> at >>>> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201) >>>> at >>>> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:851) >>>> at >>>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:822) >>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771) >>>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1290) >>>> at >>>> org.apache.hadoop.chukwa.extraction.demux.Demux.run(Demux.java:192) >>>> >>>> Which seems like I need to configure it to try to connect to hdfs rather >>>> than >>>> file:/ >>>> >>>> Only docs I've found are here: >>>> http://hadoop.apache.org/chukwa/docs/current/admin.html >>>> Is there a guide to configuring chukwa-demux-conf.xml? >>>> >>>> I also noticed start-data-processors.sh tries to start processSinkFiles.sh >>>> which doesn't exist for me--do I need to get this script?s >>>> >>>> >>>> >>> >>> >> >> > > > > -- > Ari Rabkin asrab...@gmail.com > UC Berkeley Computer Science Department