Hi Amit, For your problem (1): There is syntax error in your HDFS sink configuration, that’s why the file is getting stored in sequence file format. agent1.sinks.HDFS.hdfs.file.Type = DataStream agent1.sinks.HDFS.hdfs.file.Format = Text
You need to correct it as below: agent1.sinks.HDFS.hdfs.fileType = DataStream agent1.sinks.HDFS.hdfs.writeFormat = Text I hope this will solve your first problem. ---------------------------------------- ---------------------------------------- Thanks & Regards, Ashutosh Sharma ---------------------------------------- From: Amit Handa [mailto:[email protected]] Sent: Friday, July 06, 2012 6:44 PM To: [email protected] Subject: Re: flume ng error while going for hdfs sink Hi, @Mike thanks for ur reply. 1) After executing Flume-ng agent, and avro client, File is created in HDFS. I used today same flume-ng setup with hadoop 1.0.1. Now i m facing problem that through avro client i am sending normal text file. But inside HDFS File content is coming like as shown below. I want in HDFS this file content should be in normal text format HDFS File Content: "SEQ^F!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable^@^@^@^@^@^@^UªG^Oòá~v¾z/<87>^[~ð^@^@^@)^@^@^@^H^@^@^A8[<8e>)Ú^@^@^@^]We are modifying the file now^@^@^@ Given txt file content through AvroClient is We are modifying the file now Kindly provide ur inputs to resolve this issue. my flume.conf file content is as folows: # Define a memory channel called ch1 on agent1 agent1.channels.ch1.type = memory # Define an Avro source called avro-source1 on agent1 and tell it # to bind to 0.0.0.0:41414<http://0.0..0.0:41414>. Connect it to channel ch1. agent1.sources..avro-source1.channels = ch1 agent1.sources.avro-source1.type = avro agent1.sources.avro-source1.selector.type=replicating agent1.sources.avro-source1.bind = 0.0.0.0 agent1.sources.avro-source1.port = 41414 # Define a hdfs sink that simply logs all events it receives # and connect it to the other end of the same channel. agent1.sinks.HDFS..channel = ch1 agent1.sinks.HDFS.type = hdfs agent1.sinks.HDFS.hdfs.path = hdfs://localhost:54310/user/hadoop-node1/flumeTest agent1.sinks.HDFS.hdfs.file.Type = DataStream agent1.sinks.HDFS.hdfs.file.Format = Text # Finally, now that we've defined all of our components, tell # agent1 which ones we want to activate. agent1.channels = ch1 agent1.sources = avro-source1 agent1.sinks = HDFS 2) AT Flume NG Side still i am getting security related IO Exception. when i start flume-ng using above configuration file. Exception log coming at flume-ng side is : 2012-07-06 11:14:42,957 (conf-file-poller-0) [DEBUG - org.apache.hadoop.security.Groups.<init>(Groups.java:59)] Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000 2012-07-06 11:14:42,961 (conf-file-poller-0) [DEBUG - org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)] java.io.IOException: config() at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227) at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214) at org.apache.hadoop.security.UserGroupInformation..ensureInitialized(UserGroupInformation.java:187) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239) at org.apache.hadoop.security.KerberosName.<clinit>(KerberosName.java:83) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:212) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:187) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239) at org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:516) at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:239) at org.apache.flume.conf.Configurables.configure(Configurables.java:41) at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSinks(PropertiesFileConfigurationProvider.java:373) at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:223) at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123) at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38) at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202) With Regards, Amit Handa On Fri, Jul 6, 2012 at 12:21 AM, Mike Percy <[email protected]<mailto:[email protected]>> wrote: On Thu, Jul 5, 2012 at 12:28 AM, Amit Handa <[email protected]<mailto:[email protected]>> wrote: HI All, While trying to run Flume ng using HDFS SInk, and using avro Client.. i am getting IOException. Kindly help in resolving this issue Exception log is as follows: 2012-07-05 12:01:32,789 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:70)] Creating instance of sink HDFS typehdfs 2012-07-05 12:01:32,816 (conf-file-poller-0) [DEBUG - org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)] java.io.IOException: config() at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227) at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214) at org.apache.hadoop.security.UserGroupInformation..ensureInitialized(UserGroupInformation.java:187) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239) .... Nothing is wrong with this, you are running at DEBUG level and Hadoop is giving you debug-level output. If you don't want to get DEBUG level messages from Hadoop while running Flume at DEBUG level then you will need to add something like: log4j.logger.org.apache.hadoop = INFO To your log4j.properties file. Are you experiencing any problems with your setup? Regards, Mike 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다. This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.
