Oops, I just noticed that this was already suggested by Ashutosh Sharma. On Fri, Jul 6, 2012 at 4:18 AM, Will McQueen <[email protected]> wrote:
> Hi Amit, > > Try: > > *agent1.sinks.HDFS.hdfs.file.Type = DataStream > ===change to==> > **agent1.sinks.HDFS.hdfs.fileType = DataStream* > > Otherwise the fileType is SequenceFile by default. > > Cheers, > Will > > > On Fri, Jul 6, 2012 at 2:44 AM, Amit Handa <[email protected]> wrote: > >> Hi, >> >> @Mike thanks for ur reply. >> >> 1) After executing Flume-ng agent, and avro client, File is created in >> HDFS. >> I used today same flume-ng setup with hadoop 1.0.1. >> Now i m facing problem that through avro client i am sending normal text >> file. But inside HDFS File content is coming like as shown below. I want in >> HDFS this file content should be in normal text format >> HDFS File Content: >> *"SEQ^F!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable^@^@^@^@^@^@^UªG^Oòá~v¾z/<87>^[~ð^@^@^@)^@^@^@^H^@^@^A8[<8e>)Ú^@^@^@^]We >> are modifying the file now^@^@^@* >> >> Given txt file content through AvroClient is >> * We are modifying the file now* >> >> Kindly provide ur inputs to resolve this issue. >> my flume.conf file content is as folows: >> * >> # Define a memory channel called ch1 on agent1 >> agent1.channels.ch1.type = memory >> >> >> # Define an Avro source called avro-source1 on agent1 and tell it >> # to bind to 0.0.0.0:41414. Connect it to channel ch1. >> agent1.sources.avro-source1.channels = ch1 >> agent1.sources.avro-source1.type = avro >> agent1.sources.avro-source1.selector.type=replicating >> >> agent1.sources.avro-source1.bind = 0.0.0.0 >> agent1.sources.avro-source1.port = 41414 >> >> >> # Define a hdfs sink that simply logs all events it receives >> # and connect it to the other end of the same channel. >> agent1.sinks.HDFS.channel = ch1 >> agent1.sinks.HDFS.type = hdfs >> agent1.sinks.HDFS.hdfs.path = >> hdfs://localhost:54310/user/hadoop-node1/flumeTest >> agent1.sinks.HDFS.hdfs.file.Type = DataStream >> agent1.sinks.HDFS.hdfs.file.Format = Text >> >> >> # Finally, now that we've defined all of our components, tell >> # agent1 which ones we want to activate. >> agent1.channels = ch1 >> agent1.sources = avro-source1 >> agent1.sinks = HDFS* >> >> >> 2) AT Flume NG Side still i am getting security related IO Exception. >> when i start flume-ng using above configuration file. >> Exception log coming at flume-ng side is : >> 2012-07-06 11:14:42,957 (conf-file-poller-0) [DEBUG - >> org.apache.hadoop.security.Groups.<init>(Groups.java:59)] Group mapping >> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; >> cacheTimeout=300000 >> 2012-07-06 11:14:42,961 (conf-file-poller-0) [DEBUG - >> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)] >> java.io.IOException: config() >> >> at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227) >> at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214) >> at >> org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:187) >> at >> org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239) >> at >> org.apache.hadoop.security.KerberosName.<clinit>(KerberosName.java:83) >> at >> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:212) >> >> at >> org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:187) >> at >> org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239) >> at >> org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:516) >> at >> org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:239) >> at >> org.apache.flume.conf.Configurables.configure(Configurables.java:41) >> at >> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSinks(PropertiesFileConfigurationProvider.java:373) >> at >> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:223) >> at >> org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123) >> at >> org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38) >> at >> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202) >> >> >> >> >> With Regards, >> Amit Handa >> >> >> >> On Fri, Jul 6, 2012 at 12:21 AM, Mike Percy <[email protected]> wrote: >> >>> On Thu, Jul 5, 2012 at 12:28 AM, Amit Handa <[email protected]>wrote: >>> >>>> HI All, >>>> >>>> While trying to run Flume ng using HDFS SInk, and using avro Client.. i >>>> am getting IOException. Kindly help in resolving this issue >>>> >>>> Exception log is as follows: >>>> 2012-07-05 12:01:32,789 (conf-file-poller-0) [INFO - >>>> org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:70)] >>>> Creating instance of sink HDFS typehdfs >>>> 2012-07-05 12:01:32,816 (conf-file-poller-0) [DEBUG - >>>> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)] >>>> java.io.IOException: config() >>>> at >>>> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227) >>>> at >>>> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:187) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239) >>>> ... >>>> >>> >>> Nothing is wrong with this, you are running at DEBUG level and Hadoop is >>> giving you debug-level output. If you don't want to get DEBUG level >>> messages from Hadoop while running Flume at DEBUG level then you will need >>> to add something like: >>> >>> log4j.logger.org.apache.hadoop = INFO >>> >>> To your log4j.properties file. >>> >>> Are you experiencing any problems with your setup? >>> >>> Regards, >>> Mike >>> >>> >> >
