Looks like your agent was set up properly. Can you increase the heap and try
again? You can do this by setting -Xmx in the flume-env.sh file. Try setting it
to 1G or higher, since you are using memory channel. Also I assume the file you
are tailing is getting written to? I strongly suggest using the AsyncHBaseSink.
Thanks,
Hari
--
Hari Shreedharan
On Thursday, October 4, 2012 at 3:19 PM, Kumar, Suresh wrote:
> Yes, my HBase has the table and column family, if I run the /etc/passwd test
> using flume-ng client, the table
> gets populated.
>
> Here is the log from the source agent, there is nothing much in the sink
> except for which seem to benign.
>
> Thanks,
> Suresh
>
> 2012-10-04 14:59:05,622 (lifecycleSupervisor-1-0-SendThread(localhost:2181))
> [DEBUG -
> org.apache.zookeeper.client.ZooKeeperSaslClient.clientTunneledAuthenticationInProgress(ZooKeeperSaslClient.java:515)]
> Could not retrieve login configuration: java.lang.SecurityException: Unable
> to locate a login configuration
> 2012-10-04 14:59:08,414 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
> Checking file:conf/flume.conf for changes
>
> source agent log:
>
>
> $ bin/flume-ng agent --conf ./conf/ -f conf/flume.conf
> -Dflume.root.logger=DEBUG,console -n agent3
>
> + exec /usr/lib/jvm/java-6-sun/bin/java -Xmx20m
> -Dflume.root.logger=DEBUG,console -cp '/opt/flume/conf:/opt/flume/lib/*'
> -Djava.library.path= org.apache.flume.node.Application -f conf/flume.conf -n
> agent3
> 2012-10-04 15:09:30,778 (main) [INFO -
> org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)]
> Starting lifecycle supervisor 1
> 2012-10-04 15:09:30,791 (main) [INFO -
> org.apache.flume.node.FlumeNode.start(FlumeNode.java:54)] Flume node starting
> - agent3
> 2012-10-04 15:09:30,799 (lifecycleSupervisor-1-1) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:203)]
> Node manager starting
> 2012-10-04 15:09:30,801 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:67)]
> Configuration provider starting
> 2012-10-04 15:09:30,810 (lifecycleSupervisor-1-1) [INFO -
> org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)]
> Starting lifecycle supervisor 10
> 2012-10-04 15:09:30,813 (lifecycleSupervisor-1-1) [DEBUG -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:207)]
> Node manager started
> 2012-10-04 15:09:30,819 (lifecycleSupervisor-1-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:86)]
> Configuration provider started
> 2012-10-04 15:09:30,819 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
> Checking file:conf/flume.conf for changes
> 2012-10-04 15:09:30,821 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:195)]
> Reloading configuration file:conf/flume.conf
> 2012-10-04 15:09:30,839 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:912)]
> Added sinks: avro-sink Agent: agent3
> 2012-10-04 15:09:30,840 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:998)]
> Processing:avro-sink
> 2012-10-04 15:09:30,840 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1002)]
> Created context for avro-sink: hostname
> 2012-10-04 15:09:30,841 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:998)]
> Processing:avro-sink
> 2012-10-04 15:09:30,841 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:998)]
> Processing:avro-sink
> 2012-10-04 15:09:30,841 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:998)]
> Processing:avro-sink
> 2012-10-04 15:09:30,841 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:295)]
> Starting validation of configuration for agent: agent3,
> initial-configuration: AgentConfiguration[agent3]
> SOURCES: {tail={ parameters:{command=tail -F /var/log/auth.log,
> channels=MemoryChannel-1, type=exec} }}
> CHANNELS: {MemoryChannel-1={ parameters:{type=memory} }}
> SINKS: {avro-sink={ parameters:{port=41414, hostname=sig-flume, type=avro,
> channel=MemoryChannel-1} }}
>
> 2012-10-04 15:09:30,854 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:450)]
> Created channel MemoryChannel-1
> 2012-10-04 15:09:30,883 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:655)]
> Creating sink: avro-sink using AVRO
> 2012-10-04 15:09:30,885 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:353)]
> Post validation configuration for agent3
> AgentConfiguration created without Configuration stubs for which only basic
> syntactical validation was performed[agent3]
> SOURCES: {tail={ parameters:{command=tail -F /var/log/auth.log,
> channels=MemoryChannel-1, type=exec} }}
> CHANNELS: {MemoryChannel-1={ parameters:{type=memory} }}
> SINKS: {avro-sink={ parameters:{port=41414, hostname=sig-flume, type=avro,
> channel=MemoryChannel-1} }}
>
> 2012-10-04 15:09:30,885 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:117)]
> Channels:MemoryChannel-1
>
> 2012-10-04 15:09:30,885 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:118)]
> Sinks avro-sink
>
> 2012-10-04 15:09:30,885 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:119)]
> Sources tail
>
> 2012-10-04 15:09:30,885 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:122)]
> Post-validation flume configuration contains configuration for agents:
> [agent3]
> 2012-10-04 15:09:30,886 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:249)]
> Creating channels
> 2012-10-04 15:09:30,886 (conf-file-poller-0) [DEBUG -
> org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:68)]
> Creating instance of channel MemoryChannel-1 type memory
> 2012-10-04 15:09:31,013 (conf-file-poller-0) [INFO -
> org.apache.flume.instrumentation.MonitoredCounterGroup.<init>(MonitoredCounterGroup.java:68)]
> Monitoried counter group for type: CHANNEL, name: MemoryChannel-1,
> registered successfully.
> 2012-10-04 15:09:31,013 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:273)]
> created channel MemoryChannel-1
> 2012-10-04 15:09:31,014 (conf-file-poller-0) [DEBUG -
> org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:74)]
> Creating instance of source tail, type exec
> 2012-10-04 15:09:31,037 (conf-file-poller-0) [INFO -
> org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:70)]
> Creating instance of sink: avro-sink, type: avro
> 2012-10-04 15:09:31,045 (conf-file-poller-0) [INFO -
> org.apache.flume.instrumentation.MonitoredCounterGroup.<init>(MonitoredCounterGroup.java:68)]
> Monitoried counter group for type: SINK, name: avro-sink, registered
> successfully.
> 2012-10-04 15:09:31,046 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:106)]
> Starting new configuration:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> source:org.apache.flume.source.ExecSource{name:tail,state:IDLE} }}
> sinkRunners:{avro-sink=SinkRunner: {
> policy:org.apache.flume.sink.DefaultSinkProcessor@e949f69 counterGroup:{
> name:null counters:{} } }}
> channels:{MemoryChannel-1=org.apache.flume.channel.MemoryChannel{name:
> MemoryChannel-1}} }
> 2012-10-04 15:09:31,046 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:113)]
> Starting Channel MemoryChannel-1
> 2012-10-04 15:09:31,049 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:128)]
> Waiting for channel: MemoryChannel-1 to start. Sleeping for 500 ms
> 2012-10-04 15:09:31,052 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:82)]
> Component type: CHANNEL, name: MemoryChannel-1 started
> 2012-10-04 15:09:31,550 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:141)]
> Starting Sink avro-sink
> 2012-10-04 15:09:31,552 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.sink.AvroSink.start(AvroSink.java:239)] Starting AvroSink
> avro-sink { host: sig-flume, port: 41414 }...
> 2012-10-04 15:09:31,552 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:82)]
> Component type: SINK, name: avro-sink started
> 2012-10-04 15:09:31,552 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents(DefaultLogicalNodeManager.java:152)]
> Starting Source tail
> 2012-10-04 15:09:31,554 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.sink.AvroSink.createConnection(AvroSink.java:178)] Avro sink
> avro-sink: Building RpcClient with hostname: sig-flume, port: 41414
> 2012-10-04 15:09:31,561 (lifecycleSupervisor-1-1) [INFO -
> org.apache.flume.source.ExecSource.start(ExecSource.java:155)] Exec source
> starting with command:tail -F /var/log/auth.log
> 2012-10-04 15:09:31,586 (lifecycleSupervisor-1-1) [DEBUG -
> org.apache.flume.source.ExecSource.start(ExecSource.java:173)] Exec source
> started
> 2012-10-04 15:09:31,626 (lifecycleSupervisor-1-0) [DEBUG -
> org.apache.flume.api.NettyAvroRpcClient.configure(NettyAvroRpcClient.java:420)]
> Batch size string = null
> 2012-10-04 15:09:32,684 (lifecycleSupervisor-1-0) [DEBUG -
> org.apache.flume.sink.AvroSink.createConnection(AvroSink.java:192)] Avro sink
> avro-sink: Created RpcClient: NettyAvroRpcClient { host: sig-flume, port:
> 41414 }
> 2012-10-04 15:09:32,685 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.sink.AvroSink.start(AvroSink.java:253)] Avro sink avro-sink
> started.
> 2012-10-04 15:09:32,689 (SinkRunner-PollingRunner-DefaultSinkProcessor)
> [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)]
> Polling sink runner starting
> 2012-10-04 15:10:01,565 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
> Checking file:conf/flume.conf for changes
> 2012-10-04 15:10:31,567 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(Abstr
>
>
> From: Hari Shreedharan [mailto:[email protected]]
> Sent: Thursday, October 04, 2012 3:02 PM
> To: [email protected] (mailto:[email protected])
> Subject: Re: Flume Source and Sink in different hosts
>
> Can you send the logs also, of both agents? Does your Hbase cluster have the
> said column family and table with that family?
>
>
>
> Also are you sure the files are not getting rotated out. You should use tail
> -F so that your code works even with files getting rotated out.
>
>
>
>
>
> Hari
>
> --
>
> Hari Shreedharan
>
>
>
>
> On Thursday, October 4, 2012 at 2:53 PM, Kumar, Suresh wrote:
> >
> > Hello:
> >
> >
> >
> >
> >
> > I have just downloaded and build flume-ng (apache-flume-1.3.0-SNAPSHOT).
> >
> >
> >
> >
> >
> > My goal is to collect log data from HostA (source) and send it to
> > HostB(sink), my initial test (sending /etc/passwd)
> >
> >
> > from HostA to HostB worked fine, I was also able to load the passwd file
> > into my HBase in HostB.
> >
> >
> >
> >
> >
> > Now, I want to load a continuous stream of log data (using tail –f), but I
> > was not able to replicate the above process.
> >
> >
> > Flume just started fine in HostA, but I do not see any data being received
> > by HostB or in my HBase.
> >
> >
> >
> >
> >
> > What is wrong with my configuration?
> >
> >
> >
> >
> >
> > Thanks,
> >
> >
> > Suresh
> >
> >
> >
> >
> >
> > Here is my flume.conf in HostA
> >
> >
> >
> >
> >
> > agent3.sources = tail
> >
> >
> > agent3.channels = MemoryChannel-1
> >
> >
> > agent3.sinks = avro-sink
> >
> >
> >
> >
> >
> > # Define source flow
> >
> >
> > agent3.sources.tail.type = exec
> >
> >
> > agent3.sources.tail.command = tail -f /var/log/auth.log
> >
> >
> > agent3.sources.tail.channels = MemoryChannel-1
> >
> >
> >
> >
> >
> > # What kind of channel
> >
> >
> > agent3.channels.MemoryChannel-1.type = memory
> >
> >
> >
> >
> >
> > # avro sink properties
> >
> >
> > agent3.sinks.avro-sink.type = avro
> >
> >
> > agent3.sinks.avro-sink.channel = MemoryChannel-1
> >
> >
> > agent3.sinks.avro-sink.hostname = hostb
> >
> >
> > agent3.sinks.avro-sink.port = 41414
> >
> >
> >
> >
> >
> > Here is my flume.conf in HostB
> >
> >
> >
> >
> >
> > # Define a memory channel called ch1 on agent1
> >
> >
> > agent1.channels.ch1.type = memory
> >
> >
> >
> >
> >
> > # Define an Avro source called avro-source1 on agent1 and tell it
> >
> >
> > # to bind to 0.0.0.0:41414. Connect it to channel ch1.
> >
> >
> > agent1.sources.avro-source1.channels = ch1
> >
> >
> > agent1.sources.avro-source1.type = avro
> >
> >
> > agent1.sources.avro-source1.bind = 0.0.0.0
> >
> >
> > agent1.sources.avro-source1.port = 41414
> >
> >
> >
> >
> >
> > # Define a logger sink that simply logs all events it receives
> >
> >
> > # and connect it to the other end of the same channel.
> >
> >
> > agent1.sinks.log-sink1.channel = ch1
> >
> >
> > agent1.sinks.log-sink1.type = logger
> >
> >
> >
> >
> >
> > # Finally, now that we've defined all of our components, tell
> >
> >
> > # agent1 which ones we want to activate.
> >
> >
> > agent1.channels = ch1
> >
> >
> > agent1.sources = avro-source1
> >
> >
> > #agent1.sources = avro-source1
> >
> >
> > agent1.sinks = sink1
> >
> >
> >
> >
> >
> > agent1.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> >
> >
> > agent1.sinks.sink1.channel = ch1
> >
> >
> > agent1.sinks.sink1.table = flumedemo
> >
> >
> > agent1.sinks.sink1.columnFamily = testing
> >
> >
> > agent1.sinks.sink1.column = foo
> >
> >
> > agent1.sinks.sink1.serializer =
> > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> >
> >
> > agent1.sinks.sink1.serializer.payloadColumn = col1
> >
> >
> > agent1.sinks.sink1.serializer.keyType = timestamp
> >
> >
> > agent1.sinks.sink1.serializer.rowPrefix = 1
> >
> >
> > agent1.sinks.sink1.serializer.suffix = timestamp
> >
> >
> > agent1.sinks.sink1.serializer.payloadColumn = pcol
> >
> >
> > agent1.sinks.sink1.serializer.incrementColumn = icol
> >
> >
> >
> >
> >
> >
> >
>
>
>
>
>
>