David, Does this look related? https://issues.apache.org/jira/browse/AMBARI-9009 ?
On Wed, Dec 24, 2014 at 3:20 PM, David Novogrodsky < [email protected]> wrote: > All, > > I have run Flume agents on a pusedo-distributed VM from Cloudera > ingesting tweets from twitter. When I paste the same configuratons > into the Flume section of Ambari I do not get any data from twitter. > The screen in Ambari says the agents are running but when I go to the > directory, I see no files: > > [root@namenode PBX]# hadoop fs -ls /user/flume/tweets > [root@namenode PBX]# hadoop fs -ls /user/flume/tweets > [root@namenode PBX]# hadoop fs -ls /user/flume/tweets/ > [root@namenode PBX]# > > > I have attached the cluster parameters in a PDF. > > Here is the URL I am using to add the configuration to the Flume agents: > http://namenode.localdomain.com:8080/#/main/services/FLUME/configs > > Here is the configuration for the twitter agent: > # defining the source for the agent for Twitter > TwitterAgent.sources.Twitter.type = > org.apache.flume.source.twitter.TwitterSource > TwitterAgent.sources.Twitter.channels = MemoryChannel > TwitterAgent.sources.Twitter.consumerKey = (just removing for security) > TwitterAgent.sources.Twitter.accessToken = (removing) > TwitterAgent.sources.Twitter.accessTokenSecret =(removing) > TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, > bigdata, cloudera, data science, data scientist, business > intelligence, mapreduce, data warehouse, data warehousing, mahout, > hbase, nosql, newsql, businessintelligence, cloudcomputing > TwitterAgent.sources.Twitter.maxBatchSize = 10 > TwitterAgent.sources.Twitter.maxBatchDurationMillis = 200 > > # defining the interceptors > TwitterAgent.sources.Twitter.interceptors = i1 > TwitterAgent.sources.Twitter.interceptors.i1.type = timestamp > > > # defining the sink for the agent > TwitterAgent.sinks.HDFS.channel = MemoryChannel > TwitterAgent.sinks.HDFS.type = hdfs > TwitterAgent.sinks.HDFS.hdfs.path = /user/flume/tweets/%Y/%m/%d > TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream > TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text > TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000 > TwitterAgent.sinks.HDFS.hdfs.rollSize = 0 > TwitterAgent.sinks.HDFS.hdfs.rollCount = 100000 > TwitterAgent.sinks.HDFS.hdfs.rollInterval = 6000 > TwitterAgent.sinks.HDFS.hdfs.filePrefix = events- > > # definning the channel for the agent > TwitterAgent.channels.MemoryChannel.type = memory > TwitterAgent.channels.MemoryChannel.capacity = 10000 > TwitterAgent.channels.MemoryChannel.transactionCapacity = 10000 > > > David Novogrodsky > [email protected] > http://www.linkedin.com/in/davidnovogrodsky > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
