I noticed that TsProcessor is using the timestamp as the key for putting logs into hbase. But, my logs are coming in so fast that they have same timestamp like this:
2012-01-20 20:03:14,041 [INFO] [communication thread] [org.apache.hadoop.mapred.LocalJobRunner.statusUpdate()] 10 threads, 28 requests, 0 errors, 0 forbidden, 0.6 pages/s, 80 kb/s, 2012-01-20 20:03:14,852 [INFO] [Thread-274] [jcrawler.fetch.mapreduce.FetchMapper.doWork()] -activeThreads=10, spinWaiting=7, fetchQueues.totalSize=649 2012-01-20 20:03:14,852 [INFO] [Thread-274] [jcrawler.fetch.mapreduce.FetchMapper.feedQueueManager()] feeding 649 input urls ... 2012-01-20 20:03:14,852 [INFO] [Thread-274] [jcrawler.fetch.mapreduce.FetchMapper.logHeapUsage()] Fetcher feeding queue manager. Heap usage: 327668152 out of 932118528 bytes. I think because of this, they are getting reduced and takes only one log for a given timestamp. Any idea how to fix this? Thanks, -- View this message in context: http://apache-chukwa.679492.n3.nabble.com/Missing-logs-in-hbase-because-of-same-timestamp-tp3677271p3677271.html Sent from the Chukwa - Users mailing list archive at Nabble.com.