Hi , I looked at hadoop-consumer , which fetches data directly from the kafka broker . But from what i understand it is based on min and max offset and map task complete once they reach the maximum offset for a given topic .
In our use case we would not know about the max offset before hand. Instead we want map to keep reading data from a min offset and roll over every 30 mins . At 30th min we would again generate the offsets which would be used for the next run. any suggestions would be helpful . regards, rks