Hi kafka gurus, This might sound a little off the track, but I don't know where else to go. I tried the Confluent google group but it seems to be quite unresponsive/inactive. Please bear with me. Many thanks in advance!
I am trying to copy data from Kafka into Hive tables using kafka-hdfs-connector provided by Confluent. While I am able to do it successfully I was wondering how to bucket the incoming data based on time interval. For example, I would like to have a new partition created every 5 minutes. I tried io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner with partition.duration.ms but I think I am doing it the wrong way. I see only one partition in the Hive table with all the data going into that particular partition. Something like this : hive> show partitions test; OK partition year=2016/month=03/day=15/hour=19/minute=03 And all the avro objects are getting copied into this partition. Instead, I would like to have something like this : hive> show partitions test; OK partition year=2016/month=03/day=15/hour=19/minute=03 year=2016/month=03/day=15/hour=19/minute=08 year=2016/month=03/day=15/hour=19/minute=13 Initially connector will create the path year=2016/month=03/day=15/hour=19/minute=03 and will continue to copy all the incoming data into this directory for next 5 minutes, and at the start of 6th minute it should create a new path, i.e year=2016/month=03/day=15/hour=19/minute=08 and copy the data for next 5 minutes into this directory, and so on. This is how my config file looks like : name=hdfs-sink connector.class=io.confluent.connect.hdfs.HdfsSinkConnector tasks.max=1 topics=test hdfs.url=hdfs://localhost:9000 flush.size=3 partitioner.class=io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner partition.duration.ms=300000 path.format='year'=YYYY/'month'=MM/'day'=dd/'hour'=HH/'minute'=MM/ locale=en timezone=GMT logs.dir=/kafka-connect/logs topics.dir=/kafka-connect/topics hive.integration=true hive.metastore.uris=thrift://localhost:9083 schema.compatibility=BACKWARD It would be really helpful if someone could point me in the right direction. I would be glad to share more details in case it's required. Don't want to make this email look like one that never ends. Thank you so much for your valuable time! [image: http://] Tariq, Mohammad about.me/mti [image: http://] <http://about.me/mti>
