You should read the message value as byte array rather than string . Other Approach is , while producing you can use the kafka compression = GZIP to have similar results.
-----Original Message----- From: mayur shah <mayurshah3...@gmail.com> Sent: Monday, May 21, 2018 1:50 AM To: users@kafka.apache.org; d...@kafka.apache.org Subject: Kafka consumer to unzip stream of .gz files and read HI Team, Greeting! I am facing one issue on kafka consumer using python hope you guys help us to resolve this issue Kafka consumer to unzip stream of .gz files and read <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F50232186%2Fkafka-consumer-to-unzip-stream-of-gz-files-and-read&data=02%7C01%7Ckchitta%40microsoft.com%7Cf6bb56d82595416ead9508d5bef7e6c9%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636624894296815698&sdata=3d0yQUtWTq8AcpzDs01jqDPh2EsPeIztlznJmLbT0ns%3D&reserved=0> Kafka producer is sending .gz files but not able to decompress and read the files at the consumer end. Getting error as "IOError: Not a gzipped file" Producer - bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Airport < ~/Downloads/stocks.json.gz Consumer - import sys import gzipimport StringIOfrom kafka import KafkaConsumer consumer = KafkaConsumer(KAFKA_TOPIC, bootstrap_servers=KAFKA_BROKERS) try: for message in consumer: f = StringIO.StringIO(message.value) gzip_f = gzip.GzipFile(fileobj=f) unzipped_content = gzip_f.read() content = unzipped_content.decode('utf8') print (content)except KeyboardInterrupt: sys.exit() Error at consumer - Traceback (most recent call last): File "consumer.py", line 18, in <module> unzipped_content = gzip_f.read() File "/usr/lib64/python2.6/gzip.py", line 212, in read self._read(readsize) File "/usr/lib64/python2.6/gzip.py", line 255, in _read self._read_gzip_header() File "/usr/lib64/python2.6/gzip.py", line 156, in _read_gzip_header raise IOError, 'Not a gzipped file'IOError: Not a gzipped file Regards, Mayur