Shaofeng SHI created KYLIN-3369: ----------------------------------- Summary: Reduce the data size sink from Kafka topic to HDFS Key: KYLIN-3369 URL: https://issues.apache.org/jira/browse/KYLIN-3369 Project: Kylin Issue Type: Improvement Components: Streaming Reporter: Shaofeng SHI
When building a cube from Kafka topic, the first step is to sink the Kafka data to HDFS. In today's implementation, it will persist all the fields of a message to disk. While in many cases, only a couple of fields will be needed for cubing; Today's behavior wastes network bandwidth and disk space. -- This message was sent by Atlassian JIRA (v7.6.3#76005)