Shaofeng SHI created KYLIN-3369:
-----------------------------------
Summary: Reduce the data size sink from Kafka topic to HDFS
Key: KYLIN-3369
URL: https://issues.apache.org/jira/browse/KYLIN-3369
Project: Kylin
Issue Type: Improvement
Components: Streaming
Reporter: Shaofeng SHI
When building a cube from Kafka topic, the first step is to sink the Kafka data
to HDFS. In today's implementation, it will persist all the fields of a message
to disk. While in many cases, only a couple of fields will be needed for
cubing; Today's behavior wastes network bandwidth and disk space.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)