[
https://issues.apache.org/jira/browse/TAJO-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356174#comment-14356174
]
Hyunsik Choi edited comment on TAJO-1388 at 3/11/15 11:58 PM:
--------------------------------------------------------------
I have some questions. Actually, I'm not an expert of Kafka. If I misunderstood
something, please let me know.
Kafka deals with unbounded stream data set. When some batch processing systems
are integrated with Kafka, they seem to make use of offset value for each
topic. How about you? Could you share your plan about how to deal with kafka
topic stream?
It would be great if you give some design draft of your work.
was (Author: hyunsik):
I have some questions. Actually, I'm not an expect of Kafka. If I misunderstood
something, please let me know.
Kafka deals with unbounded stream data set. When some batch processing systems
are integrated with Kafka, they seem to make use of offset value for each
topic. How about you? Could you share your plan about how to deal with kafka
topic stream?
It would be great if you give some design draft of your work.
> [Umbrella] Kafka Storage Integration.
> -------------------------------------
>
> Key: TAJO-1388
> URL: https://issues.apache.org/jira/browse/TAJO-1388
> Project: Tajo
> Issue Type: New Feature
> Components: storage
> Reporter: YeonSu Han
> Assignee: YeonSu Han
> Labels: kafka_storage
>
> Apache Kafka is one of the widely used message queueing system. If we can use
> the Kafka as Tajo storage, analysis area of Tajo user is be broaden. For
> example, as realtime analysis.
> For this, I propose 'Kafka storage'. Please review my proposal and give your
> opinion.
> * Table Creation
> {code:sql}
> CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> [(<column_name>
> <data_type>, ... )]
> using kafka with
> (‘kafka.topic’=’<kafka_topic_name>’,‘kafka.zk’=’<kafka_zookeeper_info>’,[other
> options])
> {code}
> ** Use “kafka” keyword in “using” clause for creating kafka table in Tajo.
> ** kafka table name is mapped to a Tajo table name with , 'kafka.topic'
> property.
> * Column mapping of kafka message
> ** Delimited line mapping (default)
> ** json mapping
> ** ...
> * Concept
> ** The topic of kafka correspond to table.
> ** The partition of kafka correspond to file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)