[ 
https://issues.apache.org/jira/browse/TAJO-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357460#comment-14357460
 ] 

Jakob Homan commented on TAJO-1388:
-----------------------------------

A couple questions:
* Kafka is agnostic to byte content and so has different serdes for 
encoding/decoding data.  Won't these also need to be specified and end up 
specifying the mapping of the kafka message?
* How will this approach work with compaction, where Kafka can delete older 
versions of messages that have the same versions of keys as later messages?

Samza is currently working on adding a higher level language layer and is 
generally used for processing Kafka streams (SAMZA-390).  It looks like this 
approach is coming at the same problem from the opposite direction.  It may be 
worth keeping an eye out for areas of cooperation.

> [Umbrella] Kafka Storage Integration.
> -------------------------------------
>
>                 Key: TAJO-1388
>                 URL: https://issues.apache.org/jira/browse/TAJO-1388
>             Project: Tajo
>          Issue Type: New Feature
>          Components: storage
>            Reporter: YeonSu Han
>            Assignee: YeonSu Han
>              Labels: kafka_storage
>
> Apache Kafka is one of the widely used message queueing system. If we can use 
> the Kafka as Tajo storage, analysis area of Tajo user is be broaden. For 
> example, as realtime analysis. 
> For this, I propose 'Kafka storage'. Please review my proposal and give your 
> opinion.
> * Table Creation
> {code:sql}
> CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> [(<column_name>
> <data_type>, ... )]
> using kafka with 
> (‘kafka.topic’=’<kafka_topic_name>’,‘kafka.zk’=’<kafka_zookeeper_info>’,[other
>  options])
> {code}
> ** Use “kafka” keyword in “using” clause for creating kafka table in Tajo.
> ** kafka table name is mapped to a Tajo table name with , 'kafka.topic' 
> property.
> * Column mapping of kafka message
> ** Delimited line mapping (default)
> ** json mapping
> ** ...
> * Concept
> ** The topic of kafka correspond to table.
> ** The partition of kafka correspond to file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to