[ 
https://issues.apache.org/jira/browse/TAJO-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Byunghwa Yun updated TAJO-1388:
-------------------------------
    Description: 
Apache Kafka is one of the widely used message queueing system. If we can use 
the Kafka as Tajo storage, analysis area of Tajo user is be broaden. For 
example, as realtime analysis. 
For this, I propose 'Kafka storage'. Please review my proposal and give your 
opinion.

* Table Creation
{code:sql}
CREATE TABLE [IF NOT EXISTS] <table_name> [(column_list)] TABLESPACE 
kafka_cluster1
using kafka with (‘kafka.topic’=’<kafka_topic_name>’,[other options])

CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name> (column_list)
using kafka with (‘kafka.topic’=’<kafka_topic_name>’,[other options]) LOCATION 
'kafka://host1:9092,host2:9092,host3:9092'
{code}

** Use “kafka” keyword in “using” clause for creating kafka table in Tajo.
** kafka table name is mapped to a Tajo table name with , 'kafka.topic' 
property.

* storage-site.json
{code:json}
{
  "spaces": {
    "kafka_cluster1":
    { "uri": "kafka://host1:9092,host2:9092,host3:9092" }
  }
}
{code}

* Column mapping of kafka message
** Delimited line mapping (default)
** json mapping
** ...

* Concept
** The topic of kafka correspond to table.
** The partition of kafka correspond to file.

  was:
Apache Kafka is one of the widely used message queueing system. If we can use 
the Kafka as Tajo storage, analysis area of Tajo user is be broaden. For 
example, as realtime analysis. 
For this, I propose 'Kafka storage'. Please review my proposal and give your 
opinion.

* Table Creation
{code:sql}
CREATE TABLE [IF NOT EXISTS] <table_name> [(column_list)] TABLESPACE 
kafka_cluster1
using kafka with (‘kafka.topic’=’<kafka_topic_name>’,[other options])

CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name> (column_list)
using kafka with (‘kafka.topic’=’<kafka_topic_name>’,[other options]) LOCATION 
'kafka://host1:9092,host2:9092,host3:9092'
{code}

* storage-site.json
{code:json}
{
  "spaces": {
    "kafka_cluster1":
    { "uri": "kafka://host1:9092,host2:9092,host3:9092" }
  }
}
{code}
** Use “kafka” keyword in “using” clause for creating kafka table in Tajo.
** kafka table name is mapped to a Tajo table name with , 'kafka.topic' 
property.

* Column mapping of kafka message
** Delimited line mapping (default)
** json mapping
** ...

* Concept
** The topic of kafka correspond to table.
** The partition of kafka correspond to file.


> [Umbrella] Kafka Storage Integration.
> -------------------------------------
>
>                 Key: TAJO-1388
>                 URL: https://issues.apache.org/jira/browse/TAJO-1388
>             Project: Tajo
>          Issue Type: New Feature
>          Components: Storage
>            Reporter: YeonSu Han
>            Assignee: Byunghwa Yun
>              Labels: kafka_storage
>         Attachments: Kafka _Storage_Ingegration_draft.pdf
>
>
> Apache Kafka is one of the widely used message queueing system. If we can use 
> the Kafka as Tajo storage, analysis area of Tajo user is be broaden. For 
> example, as realtime analysis. 
> For this, I propose 'Kafka storage'. Please review my proposal and give your 
> opinion.
> * Table Creation
> {code:sql}
> CREATE TABLE [IF NOT EXISTS] <table_name> [(column_list)] TABLESPACE 
> kafka_cluster1
> using kafka with (‘kafka.topic’=’<kafka_topic_name>’,[other options])
> CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name> (column_list)
> using kafka with (‘kafka.topic’=’<kafka_topic_name>’,[other options]) 
> LOCATION 'kafka://host1:9092,host2:9092,host3:9092'
> {code}
> ** Use “kafka” keyword in “using” clause for creating kafka table in Tajo.
> ** kafka table name is mapped to a Tajo table name with , 'kafka.topic' 
> property.
> * storage-site.json
> {code:json}
> {
>   "spaces": {
>     "kafka_cluster1":
>     { "uri": "kafka://host1:9092,host2:9092,host3:9092" }
>   }
> }
> {code}
> * Column mapping of kafka message
> ** Delimited line mapping (default)
> ** json mapping
> ** ...
> * Concept
> ** The topic of kafka correspond to table.
> ** The partition of kafka correspond to file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to