[
https://issues.apache.org/jira/browse/FLUME-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roshan Naik updated FLUME-1734:
-------------------------------
Description:
Hive 0.13 has introduced Streaming support which is itself transactional in
nature and fits well with Flume's transaction model.
Short overview of Hive's Streaming support on which this sink is based can be
found here:
https://issues.apache.org/jira/secure/attachment/12639278/package.html
This jira is for creating a sink that would continuously stream data into Hive
tables using the above APIs. The primary goal being that the data streamed by
the sink should be instantly queryable (using say Hive or Pig) without
requiring additional post-processing steps on behalf of the users. Sink should
manage the creation of new partitions periodically if needed.
was:Create a sink that would stream data into HCatalog partitions. The
primary goal being that once the data is loaded into Hadoop, it should be
automatically queryable (using say Hive or Pig) without requiring additional
post processing steps on behalf of the users. Sink should manage the creation
of new partitions and committing them periodically.
> Create a Hive Sink based on the new Hive Streaming support
> ----------------------------------------------------------
>
> Key: FLUME-1734
> URL: https://issues.apache.org/jira/browse/FLUME-1734
> Project: Flume
> Issue Type: New Feature
> Components: Sinks+Sources
> Affects Versions: v1.2.0
> Reporter: Roshan Naik
> Assignee: Roshan Naik
> Labels: features
> Attachments: FLUME-1734.draft.1.patch, FLUME-1734.draft.2.patch,
> FLUME-1734.v1.patch, FLUME-1734.v2.patch
>
>
> Hive 0.13 has introduced Streaming support which is itself transactional in
> nature and fits well with Flume's transaction model.
> Short overview of Hive's Streaming support on which this sink is based can
> be found here:
> https://issues.apache.org/jira/secure/attachment/12639278/package.html
> This jira is for creating a sink that would continuously stream data into
> Hive tables using the above APIs. The primary goal being that the data
> streamed by the sink should be instantly queryable (using say Hive or Pig)
> without requiring additional post-processing steps on behalf of the users.
> Sink should manage the creation of new partitions periodically if needed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)