[
https://issues.apache.org/jira/browse/GEODE-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anthony Baker closed GEODE-10.
------------------------------
> HDFS Integration
> ----------------
>
> Key: GEODE-10
> URL: https://issues.apache.org/jira/browse/GEODE-10
> Project: Geode
> Issue Type: New Feature
> Components: hdfs
> Reporter: Dan Smith
> Assignee: Ashvin
> Fix For: 1.0.0-incubating.M1
>
> Attachments: GEODE-HDFSPersistence-Draft-060715-2109-21516.pdf
>
>
> Ability to persist data on HDFS had been under development for GemFire. It
> was part of the latest code drop, GEODE-8. As part of this feature we are
> proposing some changes to the HdfsStore management API (see attached doc for
> details).
> # The current API has nested configuration for compaction and async queue.
> This nested structure forces user to execute multiple steps to manage a
> store. It also does not seem to be consistent with other management APIs
> # Some member names in current API are confusing
> HDFS Integration: Geode as a transactional layer that microbatches data out
> to Hadoop. This capability makes Geode a NoSQL store that can sit on top of
> Hadoop and parallelize the process of moving data from the in memory tier
> into Hadoop, making it very useful for capturing and processing fast data
> while making it available for Hadoop jobs relatively quickly. The key
> requirements being met here are
> # Ingest data into HDFS parallely
> # Cache bloom filters and allow fast lookups of individual elements
> # Have programmable policies for deciding what stays in memory
> # Roll files in HDFS
> # Index data that is in memory
> # Have expiration policies that allows the transactional set to decay out
> older data
> # Solution needs to support replicated and partitioned regions
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)