[ https://issues.apache.org/jira/browse/CONNECTORS-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774561#comment-13774561 ]
Karl Wright edited comment on CONNECTORS-286 at 9/23/13 2:03 PM: ----------------------------------------------------------------- Jan: I will follow up with this to see if Warthog is replaceable by Gora. Warthog is quite full-featured, and a simple columnar rendering is not going to help much. But more importantly, the whole way you use the database has to change for an async key-value store. If it doesn't you are not significantly better off, is my conclusion based on playing with Warthog locally. It all boils down to how you build a high-performance queue using the underlying abstraction. ManifoldCF uses btrees extensively here, especially the ability to read index-ordered tuples from a table. If Gora does not support that and is nevertheless a SQL-like paradigm, it's not possible to use it for ManifoldCF. The alternative, which is to abandon all pretense of sql constructs, means that we'd have to find an name/value store way of representing a priority queue efficiently. This might be a fun project in its own right - it's what I thought we'd do for ManifoldCF 2.0. was (Author: kwri...@metacarta.com): Jan: I will follow up with this to see if Warthog is replaceable by Gora. Warthog is quite full-featured, and a simple columnar rendering is not going to help is much. But more importantly, the whole way you use the database has to change for an async key-value store. If it doesn't you are not significantly better off, is my conclusion based on playing with Warthog locally. It all boils down to how you build a high-performance queue using the underlying abstraction. ManifoldCF uses btrees extensively here, especially the ability to read index-ordered tuples from a table. If Gora does not support that and is nevertheless a SQL-like paradigm, it's not possible to use it for ManifoldCF. The alternative, which is to abandon all pretense of sql constructs, means that we'd have to find an name/value store way of representing a priority queue efficiently. This might be a fun project in its own right - it's what I thought we'd do for ManifoldCF 2.0. > Get ManifoldCF to run on top of a key/value store like Voldemort, for > potential massive scalability improvements and speed gains > -------------------------------------------------------------------------------------------------------------------------------- > > Key: CONNECTORS-286 > URL: https://issues.apache.org/jira/browse/CONNECTORS-286 > Project: ManifoldCF > Issue Type: New Feature > Components: Framework core > Reporter: Karl Wright > Assignee: Karl Wright > Fix For: ManifoldCF next > > > ManifoldCF's reliance on a relational database limits its throughput and > scalability. I am now convinced it is possible to build all the structures > we need within a distributed key-value store like Voldemort, which has the > nice side effect of permitting massive scaling. I envision there will be > several layers to this project, some of which may have broader utility in the > open-source community at large: > (1) An atomic serialization layer, which adds serialization capabilities to > an non-transactional substrate; > (2) A transaction layer, which uses atomic serialization to build a notion of > light transactions; > (3) A table and index layer, which defines SQL-like concepts of tables and > btree indexes on top of the transaction layer, via a Java API; > (4) A generic "database abstraction" layer, which is capable of representing > both standard SQL databases as well as this NoSQL variant, so that ManifoldCF > can support both models. > This is obviously a major development task, and as such is not envisioned to > be completed by the next standard release. Work will indeed need to be done > in a branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira