[
https://issues.apache.org/jira/browse/S2GRAPH-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
DOYUNG YOON updated S2GRAPH-50:
-------------------------------
Description:
I think we need to provide choice for both for `Tall` and `Wide` row for
IndexEdge. The fatal difference between these two would be following.
# Wide. if we store adjacent edges on single row with wide column and use get
request to get adjacent edges. This is how IndexEdge is currently stored.
# Tall. adjacent edges are on multiple `consecutive` rows and we use scanner to
scan through them.
once S2GRAPH-17 is resolved, then I think only thing we have to do is provide
`IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase
and I think this is very trivial task since we all have primitives for this.
The hard part would be changing interface for client.
currently query support `offset` and `limit` for pagination. if we use scanner,
then there is no easy way to support `offset`.
I think it is worth to try with Tall row schema and benchmark them over Wide
row schema. also I think this is very beneficial for others who is interested
in implementing other storage such as RocksDB or LevelDB(including myself).
I will followup with benchmark on both `Tall` and `Wide` row then we can decide
what schema should be default. What others think?
was:
I think we need to provide choice for both for `Tall` and `Wide` row for
IndexEdge. The fatal difference between these two would be following.
# Wide.
if we store adjacent edges on single row with wide column and use get request
to get adjacent edges. This is how IndexEdge is currently stored.
# Tall.
adjacent edges are on multiple `consecutive` rows and we use scanner to scan
through them.
once S2GRAPH-17 is resolved, then I think only thing we have to do is provide
`IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase
and I think this is very trivial task since we all have primitives for this.
The hard part would be changing interface for client.
currently query support `offset` and `limit` for pagination. if we use scanner,
then there is no easy way to support `offset`.
I think it is worth to try with Tall row schema and benchmark them over Wide
row schema. also I think this is very beneficial for others who is interested
in implementing other storage such as RocksDB or LevelDB(including myself).
I will followup with benchmark on both `Tall` and `Wide` row then we can decide
what schema should be default. What others think?
> Provide new HBase Storage Schema
> --------------------------------
>
> Key: S2GRAPH-50
> URL: https://issues.apache.org/jira/browse/S2GRAPH-50
> Project: S2Graph
> Issue Type: New Feature
> Reporter: DOYUNG YOON
> Assignee: DOYUNG YOON
> Labels: benchmark, schema, serde
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> I think we need to provide choice for both for `Tall` and `Wide` row for
> IndexEdge. The fatal difference between these two would be following.
> # Wide. if we store adjacent edges on single row with wide column and use get
> request to get adjacent edges. This is how IndexEdge is currently stored.
> # Tall. adjacent edges are on multiple `consecutive` rows and we use scanner
> to scan through them.
> once S2GRAPH-17 is resolved, then I think only thing we have to do is provide
> `IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on
> HBase and I think this is very trivial task since we all have primitives for
> this.
> The hard part would be changing interface for client.
> currently query support `offset` and `limit` for pagination. if we use
> scanner, then there is no easy way to support `offset`.
> I think it is worth to try with Tall row schema and benchmark them over Wide
> row schema. also I think this is very beneficial for others who is interested
> in implementing other storage such as RocksDB or LevelDB(including myself).
> I will followup with benchmark on both `Tall` and `Wide` row then we can
> decide what schema should be default. What others think?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)