[
https://issues.apache.org/jira/browse/S2GRAPH-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
DOYUNG YOON updated S2GRAPH-50:
-------------------------------
Description:
I think we need to provide choice for both for `Tall` and `Wide` row for
IndexEdge. The fatal difference between these two would be following.
1. Wide.
if we store adjacent edges on single row with wide column and use get request
to get adjacent edges. This is how IndexEdge is currently stored.
2. Tall.
adjacent edges are on multiple `consecutive` rows and we use scanner to scan
through them.
once S2GRAPH-17 is resolved, then I think only thing we have to do is provide
`IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase
and I think this is very trivial task since we all have primitives for this.
The hard part would be changing interface for client.
currently query support `offset` and `limit` for pagination. if we use scanner,
then there is no easy way to support `offset`.
I think it is worth to try with Tall row schema and benchmark them over Wide
row schema. also I think this is very beneficial for others who is interested
in implementing other storage such as RocksDB or LevelDB(including myself).
I will followup with benchmark on both `Tall` and `Wide` row then we can decide
what schema should be default. What others think?
was:
I think we need to provide choice for both for `Tall` and `Wide` row for
IndexEdge. The fatal difference between these two would be following.
# Wide. if we store adjacent edges on single row with wide column and use get
request to get adjacent edges. This is how IndexEdge is currently stored.
# Tall. adjacent edges are on multiple `consecutive` rows and we use scanner to
scan through them.
once S2GRAPH-17 is resolved, then I think only thing we have to do is provide
`IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase
and I think this is very trivial task since we all have primitives for this.
The hard part would be changing interface for client.
currently query support `offset` and `limit` for pagination. if we use scanner,
then there is no easy way to support `offset`.
I think it is worth to try with Tall row schema and benchmark them over Wide
row schema. also I think this is very beneficial for others who is interested
in implementing other storage such as RocksDB or LevelDB(including myself).
I will followup with benchmark on both `Tall` and `Wide` row then we can decide
what schema should be default. What others think?
> Provide new HBase Storage Schema
> --------------------------------
>
> Key: S2GRAPH-50
> URL: https://issues.apache.org/jira/browse/S2GRAPH-50
> Project: S2Graph
> Issue Type: New Feature
> Reporter: DOYUNG YOON
> Assignee: DOYUNG YOON
> Labels: benchmark, schema, serde
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> I think we need to provide choice for both for `Tall` and `Wide` row for
> IndexEdge. The fatal difference between these two would be following.
> 1. Wide.
> if we store adjacent edges on single row with wide column and use get request
> to get adjacent edges. This is how IndexEdge is currently stored.
> 2. Tall.
> adjacent edges are on multiple `consecutive` rows and we use scanner to scan
> through them.
> once S2GRAPH-17 is resolved, then I think only thing we have to do is provide
> `IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on
> HBase and I think this is very trivial task since we all have primitives for
> this.
> The hard part would be changing interface for client.
> currently query support `offset` and `limit` for pagination. if we use
> scanner, then there is no easy way to support `offset`.
> I think it is worth to try with Tall row schema and benchmark them over Wide
> row schema. also I think this is very beneficial for others who is interested
> in implementing other storage such as RocksDB or LevelDB(including myself).
> I will followup with benchmark on both `Tall` and `Wide` row then we can
> decide what schema should be default. What others think?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)