[jira] [Updated] (S2GRAPH-50) Provide new HBase Storage Schema

DOYUNG YOON (JIRA) Sun, 28 Feb 2016 21:55:58 -0800

     [ 
https://issues.apache.org/jira/browse/S2GRAPH-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


DOYUNG YOON updated S2GRAPH-50:
-------------------------------
    Description: 
I think we need to provide choice for both for `Tall` and `Wide` row for 
IndexEdge. The fatal difference between these two would be following.

1. Wide. 
if we store adjacent edges on single row with wide column and use get request 
to get adjacent edges. This is how IndexEdge is currently stored.

2. Tall. 
adjacent edges are on multiple `consecutive` rows and we use scanner to scan 
through them. 

once S2GRAPH-17 is resolved, then I think only thing we have to do is provide 
`IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase 
and I think this is very trivial task since we all have primitives for this. 

The hard part would be changing interface for client.

currently query support `offset` and `limit` for pagination. if we use scanner, 
then there is no easy way to support `offset`. 

I think it is worth to try with Tall row schema and benchmark them over Wide 
row schema. also I think this is very beneficial for others who is interested 
in implementing other storage such as RocksDB or LevelDB(including myself). 

I will followup with benchmark on both `Tall` and `Wide` row then we can decide 
what schema should be default. What others think? 

  was:
I think we need to provide choice for both for `Tall` and `Wide` row for 
IndexEdge. The fatal difference between these two would be following.

# Wide. if we store adjacent edges on single row with wide column and use get 
request to get adjacent edges. This is how IndexEdge is currently stored.

# Tall. adjacent edges are on multiple `consecutive` rows and we use scanner to 
scan through them. 

once S2GRAPH-17 is resolved, then I think only thing we have to do is provide 
`IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase 
and I think this is very trivial task since we all have primitives for this. 

The hard part would be changing interface for client.

currently query support `offset` and `limit` for pagination. if we use scanner, 
then there is no easy way to support `offset`. 

I think it is worth to try with Tall row schema and benchmark them over Wide 
row schema. also I think this is very beneficial for others who is interested 
in implementing other storage such as RocksDB or LevelDB(including myself). 

I will followup with benchmark on both `Tall` and `Wide` row then we can decide 
what schema should be default. What others think? 


> Provide new HBase Storage Schema
> --------------------------------
>
>                 Key: S2GRAPH-50
>                 URL: https://issues.apache.org/jira/browse/S2GRAPH-50
>             Project: S2Graph
>          Issue Type: New Feature
>            Reporter: DOYUNG YOON
>            Assignee: DOYUNG YOON
>              Labels: benchmark, schema, serde
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I think we need to provide choice for both for `Tall` and `Wide` row for 
> IndexEdge. The fatal difference between these two would be following.
> 1. Wide. 
> if we store adjacent edges on single row with wide column and use get request 
> to get adjacent edges. This is how IndexEdge is currently stored.
> 2. Tall. 
> adjacent edges are on multiple `consecutive` rows and we use scanner to scan 
> through them. 
> once S2GRAPH-17 is resolved, then I think only thing we have to do is provide 
> `IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on 
> HBase and I think this is very trivial task since we all have primitives for 
> this. 
> The hard part would be changing interface for client.
> currently query support `offset` and `limit` for pagination. if we use 
> scanner, then there is no easy way to support `offset`. 
> I think it is worth to try with Tall row schema and benchmark them over Wide 
> row schema. also I think this is very beneficial for others who is interested 
> in implementing other storage such as RocksDB or LevelDB(including myself). 
> I will followup with benchmark on both `Tall` and `Wide` row then we can 
> decide what schema should be default. What others think? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (S2GRAPH-50) Provide new HBase Storage Schema

Reply via email to