Re: hbase rowkey design

2016-05-16 Thread Heng Chen
In my company, we calculate UV/PV offline in batch, and update every day. If do it online, url + timestamp could be the rowkey. 2016-05-16 18:13 GMT+08:00 齐忠 : > Yes, like google analytics. > > 2016-05-16 17:48 GMT+08:00 Heng Chen : > > You want

Re: hbase rowkey design

2016-05-16 Thread 齐忠
Yes, like google analytics. 2016-05-16 17:48 GMT+08:00 Heng Chen : > You want to calculate UV/PV online? > > 2016-05-16 16:46 GMT+08:00 齐忠 : > >> I have very large log(50T per day), >> >> My log event as follows >> >> url,visitid,requesttime >> >>

Re: hbase rowkey design

2016-05-16 Thread Heng Chen
You want to calculate UV/PV online? 2016-05-16 16:46 GMT+08:00 齐忠 : > I have very large log(50T per day), > > My log event as follows > > url,visitid,requesttime > > http://www.aaa.com?a=b=d=f, 1, 1463387380 > http://www.aaa.com?a=b=d=fa, 1, 1463387280 >

hbase rowkey design

2016-05-16 Thread 齐忠
I have very large log(50T per day), My log event as follows url,visitid,requesttime http://www.aaa.com?a=b=d=f, 1, 1463387380 http://www.aaa.com?a=b=d=fa, 1, 1463387280 http://www.aaa.com?a=b=d=fa, 2, 1463387280 http://www.aaa.com?a=b=d=fab, 2, 1463387280 http://www.aaa.com?a=b=d=f, 1,

Re: hbase rowkey design ways

2015-08-23 Thread Ted Yu
Have you read the following ? http://hbase.apache.org/book.html#rowkey.design Cheers On Sun, Aug 23, 2015 at 8:01 AM, jackiehbaseuser jackiehbaseu...@126.com wrote: Hi How many ways when i design the hbase rowkey ,and give some examples. Thank u very much! Best regards! qiguo

Hbase RowKey design schema

2013-08-29 Thread Wasim Karani
I am using HBase to store webtable content like how google is using bigtable. For reference of google bigtable My question is on RowKey, how we should be forming it. What google is doing is saving the URL in a reverse order as you can see in the PDF document com.cnn.www so that all the links

Re: Hbase RowKey design schema

2013-08-29 Thread Shahab Yunus
What advantage you will be gaining by compressing? Less space? But then it will add compression/decompression performance overhead. A trade-off but a especially significant as space is cheap and redundancy is OK with such data stores. Having said that, more importantly, what are your read