Hi
Thats an sample use case for my doubt . This is my use case
Customers visiting our website are generated as logs and we will be
processing it which is usually done by Apache Pig for processing it and
inserts the output from pig into hbase table(test) directly using
HbaseStorage. This will be done every morning. Data consists of following
columns
Customerid | Name | visitedurl | timestamp | location | companyname
I have only one column family (test_family)
As of now I have generated random no for each row and it is inserted as row
key for that table. For ex I have following data to be inserted into table
1725|xxx|www.something.com|127987834 | india |zzzz
1726|yyy|www.some.com|128389478 | UK | yyyy
If so I will add 1 as row key for first row and 2 for second one and so on.
Note : Same id will be repeated for different days so I chose random no to
be row-key
while querying data from table where I use scan 'test',
{FILTER=>"SingleColumnValueFilter('test_family',Customerr'id',=,'binary:1002')"}
it takes more than 2 minutes to return the results.
Suggest me a way so that I have to bring down this process to 1 to 2
seconds since I am using it in real-time analytics
Thanks
On Tue, Dec 1, 2015 at 3:40 PM, Heng Chen <[email protected]> wrote:
> So, maybe we can use 1212 + customerId as rowKey.
> btw, what is 1212 used for?
>
> 2015-12-01 17:49 GMT+08:00 Rajeshkumar J <[email protected]>:
>
> > Hi chen,
> >
> > yes I have customerid column to represent each customers
> >
> >
> >
> > On Tue, Dec 1, 2015 at 3:11 PM, Heng Chen <[email protected]>
> > wrote:
> >
> > > Hm.., is there anything unique like userId to represent one people?
> > >
> > >
> > > 2015-12-01 16:33 GMT+08:00 Rajeshkumar J <[email protected]
> >:
> > >
> > > > Is there any other way to store only id becoz there may be new rows
> > with
> > > > the same name like
> > > >
> > > > 1212 | xxxx | 20
> > > > 1212 | yyyy | 21
> > > > 1212 | xxxx | 22
> > > >
> > > >
> > > > On Tue, Dec 1, 2015 at 1:59 PM, Heng Chen <[email protected]>
> > > > wrote:
> > > >
> > > > > Yeah, if you want to get all records about 1212, just scan rows
> > with
> > > > > prefix 1212
> > > > >
> > > > > 2015-12-01 16:27 GMT+08:00 Rajeshkumar J <
> > [email protected]
> > > >:
> > > > >
> > > > > > so you want me to design row-key value by appending name column
> > value
> > > > to
> > > > > > the rowkey
> > > > > >
> > > > > > On Tue, Dec 1, 2015 at 1:19 PM, Heng Chen <
> > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > So, why not
> > > > > > >
> > > > > > > 1212-xxx 20
> > > > > > > 1212-yyy 21
> > > > > > > 1212-zzz 22
> > > > > > >
> > > > > > > 2015-12-01 15:33 GMT+08:00 Rajeshkumar J <
> > > > [email protected]
> > > > > >:
> > > > > > >
> > > > > > > > Hi
> > > > > > > >
> > > > > > > > I meant like below is this possible
> > > > > > > >
> > > > > > > > Rowkey | column family
> > > > > > > >
> > > > > > > > Name | Age
> > > > > > > >
> > > > > > > > 1212 | xxxx | 20
> > > > > > > > 1212 | yyyy | 21
> > > > > > > > 1212 | zzzz | 22
> > > > > > > >
> > > > > > > > On Tue, Dec 1, 2015 at 12:03 PM, Heng Chen <
> > > > [email protected]
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > why not
> > > > > > > > >
> > > > > > > > > 1212 | 10, 11, 12, 13, 14, 15, 16, 27, 28 ?
> > > > > > > > >
> > > > > > > > > 2015-12-01 14:29 GMT+08:00 Rajeshkumar J <
> > > > > > [email protected]
> > > > > > > >:
> > > > > > > > >
> > > > > > > > > > Hi Ted,
> > > > > > > > > >
> > > > > > > > > > This is my use case. I have to store values like this
> is
> > it
> > > > > > > possible?
> > > > > > > > > >
> > > > > > > > > > RowKey | Values
> > > > > > > > > >
> > > > > > > > > > 1212 | 10,11,12
> > > > > > > > > >
> > > > > > > > > > 1212 | 13, 14, 15
> > > > > > > > > >
> > > > > > > > > > 1212 | 16,27,28
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Nov 30, 2015 at 10:40 PM, Ted Yu <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Have you read
> > > > http://hbase.apache.org/book.html#rowkey.design
> > > > > ?
> > > > > > > > > > >
> > > > > > > > > > > bq. we can store more than one row for a row-key value.
> > > > > > > > > > >
> > > > > > > > > > > Can you clarify your intention / use case ? If row key
> is
> > > the
> > > > > > same,
> > > > > > > > key
> > > > > > > > > > > values would be in the same row.
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Nov 30, 2015 at 8:30 AM, Rajeshkumar J <
> > > > > > > > > > > [email protected]>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > I am new to Apache Hbase and I know that in a table
> > > when
> > > > we
> > > > > > try
> > > > > > > > to
> > > > > > > > > > > insert
> > > > > > > > > > > > row key value which is already present either new
> value
> > > is
> > > > > > > > discarded
> > > > > > > > > or
> > > > > > > > > > > > updated. Also I came across row version through which
> > we
> > > > can
> > > > > > > store
> > > > > > > > > > > > different versions of row key based on timestamp. Any
> > one
> > > > > > correct
> > > > > > > > me
> > > > > > > > > > if I
> > > > > > > > > > > > am wrong? Also I need to know is there any way we can
> > > store
> > > > > > more
> > > > > > > > than
> > > > > > > > > > one
> > > > > > > > > > > > row for a row-key value.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>