Hi

   Thats an sample use case for my doubt . This is my use case

Customers visiting our website are generated as logs and we will be
processing it  which is usually done by Apache Pig for processing it and
inserts the output from pig into hbase table(test) directly using
HbaseStorage. This will be done every morning. Data consists of following
columns

Customerid | Name | visitedurl | timestamp | location | companyname

I have only one column family (test_family)

As of now I have generated random no for each row and it is inserted as row
key for that table. For ex I have following data to be inserted into table

1725|xxx|www.something.com|127987834 | india |zzzz
1726|yyy|www.some.com|128389478 | UK | yyyy

If so I will add 1 as row key for first row and 2 for second one and so on.

Note : Same id will be repeated for different days so I chose random no to
be row-key

while querying data from table where I use  scan 'test',
{FILTER=>"SingleColumnValueFilter('test_family',Customerr'id',=,'binary:1002')"}
it takes more than 2 minutes to return the results.

Suggest me a way so that I have to bring down this process to 1 to 2
seconds since I am using it in real-time analytics

Thanks

On Tue, Dec 1, 2015 at 3:40 PM, Heng Chen <[email protected]> wrote:

> So, maybe we can use 1212 + customerId as rowKey.
> btw, what is 1212 used for?
>
> 2015-12-01 17:49 GMT+08:00 Rajeshkumar J <[email protected]>:
>
> > Hi chen,
> >
> > yes I have customerid column to represent each customers
> >
> >
> >
> > On Tue, Dec 1, 2015 at 3:11 PM, Heng Chen <[email protected]>
> > wrote:
> >
> > > Hm.., is there anything unique like userId to represent one people?
> > >
> > >
> > > 2015-12-01 16:33 GMT+08:00 Rajeshkumar J <[email protected]
> >:
> > >
> > > > Is there any other way to store only id becoz there may be new rows
> > with
> > > > the same name like
> > > >
> > > > 1212  |   xxxx | 20
> > > > 1212  | yyyy  |  21
> > > > 1212  | xxxx | 22
> > > >
> > > >
> > > > On Tue, Dec 1, 2015 at 1:59 PM, Heng Chen <[email protected]>
> > > > wrote:
> > > >
> > > > > Yeah,  if you want to get all records about 1212,  just scan rows
> > with
> > > > > prefix 1212
> > > > >
> > > > > 2015-12-01 16:27 GMT+08:00 Rajeshkumar J <
> > [email protected]
> > > >:
> > > > >
> > > > > > so you want me to design row-key value by appending name column
> > value
> > > > to
> > > > > > the rowkey
> > > > > >
> > > > > > On Tue, Dec 1, 2015 at 1:19 PM, Heng Chen <
> > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > So, why not
> > > > > > >
> > > > > > > 1212-xxx    20
> > > > > > > 1212-yyy    21
> > > > > > > 1212-zzz    22
> > > > > > >
> > > > > > > 2015-12-01 15:33 GMT+08:00 Rajeshkumar J <
> > > > [email protected]
> > > > > >:
> > > > > > >
> > > > > > > > Hi
> > > > > > > >
> > > > > > > >   I meant like below is this possible
> > > > > > > >
> > > > > > > > Rowkey | column family
> > > > > > > >
> > > > > > > >                Name | Age
> > > > > > > >
> > > > > > > > 1212     |   xxxx | 20
> > > > > > > > 1212     |  yyyy | 21
> > > > > > > > 1212  | zzzz | 22
> > > > > > > >
> > > > > > > > On Tue, Dec 1, 2015 at 12:03 PM, Heng Chen <
> > > > [email protected]
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > why not
> > > > > > > > >
> > > > > > > > > 1212 | 10, 11, 12, 13, 14, 15, 16, 27,  28 ?
> > > > > > > > >
> > > > > > > > > 2015-12-01 14:29 GMT+08:00 Rajeshkumar J <
> > > > > > [email protected]
> > > > > > > >:
> > > > > > > > >
> > > > > > > > > > Hi Ted,
> > > > > > > > > >
> > > > > > > > > >   This is my use case. I have to store values like this
> is
> > it
> > > > > > > possible?
> > > > > > > > > >
> > > > > > > > > > RowKey | Values
> > > > > > > > > >
> > > > > > > > > > 1212   | 10,11,12
> > > > > > > > > >
> > > > > > > > > > 1212  | 13, 14, 15
> > > > > > > > > >
> > > > > > > > > > 1212  | 16,27,28
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Nov 30, 2015 at 10:40 PM, Ted Yu <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Have you read
> > > > http://hbase.apache.org/book.html#rowkey.design
> > > > > ?
> > > > > > > > > > >
> > > > > > > > > > > bq. we can store more than one row for a row-key value.
> > > > > > > > > > >
> > > > > > > > > > > Can you clarify your intention / use case ? If row key
> is
> > > the
> > > > > > same,
> > > > > > > > key
> > > > > > > > > > > values would be in the same row.
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Nov 30, 2015 at 8:30 AM, Rajeshkumar J <
> > > > > > > > > > > [email protected]>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > >   I am new to Apache Hbase and I know that in a table
> > > when
> > > > we
> > > > > > try
> > > > > > > > to
> > > > > > > > > > > insert
> > > > > > > > > > > > row key value which is already present either new
> value
> > > is
> > > > > > > > discarded
> > > > > > > > > or
> > > > > > > > > > > > updated. Also I came across row version through which
> > we
> > > > can
> > > > > > > store
> > > > > > > > > > > > different versions of row key based on timestamp. Any
> > one
> > > > > > correct
> > > > > > > > me
> > > > > > > > > > if I
> > > > > > > > > > > > am wrong? Also I need to know is there any way we can
> > > store
> > > > > > more
> > > > > > > > than
> > > > > > > > > > one
> > > > > > > > > > > > row for a row-key value.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to