Hi Jean, Please find my reply inline.
On Tue, Jan 29, 2013 at 1:40 PM, Jean-Marc Spaggiari < [email protected]> wrote: > Hi Anil, > > I think it really depend on the way you want to use the pagination. > Absolutely true! > > Do you need to be able to jump to page X? Are you ok if you miss a > line or 2? Is your data growing fastly? Or slowly? Is it ok if your > page indexes are a day old? Do you need to paginate over 300 colums? > Or just 1? Do you need to always have the exact same number of entries > in each page? > No, i dont need to be able to jump page X. I dont think that missing lines will be acceptable. I need to filter the rows on non-rowkey attributes. It wont be ok if my page indexes are 1 day old. I need to paginate on basis of various filters based on columns or(and) rowkey. So, the number of combinations are quite large. > > For my usecase I need to be able to jump to the page X and I don't > have any content. I have hundred of millions lines. Only the rowkey > matter for me and I'm fine if sometime I have 50 entries displayed, > and sometime only 45. So I'm thinking about calculating which row is > the first one for each page, and store that separatly. Then I just > need to run the MR daily. > hmm..yeah, it might work for you. > > It's not a perfect solution I agree, but this might do the job for me. > I'm totally open to all other idea which might do the job to. > There is nothing like a "perfect" solution. If the implementation is able to fulfill your business needs, then go for it. > > JM > > 2013/1/29, anil gupta <[email protected]>: > > Yes, your suggested solution only works on RowKey based pagination. It > will > > fail when you start filtering on the basis of columns. > > > > Still, i would say it's comparatively easier to maintain this at > > Application level rather than creating tables for pagination. > > > > What if you have 300 columns in your schema. Will you create 300 tables? > > What about handling of pagination when filtering is done based on > multiple > > columns ("and" and "or" conditions)? > > > > On Tue, Jan 29, 2013 at 1:08 PM, Jean-Marc Spaggiari < > > [email protected]> wrote: > > > >> No, no killer solution here ;) > >> > >> But I'm still thinking about that because I might have to implement > >> some pagination options soon... > >> > >> As you are saying, it's only working on the row-key, but if you want > >> to do the same-thing on non-rowkey, you might have to create a > >> secondary index table... > >> > >> JM > >> > >> 2013/1/27, anil gupta <[email protected]>: > >> > That's alright..I thought that you have come-up with a killer > solution. > >> So, > >> > got curious to hear your ideas. ;) > >> > It seems like your below mentioned solution will not work on filtering > >> > on > >> > non row-key columns since when you are deciding the page numbers you > >> > are > >> > only considering rowkey. > >> > > >> > Thanks, > >> > Anil > >> > > >> > On Fri, Jan 25, 2013 at 6:58 PM, Jean-Marc Spaggiari < > >> > [email protected]> wrote: > >> > > >> >> Hi Anil, > >> >> > >> >> I don't have a solution. I never tought about that ;) But I was > >> >> thinking about something like you create a 2nd table where you place > >> >> the raw number (4 bytes) then the raw key. You go directly to a > >> >> specific page, you query by the number, found the key, and you know > >> >> where to start you scan in the main table. > >> >> > >> >> The issue is properly the number for each lines since with a MR you > >> >> don't know where you are from the beginning. But you can built > >> >> something where you store the line number from the beginning of the > >> >> region, then when all regions are parsed you can reconstruct the > total > >> >> numbering... That should work... > >> >> > >> >> JM > >> >> > >> >> 2013/1/25, anil gupta <[email protected]>: > >> >> > Inline... > >> >> > > >> >> > On Fri, Jan 25, 2013 at 9:17 AM, Jean-Marc Spaggiari < > >> >> > [email protected]> wrote: > >> >> > > >> >> >> Hi Anil, > >> >> >> > >> >> >> The issue is that all the other sub-sequent page start should be > >> moved > >> >> >> too... > >> >> >> > >> >> > Yes, this is a possibility. Hence the Developer has to take care of > >> >> > this > >> >> > case. It might also be possible that the pageSize is not a hard > >> >> > limit > >> >> > on > >> >> > number of results(more like a hint or suggestion on size). I would > >> >> > say > >> >> > it > >> >> > varies by use case. > >> >> > > >> >> >> > >> >> >> so if you want to jump directly to page n, you might be totally > >> >> >> shifted because of all the data inserted in the meantime... > >> >> >> > >> >> >> If you want a real complete pagination feature, you might want to > >> have > >> >> >> a coproccessor or a MR updating another table refering to the > >> >> >> pages.... > >> >> >> > >> >> > Well, the solution depends on the use case. I will be doing > >> >> > pagination > >> >> > in > >> >> > HBase for a restful service but till now i am unable to find any > >> reason > >> >> why > >> >> > this cant be done at application level. > >> >> > Are you suggesting to use MR for paging in HBase? If yes, how? > >> >> > How would you use another table for pagination?what would you store > >> >> > in > >> >> the > >> >> > extra table? > >> >> > > >> >> >> > >> >> >> JM > >> >> >> > >> >> >> 2013/1/25, anil gupta <[email protected]>: > >> >> >> > Hi Vijay, > >> >> >> > > >> >> >> > I've done paging in HBase by using Scan only(no pagination > >> >> >> > filter) > >> >> >> > as > >> >> >> > Mohammed has explained. However it was just an experimental > >> >> >> > stuff. > >> >> >> > It > >> >> >> works > >> >> >> > but Jean raised a very good point. > >> >> >> > Find my answer inline to fix the problem that Jean reported. > >> >> >> > > >> >> >> > > >> >> >> > On Fri, Jan 25, 2013 at 4:38 AM, Jean-Marc Spaggiari < > >> >> >> > [email protected]> wrote: > >> >> >> > > >> >> >> >> Hi Vijay, > >> >> >> >> > >> >> >> >> If, while the user os scrolling forward, you store the key of > >> >> >> >> each > >> >> >> >> page, then you will be able to go back to a specific page, and > >> jump > >> >> >> >> forward back up to where he was. > >> >> >> >> > >> >> >> >> The only issue is that, if while the user is scrolling the > >> >> >> >> table, > >> >> >> >> someone insert a row between the last of a page, and the first > >> >> >> >> of > >> >> >> >> the > >> >> >> >> next page, you will never see this row. > >> >> >> >> > >> >> >> >> Let's take this exemaple. > >> >> >> >> > >> >> >> >> You have 10 items per page. > >> >> >> >> > >> >> >> >> 010 020 030 040 050 060 070 080 090 100 is the first page. > >> >> >> >> 110 120 130 140 150 160 170 180 190 200 is the second one. > >> >> >> >> > >> >> >> >> Now, if someone insert 101... If will be just after 100 and > >> >> >> >> before > >> >> >> >> 110. > >> >> >> >> > >> >> >> > Anil: Instead of scanning from 010 to 100, scan from 010 to 110. > >> >> >> > Then > >> >> >> > we > >> >> >> > wont have this problem. So, i mean to say that > >> >> >> > startRow(firstRowKeyofPage(N)) and > >> >> >> > stopRow(firstRowKeyofPage(N+1)). > >> >> >> > This > >> >> >> > would fix it. Also, in that case number of results might exceed > >> >> >> > the > >> >> >> > pageSize. So you might need to handle this logic. > >> >> >> > > >> >> >> >> > >> >> >> >> When you will display 10 rows starting at 010 you will stop > just > >> >> >> >> before 101... And for the next page you will start at 110... > And > >> >> >> >> 101 > >> >> >> >> will never be displayed... > >> >> >> >> > >> >> >> >> HTH > >> >> >> >> > >> >> >> >> JM > >> >> >> >> > >> >> >> >> 2013/1/25, Mohammad Tariq <[email protected]>: > >> >> >> >> > Hello sir, > >> >> >> >> > > >> >> >> >> > While paging through, store the startkey of the current > >> >> >> >> > page > >> >> >> >> > of > >> >> >> >> > 25 > >> >> >> >> > rows > >> >> >> >> > in a separate byte[]. Now, if you want to come back to this > >> >> >> >> > page > >> >> >> >> > when > >> >> >> >> > you > >> >> >> >> > are at the next page do a range query where startkey would > be > >> >> >> >> > the > >> >> >> >> > rowkey > >> >> >> >> > you had stored earlier and the endkey would be the > startrowkey > >> >> >> >> > of > >> >> >> >> current > >> >> >> >> > page. You have to store just one rowkey each time you show a > >> page > >> >> >> using > >> >> >> >> > which you could come back to this page when you are at the > >> >> >> >> > next > >> >> >> >> > page. > >> >> >> >> > > >> >> >> >> > However, this approach will fail in a case where your user > >> >> >> >> > would > >> >> >> >> > like > >> >> >> >> > to > >> >> >> >> go > >> >> >> >> > to a particular previous page. > >> >> >> >> > > >> >> >> >> > Warm Regards, > >> >> >> >> > Tariq > >> >> >> >> > https://mtariq.jux.com/ > >> >> >> >> > cloudfront.blogspot.com > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > On Fri, Jan 25, 2013 at 10:28 AM, Vijay Ganesan > >> >> >> >> > <[email protected]> > >> >> >> >> > wrote: > >> >> >> >> > > >> >> >> >> >> I'm displaying rows of data from a HBase table in a data > grid > >> >> >> >> >> UI. > >> >> >> >> >> The > >> >> >> >> >> grid > >> >> >> >> >> shows 25 rows at a time i.e. it is paginated. User can click > >> >> >> >> >> on > >> >> >> >> >> Next/Previous to paginate through the data 25 rows at a > time. > >> >> >> >> >> I > >> >> can > >> >> >> >> >> implement Next easily by setting a HBase > >> >> >> >> >> org.apache.hadoop.hbase.filter.PageFilter and setting > >> >> >> >> >> startRow > >> >> >> >> >> on > >> >> >> >> >> the > >> >> >> >> >> org.apache.hadoop.hbase.client.Scan to be the row id of the > >> next > >> >> >> >> >> batch's > >> >> >> >> >> row that is sent to the UI with the previous batch. However, > >> >> >> >> >> I > >> >> >> >> >> can't > >> >> >> >> seem > >> >> >> >> >> to be able to do the same with Previous. I can set the > endRow > >> on > >> >> >> >> >> the > >> >> >> >> Scan > >> >> >> >> >> to be the row id of the last row of the previous batch but > >> since > >> >> >> HBase > >> >> >> >> >> Scans are always in the forward direction, there is no way > to > >> >> >> >> >> set > >> >> a > >> >> >> >> >> PageFilter that can get 25 rows ending at a particular row. > >> >> >> >> >> The > >> >> >> >> >> only > >> >> >> >> >> option > >> >> >> >> >> seems to be to get *all* rows up to the end row and filter > >> >> >> >> >> out > >> >> >> >> >> all > >> >> >> but > >> >> >> >> >> the > >> >> >> >> >> last 25 in the caller, which seems very inefficient. Any > >> >> >> >> >> ideas > >> >> >> >> >> on > >> >> >> >> >> how > >> >> >> >> >> this > >> >> >> >> >> can be done efficiently? > >> >> >> >> >> > >> >> >> >> >> -- > >> >> >> >> >> -Vijay > >> >> >> >> >> > >> >> >> >> > > >> >> >> >> > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > -- > >> >> >> > Thanks & Regards, > >> >> >> > Anil Gupta > >> >> >> > > >> >> >> > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Thanks & Regards, > >> >> > Anil Gupta > >> >> > > >> >> > >> > > >> > > >> > > >> > -- > >> > Thanks & Regards, > >> > Anil Gupta > >> > > >> > > > > > > > > -- > > Thanks & Regards, > > Anil Gupta > > > -- Thanks & Regards, Anil Gupta
