Hi 1. This is not some great advantage. But if you want to index for example users (described by firstname, lastname, age) and you would like to execute queries based on all combinations of that fields - then you have about 2^3 indexes (without ordering). Because of paging, each index can have even 3 tables (we will describe it in technical presentation). So without ordering, you have 8*3 = 24 additional tables for 1 data table. I would rather want to have 1 data table and 3 index tables. It is just more clear for me, but if you like, you can have another table (or 3 tables) for each index.
2. At this stage we don't. It is interesting feature, but I'm not sure if it is possible to ensure transactions. Regards Chriss -----Original Message----- From: Ding, Hui [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 01, 2008 7:18 PM To: [email protected] Subject: RE: Pigi project This sounds really interesting. A few more questions if I may: 1. what do you see as the advantage of having one index table that contains all, rather than having separate index tables? 2. do you ensure that update to the main table and the index table are done in one transaction? -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 01, 2008 1:48 AM To: [email protected] Subject: Re: Pigi project > Hey Antoni & Krzysztof: > > Couple of things: > > + How does it work? The indices in particular? (I suppose I'm > interested in seeing the technial presentation). > + Why the name Pigi? > + What features do you need in hbase to support Pigi? > + What Jim said regards the list (unless you wanted just two of us to > see it first?). > + Multivalue fields? Is that cells in hbase-speak? > + Distributed object cache? How? Sounds great. > stack pisze: > Hey Antoni & Krzysztof: > > Couple of things: > > + How does it work? The indices in particular? (I suppose I'm interested in seeing the technial presentation). > + Why the name Pigi? > + What features do you need in hbase to support Pigi? > + What Jim said regards the list (unless you wanted just two of us to see it first?). > + Multivalue fields? Is that cells in hbase-speak? > + Distributed object cache? How? Sounds great. > Hi We will prepare a short technical presentation, but at this moment i'll try to answer your questions: 1) How does it work ? The idea is based on fact that identifiers in hbase table are sorted lexicographically. For every 1:n relation Pigi maintains additional table (index table). Every row added to child table causes insert row to each index designed for that child object. Index table contains identifiers of ordered child object identifiers. This order is cause by special prepared identifiers of rows in index table - it contains: index name parent object id optional index parameters (for example: color of the car) optional ordering parameters (if we want to order results) child object id Because of index name field in that id, many indexes can share one index table (so in fact there is no need to create another table for every one index) Pigi helps to create and maintain such kind of indexes. Otherwise user has to do it manually (probably individually for each 1:n relation) indexes - our framework creates an additional table and puts there all data it needs. Indexing is realised by preparing complex rowId: for example : we have objects: - UserVO with fields: id, name, surname - CarVO with fields: id, userId, color Each user can have many cars, and one car has only one owner. We want to execute queries: - find all cars by userId - find all cars by userId and color Framework maintain 2 indexes: - cars by userId - where rowId in index table will contain userId data. - cars by userId and color - where rowId in index table will contain userId and color data. indexes are ordered lexicographicaly, than for descendant index rowId will be "reversed". When we want to change color of a car, we only have to notify framework about changes in CarVO object. Framework will update all indexes of this object. 2) Why the name Pigi? there are no specyfic reason..... :-) 3) What features do you need in hbase to support Pigi? only java API - we use only scanners and simple gets, we don't use filters. 4) Multivalue fields? Is that cells in hbase-speak? 5) Distributed object cache? How? Sounds great. in future we will need to write distributed cache - something like TreeCache - or use some existing solution. We need it to reduce reads from hbase - like in hibernate and any Cache. Antony
