Hi JG, There cannot be duplicate insertions because in my case, a row represents a set and the qualifier values represent each element of the set. So whenever I insert a value, I have to check whether the value already exists. A new values goes under a new qualifier. Do you think this is an appropriate schema design?
Regards, Raghava. On Tue, Jun 1, 2010 at 1:12 PM, Jonathan Gray <[email protected]> wrote: > Do you expect a very high percentage to be duplicates or just some? > > An alternate approach is to just perform the insertions. Writes are faster > than reads, so sometimes it's best to just insert. This will create an > additional version but if you aren't relying on versions then will have > little impact. > > If a majority of stuff will be duplicate, then maybe consider something > different. Just remember that requiring reads before each write is going to > significantly slow everything down. > > > -----Original Message----- > > From: Raghava Mutharaju [mailto:[email protected]] > > Sent: Tuesday, June 01, 2010 9:49 AM > > To: [email protected] > > Subject: Re: question on Filtering and checkAndPut() > > > > Thank you JG. > > > > >>> is checking if the values there are the same as the ones you are > > trying > > to insert? > > Yes, that is right. I am doing this because there could be > > duplicate > > values generated. In the current iteration of MR, I could generate a > > value > > which was already present in that row/qualifier combination (it is > > sufficient if the value be in any column). > > > > Regards, > > Raghava. > > > > On Tue, Jun 1, 2010 at 12:32 PM, Jonathan Gray <[email protected]> > > wrote: > > > > > And for checkAndPut, from the javadoc: > > > > > > "Atomically checks if a row/family/qualifier value match the > > expectedValue. > > > If it does, it adds the put." > > > > > > This can be used a number of ways. It sounds like what you're > > describing > > > is checking if the values there are the same as the ones you are > > trying to > > > insert? This wouldn't make much sense, why would you re-insert the > > same > > > value? You specify a row, family, qualifier, and value. You also > > specify a > > > Put. > > > > > > checkAndPut is an example of an atomic operation. I may want to only > > > insert certain data if the value I expect is there at the time I am > > > inserting. Think about updating account balances, state transitions, > > data > > > processing, etc. You may read some data at an earlier point in time, > > do > > > some processing, and then insert. When you do the insert, you only > > want it > > > to happen if something else hasn't gone in during your process time > > and > > > modified the data that was there. > > > > > > JG > > > > > > > -----Original Message----- > > > > From: Raghava Mutharaju [mailto:[email protected]] > > > > Sent: Tuesday, June 01, 2010 1:47 AM > > > > To: [email protected] > > > > Cc: [email protected] > > > > Subject: question on Filtering and checkAndPut() > > > > > > > > Hi all, > > > > > > > > Can the following type of value filter be possible -- Within a > > > > row, > > > > irrespective of the columns (qualifiers), the presence of a value > > > > should be > > > > checked. If that value is present then the row along with all the > > > > columns > > > > should be fetched. > > > > > > > > SingleColumnValueFilter requires the we specify the name of the > > > > qualifier > > > > but here I would like to check the value across all the qualifiers > > of > > > > the > > > > row. ValueFilter can be used but it does not return all the columns > > if > > > > there > > > > is a match - it only returns the matched column along with the row. > > So > > > > I > > > > want something which is a mix of both. Is this possible? > > > > > > > > Can someone please explain the functionality of checkAndPut() > > method in > > > > HTable? I couldn't get it from the api doc. When I came across this > > > > method, > > > > my guess was that it would check for duplicate values -- for the > > given > > > > (row, > > > > family, qualifier) combination whether the given value is same as > > the > > > > value > > > > mentioned in put (for the same combination). > > > > > > > > Thank you. > > > > > > > > Regards, > > > > Raghava. > > > >
