I think this makes sense for isset_bitmap_ and owned_strings_bitmap_. I have the source checked out from github and will try to build and run the regression tests so I can play with this idea.
In general though it seems I only want to copy the primary keys (any given RowPtr might have other columns). I think I can come up with a way to do that but it will take some more learning... Thanks for the help! On Thu, Feb 23, 2017 at 2:18 AM, Todd Lipcon <[email protected]> wrote: > On Thu, Feb 16, 2017 at 8:36 AM, Paul Brannan <[email protected] > > wrote: > >> Thank you for the quick response! >> >> I do understand the desire to not present in the API the appearance of a >> feature that isn't really there. As an example, I was reading KUDU-1291 >> yesterday and realized that since predicates are specified by name and not >> column index, it's easy to accidentally construct an inefficient scan by >> omitting the first component of the key. It's not surprising given how >> kudu works, but it's not obvious just by looking at the API. >> >> I would like to contribute a patch, though I would like to learn more >> about kudu before I feel comfortable. In this case, for example, my naive >> solution may work, but isn't very robust -- since it has to switch on the >> data type of each cell, it will fail when new data types are added. >> > > I don't think that's too bad - we have lots of places where we have to add > a new case label when a new type is added, so what's one more? :) > > >> I have a hunch it's possible to copy row_data_ directly somehow, but I >> have no idea how to correctly set isset_bitmap_ or owned_strings_bitmap_. >> > > "isset_bitmap_" should probably be set based on the projected columns of > the scanner. > "owned_stirngs_bitmap_" indicates which strings in the row are "owned" by > the PartialRow rather than "referenced". I think to make this API safe, > we'd probably want to clone all of the string data, which would imply > setting owned_strings_bitmap_ to 1 bits for those strings. > > Hope that helps > -Todd > > >> >> On Wed, Feb 15, 2017 at 1:32 PM, Todd Lipcon <[email protected]> wrote: >> >>> Hi Paul, >>> >>> You're correct that there is no predicate-based delete currently >>> available, so you have to scan and then feed the results of the scan back >>> into your desired mutations/deletes. This is intentional, since right now >>> we don't have multi-row transactional capabilities, and a "delete by >>> predicate" API would probably give the false impression that it is >>> transactional. >>> >>> I also think you're right that there isn't a nice way of propagating a >>> RowResult into a PartialRow as you need to do here. It seems you're a C++ >>> programmer -- any interest in contributing a patch to make this >>> transformation a bit easier? >>> >>> -Todd >>> >>> >>> >>> On Wed, Feb 15, 2017 at 10:27 AM, Paul Brannan < >>> [email protected]> wrote: >>> >>>> I want to delete all rows that match a particular partial key. For >>>> example, if my schema includes columns "foo", "bar", and "baz" in its >>>> primary key, I want to be able to delete all rows with "foo=16" and >>>> "bar=32", regardless of the value of baz. If I attempt to apply a >>>> KuduDelete without specifying "baz", I get an error "Illegal state: Key not >>>> specified". >>>> >>>> The best I have come up with so far is to do a scan and copy the data >>>> cell-by-cell from the RowPtr returned by the scan into the KuduPartialRow >>>> used by the delete; I don't see any good way in the interface to copy row >>>> data from one to the other without copying cell-by-cell. The code looks >>>> something like: >>>> >>>> for (auto idx : primary_key_column_indexes) { >>>> switch(schema.Column(idx).type()) { >>>> case KuduColumnSchema::INT16: // GetInt16/SetInt16 >>>> case KuduColumnSchema::INT32: // GetInt32/SetInt32 >>>> case KuduColumnSchema::STRING: // GetString/SetString >>>> // and so on... >>>> } >>>> } >>>> >>>> Is there a better way? >>>> >>>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >>> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >
