I meant a gist with a example of client API usage where the keys are
getting dropped, so that we can reproduce what you're seeing.

-david

On Fri, Sep 9, 2016 at 8:51 AM, Jordan Birdsell <[email protected]>
wrote:

> Right, what i'm saying is, if i do include the key in my projection, the
> schema does not maintain it as a key.  The issue isnt so much that i cant
> apply predicates to the key column, its that if i wanted to create a
> projection and then want to use that projection to create a table based on
> that projection, i'd have to rebuild the schema (i.e., the schema returned
> is effectively useless for creating new tables).  This pattern of creating
> tables from projections is pretty common in dataframe like libraries in
> python.
>
> gist of offending code with comment on issue:
> https://gist.github.com/jtbirdsell/e376a7fa21f3b1893efa7e1ddac408d7
>
>
> On Fri, Sep 9, 2016 at 11:38 AM David Alves <[email protected]> wrote:
>
> > Wait, If you _do_ set a projection on the scanner that does not include
> the
> > keys, then they won't be returned (and won't appear on the projection's
> > schema).
> > Note that this does not mean that you can't set predicates on the key,
> it's
> > just that they'll be evaluated server side, but the key won't actually be
> > returned.
> > Maybe I'm misunderstanding what you're saying?
> > Care to post a gist with the offending code?
> >
> > -david
> >
> >
> >
> > On Fri, Sep 9, 2016 at 8:26 AM, Jordan Birdsell <
> [email protected]
> > >
> > wrote:
> >
> > > Hey David,
> > >
> > > Yep, i'm sure, taking a look at the scan_configuration class, the issue
> > > seems to be here:
> > >
> > > Status ScanConfiguration::SetProjectedColumnIndexes(const vector<int>&
> > > col_indexes) {
> > > ....
> > >   RETURN_NOT_OK*(s->Reset(cols, 0));*
> > > ....
> > >
> > > In the SetProjectedColumnIndexes method (which is also used by
> > > SetProjectedColumnNames), we're setting the schema without the index.
> > >
> > > There are probably a couple of ways to address this:
> > >
> > >    1. Check if all key columns are in the projection, and if so,
> maintain
> > >    the key.
> > >    2. Provide an optional parameter to be able to set the key to users
> > >    preference for the new projection. This would be beneficial for
> cases
> > > where
> > >    the user may want to create a new table based on their projection.
> > >
> > > Thoughts?
> > >
> > > Jordan
> > >
> > > On Fri, Sep 9, 2016 at 11:08 AM David Alves <[email protected]>
> > wrote:
> > >
> > > > Hi Jordan
> > > >
> > > >   KuduScanner::GetProjectionSchema returns the schema of the
> projection
> > > > that was previously set on the scanner. If you don't a projection it
> > > should
> > > > indeed return all the columns.
> > > >   Are you sure you didn't set a projection (with
> > SetProjectedColumnNames
> > > > or SetProjectedColumnIndexes) that excluded the key?
> > > >
> > > > Best
> > > > David
> > > >
> > > > On Fri, Sep 9, 2016 at 5:16 AM, Jordan Birdsell <
> > > [email protected]
> > > > >
> > > > wrote:
> > > >
> > > > > Hey folks,
> > > > >
> > > > > i was doing some work on KUDU-854
> > > > > <https://issues.apache.org/jira/browse/KUDU-854> and when testing
> > the
> > > > > KuduScanner::GetProjectionSchema method call, found that the key
> was
> > > > being
> > > > > dropped, which makes this much more challenging to test. Any ideas
> if
> > > it
> > > > is
> > > > > intended to drop the key information in a scanner projection? I
> would
> > > > > imagine this could prevent functionality like creating new tables
> > from
> > > a
> > > > > projection.
> > > > >
> > > > > Thanks,
> > > > > Jordan
> > > > >
> > > >
> > >
> >
>

Reply via email to