Oh I see what you mean. It's not that the keys are getting dropped, its that they're not marked as keys. This arguably makes sense on a projection: for instance you might want the keys returns in the end of the projection, while table schemas (at least for now) require that they are present at the beginning of the projection. If you really want to create a new table based on an existing one, you could get the schema from KuduTable. That one should be complete.
-david On Fri, Sep 9, 2016 at 8:51 AM, Jordan Birdsell <[email protected]> wrote: > Right, what i'm saying is, if i do include the key in my projection, the > schema does not maintain it as a key. The issue isnt so much that i cant > apply predicates to the key column, its that if i wanted to create a > projection and then want to use that projection to create a table based on > that projection, i'd have to rebuild the schema (i.e., the schema returned > is effectively useless for creating new tables). This pattern of creating > tables from projections is pretty common in dataframe like libraries in > python. > > gist of offending code with comment on issue: > https://gist.github.com/jtbirdsell/e376a7fa21f3b1893efa7e1ddac408d7 > > > On Fri, Sep 9, 2016 at 11:38 AM David Alves <[email protected]> wrote: > > > Wait, If you _do_ set a projection on the scanner that does not include > the > > keys, then they won't be returned (and won't appear on the projection's > > schema). > > Note that this does not mean that you can't set predicates on the key, > it's > > just that they'll be evaluated server side, but the key won't actually be > > returned. > > Maybe I'm misunderstanding what you're saying? > > Care to post a gist with the offending code? > > > > -david > > > > > > > > On Fri, Sep 9, 2016 at 8:26 AM, Jordan Birdsell < > [email protected] > > > > > wrote: > > > > > Hey David, > > > > > > Yep, i'm sure, taking a look at the scan_configuration class, the issue > > > seems to be here: > > > > > > Status ScanConfiguration::SetProjectedColumnIndexes(const vector<int>& > > > col_indexes) { > > > .... > > > RETURN_NOT_OK*(s->Reset(cols, 0));* > > > .... > > > > > > In the SetProjectedColumnIndexes method (which is also used by > > > SetProjectedColumnNames), we're setting the schema without the index. > > > > > > There are probably a couple of ways to address this: > > > > > > 1. Check if all key columns are in the projection, and if so, > maintain > > > the key. > > > 2. Provide an optional parameter to be able to set the key to users > > > preference for the new projection. This would be beneficial for > cases > > > where > > > the user may want to create a new table based on their projection. > > > > > > Thoughts? > > > > > > Jordan > > > > > > On Fri, Sep 9, 2016 at 11:08 AM David Alves <[email protected]> > > wrote: > > > > > > > Hi Jordan > > > > > > > > KuduScanner::GetProjectionSchema returns the schema of the > projection > > > > that was previously set on the scanner. If you don't a projection it > > > should > > > > indeed return all the columns. > > > > Are you sure you didn't set a projection (with > > SetProjectedColumnNames > > > > or SetProjectedColumnIndexes) that excluded the key? > > > > > > > > Best > > > > David > > > > > > > > On Fri, Sep 9, 2016 at 5:16 AM, Jordan Birdsell < > > > [email protected] > > > > > > > > > wrote: > > > > > > > > > Hey folks, > > > > > > > > > > i was doing some work on KUDU-854 > > > > > <https://issues.apache.org/jira/browse/KUDU-854> and when testing > > the > > > > > KuduScanner::GetProjectionSchema method call, found that the key > was > > > > being > > > > > dropped, which makes this much more challenging to test. Any ideas > if > > > it > > > > is > > > > > intended to drop the key information in a scanner projection? I > would > > > > > imagine this could prevent functionality like creating new tables > > from > > > a > > > > > projection. > > > > > > > > > > Thanks, > > > > > Jordan > > > > > > > > > > > > > > >
