One of the work arounds we have used is to split the table into two on the Kudu side, use the same primary key in both, and then when accessing the two through Impala, write a view that joins the two. That has abstracted the constraint from our analytics consumers.
Not sure if that works for you but something we have done. On Thu, Aug 30, 2018, 12:28 PM Adar Lieber-Dembo <[email protected]> wrote: > Check out this older post by Todd Lipcon about the 300 column limit: > > > http://mail-archives.apache.org/mod_mbox/kudu-user/201706.mbox/%3CCADY20s7iT7%2BrVZNhagnNFUjk7-nNMxJK6%2BnHV%2B2SzpHXKFxvmw%40mail.gmail.com%3E > > There are probably other folks who run with over 300 columns in their > schemas, but it's not something Kudu developers are actively testing. > You're certainly welcome to try it, and I'm pleased that your 400 > column schema is working well for you, but just be aware that you may > run into unforeseen issues. > > On Thu, Aug 30, 2018 at 10:05 AM Roberto Cerioni, Paulo > <[email protected]> wrote: > > > > Hello, > > > > We have notice that Kudu has a limitation of 300 columns per table, but > unfortunately we have a table with 396 columns in our system, which > violates this. We already tried to increase the column limit to 400 and > ingest data into that table and haven't notice any issues and the > performance looked good for us. > > > > What is the underlying reason for this limit? Could you please share > some information about the known implications of exceeding this limit and > the circumstances in which they can become more apparent? Do you have an > example of test case (be it regarding performance or functional) which > fails due to the use of tables with more than 300 columns? > > > > Is it in the Kudu roadmap to provide support for higher number of > columns in the future? Are the any workarounds to use Kudu with more than > 300 columns besides setting a higher maximum value? > > > > Thanks, > > Paulo. >
