Hi Todd, We are on Kudu 1.5 still and I used Kudu client 1.7
Thanks, Boris On Fri, Nov 16, 2018, 17:07 Todd Lipcon <[email protected] wrote: > Hi Boris, > > This is interesting. Just so we're looking at the same code, what version > of the kudu-client dependency have you specified, and what version of the > server? > > -Todd > > On Fri, Nov 16, 2018 at 1:12 PM Boris Tyukin <[email protected]> > wrote: > >> Hey guys, >> >> I am playing with Kudu Java client (wow it is fast), using mostly code >> from Kudu Java example. >> >> While learning about exceptions during rows inserts, I stumbled upon >> something I could not explain. >> >> If I insert 10 rows into a brand new Kudu table (AUTO_FLUSH_BACKGROUND >> mode) and I make one row to be "bad" intentionally (one column cannot be >> NULL), I actually get 3 rows that cannot be inserted into Kudu, not 1 as I >> was expected. >> >> But if I do session.flush() after every single insert, I get only one >> error row (but this ruins the purpose of AUTO_FLUSH_BACKGROUND mode). >> >> Any ideas one? We cannot afford losing data and need to track all rows >> which cannot be inserted. >> >> AUTO_FLUSH mode works much better and I do not have an issue like above, >> but then it is way slower than AUTO_FLUSH_BACKGROUND. >> >> My code is below. It is in Groovy, but I think you will get an idea :) >> https://gist.github.com/boristyukin/8703d2c6ec55d6787843aa133920bf01 >> >> Here is output from my test code that hopefully illustrates my confusion >> - out of 10 rows inserted, 9 should be good and 1 bad, but it turns out >> Kudu flagged 3 as bad: >> >> Created table kudu_groovy_example >> Inserting 10 rows in AUTO_FLUSH_BACKGROUND flush mode ... >> (int32 key=1, string value="value 1", unixtime_micros >> dt_tm=2018-11-16T20:57:03.469000Z) >> (int32 key=2, string value=NULL) BAD ROW >> (int32 key=3, string value="value 3", unixtime_micros >> dt_tm=2018-11-16T20:57:03.595000Z) >> (int32 key=4, string value=NULL, unixtime_micros >> dt_tm=2018-11-16T20:57:03.596000Z) >> (int32 key=5, string value="value 5", unixtime_micros >> dt_tm=2018-11-16T20:57:03.597000Z) >> (int32 key=6, string value=NULL, unixtime_micros >> dt_tm=2018-11-16T20:57:03.597000Z) >> (int32 key=7, string value="value 7", unixtime_micros >> dt_tm=2018-11-16T20:57:03.598000Z) >> (int32 key=8, string value=NULL, unixtime_micros >> dt_tm=2018-11-16T20:57:03.602000Z) >> (int32 key=9, string value="value 9", unixtime_micros >> dt_tm=2018-11-16T20:57:03.603000Z) >> (int32 key=10, string value=NULL, unixtime_micros >> dt_tm=2018-11-16T20:57:03.603000Z) >> 3 errors inserting rows - why 3???? only 1 expected to be bad... >> there were errors inserting rows to Kudu >> the first few errors follow: >> ??? key 1 and 6 supposed to be fine! >> Row error for primary key=[-128, 0, 0, 1], tablet=null, server=null, >> status=Invalid argument: No value provided for required column: >> dt_tm[unixtime_micros NOT NULL] (error 0) >> Row error for primary key=[-128, 0, 0, 2], tablet=null, server=null, >> status=Invalid argument: No value provided for required column: >> dt_tm[unixtime_micros NOT NULL] (error 0) >> Row error for primary key=[-128, 0, 0, 6], tablet=null, server=null, >> status=Invalid argument: No value provided for required column: >> dt_tm[unixtime_micros NOT NULL] (error 0) >> Rows counted in 485 ms >> Table has 7 rows - ??? supposed to be 9! >> INT32 key=4, STRING value=NULL, UNIXTIME_MICROS >> dt_tm=2018-11-16T20:57:03.596000Z >> INT32 key=8, STRING value=NULL, UNIXTIME_MICROS >> dt_tm=2018-11-16T20:57:03.602000Z >> INT32 key=9, STRING value=value 9, UNIXTIME_MICROS >> dt_tm=2018-11-16T20:57:03.603000Z >> INT32 key=3, STRING value=value 3, UNIXTIME_MICROS >> dt_tm=2018-11-16T20:57:03.595000Z >> INT32 key=10, STRING value=NULL, UNIXTIME_MICROS >> dt_tm=2018-11-16T20:57:03.603000Z >> INT32 key=5, STRING value=value 5, UNIXTIME_MICROS >> dt_tm=2018-11-16T20:57:03.597000Z >> INT32 key=7, STRING value=value 7, UNIXTIME_MICROS >> dt_tm=2018-11-16T20:57:03.598000Z >> >> >> >> >> > > -- > Todd Lipcon > Software Engineer, Cloudera >
