Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Todd Lipcon
On Wed, May 18, 2016 at 3:42 PM, Abhi Basu <9000r...@gmail.com> wrote: > Todd: > > Thanks for the update. So Kudu is not designed to be a common storage > system for long-term and streaming data/random access? Just curious. > I'd say it is, but right now we are focusing on more common use cases t

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Abhi Basu
Todd: Thanks for the update. So Kudu is not designed to be a common storage system for long-term and streaming data/random access? Just curious. On Wed, May 18, 2016 at 3:38 PM, Todd Lipcon wrote: > Hm, so each of the strings is about 27 bytes, so each row is 27KB. So, a > batch size of 500 is

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Todd Lipcon
Hm, so each of the strings is about 27 bytes, so each row is 27KB. So, a batch size of 500 is still >13MB. I'd start with something very low like 10, and work your way up. That said, this is definitely not in the "standard" use cases for which Kudu has been designed. I'd also recommend using comp

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Abhi Basu
Query: describe kudu_db.chr22_kudu +-++-+ | name| type | comment | +-++-+ | pos | int| | | id | string | | | chrom | string | | | ref | string | | | alt | str

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Todd Lipcon
What are the types of your 1000 columns? Maybe an even smaller batch size is necessary. -Todd On Wed, May 18, 2016 at 10:41 AM, Abhi Basu <9000r...@gmail.com> wrote: > I have tried with batch_size=500 and still get same error. For your > reference are attached info that may help diagnose. > > Er

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Abhi Basu
I have tried with batch_size=500 and still get same error. For your reference are attached info that may help diagnose. Error: Error while applying Kudu session.: Incomplete: not enough space remaining in buffer for op (required 46.7K, 7.00M already used Config settings: Kudu Tablet Server Bloc

Re: Spark on Kudu

2016-05-18 Thread Chris George
There is some code in review that needs some more refinement. It will allow upsert/insert from a dataframe using the datasource api. It will also allow the creation and deletion of tables from a dataframe http://gerrit.cloudera.org:8080/#/c/2992/ Example usages will look something like: http://ge

Re: Spark on Kudu

2016-05-18 Thread Benjamin Kim
Can someone tell me what the state is of this Spark work? Also, does anyone have any sample code on how to update/insert data in Kudu using DataFrames? Thanks, Ben > On Apr 13, 2016, at 8:22 AM, Chris George wrote: > > SparkSQL cannot support these type of statements but we may be able to >

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread William Berkeley
Both options are more or less the same idea- the point is you need less rows going in per batch so you don't go over the batch size limit. Follow what Todd said as he explained it more clearly and suggested a better way. -Will On Wed, May 18, 2016 at 10:45 AM, Abhi Basu <9000r...@gmail.com> wrote

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Abhi Basu
Thanks for the updates. I will give both options a try and report back. If you are interested in testing with such datasets, I can help. Thanks, Abhi On Wed, May 18, 2016 at 6:25 AM, Todd Lipcon wrote: > Hi Abhi, > > Will is right that the error is client-side, and probably happening > becaus

Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op

2016-05-18 Thread Todd Lipcon
Hi Abhi, Will is right that the error is client-side, and probably happening because your rows are so wide.Impala typically will batch 1000 rows at a time when inserting into Kudu, so if each of your rows is 7-8KB, that will overflow the max buffer size that Will mentioned. This seems quite probab