Hi Andrey, Can you double check how much memory is actually given to Kudu? That's --memory_limit_hard_bytes. Providing us with a full kudu-tserver log could be useful, as long as it starts with this line "Tablet server non-default flags".
Without more data about your situation it's going to be really hard to help you. Thx, J-D On Thu, Aug 10, 2017 at 4:46 AM, Andrey Kuznetsov <[email protected] > wrote: > Hi Jean-Daniel, > > Nice to hear you) > > > > I use kudu 1.3, I hope kudu has enough memory (about 256Gb each node), > > I have played with threads parameter, but there are no a lot of > differences - > > it is extremely slow… > > > > Best regards, > > *ANDREY KUZNETSOV* > > *Software Engineering Team Leader, Assessment Global Discipline Head > (Java)* > > > > *Office: *+7 482 263 00 70 *x* 42766 <+7%20482%20263%2000%2070;ext=42766> > *Cell: *+7 920 154 05 72 <+7%20920%20154%2005%2072> *Email: * > [email protected] > > *Tver,* *Russia * *epam.com <http://www.epam.com/>* > > > > CONFIDENTIALITY CAUTION AND DISCLAIMER > This message is intended only for the use of the individual(s) or > entity(ies) to which it is addressed and contains information that is > legally privileged and confidential. If you are not the intended recipient, > or the person responsible for delivering the message to the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this communication is strictly prohibited. All unintended > recipients are obliged to delete this message and destroy any printed > copies. > > > > *From:* Jean-Daniel Cryans [mailto:[email protected]] > *Sent:* Wednesday, August 9, 2017 10:52 PM > *To:* [email protected] > *Cc:* Special SBER-BPOC Team <[email protected]> > *Subject:* Re: [kudu] import from hdfs > > > > Hi Andrey, > > > > Which version of Kudu and Impala are you using? Just that can make a huge > difference. > > > > Apart from that, make sure Kudu has enough memory (no memory back > pressure), you have enough maintenance manager threads (1/3 or 1/4 the > number of disks), and that your partitioning favors good load distribution. > > > > But TBH writing to Parquet will remain faster than writing to Kudu, > because Kudu isn't just dropping the rows into a file and has to do more > than that. > > > > Hope this helps, > > > > J-D > > > > On Wed, Aug 9, 2017 at 9:05 AM, Andrey Kuznetsov < > [email protected]> wrote: > > Hi folk, > > I have a problem with hdfs to kudu performance, I have created external > table with CSV data and ran “insert as select” from it to kudu-table and > to parquet-table: > > Importing to parquet-table is 3x faster than to kudu – do you know some > tips/tricks to increase performance of import? > > actually I am importing 8TB of data, so it is critical for me, > > > > Best regards, > > *ANDREY KUZNETSOV* > > *Software Engineering Team Leader, Assessment Global Discipline Head > (Java)* > > > > *Office: *+7 482 263 00 70 *x* 42766 <+7%20482%20263%2000%2070;ext=42766> > *Cell: *+7 920 154 05 72 <+7%20920%20154%2005%2072> *Email: * > [email protected] > > *Tver,* *Russia * *epam.com <http://www.epam.com/>* > > > > CONFIDENTIALITY CAUTION AND DISCLAIMER > This message is intended only for the use of the individual(s) or > entity(ies) to which it is addressed and contains information that is > legally privileged and confidential. If you are not the intended recipient, > or the person responsible for delivering the message to the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this communication is strictly prohibited. All unintended > recipients are obliged to delete this message and destroy any printed > copies. > > > > >
