Yep, I’ve added a few Gbs ☺
But it provides minimal effect for performance of import

Best regards,
ANDREY KUZNETSOV
Software Engineering Team Leader, Assessment Global Discipline Head (Java)

Office: +7 482 263 00 70 x 42766<tel:+7%20482%20263%2000%2070;ext=42766>   
Cell: +7 920 154 05 72<tel:+7%20920%20154%2005%2072>   Email: 
[email protected]<mailto:[email protected]>
Tver, Russia   epam.com<http://www.epam.com/>

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or entity(ies) 
to which it is addressed and contains information that is legally privileged 
and confidential. If you are not the intended recipient, or the person 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. All unintended recipients are obliged to 
delete this message and destroy any printed copies.

From: Jean-Daniel Cryans [mailto:[email protected]]
Sent: Wednesday, August 16, 2017 9:39 PM
To: [email protected]
Cc: Special SBER-BPOC Team <[email protected]>
Subject: Re: [kudu] import from hdfs

Huh this is confusing, how much memory did you say you have per node? You 
mentioned 256GB but I'm not sure what it relates to anymore because I see you 
gave 400GB to Kudu in there.

Also, why a single disk? Is HDFS using more than one?

On Tue, Aug 15, 2017 at 9:40 AM, Andrey Kuznetsov 
<[email protected]<mailto:[email protected]>> wrote:
Hi Jean-Daniel,
No problem, you can find screen in attachment,
Could not provide the log due security reasons, sorry…

Best regards,
ANDREY KUZNETSOV
Software Engineering Team Leader, Assessment Global Discipline Head (Java)

Office: +7 482 263 00 70 x 42766<tel:+7%20482%20263%2000%2070;ext=42766>   
Cell: +7 920 154 05 72<tel:+7%20920%20154%2005%2072>   Email: 
[email protected]<mailto:[email protected]>
Tver, Russia   epam.com<http://www.epam.com/>

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or entity(ies) 
to which it is addressed and contains information that is legally privileged 
and confidential. If you are not the intended recipient, or the person 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. All unintended recipients are obliged to 
delete this message and destroy any printed copies.

From: Jean-Daniel Cryans 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, August 10, 2017 6:55 PM

To: [email protected]<mailto:[email protected]>
Cc: Special SBER-BPOC Team 
<[email protected]<mailto:[email protected]>>
Subject: Re: [kudu] import from hdfs

Hi Andrey,

Can you double check how much memory is actually given to Kudu? That's 
--memory_limit_hard_bytes. Providing us with a full kudu-tserver log could be 
useful, as long as it starts with this line "Tablet server non-default flags".

Without more data about your situation it's going to be really hard to help you.

Thx,

J-D

On Thu, Aug 10, 2017 at 4:46 AM, Andrey Kuznetsov 
<[email protected]<mailto:[email protected]>> wrote:
Hi Jean-Daniel,
Nice to hear you)

I use kudu 1.3, I hope kudu has enough memory (about 256Gb each node),
I have played with threads parameter, but there are no a lot of differences -
it is extremely slow…

Best regards,
ANDREY KUZNETSOV
Software Engineering Team Leader, Assessment Global Discipline Head (Java)

Office: +7 482 263 00 70 x 42766<tel:+7%20482%20263%2000%2070;ext=42766>   
Cell: +7 920 154 05 72<tel:+7%20920%20154%2005%2072>   Email: 
[email protected]<mailto:[email protected]>
Tver, Russia   epam.com<http://www.epam.com/>

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or entity(ies) 
to which it is addressed and contains information that is legally privileged 
and confidential. If you are not the intended recipient, or the person 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. All unintended recipients are obliged to 
delete this message and destroy any printed copies.

From: Jean-Daniel Cryans 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, August 9, 2017 10:52 PM
To: [email protected]<mailto:[email protected]>
Cc: Special SBER-BPOC Team 
<[email protected]<mailto:[email protected]>>
Subject: Re: [kudu] import from hdfs

Hi Andrey,

Which version of Kudu and Impala are you using? Just that can make a huge 
difference.

Apart from that, make sure Kudu has enough memory (no memory back pressure), 
you have enough maintenance manager threads (1/3 or 1/4 the number of disks), 
and that your partitioning favors good load distribution.

But TBH writing to Parquet will remain faster than writing to Kudu, because 
Kudu isn't just dropping the rows into a file and has to do more than that.

Hope this helps,

J-D

On Wed, Aug 9, 2017 at 9:05 AM, Andrey Kuznetsov 
<[email protected]<mailto:[email protected]>> wrote:
Hi folk,
I have a problem with hdfs to kudu performance, I have created external table 
with CSV data and ran “insert as select”  from it to kudu-table and to 
parquet-table:
Importing to parquet-table is 3x faster than to kudu – do you know some 
tips/tricks to increase performance of import?
actually I am importing 8TB of data, so it is critical for me,

Best regards,
ANDREY KUZNETSOV
Software Engineering Team Leader, Assessment Global Discipline Head (Java)

Office: +7 482 263 00 70 x 42766<tel:+7%20482%20263%2000%2070;ext=42766>   
Cell: +7 920 154 05 72<tel:+7%20920%20154%2005%2072>   Email: 
[email protected]<mailto:[email protected]>
Tver, Russia   epam.com<http://www.epam.com/>

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or entity(ies) 
to which it is addressed and contains information that is legally privileged 
and confidential. If you are not the intended recipient, or the person 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. All unintended recipients are obliged to 
delete this message and destroy any printed copies.




Reply via email to