Hi,
So, it seems is not our problem (we have a problem with performance, it looks 
like IGFS load network too actively),
Thank you for your answer,

Best regards,
ANDREY KUZNETSOV
Software Engineering Team Leader, Assessment Global Discipline Head (Java)

Office: +7 482 263 00 70 x 42766<tel:+7%20482%20263%2000%2070;ext=42766>   
Cell: +7 920 154 05 72<tel:+7%20920%20154%2005%2072>   Email: 
[email protected]<mailto:[email protected]>
Tver, Russia   epam.com<http://www.epam.com/>

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or entity(ies) 
to which it is addressed and contains information that is legally privileged 
and confidential. If you are not the intended recipient, or the person 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. All unintended recipients are obliged to 
delete this message and destroy any printed copies.

From: Bharath Vissapragada [mailto:[email protected]]
Sent: Tuesday, September 19, 2017 7:20 PM
To: dev@impala <[email protected]>
Cc: Special SBER-BPOC Team <[email protected]>
Subject: Re: FW: Письмо о проблемах Impala and IGFS.

In Impala's context, disk-ID corresponds to the ID of a local disk (on a data 
node) hosting a particular block replica of a given file. I'm not familiar with 
the internals of IGFS but from a quick read [1], it looks like an in-memory FS. 
So, I don't think the idea of "disk ID" makes sense.

To fix this, I think we need to make some Impala side changes to ignore loading 
disk IDs in such cases (patches are welcome :)).

FWIW, we did somewhat similar things while integrating S3/ADLS filesystems 
where there is no concept of block replicas and we just systhesized dummy 
metadata based on file range splits [2].

[1] https://ignite.apache.org/features/igfs.html
[2] 
https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L292

On Tue, Sep 19, 2017 at 4:28 AM, Andrey Kuznetsov 
<[email protected]<mailto:[email protected]>> wrote:
Hi folk,
We have a problem with integration Impala and IGFS.  Select from tables on IGFS 
causes a warning:

WARNINGS: Unknown disk id.  This will negatively affect performance.
Check your hdfs settings to enable block location metadata. (1 of 2 similar).

Is this problem of IGFS? Can we enable <block location metadata> on IGFS?

Best regards,
ANDREY KUZNETSOV
Software Engineering Team Leader

Office: +7 482 263 00 70 x 
42766<tel:%2B7%20482%20263%2000%2070%20x%2042766><tel:+7%20482%20263%2000%2070;ext=42766>
   Cell: +7 920 154 05 
72<tel:%2B7%20920%20154%2005%2072><tel:+7%20920%20154%2005%2072>   Email: 
[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>
Tver, Russia   epam.com<http://epam.com><http://www.epam.com/>

CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or entity(ies) 
to which it is addressed and contains information that is legally privileged 
and confidential. If you are not the intended recipient, or the person 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. All unintended recipients are obliged to 
delete this message and destroy any printed copies.

Reply via email to