Hi, So, it seems is not our problem (we have a problem with performance, it looks like IGFS load network too actively), Thank you for your answer,
Best regards, ANDREY KUZNETSOV Software Engineering Team Leader, Assessment Global Discipline Head (Java) Office: +7 482 263 00 70 x 42766<tel:+7%20482%20263%2000%2070;ext=42766> Cell: +7 920 154 05 72<tel:+7%20920%20154%2005%2072> Email: [email protected]<mailto:[email protected]> Tver, Russia epam.com<http://www.epam.com/> CONFIDENTIALITY CAUTION AND DISCLAIMER This message is intended only for the use of the individual(s) or entity(ies) to which it is addressed and contains information that is legally privileged and confidential. If you are not the intended recipient, or the person responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. All unintended recipients are obliged to delete this message and destroy any printed copies. From: Bharath Vissapragada [mailto:[email protected]] Sent: Tuesday, September 19, 2017 7:20 PM To: dev@impala <[email protected]> Cc: Special SBER-BPOC Team <[email protected]> Subject: Re: FW: Письмо о проблемах Impala and IGFS. In Impala's context, disk-ID corresponds to the ID of a local disk (on a data node) hosting a particular block replica of a given file. I'm not familiar with the internals of IGFS but from a quick read [1], it looks like an in-memory FS. So, I don't think the idea of "disk ID" makes sense. To fix this, I think we need to make some Impala side changes to ignore loading disk IDs in such cases (patches are welcome :)). FWIW, we did somewhat similar things while integrating S3/ADLS filesystems where there is no concept of block replicas and we just systhesized dummy metadata based on file range splits [2]. [1] https://ignite.apache.org/features/igfs.html [2] https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L292 On Tue, Sep 19, 2017 at 4:28 AM, Andrey Kuznetsov <[email protected]<mailto:[email protected]>> wrote: Hi folk, We have a problem with integration Impala and IGFS. Select from tables on IGFS causes a warning: WARNINGS: Unknown disk id. This will negatively affect performance. Check your hdfs settings to enable block location metadata. (1 of 2 similar). Is this problem of IGFS? Can we enable <block location metadata> on IGFS? Best regards, ANDREY KUZNETSOV Software Engineering Team Leader Office: +7 482 263 00 70 x 42766<tel:%2B7%20482%20263%2000%2070%20x%2042766><tel:+7%20482%20263%2000%2070;ext=42766> Cell: +7 920 154 05 72<tel:%2B7%20920%20154%2005%2072><tel:+7%20920%20154%2005%2072> Email: [email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>> Tver, Russia epam.com<http://epam.com><http://www.epam.com/> CONFIDENTIALITY CAUTION AND DISCLAIMER This message is intended only for the use of the individual(s) or entity(ies) to which it is addressed and contains information that is legally privileged and confidential. If you are not the intended recipient, or the person responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. All unintended recipients are obliged to delete this message and destroy any printed copies.
