Lars Volker has posted comments on this change. (
http://gerrit.cloudera.org:8080/11517 )
Change subject: [WIP] IMPALA-6932: Speed up scans for sequence datasets with
many files
......................................................................
Patch Set 3:
> > This can't be tested on hdfs since there are no "remote" blocks
> in
> > the minicluster. So all the scan ranges of a file are added to
> the
> > appropriate local disk queue once the header is processed.
>
> This came up in a conversation between me and Joe today as well.
> Replication in HDFS is per file, so we should be able to "hdfs put"
> with appropriate options to induce a remote block, even in the
> minicluster. Unfortunately, it doesn't seem to work with the
> following sequence:
>
> $ impala-shell.sh -q 'create table t (x string)'
> $ yes | head > /tmp/f
> $ hadoop fs -D dfs.replication=1 -put /tmp/f /test-warehouse/t
> $ impala-shell.sh -i localhost:21002 -q 'set num_nodes=1;
> invalidate metadata t; select * from t limit 2; profile' | grep -i
> BytesReadShortCircuit
>
> Impala seems to be doing short-circuit-read on all the impalad's
> (presumably because the datanode somewhat reasonably decides things
> are indeed local).
>
> Anyway--this surprised me so I figured I'd mention it.
The scheduler treats all backend *hosts* the same. In particular all reads on
the minicluster will be assigned to "localhost". Then we have some special
handling for backend hosts that have multiple impalads running: we assign scan
ranges round-robin without considering the actual size of each scan range
(scheduler.cc:890). This is only used during testing and not supported on
production deployments so we don't try to be very sophisticated.
On second thought I hoped we might be able to provoke a remote read if we add
multiple files that only reside on the first data node and then scan all of
them. I tried this and it didn't work. The HDFS file browser shows "localhost"
as the location of each file, making me think that it does not make a
distinction between each datanode and instead figures out how to perform a
short circuit read directly.
The scheduler itself makes the right assignments:
I1029 21:19:09.685812 24983 scheduler.cc:995] ScanRangeAssignment:
server=TNetworkAddress {
01: hostname (string) = "lv-desktop",
02: port (i32) = 22000,
}
I1029 21:19:09.685825 24983 scheduler.cc:1001] node_id=0
ranges=TScanRangeParams {
01: scan_range (struct) = TScanRange {
01: hdfs_file_split (struct) = THdfsFileSplit {
01: file_name (string) = "f3",
02: offset (i64) = 0,
03: length (i64) = 20,
04: partition_id (i64) = 0,
05: file_length (i64) = 20,
06: file_compression (i32) = 0,
07: mtime (i64) = 1540872321692,
},
},
02: volume_id (i32) = 2,
03: is_cached (bool) = false,
04: is_remote (bool) = false,
}
I1029 21:19:09.685854 24983 scheduler.cc:995] ScanRangeAssignment:
server=TNetworkAddress {
01: hostname (string) = "lv-desktop",
02: port (i32) = 22002,
}
I1029 21:19:09.685863 24983 scheduler.cc:1001] node_id=0
ranges=TScanRangeParams {
01: scan_range (struct) = TScanRange {
01: hdfs_file_split (struct) = THdfsFileSplit {
01: file_name (string) = "f2",
02: offset (i64) = 0,
03: length (i64) = 20,
04: partition_id (i64) = 0,
05: file_length (i64) = 20,
06: file_compression (i32) = 0,
07: mtime (i64) = 1540872318716,
},
},
02: volume_id (i32) = 1,
03: is_cached (bool) = false,
04: is_remote (bool) = false,
}
I1029 21:19:09.685868 24983 scheduler.cc:995] ScanRangeAssignment:
server=TNetworkAddress {
01: hostname (string) = "lv-desktop",
02: port (i32) = 22001,
}
I1029 21:19:09.685875 24983 scheduler.cc:1001] node_id=0
ranges=TScanRangeParams {
01: scan_range (struct) = TScanRange {
01: hdfs_file_split (struct) = THdfsFileSplit {
01: file_name (string) = "f",
02: offset (i64) = 0,
03: length (i64) = 20,
04: partition_id (i64) = 0,
05: file_length (i64) = 20,
06: file_compression (i32) = 0,
07: mtime (i64) = 1540872285223,
},
},
02: volume_id (i32) = 0,
03: is_cached (bool) = false,
04: is_remote (bool) = false,
}
--
To view, visit http://gerrit.cloudera.org:8080/11517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I211e2511ea3bb5edea29f1bd63e6b1fa4c4b1965
Gerrit-Change-Number: 11517
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>
Gerrit-Reviewer: Pooja Nilangekar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Tue, 30 Oct 2018 04:22:08 +0000
Gerrit-HasComments: No