Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15914 )

Change subject: IMPALA-9435: Usability enhancements for data cache access trace
......................................................................


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace-replayer.cc
File be/src/runtime/io/data-cache-trace-replayer.cc:

http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace-replayer.cc@35
PS4, Line 35: // This provides a command-line utility for replaying data cache 
trace files.
would be nice to include an example of how to invoke this from the command line.


http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace-replayer.cc@51
PS4, Line 51: means the cache may 2TB
nit: needs revising


http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace-replayer.cc@174
PS4, Line 174:    LOG(INFO) << "Cache hit statistics from the original trace:";
             :     DumpStatisticsToLog(original_trace_stats);
             :     LOG(INFO) << "Cache hit statistics from the replay:";
             :     DumpStatisticsToLog(replay_stats);
my understanding of this tool is that it takes the trace from a data cache 
configured with a specific eviction algorithm (e.g. LRU) and then replays it 
using a different eviction algorithm (e.g. LRFU), and them dumps the cache hit 
stats for the "original trace" (e.g. LRU) and the "replay trace" (e.g. LRFU). 
is that correct? might be worth mentioning that somewhere in the top level 
comments.


http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace.h
File be/src/runtime/io/data-cache-trace.h:

http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace.h@50
PS4, Line 50: class DataCacheTrace {
> Its a little weird how this class is basically a wrapper around several oth
at the very least, it would be good to separate out the Replayer code from the 
code that actually does the tracing (e.g. the code that writes to trace files). 
its a bit confusing to put them together.


http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace.cc
File be/src/runtime/io/data-cache-trace.cc:

http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace.cc@251
PS4, Line 251:   uint128 hash = util_hash::CityHash128(reinterpret_cast<const 
char*>(filename.data()),
if we are doing this because of security concerns, should we use a secure hash 
function like SHA? I guess this is the same function we use for 
'data_cache_anonymize_trace' so maybe this isn't an issue.


http://gerrit.cloudera.org:8080/#/c/15914/4/be/src/runtime/io/data-cache-trace.cc@375
PS4, Line 375:
nit: extra new line



--
To view, visit http://gerrit.cloudera.org:8080/15914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f84204d8e5145f5fa8d4851d9c19ac317db168e
Gerrit-Change-Number: 15914
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Jun 2020 18:55:38 +0000
Gerrit-HasComments: Yes

Reply via email to