Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15104 )
Change subject: IMPALA-9228: ORC scanner reads rows into scratch batch ...................................................................... Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/hdfs-columnar-scanner.h File be/src/exec/hdfs-columnar-scanner.h: http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/hdfs-columnar-scanner.h@40 PS4, Line 40: (). nit: "...in the derived classes." http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/hdfs-columnar-scanner.h@45 PS4, Line 45: ; nit: you could add "= nullptr" to make the default value of it obvious. And then you don't need to init it in the constructor. http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/hdfs-orc-scanner.h File be/src/exec/hdfs-orc-scanner.h: http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/hdfs-orc-scanner.h@258 PS4, Line 258: has at least a : /// collection amongst its children. nit: sounds weird to me, maybe just "has a collection amongst..." http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/hdfs-orc-scanner.cc@703 PS4, Line 703: DCHECK_EQ(scratch_batch_->total_allocated_bytes(), 0); We hit this DCHECK when we come from the loop of AssembleRows(). We can also have allocated bytes if the scratch batch was compacted by the row batch, in that case the scratch batch don't transfer the ownership of 'tuple_mem_pool'. I think it's enough to only have this DCHECK Close(). http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/orc-column-readers.h File be/src/exec/orc-column-readers.h: http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/orc-column-readers.h@130 PS4, Line 130: to implement "curiously recurring template pattern". nit: to implement static polymorphism via the "curiously recurring template pattern". http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/orc-column-readers.h@161 PS4, Line 161: derived->ReadValueHelper(row_idx + i, tuple, pool) If you'd add the 'final' specifier to the ReadValue() implementations then the compiler could devirtualize these calls, i.e. you wouldn't need ReadValueHelper. http://gerrit.cloudera.org:8080/#/c/15104/4/be/src/exec/orc-column-readers.h@469 PS4, Line 469: virtual Status ReadValueBatch(ScratchTupleBatch* scratch_batch, MemPool* pool) You need to rename this to make clang-tidy happy as it hides virtual function with the same name. -- To view, visit http://gerrit.cloudera.org:8080/15104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56db0325dee283d73742ebbae412d19693fac0ca Gerrit-Change-Number: 15104 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 28 Feb 2020 15:20:01 +0000 Gerrit-HasComments: Yes
