[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 16: (10 comments) http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@54 PS15, Line 54: DCHECK(buffered_indices_.empty()); > Ok to leave for now but we've generally been moving towards using the recen Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@59 PS15, Line 59: / This > DCHECK_EQ(dict_bytes_cnt_, 0); Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@77 PS15, Line 77: /// indices. Used to size the buffer passed to WriteData(). > To add to what Tim said, this code didn't actually change, so it is nice to Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@77 PS15, Line 77: /// indices. Used to size the buffer passed to WriteData(). > We generally use the more concise one-line formatting for very short single Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@123 PS15, Line 123: > formatting nit: * goes on the left with the type name. Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@152 PS15, Line 152: } : num_call_track_++; : } : } : }; > I think we can omit this destructor, since DictEncoderBase does the same th Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@168 PS15, Line 168: /// th > nit: inline isn't necessary. It isn't harmful but can be confusing to have Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@269 PS15, Line 269: > nit: * should be on left. Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@296 PS15, Line 296: : template : class DictDecoder : public DictDecoderBase { : public: : // > I think we can omit this destructor, as DictDecoderBase does this. Done http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@330 PS15, Line 330: /// It > nit: unnecessary inline Done -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 16 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 10 Nov 2017 06:22:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#16). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 5 files changed, 142 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/16 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 16 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8215 to look at the new patch set (#6). Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. IMPALA-5142 EventSequence displays negative elapsed time. EventSequence::EventList which stores the Event in increasing time order may have at times Events that are not stored in increasing time order. This may happen when concurrently EventSequence::MarkEvent() is called for different Events in the same EventSequence. The incorrect time order of Event in EventList results in a negative value being displayed, which is the issue reported in this JIRA. The issue can be fixed either be re-ordering the lock in EventSequence::MarkEvent() or by sorting the EventList before printing. Reordering of lock in EventSequence::MarkEvent() will involve holding the lock across clock_gettime() so that approach is not taken. This patch fixes the issue by sorting the EventList in GetEvents() as this function is expected to return sorted list of events based on time. Testing: Ran all the front-end/backend and end-end tests. Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 --- M be/src/util/runtime-profile-counters.h 1 file changed, 6 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8215/6 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h@327 PS5, Line 327:[](Event const , Event const ) { : return event1.second < event2.second; > rather than creating a tmp eventlist_sorted, you should create a temp for t I'm dropping this logic to call the sort() all the time to make the result more predictable each time this function gets called and also since the event list will be small so sorting won't be that expensive. http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h@330 PS5, Line 330: == false > our style uses ! rather than == false Since I'm not checking the event list to be sorted so I won't need this anymore. http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@326 PS4, Line 326: bool eventlist_sorted = std::is_sorted(events_.begin(), events_.end(), : [](Event const , Event const ) { : return event1.second < event2.second; : }); : if (eventlist_sorted == false) { : std::sort(events_.begin(), events_.end(), : [](Event const , Event const ) { : return event1.second < event2.second; : }); : > the flipside is that this becomes a cliff in the sense that normally it wil Done as suggested kept the functionality simple, removed the logic of having an extra check. -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 5 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Wed, 01 Nov 2017 23:01:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8215 to look at the new patch set (#5). Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. IMPALA-5142 EventSequence displays negative elapsed time. EventSequence::EventList which stores the Event in increasing time order may have at times Events that are not stored in increasing time order. This may happen when concurrently EventSequence::MarkEvent() is called for different Events in the same EventSequence. The incorrect time order of Event in EventList results in a negative value being displayed, which is the issue reported in this JIRA. The issue can be fixed either be re-ordering the lock in EventSequence::MarkEvent() or by sorting the EventList before printing. Reordering of lock in EventSequence::MarkEvent() will involve holding the lock across clock_gettime() so that approach is not taken. This patch fixes the issue by sorting the EventList in GetEvents() as this function is expected to return sorted list of events based on time. Testing: Ran all the front-end/backend and end-end tests. Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 --- M be/src/util/runtime-profile-counters.h 1 file changed, 14 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8215/5 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 5 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@326 PS4, Line 326: bool eventlist_sorted = true; : int64_t prev = 0L; : for (const Event& event: events_) { : if (event.second < prev) { : eventlist_sorted = false; : break; : } : prev = event.second; : } : > why bother with this? the list is pretty short and this isn't a critical pa Since the list is sorted in most of the times calling sort() may be expensive (if in future # of events becomes large), hence I'll use the std::is_sorted() instead. http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@336 PS4, Line 336: [](Event const , > nit: how about breaking the line before the [] instead of in the middle of Done -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Wed, 01 Nov 2017 22:24:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#15). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 5 files changed, 140 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/15 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 15 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h@63 PS12, Line 63: /// dict_encoded_size() bytes. : virtual void Writ > Close() should call ClearIndices() as one of its steps, ClearIndices() stil My bad didn't realize it, now that's fixed. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 14 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 01 Nov 2017 20:44:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#14). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 5 files changed, 134 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/14 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 14 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#13). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 5 files changed, 137 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/13 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 13 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 12: (5 comments) http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h@352 PS12, Line 352: MemTracker* dict_mem_tracker() { return scan_node_->mem_tracker(); } > See comment in HdfsParquetTableWriter. Also, see how we pass the mem_tracke Done http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h File be/src/exec/hdfs-parquet-table-writer.h: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h@81 PS12, Line 81: /// Memory tracker to track the memory used by encoder. : MemTracker* dict_mem_tracker(); > When I'm reading the code, I'm thinking that this is hiding more than it is Removed the function made it in line where the memtracker is used so it's more explicit http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@179 PS12, Line 179: dict_encoder_base_->ClearIndices(); : dict_encoder_base_->Close(); > When Close() does a call to ClearIndices(), this can be simplified to just implemented the Close() logic inside ClearIndices() http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@322 PS12, Line 322:plain_encoded_value_size_, :parent_->dict_mem_tracker())); > Nit: Indentation in this case should be even with the "D" in new DictEncode aligned the indentation with D in next line. http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h@63 PS12, Line 63: void Close() { : ReleaseBytes(); > I think Close() should do ClearIndices(). implemented the logic of Close() inside ClearIndices() -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 12 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 01 Nov 2017 01:07:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8215 to look at the new patch set (#4). Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. IMPALA-5142 EventSequence displays negative elapsed time. EventSequence::EventList which stores the Event in increasing time order may have at times Events that are not stored in increasing time order. This may happen when concurrently EventSequence::MarkEvent() is called for different Events in the same EventSequence. The incorrect time order of Event in EventList results in a negative value being displayed, which is the issue reported in this JIRA. The issue can be fixed either be re-ordering the lock in EventSequence::MarkEvent() or by sorting the EventList before printing. Reordering of lock in EventSequence::MarkEvent() will involve holding the lock across clock_gettime() so that approach is not taken. This patch fixes the issue by sorting the EventList in GetEvents() as this function is expected to return sorted list of events based on time. Testing: Ran all the front-end/backend and end-end tests. Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 --- M be/src/util/runtime-profile-counters.h 1 file changed, 19 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8215/4 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8215 to look at the new patch set (#3). Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. IMPALA-5142 EventSequence displays negative elapsed time. EventSequence::EventList which stores the Event in increasing time order may have at times Events that are not stored in increasing time order. This may happen when concurrently EventSequence::MarkEvent() is called for different Events in the same EventSequence. The incorrect time order of Event in EventList results in a negative value being displayed, which is the issue reported in this JIRA. The issue can be fixed either be re-ordering the lock in EventSequence::MarkEvent() or by sorting the EventList before printing. Reordering of lock in EventSequence::MarkEvent() will involve holding the lock across clock_gettime() so that approach is not taken. This patch fixes the issue by sorting the EventList in GetEvents() as this function is expected to return sorted list of events based on time. Testing: Ran all the front-end/backend and end-end tests. Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 --- M be/src/util/runtime-profile-counters.h 1 file changed, 19 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8215/3 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 3 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc@658 PS2, Line 658: event_sequence.second->GetEvents(); : if (last == 0 && events.size() > 0) last = events.back().second; > that looks like it could be inconsistent with the sorted order Introduced a fix to check and fix the event order in event list in GetEvents() http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc@672 PS2, Line 672: > runtime-profile-test.cc has a test case that expects GetEvents() to be in o The events are printed in the list order and as they are printed the value of current is decremented from previous to get the elapsed time. So we are getting a list of events that are not in order. I've moved the sorting code in GetEvents() as per your suggestion which seems to be more plausible. -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 2 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Wed, 25 Oct 2017 18:33:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8215 to look at the new patch set (#2). Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. IMPALA-5142 EventSequence displays negative elapsed time. EventSequence::EventList which stores the Event in increasing time order may have at times Events that are not stored in increasing time order. This may happen when concurrently EventSequence::MarkEvent() is called for different Events in the same EventSequence. The incorrect time order of Event in EventList results in a negative value being displayed, which is the issue reported in this JIRA. The issue can be fixed either be re-ordering the lock in EventSequence::MarkEvent() or by sorting the EventList before printing. This patch fixes the issue by sorting the EventList before printing.Since reordering the lock in EventSequence::MarkEvent() will involve holding the lock across clock_gettime(). Testing: Ran all the front-end/backend and end-end tests. Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 --- M be/src/util/runtime-profile.cc 1 file changed, 7 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8215/2 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 2 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#12). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 7 files changed, 146 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/12 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 12 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#11). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 8 files changed, 160 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/11 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 11 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 10: (5 comments) http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/exec/hdfs-parquet-scanner.h@352 PS10, Line 352: return mem_tracker_; > if this is the same mem tracker in scan_node_ perhaps we can just do I tried that earlier but I was getting the below error for parquet table writer, so in order to have similar change in scanner and writer went this route Error: - hdfs-parquet-table-writer.h:83:57: error: invalid use of incomplete type ‘class impala::HdfsTableSink’ http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/exec/hdfs-parquet-scanner.cc@337 PS10, Line 337: if (mem_tracker_ != nullptr) { : mem_tracker_ = nullptr; : } > nit: one line Done http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/exec/parquet-column-readers.h@356 PS10, Line 356: if (dict_decoder_base_ != nullptr) dict_decoder_base_->Close(); > instead of dict_decoder_base_ we can just use the virtual function GetDicti GetDictionaryDecoder() returns dict_decoder_base_ it should be the same calling the function GetDictionaryDecoder() or accessing member variable dict_decoder_base_ directly. http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/util/dict-encoding.h@206 PS10, Line 206: > nit: extra line Done http://gerrit.cloudera.org:8080/#/c/8034/10/be/src/util/dict-encoding.h@324 PS10, Line 324: > nit: extra line Done -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 10 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Oct 2017 23:04:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#10). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 8 files changed, 166 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/10 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 10 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 9: (5 comments) http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@348 PS8, Line 348: f the column. > We should be using Close() here instead of unregistering every tracker (thi Removed the Close() since I'm using the MemTracker of scan_node_->mem_tracker so won't be using 1000's of memtrackers now http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@133 PS8, Line 133: di > nit: inline here and nearby isn't necessary for a couple of reason. First, Removed inline http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@184 PS8, Line 184: Node(const T& v, const NodeIndex& n) : value(v), next(n) { } > Nit: check isn't necessary - Release() does this for you. Removed the check http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@194 PS8, Line 194: enum { INVALID_INDEX = 4 }; > The TODO should be to use TryConsume(). The asynchronous checks are not the Changed the TODO to use TryConsume() http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@251 PS8, Line 251: / This function > I agree with Joe's comment. All codepaths should call Close(), so we can ju Added a DCHECK found during the testing that some of the column readers do not call Close, so I've added ReleaseBytes() so that accounting of memory used for dictionary is release when DictDecoder is destroyed. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 9 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 05:28:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG@22 PS1, Line 22: value. > That's right since the code not synchronized we can read a garbage start_ w In the thread doing ClientRequestState::ClientRequestState() a) When the function AddEventSequence() is called, an EventSequence with the name is looked for, if found then the EventSequence is returned else a new event sequence gets created. b) Then after the query_events->Start(0 resets the monotonic time for the EventSequence, so say in this case a matching name for "Query Timeline" was found In another thread which is done doing AdmissionController::AdmitQuery(QuerySchedule* schedule) in the mean time will have ScopedEvent destroyed for the same EventSequence and since they are not serialized we may hit upon this issue. /// Utility class to mark an event when the object is destroyed. class ScopedEvent { . /// Mark the event when the object is destroyed ~ScopedEvent() { event_sequence_->MarkEvent(label_); ---> } }; -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Tue, 17 Oct 2017 21:55:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG@22 PS1, Line 22: value. > but the code isn't synchronized at all, so you could (in theory) read a gar That's right since the code not synchronized we can read a garbage start_ which is the case here.I was wondering wether there will be performance overhead if lock is used to seialize. So instead of printing a negative value I'm printing 0. The Start() is called from ClientRequestState() ClientRequestState::ClientRequestState() { query_events_ = summary_profile_->AddEventSequence("Query Timeline");··· query_events_->Start(); } whereas ElapsedTime() RuntimeProfile::EventSequence::MarkEvent() { sw_.ElapsedTime(); } -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Tue, 17 Oct 2017 03:48:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#9). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch introduces memory tracker to track the memory used by dictionary used in DictDecoder class and DictEncoder class when a parquet file is read from or written to. Memory tracker is introduced at the class HdfsParquetScanner level to track all the memory used by dictonary in DictDecoders. Similarly memory tracker is introduced in class HdfsParquetTableWriter to track the memory used by dictionary in DictEncoders. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 8 files changed, 170 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/9 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 9 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 8: (8 comments) http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@274 PS8, Line 274: // Before closing the Column readers, the accounted bytes in mem_tracker for : // dict_decoder_ is released. : if (mem_tracker_ != nullptr) { : for (ParquetColumnReader* col_reader : column_readers_) col_reader->Cleanup(); : } > I think this code won't be necessary if col_reader->Close() does the dictio Removed this code and wrapped it up with the col_reader->Close() http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@348 PS8, Line 348: CloseAndUnregisterFromParent > Same question here as elsewhere: Do we really need CloseAndUnregisterFromPa I found that MemTracker::Close() is not unregistering with the parent MemTracker , so I found CloseAndUnregisterFromParent to be more appropriate. http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@178 PS8, Line 178: dict_encoder_base_->ClearIndices(); > This should call the new base Close() function (which should do ClearIndice Done http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@1057 PS8, Line 1057: CloseAndUnregisterFromParent(); > I think we want to limit use of CloseAndUnregisterFromParent() to cases tha I checked the MemTracker::Close(), it does not unregister the mem_tracker_ from it's parent, Since this mem_tracker_ should not be used beyond this point I have reset it to NULL http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h@234 PS8, Line 234: /// Delete the bytes used for memory tracking. : virtual void Cleanup() {} > I think this can be folded into Close() and removed. Done http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc@269 PS8, Line 269: virtual void Cleanup() { : dict_decoder_.Close(); : } > I think this can be folded into Close() without a separate Cleanup() functi Done http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@107 PS8, Line 107: DictEncoder(MemPool* pool, int encoded_value_size, MemTracker* mem_tracker) : : DictEncoderBase(pool), buckets_(HASH_TABLE_SIZE, Node::INVALID_INDEX), : encoded_value_size_(encoded_value_size), : dict_mem_tracker_(mem_tracker), : dict_bytes_cnt_(0) { } > As I understand it, the DictEncoderBase/DictEncoder split is that DictEncod Moved the functionality to the base class as you suggested http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@251 PS8, Line 251: ReleaseBytes(); > Is there any codepath that should legitimately use this? If not, DCHECK tha Checking the dict_bytes_cnt_ to be zero seems to be tricky say even if it is not used in a function it can have a non zero value. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 8 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Fri, 13 Oct 2017 20:45:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Addressing comments.
Pranay Singh has abandoned this change. ( http://gerrit.cloudera.org:8080/8263 ) Change subject: Addressing comments. .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/8263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Iac2047ad6a548f4c9b1edb51189fe0970507d6dd Gerrit-Change-Number: 8263 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#8). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch introduces memory tracker to track the memory used by dictionary used in DictDecoder class and DictEncoder class when a parquet file is read from or written to. Memory tracker is introduced at the class HdfsParquetScanner level to track all the memory used by dictonary in DictDecoders. Similarly memory tracker is introduced in class HdfsParquetTableWriter to track the memory used by dictionary in DictEncoders. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 8 files changed, 157 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/8 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 8 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] Addressing comments.
Pranay Singh has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8263 Change subject: Addressing comments. .. Addressing comments. Change-Id: Iac2047ad6a548f4c9b1edb51189fe0970507d6dd --- M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h 3 files changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/8263/1 -- To view, visit http://gerrit.cloudera.org:8080/8263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iac2047ad6a548f4c9b1edb51189fe0970507d6dd Gerrit-Change-Number: 8263 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#7). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch introduces memory tracker to track the memory used by dictionary used in DictDecoder class and DictEncoder class when a parquet file is read from or written to. Memory tracker is introduced at the class HdfsParquetScanner level to track all the memory used by dictonary in DictDecoders. Similarly memory tracker is introduced in class HdfsParquetTableWriter to track the memory used by dictionary in DictEncoders. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 8 files changed, 158 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/7 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 7 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 ) Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8056/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8056/8//COMMIT_MSG@15 PS8, Line 15: Ran the query_test/test_scanners_fuzz.py in a loop (5 times) > OK I'll run overnight and check tomorrow morning what happens. Ran for 4 hours in a loop didn't see impalad crashing. -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 8 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 05 Oct 2017 19:39:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h@485 PS6, Line 485: /// Tracks all the memory allocation used by dictionary in DictDecoder. : struct DictMemTrack { : std::unique_ptr mem_tracker; : : DictMemTrack(MemTracker *parent_memtrack):mem_tracker(new MemTracker(-1, : "Decoder parquet dict", parent_memtrack, false)) { }; : : MemTracker* GetMemTracker() { : return mem_tracker.get(); : } : : ~DictMemTrack() { : mem_tracker->CloseAndUnregisterFromParent(); : } : }; > I dont think we need to add a new struct for this, you can just call CloseA I had added a new structure to have the creation /destruction of memtracker in one place and also the CloseAndUnregisterFromParent() has to be called after releasing all bytes owned by MemTracker in the DictDecoder class (which happens in it's destructor), since the destructor of member variables are called later than the owner making a separate struct/class for DictMemTrack solved the problem. http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@113 PS6, Line 113: // Bytes used for memory tracking by dictionary in DictEncoder is decremented. : virtual void Cleanup() { : if (dict_encoder_base_ != nullptr) { : dict_encoder_base_->Cleanup(); : } : } > you can move this to the BaseColumnWriter::Close() method This can't be done because I'm not tracking the bytes at the BaseColumnWriter http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@767 PS6, Line 767: // Below code will decrement the bytes used for memory tracking by dictionary in : // DictEncoder class for each ColumnWriter. : for (int i = 0; i < columns_.size(); i++) { : if (columns_[i] != nullptr) { :columns_[i]->Cleanup(); : } : } > similarly you can move this to the HdfsParquetTableWriter::Close() method I'll try moving the functionality in Close() for HdfsParquetTableWriter/Reader http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@138 PS6, Line 138: SizeofDict > I think we can use a more precise name here, SizeOfDict might be confused w I've changed it to be DictByteSize http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@438 PS6, Line 438: bytes_alloc += sizeof(value); > can you switch to what joe suggested earlier, that is, instead of bytes_all This looks fine what is the issue with this? -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Thu, 05 Oct 2017 01:58:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 ) Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8056/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8056/8//COMMIT_MSG@15 PS8, Line 15: Ran the query_test/test_scanners_fuzz.py in a loop (5 times) > I was thinking more like overnight (or a few hours). It's worth doing just OK I'll run overnight and check tomorrow morning what happens. -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 8 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 05 Oct 2017 01:32:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Pranay Singh has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8215 Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. IMPALA-5142 EventSequence displays negative elapsed time. EventSequence may display a negative elapsed time if ClientRequestState is done while EventSequence::MarkEvent() is in progress. The time is calculated based on POSIX clock_gettime() which uses CLOCK_MONOTONIC that will always give a time in increasing time order. In this case the cause of negative delay is start time becoming greater than end time, as the difference of latter and the former is used to compute the delay. The negative delay case can happen if EventSequence::Start() is called on the same EventSequence for which EventSequence::MarkEvent() is in progress.It can be serialized by reordering the lock in EventSequence::MarkEvent() but that may be a performance overhead, calling ElapsedTime() with lock held.So this patch fixes the issue by returning a 0 delay value when such an issue ever happens rather than returning a negative delay value. Testing: --- Ran all the front-end/backend and end-end tests. Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 --- M be/src/util/stopwatch.h 1 file changed, 6 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/8215/1 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Hello Joe McDonnell, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8056 to look at the new patch set (#8). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Testing --- Ran the query_test/test_scanners_fuzz.py in a loop (5 times) and there was no impalad crash seen. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 50 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/8 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 8 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Hello Joe McDonnell, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8056 to look at the new patch set (#7). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Testing --- Ran the query_test/test_scanners_fuzz.py in a loop (5 times) and there was no impalad crash seen. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 50 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/7 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 7 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 ) Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/8056/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8056/6//COMMIT_MSG@12 PS6, Line 12: > Can you mention what testing you did? The fuzz tests are randomised we shou Done http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py@101 PS6, Line 101: """Parquet tables in default schema are compressed, so in order > It's weird that parquet/none means parquet/snappy. Unsure what the history Added a comment stating that parquet/none means default SNAPPY compression is set. http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py@116 PS6, Line 116: " select * from functional_parquet.{1}".format(fq_tbl_name, orig_tbl_name)) > Long lines > 90 chars here and just below. Fixed it. http://gerrit.cloudera.org:8080/#/c/8056/6/tests/query_test/test_scanners_fuzz.py@140 PS6, Line 140: self.execute_query("create table %s.%s like %s.%s" % (fuzz_db, fuzz_table, src_db, src_table)) > Long line Done -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 03 Oct 2017 23:45:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#6). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch introduces memory tracker to track the memory used by dictionary used in DictDecoder class and DictEncoder class when a parquet file is read from or written to. Memory tracker is introduced at the class HdfsParquetScanner level to track all the memory used by dictonary in DictDecoders. Similarly memory tracker is introduced in class HdfsParquetTableWriter to track the memory used by dictionary in DictEncoders. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 7 files changed, 170 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/6 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#5). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch introduces memory tracker to track the memory used by dictionary used in DictDecoder class and DictEncoder class when a parquet file is read from or written to. Memory tracker is introduced at the class HdfsParquetScanner level to track all the memory used by dictonary in DictDecoders. Similarly memory tracker is introduced in class HdfsParquetTableWriter to track the memory used by dictionary in DictEncoders. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 7 files changed, 171 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/5 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 5 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@113 PS3, Line 113: > Remove trailing spaces. This also applies to lines that don't have any code Done http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@164 PS3, Line 164: mtrackp > For class fields, the convention is to end the field name with an underscor Done http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412 PS3, Line 412: ConsumeBytes(sizeof(value)); > using num_values * size of type might not work when dealing with variable s Done http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412 PS3, Line 412: ConsumeBytes(sizeof(value)); > The parquet DictionaryPageHeader contains a num_values field. Look where we Ok I'll allocate a local variable to track the bytes consumed. http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412 PS3, Line 412: ConsumeBytes(sizeof(value)); > For strings, the StringValue is a pointer and a length. In the case of the Done http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-test.cc File be/src/util/dict-test.cc: http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-test.cc@39 PS3, Line 39: tracker > can you add test cases that verify that the encoder/decoder is keeping trac I've added the checks now -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 3 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Tue, 03 Oct 2017 21:45:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#4). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch introduces memory tracker to track the memory used by dictionary used in DictDecoder class and DictEncoder class when a parquet file is read from or written to. Memory tracker is introduced at the class HdfsParquetScanner level to track all the memory used by dictonary in DictDecoders. Similarly memory tracker is introduced in class HdfsParquetTableWriter to track the memory used by dictionary in DictEncoders. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/parquet-column-readers.cc M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 7 files changed, 171 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/4 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Hello Joe McDonnell, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8056 to look at the new patch set (#6). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 43 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/6 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 ) Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@103 PS5, Line 103: with no compression. > Nit: I would emphasize that this has a new table. Something like: "into a n Done http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@110 PS5, Line 110: for orig_tbl_name in tbl_list: : src_table_name = "parquet_uncomp_src_" + orig_tbl_name : dst_table_name = "parquet_uncomp_dst_" + orig_tbl_name > Nit: Done -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 5 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 03 Oct 2017 00:57:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#3). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch introduces memory tracker to track the memory used by dictionary used in DictDecoder class and DictEncoder class when a parquet file is read from or written to. Memory tracker is introduced at the class ExecNode level to track all the memory used by dictonary in DictDecoders. Similarly memory tracker is introduced in class DataSink to track the memory used by dictionary in DictEncoders. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-readers.cc M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 8 files changed, 93 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/3 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 3 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Hello Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8056 to look at the new patch set (#5). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 42 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/5 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 5 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 ) Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/8056/2/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8056/2/tests/query_test/test_scanners_fuzz.py@97 PS2, Line 97: if table_format.file_format != 'parquet': pytest.sk > Another option is to keep the clone code in run_fuzz_test, but change the a Done http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@101 PS4, Line 101: """ Clone an existing parquet table with codec as none in the : unique database. This cloned table is passed to run_fuzz_test : which clones the table and corrupts the table. The test later : checks that there is no crash while performing SQL queries on : a corrupt table. : """ > I think this comment should focus on why this test is different from the ot Done http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@111 PS4, Line 111: db_name = unique_database > I would prefer to emphasize that the source and destination are the unique_ Done http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@117 PS4, Line 117: functional_parquet.alltypes > Can we extend this to do fuzzing on decimal_tbl as well? I was thinking thi Done http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@118 PS4, Line 118: .format(fq_tbl_name)) > This indentation is a bit awkward. I don't think .format should be on its o Moved the format in the same line as select. -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Tue, 26 Sep 2017 17:29:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Hello Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8056 to look at the new patch set (#4). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 46 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/4 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-1649 Pass precision to Print TCounterType::DOUBLE VALUE
Pranay Singh has abandoned this change. ( http://gerrit.cloudera.org:8080/7725 ) Change subject: IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE .. Abandoned After talking to Tim and Joe, it appears that TUnit::DOUBLE_VALUE is needed and cannot be removed. The use of TUnit::DOUBLE_VALUE can be seen in the function RuntimeProfile::PrintChildCounters(), which calls PrettyPrinter::Print() for printing different TUnit types. We can't differentiate between the counters that need to be displayed with a decimal precision of FIXED_PRECISION from the other TUnit::UNIT types, if we remove TUnit::DOUBLE_VALUE . So we do need TUnit::DOUBLE_VALUE to have counters that need to be displayed as a floating point values with FIXED_PRECISION. -- To view, visit http://gerrit.cloudera.org:8080/7725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f Gerrit-Change-Number: 7725 Gerrit-PatchSet: 2 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 59 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/3 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#5). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 59 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/5 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#4). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 59 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/4 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 24 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/3 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 24 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/3 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 23 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/3 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8056/2/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: PS2, Line 97: fq_tbl_name = "functional_parquet" + "." + tbl_name > Do not create a table in the default schemas. Instead, with the changes I d I'll use the functional-parquet to create the cloned table which will be passed to fuzz_test PS2, Line 129: self.execute_query("create table %s.%s like %s" % (unique_database, table, table)) : fuzz_table_location = get_fs_path("/test-warehouse/{0}.db/{1}".format( : unique_database, table)) > Pull this logic out into its own function e.g. clone_table. run_fuzz_test s I'll retain the old functionality as per your latest comments -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1649 Pass precision to Print TCounterType::DOUBLE VALUE
Pranay Singh has posted comments on this change. Change subject: IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7725/2/be/src/util/pretty-printer.h File be/src/util/pretty-printer.h: PS2, Line 142: case TUnit::DOUBLE_VALUE: { : double output = static_cast(value); : ss << std::setprecision(precision) << output << " "; > I thought of removing TUnit::DOUBLE_VALUE from the code and replacing it wi I see that below values displayed in Runtime profile marked as --> are of type TUnit::DOUBLE changing over to TUnit::UNIT will affect their display. HDFS_SCAN_NODE (id=0):(Total: 179.334ms, non-child: 179.334ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.00 --> - AverageScannerThreadConcurrency: 0.00 ---> Hdfs split stats (:<# splits>/): 0:24/478.45 KB - AverageThreadTokens: 0.00 --> - BloomFilterBytes: 0 File Formats: TEXT/NONE:24 - AverageHdfsReadThreadConcurrency: 0.00 --> - AverageScannerThreadConcurrency: 0.00 --> TUnit::DOUBLE -- To view, visit http://gerrit.cloudera.org:8080/7725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 14 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/2 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 14 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/2 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/8056/1/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: Line 68 > Avoid unrelated whitespace diffs. Thanks I was able to fix this issue PS1, Line 96: fq_tbl_name = "functional_parquet" + "." + tbl_name > I'm wary of creating tables in our default schemas. This won't get cleaned Shall I drop the table after running fuzz test ? PS1, Line 98: create = ("create table {0} stored as parquet as select * from functional.alltypes" : .format(fq_tbl_name)) > I think we need to verify that the right options are being set when we crea I introduced a check that compression_codec == none is only used here line #93 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1649 Pass precision to Print TCounterType::DOUBLE VALUE
Pranay Singh has posted comments on this change. Change subject: IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7725/2/be/src/util/pretty-printer.h File be/src/util/pretty-printer.h: PS2, Line 142: case TUnit::DOUBLE_VALUE: { : double output = static_cast(value); : ss << std::setprecision(precision) << output << " "; > I think TUnit::DOUBLE_VALUE is so similar to TUnit::UNIT that we should be I thought of removing TUnit::DOUBLE_VALUE from the code and replacing it with TUnit::UNIT but was perplexed by two cases; a) TUnit::UNIT only obeys precision for larger values: It conditionally sets the precision for large values greater than 1000, so how should a caller display a smaller value in floating point representation restricted to two decimal places(or any determined by precision). Say this can be done by setting the precision argument but the second issue is more confounding. b) Printing different variables of TUnit::UNIT type with different precision: Use of common function like RuntimeProfile::PrintChildCounters() to display variables of same type with different precision makes it hard to use TUnit::UNIT type, it appears that TUnit::DOUBLE is a way to Print some variables with smaller value to be restricted to 2 decimal places (or any precision) and the others to be displayed without any restriction. Hence I thought that TUnit::DOUBLE_VALUE cannot be removed, so made changes for it. -- To view, visit http://gerrit.cloudera.org:8080/7725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1649 Pass precision to Print TCounterType::DOUBLE VALUE
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE .. IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE Fix: Current code uses reinterpret_cast for printing TCounterType::DOUBLE_VALUE, which is unsafe. It also uses a fixed precision, 2 for printing which may not be desirable. This change removes the usage of reinterpret_cast and provides an option to specify precision as an argument type to Print TCounterType::DOUBLE_VALUE. Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f --- M be/src/util/pretty-printer-test.cc M be/src/util/pretty-printer.h M be/src/util/runtime-profile.cc 3 files changed, 20 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7725/2 -- To view, visit http://gerrit.cloudera.org:8080/7725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/8056 Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 13 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/1 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses existing dictionary_pool_ to track the memory used by std::vector in DictDecoder class, whereas an existing memory pool for DictEncoder class is used when allocating memory for std::vector in DictEncoder class. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/parquet-column-readers.cc M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 3 files changed, 42 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/2 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/exec/exec-node.h File be/src/exec/exec-node.h: PS1, Line 215: MemTracker* decoder_mem_tracker() { return decoder_mem_tracker_.get(); } > The dictionary is only used for Parquet, so I think the MemPool should not I also think using existing memory pool, dictionary_pool_ is a better option, I missed it. I'll use dictionary_pool_ instead http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: PS1, Line 213: dict_decoder_(new MemPool(parent->scan_node_->decoder_mem_tracker())), > It is important not to create objects on a per-column basis unless truly ne I'll use dictionary_pool_ instead of creating a new memory pool for every column reads http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: PS1, Line 234: *val_ptr = *dict_[index]; > dict_ needs to return T's directly rather than T*'s. This is a performance I'll use memcpy to copy the contents of dictionary to the buffer passed. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses existing dictionary_pool_ to track the memory used by std::vector in DictDecoder class, whereas an existing memory pool for DictEncoder class is used when allocating memory for std::vector in DictEncoder class. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/parquet-column-readers.cc M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 3 files changed, 42 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/2 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-5108: Followup change to do the cleanup. .. IMPALA-5108: Followup change to do the cleanup. This change involves giving names to constant values to improve readability of test program. Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b --- M be/src/service/session-expiry-test.cc 1 file changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/7801/3 -- To view, visit http://gerrit.cloudera.org:8080/7801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-5108: Followup change to do the cleanup. .. IMPALA-5108: Followup change to do the cleanup. This change involves giving names to constant values to improve readability of test program. Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b --- M be/src/service/session-expiry-test.cc 1 file changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/7801/3 -- To view, visit http://gerrit.cloudera.org:8080/7801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has restored this change. Change subject: IMPALA-5108: Followup change to do the cleanup. .. Restored -- To view, visit http://gerrit.cloudera.org:8080/7801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: restore Gerrit-Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-5108: Followup change to do the cleanup. .. IMPALA-5108: Followup change to do the cleanup. This change involves giving names to constant values to improve readability of test program. Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b --- M be/src/service/session-expiry-test.cc 1 file changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/7801/2 -- To view, visit http://gerrit.cloudera.org:8080/7801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has abandoned this change. Change subject: IMPALA-5108: Followup change to do the cleanup. .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/7801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has abandoned this change. Change subject: IMPALA-5108: Followup change to do the cleanup. .. Abandoned Need to use the old commit id -- To view, visit http://gerrit.cloudera.org:8080/7807 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I46b4de40794c2b29668f6a41a50a757e4950daf6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7807 Change subject: IMPALA-5108: Followup change to do the cleanup. .. IMPALA-5108: Followup change to do the cleanup. This change involves giving names to constant values to improve readability of test program. Change-Id: I46b4de40794c2b29668f6a41a50a757e4950daf6 --- M be/src/service/session-expiry-test.cc 1 file changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/7807/1 -- To view, visit http://gerrit.cloudera.org:8080/7807 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I46b4de40794c2b29668f6a41a50a757e4950daf6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: Followup change to do the cleanup.
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7801 Change subject: IMPALA-5108: Followup change to do the cleanup. .. IMPALA-5108: Followup change to do the cleanup. This change involves giving names to constant values to improve readability of test program. Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b --- M be/src/service/session-expiry-test.cc 1 file changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/7801/1 -- To view, visit http://gerrit.cloudera.org:8080/7801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I521e7e493e4ae26e1955861637cd31f5f678dd0b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Pranay Singh has abandoned this change. Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. Abandoned This change was merged so I need to abandon the change I'll address the review comments as a part of another change/commit# -- To view, visit http://gerrit.cloudera.org:8080/7729 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I827a5fa2246e7b05b718fb8b4563b8488c91a1d9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Pranay Singh has abandoned this change. Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. Abandoned The previous commit was merged so I have to abandon this change -- To view, visit http://gerrit.cloudera.org:8080/7791 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I72625dcc72764c1b89eb4ecb8725c51c9ca7835b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Pranay Singh has posted comments on this change. Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7791/1//COMMIT_MSG Commit Message: PS1, Line 9: The issue was caused because sessions could get closed : some time after their timeout expires, because the session : expiration thread only woke up every session-timeout/2 : seconds. This fix changes the logic, now the session expiration : thread will wakeup every 1 sec to check whether sessions can : be expired. : : The test changes in session-expiry-test is based on introducing : variable names for the numeric constants and to reduce the time : of max_idle_timeout_ms > given that the original change was already merged (I wasn't aware of that s I had reduced MAX_IDLE_TIMEOUT_MS from 5 sec to 4 sec because the session timeout is 1 sec for all the five sessions, so the delay of 5 sec appeared to be a longer delay, waiting for sessions to expire. Since the expiration thread is checking for the sessions to be expired every sec. -- To view, visit http://gerrit.cloudera.org:8080/7791 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I72625dcc72764c1b89eb4ecb8725c51c9ca7835b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7791 Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. IMPALA-5108: idle_session_timeout kicks in later than expected The issue was caused because sessions could get closed some time after their timeout expires, because the session expiration thread only woke up every session-timeout/2 seconds. This fix changes the logic, now the session expiration thread will wakeup every 1 sec to check whether sessions can be expired. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms Change-Id: I72625dcc72764c1b89eb4ecb8725c51c9ca7835b --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 11 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/7791/1 -- To view, visit http://gerrit.cloudera.org:8080/7791 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I72625dcc72764c1b89eb4ecb8725c51c9ca7835b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Pranay Singh has posted comments on this change. Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/7729/1//COMMIT_MSG Commit Message: PS1, Line 17: reduce > what's the reason to reduce this time? Since the frequency of polling is session-timeout/2 where session-timeout is 1sec, 4 sec should be good enough a tight upper bound so I used 4 sec as idle session timeout Line 18: of max_idle_timeout_ms > let's generally keep the messages left justified (exception inside formatte OK http://gerrit.cloudera.org:8080/#/c/7729/1/be/src/service/session-expiry-test.cc File be/src/service/session-expiry-test.cc: PS1, Line 68: gets one session eac > since we're really trying to create multiple sessions (and just using clien Done PS1, Line 83: 10 > that looks like it should be expressed in terms of NUM_CLIENTS. Done PS1, Line 87: 5s > 5s should be updated if you're changing that value. Done PS1, Line 88: 5L > isn't that NUM_CLIENTS? Done -- To view, visit http://gerrit.cloudera.org:8080/7729 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I827a5fa2246e7b05b718fb8b4563b8488c91a1d9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan HechtGerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. IMPALA-5108: idle_session_timeout kicks in later than expected Fix: The issue was caused because sessions could get closed some time after their timeout expires, because the session expiration thread only woke up every session-timeout/2 seconds. This fix changes the logic, now the session expiration thread will wakeup every 1 sec to check whether sessions can be expired. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms. Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 9 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7709/3 -- To view, visit http://gerrit.cloudera.org:8080/7709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bharath VissapragadaGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7729 Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. IMPALA-5108: idle_session_timeout kicks in later than expected Fix: The issue was caused because sessions could get closed some time after their timeout expires, because the session expiration thread only woke up every session-timeout/2 seconds. This fix changes the logic, now the session expiration thread will wakeup every 1 sec to check whether sessions can be expired. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms Change-Id: I827a5fa2246e7b05b718fb8b4563b8488c91a1d9 --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 9 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/7729/1 -- To view, visit http://gerrit.cloudera.org:8080/7729 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I827a5fa2246e7b05b718fb8b4563b8488c91a1d9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5108: idle session timeout kicks in later than expected
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7709 to look at the new patch set (#3). Change subject: IMPALA-5108: idle_session_timeout kicks in later than expected .. IMPALA-5108: idle_session_timeout kicks in later than expected Fix: The issue was caused because sessions could get closed some time after their timeout expires, because the session expiration thread only woke up every session-timeout/2 seconds. This fix changes the logic, now the session expiration thread will wakeup every 1 sec to check whether sessions can be expired. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms. Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 9 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7709/3 -- To view, visit http://gerrit.cloudera.org:8080/7709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bharath VissapragadaGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-1649 Pass precision to Print TCounterType::DOUBLE VALUE
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7725 Change subject: IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE .. IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE Fix: Current code uses reinterpret_cast for printing TCounterType::DOUBLE_VALUE, which is unsafe. It also uses a fixed precision, 2 for printing which may not be desirable. This change removes the usage of reinterpret_cast and provides an option to specify precision as an argument type to Print TCounterType::DOUBLE_VALUE. Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f --- M be/src/util/pretty-printer-test.cc M be/src/util/pretty-printer.h 2 files changed, 9 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7725/1 -- To view, visit http://gerrit.cloudera.org:8080/7725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5108:idle session timeout kicks in later than expected
Pranay Singh has posted comments on this change. Change subject: IMPALA-5108:idle_session_timeout kicks in later than expected .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7712/1//COMMIT_MSG Commit Message: PS1, Line 20: Change-Id: I81505654d796b7cba59206e40cbe54b6e946794f > please remove this line and resubmit - Gerrit uses this as its unique ID fo Done -- To view, visit http://gerrit.cloudera.org:8080/7712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I81505654d796b7cba59206e40cbe54b6e946794f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Henry RobinsonGerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5108:idle session timeout kicks in later than expected
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-5108:idle_session_timeout kicks in later than expected .. IMPALA-5108:idle_session_timeout kicks in later than expected Fix: The issue was caused because polling frequency to check for the session to be expired was half of session timeout, which is around ~1 min. The polling frequency has been increased, it's changed to be around 1 sec with this change. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms. Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7709/2 -- To view, visit http://gerrit.cloudera.org:8080/7709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bharath VissapragadaGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5108:idle session timeout kicks in later than expected
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-5108:idle_session_timeout kicks in later than expected .. IMPALA-5108:idle_session_timeout kicks in later than expected Fix: The issue was caused because polling frequency to check for the session to be expired was half of session timeout, which is around ~1 min. The polling frequency has been increased, it's changed to be around 1 sec with this change. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms. Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7709/2 -- To view, visit http://gerrit.cloudera.org:8080/7709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bharath VissapragadaGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-5108:idle session timeout kicks in later than expected
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7712 Change subject: IMPALA-5108:idle_session_timeout kicks in later than expected .. IMPALA-5108:idle_session_timeout kicks in later than expected Fix: The issue was caused because polling frequency to check for the session to be expired was half of session timeout, which is around ~1 min. The polling frequency has been increased, it's changed to be around 1 sec with this change. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms. Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Change-Id: I81505654d796b7cba59206e40cbe54b6e946794f --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 10 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/7712/1 -- To view, visit http://gerrit.cloudera.org:8080/7712 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I81505654d796b7cba59206e40cbe54b6e946794f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-5108:idle session timeout kicks in later than expected
Pranay Singh has posted comments on this change. Change subject: IMPALA-5108:idle_session_timeout kicks in later than expected .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/7709/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: PS1, Line 1817: that > whether that replaced by whether, done PS1, Line 1817: > double space Double space Removed PS1, Line 1819: int64_t session_timeout_min_ms = 1000; > Rename this to something like sleep_time_ms , and make it a constant - e.g. Made it a constant and changed it to uppercase PS1, Line 1844: FLAGS_use_local_tz_for_unix_timestamp_conversions = true; > This is very dangerous: what if some other thread reads FLAGS_use_local_tz_ Removed it http://gerrit.cloudera.org:8080/#/c/7709/1/be/src/service/session-expiry-test.cc File be/src/service/session-expiry-test.cc: PS1, Line 45: num_clients > constants should be ALL_CAPS Done -- To view, visit http://gerrit.cloudera.org:8080/7709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bharath VissapragadaGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5108:idle session timeout kicks in later than expected
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7709 Change subject: IMPALA-5108:idle_session_timeout kicks in later than expected .. IMPALA-5108:idle_session_timeout kicks in later than expected Fix: The issue was caused because polling frequency to check for the session to be expired was half of session timeout, which is around ~1 min. The polling frequency has been increased, it's changed to be around 1 sec with this change. In addition to the above change, the time that displays the Expiry time of session is displayed in local time zone, this is to better co-relate with the time detail in the logs. The test changes in session-expiry-test is based on introducing variable names for the numeric constants and to reduce the time of max_idle_timeout_ms. Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 --- M be/src/service/impala-server.cc M be/src/service/session-expiry-test.cc 2 files changed, 14 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7709/1 -- To view, visit http://gerrit.cloudera.org:8080/7709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic62f1ece6c128a80676c6eb331ef190a038bee76 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-4861: READ WRITE warning on CREATE TABLE LIKE PARQUET.
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7671 Change subject: IMPALA-4861: READ_WRITE warning on CREATE TABLE LIKE PARQUET. .. IMPALA-4861: READ_WRITE warning on CREATE TABLE LIKE PARQUET. Fix: If the source URI does not have write permissions we get a "READ WRITE warning" on CREATE TABLE LIKE PARQUET. During analyze() in the class CreateTableLikeFileStmt we should be checking only for "READ" permission for the source URI not for READ and WRITE as the source URI should be readable. Change-Id: I56799c4da482fb634ce440f8764dd44234dc22ab --- M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/7671/1 -- To view, visit http://gerrit.cloudera.org:8080/7671 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I56799c4da482fb634ce440f8764dd44234dc22ab Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh
[Impala-ASF-CR] IMPALA-1478: Improve error message when subquery is used in the ON clause
Pranay Singh has uploaded a new patch set (#4). Change subject: IMPALA-1478: Improve error message when subquery is used in the ON clause .. IMPALA-1478: Improve error message when subquery is used in the ON clause Fix: Print the error stating that "Suquery not allowed in the ON clause" Add test case for testing the failure when a subquery is used in the ON clause. Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 --- M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java 2 files changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7588/4 -- To view, visit http://gerrit.cloudera.org:8080/7588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Alex BehmGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-1478: Improve error message when subquery is used in the ON clause
Hello Bharath Vissapragada, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7588 to look at the new patch set (#4). Change subject: IMPALA-1478: Improve error message when subquery is used in the ON clause .. IMPALA-1478: Improve error message when subquery is used in the ON clause Fix: Print the error stating that "Suquery not allowed in the ON clause" Add test case for testing the failure when a subquery is used in the ON clause. Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 --- M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java 2 files changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7588/4 -- To view, visit http://gerrit.cloudera.org:8080/7588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Alex BehmGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Pranay Singh
[Impala-ASF-CR] IMPALA-1478: Improve error message when subquery is used in the ON clause
Pranay Singh has posted comments on this change. Change subject: IMPALA-1478: Improve error message when subquery is used in the ON clause .. Patch Set 3: (4 comments) Addressed the comments http://gerrit.cloudera.org:8080/#/c/7588/1//COMMIT_MSG Commit Message: PS1, Line 7: IMPALA- > Could you replace it with the jira ID? Typically we include jira ID in the Done PS1, Line 10: Add test case for testing the failure when a subquery is used in th > Move above the change-Id. Done http://gerrit.cloudera.org:8080/#/c/7588/3//COMMIT_MSG Commit Message: Line 10: Add test case for testing the failure when a subquery is used in the ON clause. > nit: limit to 70chars per line in commit messages. Done PS3, Line 12: Change-Id: I05f4223fe6862fdb22fb2343ccb1f9070e32c5ff > Remove. (Gerrit's commit hook uses the change-Id to map the commit to the c Done -- To view, visit http://gerrit.cloudera.org:8080/7588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Alex BehmGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Pranay Singh Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1478: Improve error message when subquery is used in the ON clause
Pranay Singh has uploaded a new patch set (#3). Change subject: IMPALA-1478: Improve error message when subquery is used in the ON clause .. IMPALA-1478: Improve error message when subquery is used in the ON clause Fix: Print the error stating that "Suquery not allowed in the ON clause" Add test case for testing the failure when a subquery is used in the ON clause. Change-Id: I05f4223fe6862fdb22fb2343ccb1f9070e32c5ff Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 --- M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java 2 files changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7588/3 -- To view, visit http://gerrit.cloudera.org:8080/7588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bharath Vissapragada
[Impala-ASF-CR] IMPALA-1478: Improve error message when subquery is used in the ON clause
Pranay Singh has uploaded a new patch set (#2). Change subject: IMPALA-1478: Improve error message when subquery is used in the ON clause .. IMPALA-1478: Improve error message when subquery is used in the ON clause Fix: Print the error stating that "Suquery not allowed in the ON clause" Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 --- M fe/src/main/java/org/apache/impala/analysis/TableRef.java 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7588/2 -- To view, visit http://gerrit.cloudera.org:8080/7588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bharath Vissapragada
[Impala-ASF-CR] IMPALA-1478: Improve error message when subquery is used in the ON clause
Pranay Singh has uploaded a new change for review. http://gerrit.cloudera.org:8080/7603 Change subject: IMPALA-1478: Improve error message when subquery is used in the ON clause .. IMPALA-1478: Improve error message when subquery is used in the ON clause Fix: Add test case for testing the failure when a subquery is used in the ON clause. Change-Id: I05f4223fe6862fdb22fb2343ccb1f9070e32c5ff --- M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java 1 file changed, 5 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/7603/1 -- To view, visit http://gerrit.cloudera.org:8080/7603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I05f4223fe6862fdb22fb2343ccb1f9070e32c5ff Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pranay Singh