[
https://issues.apache.org/jira/browse/IMPALA-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739518#comment-16739518
]
Joe McDonnell commented on IMPALA-8063:
---------------------------------------
Ran end-to-end tests in exhaustive mode with a 0.1 second sleep in the
wait_for_state loop, and that takes the logging back close to normal:
{noformat}
$ ls -l TEST-impala-parallel.xml
-rw-rw-r-- 1 joe joe 34454144 Jan 10 00:47 TEST-impala-parallel.xml
$ grep "getting state for operation" TEST-impala-parallel.xml | wc -l
3640{noformat}
I will go ahead and disable running the tests as well (i.e. xfail with
run=False) while IMPALA-8059 is in progress.
[~poojanilangekar] For this particular loop, the loop conditions are what we
care about, so it tolerates if the sleep is not exact. If it is too long, then
the test is slower. If it is too short, it will loop again. Interrupts are
usually rare enough that the loop won't go crazy.
> Excessive logging from BeeswaxConnection::get_state() bloats JUnitXML output
> ----------------------------------------------------------------------------
>
> Key: IMPALA-8063
> URL: https://issues.apache.org/jira/browse/IMPALA-8063
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Affects Versions: Impala 3.2.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Blocker
> Labels: broken-build
>
> BeeswaxConnection has logging for each call of get_state:
>
> {code:java}
> def get_state(self, operation_handle):
> LOG.info("-- getting state for operation: %s" % operation_handle)
> return self.__beeswax_client.get_state(operation_handle.get_handle())
> {code}
> With IMPALA-7625, ImpalaTestSuite::wait_for_state() calls this more
> frequently:
>
>
> {code:java}
> def wait_for_state(self, handle, expected_state, timeout):
> """Waits for the given 'query_handle' to reach the 'expected_state'. If it
> does not
> reach the given state within 'timeout' seconds, the method throws an
> AssertionError.
> """
> start_time = time.time()
> actual_state = self.client.get_state(handle)
> while actual_state != expected_state and time.time() - start_time < timeout:
> actual_state = self.client.get_state(handle)
> if actual_state != expected_state:
> raise Timeout("query '%s' did not reach expected state '%s', last known
> state '%s'"
> % (handle.get_handle().id, expected_state, actual_state))
> {code}
> When running our tests in exhaustive mode, that increases the size of the
> logging significantly. For example:
>
> {noformat}
> Before this change:
> $ ls -l TEST-impala-parallel.xml
> -rw-rw-r-- 1 joe joe 34254745 Jan 6 23:23 TEST-impala-parallel.xml
> $ grep "getting state for operation" TEST-impala-parallel.xml | wc -
> l
> 1044
> After this change:
> $ ls -l TEST-impala-parallel.xml
> -rw-rw-r-- 1 joe joe 159187682 Jan 9 14:51 TEST-impala-parallel.xml
> $ grep "getting state for operation" TEST-impala-parallel.xml | wc -
> l
> 1167084
> {noformat}
>
>
> We should reduce this logging. Bloated JUnitXML files add burden to developer
> workstations and any Jenkins infrastructure parsing them.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]