[
https://issues.apache.org/jira/browse/HBASE-23779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033372#comment-17033372
]
Bharath Vissapragada commented on HBASE-23779:
----------------------------------------------
{quote}Average for file count is reported by yetus and its usually around the
5k. Perhaps the -T has us aggregate file counts?
{quote}
I don't think it is always around 5k. In fact, I suspected ulimits because the
failed jobs I looked at, were running dangerously close to the proc limit of
10k enforced by yetus (example: 9901 (vs. ulimit of 10000)). But I do agree
that it could be a memory issue too, like Mark mentioned.
Looks like yetus gets this data by polling it from the OS in a loop [1]. So I'd
assume it is accurate. For some reason this report only shows up only in the
precommits and not in nightly builds (am I wrong?).
[1]
[https://github.com/apache/yetus/blob/b3a402b012773c94e2ade0797e893d9a14e9f0ed/precommit/src/main/shell/coprocs.d/process_counter.sh#L34]
> Up the default fork count to make builds complete faster; make count relative
> to CPU count
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-23779
> URL: https://issues.apache.org/jira/browse/HBASE-23779
> Project: HBase
> Issue Type: Bug
> Components: test
> Reporter: Michael Stack
> Assignee: Michael Stack
> Priority: Major
> Fix For: 3.0.0, 2.3.0
>
> Attachments: addendum2.patch, test_yetus_934.0.patch
>
>
> Tests take a long time. Our fork count running all tests are conservative --
> 1 (small) for first part and 5 for second part (medium and large). Rather
> than hardcoding we should set the fork count to be relative to machine size.
> Suggestion here is 0.75C where C is CPU count. This ups the CPU use on my box.
> Looking up at jenkins, it seems like the boxes are 24 cores... at least going
> by my random survey. The load reported on a few seems low though this not
> representative (looking at machine/uptime).
> More parallelism willl probably mean more test failure. Let me take a look
> see.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)