osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 3: Code-Review+2 No comments happening, I think we need to resolve it and fix the PCU. I was going to revert, but lynxis' "looks good to me" encourages me to merge this. -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 3 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-Reviewer: Neels Hofmeyr Gerrit-Reviewer: lynxis lazus Gerrit-Reviewer: sivasankari Gerrit-HasComments: No
[MERGED] osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Neels Hofmeyr has submitted this change and it was merged. Change subject: fix PACCH paging: don't return early in case of NULL TBF .. fix PACCH paging: don't return early in case of NULL TBF Commit b78a4a6dfef217c538d45949a6ae725e22a36b05 tried to fix a NULL dereference error, but apparently was overly eager to return, because it looked like all code paths would dereference the tbf. In fact the code path further above, for msg != NULL, has "always" dereferenced the tbf, but the lower code path, the one effecting the paging, has only started to dereference tbf since shortly before the overly eager fix: in da7250ad2c1cd5ddc7d3c6e10435a00b357ef8f7, to "update the dl ctrl msg counter for ms". It seems that this tbf dereference in the paging path is bogus and the cause for the segfault that made me write the early exit fix. Fix that fix: Do not exit early if tbf == NULL, stay in there to be able to reach the paging path below. In case of a message to be sent, assume that tbf is present, and verify: print an error message and abort if there is a msg but no tbf, so that we will see the error if I'm wrong there. If a tbf is missing, free the msg. In case of no message, go on to send pending pagings, but do not attempt to count ctrl messages for a tbf -- IIUC there will never be a tbf if we're paging. This should avoid segfaults while keeping PACCH paging intact. Tweak a comment for and add a blank line above the paging section. Related: OS#2176 CID#158969 Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 --- M src/gprs_rlcmac_sched.cpp 1 file changed, 8 insertions(+), 7 deletions(-) Approvals: lynxis lazus: Looks good to me, but someone else must approve Neels Hofmeyr: Looks good to me, approved Jenkins Builder: Verified diff --git a/src/gprs_rlcmac_sched.cpp b/src/gprs_rlcmac_sched.cpp index 3b940f4..97ee53e 100644 --- a/src/gprs_rlcmac_sched.cpp +++ b/src/gprs_rlcmac_sched.cpp @@ -178,11 +178,14 @@ } } - if (!tbf) - return NULL; - /* any message */ if (msg) { + if (!tbf) { + LOGP(DRLCMACSCHED, LOGL_ERROR, +"Control message to be scheduled, but no TBF (TRX=%d, TS=%d)\n", trx, ts); + msgb_free(msg); + return NULL; + } tbf->rotate_in_list(); LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling control " "message at RTS for %s (TRX=%d, TS=%d)\n", @@ -191,14 +194,12 @@ tbf->ms()->update_dl_ctrl_msg(); return msg; } - /* schedule PACKET PAGING REQUEST */ + + /* schedule PACKET PAGING REQUEST, if any are pending */ msg = pdch->packet_paging_request(); if (msg) { LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling paging request " "message at RTS for (TRX=%d, TS=%d)\n", trx, ts); - - /* Updates the dl ctrl msg counter for ms */ - tbf->ms()->update_dl_ctrl_msg(); return msg; } -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 4 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-Reviewer: Neels Hofmeyr Gerrit-Reviewer: lynxis lazus Gerrit-Reviewer: sivasankari
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 3: Code-Review+1 LGTM -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 3 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-Reviewer: Neels Hofmeyr Gerrit-Reviewer: lynxis lazus Gerrit-Reviewer: sivasankari Gerrit-HasComments: No
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 3: https://osmocom.org/issues/2241 -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 3 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-Reviewer: Neels Hofmeyr Gerrit-Reviewer: sivasankari Gerrit-HasComments: No
[PATCH] osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Hello Harald Welte, Jenkins Builder, Holger Freyther, I'd like you to reexamine a change. Please visit https://gerrit.osmocom.org/2420 to look at the new patch set (#3). fix PACCH paging: don't return early in case of NULL TBF Commit b78a4a6dfef217c538d45949a6ae725e22a36b05 tried to fix a NULL dereference error, but apparently was overly eager to return, because it looked like all code paths would dereference the tbf. In fact the code path further above, for msg != NULL, has "always" dereferenced the tbf, but the lower code path, the one effecting the paging, has only started to dereference tbf since shortly before the overly eager fix: in da7250ad2c1cd5ddc7d3c6e10435a00b357ef8f7, to "update the dl ctrl msg counter for ms". It seems that this tbf dereference in the paging path is bogus and the cause for the segfault that made me write the early exit fix. Fix that fix: Do not exit early if tbf == NULL, stay in there to be able to reach the paging path below. In case of a message to be sent, assume that tbf is present, and verify: print an error message and abort if there is a msg but no tbf, so that we will see the error if I'm wrong there. If a tbf is missing, free the msg. In case of no message, go on to send pending pagings, but do not attempt to count ctrl messages for a tbf -- IIUC there will never be a tbf if we're paging. This should avoid segfaults while keeping PACCH paging intact. Tweak a comment for and add a blank line above the paging section. Related: OS#2176 CID#158969 Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 --- M src/gprs_rlcmac_sched.cpp 1 file changed, 8 insertions(+), 7 deletions(-) git pull ssh://gerrit.osmocom.org:29418/osmo-pcu refs/changes/20/2420/3 diff --git a/src/gprs_rlcmac_sched.cpp b/src/gprs_rlcmac_sched.cpp index 3b940f4..97ee53e 100644 --- a/src/gprs_rlcmac_sched.cpp +++ b/src/gprs_rlcmac_sched.cpp @@ -178,11 +178,14 @@ } } - if (!tbf) - return NULL; - /* any message */ if (msg) { + if (!tbf) { + LOGP(DRLCMACSCHED, LOGL_ERROR, +"Control message to be scheduled, but no TBF (TRX=%d, TS=%d)\n", trx, ts); + msgb_free(msg); + return NULL; + } tbf->rotate_in_list(); LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling control " "message at RTS for %s (TRX=%d, TS=%d)\n", @@ -191,14 +194,12 @@ tbf->ms()->update_dl_ctrl_msg(); return msg; } - /* schedule PACKET PAGING REQUEST */ + + /* schedule PACKET PAGING REQUEST, if any are pending */ msg = pdch->packet_paging_request(); if (msg) { LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling paging request " "message at RTS for (TRX=%d, TS=%d)\n", trx, ts); - - /* Updates the dl ctrl msg counter for ms */ - tbf->ms()->update_dl_ctrl_msg(); return msg; } -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 3 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-Reviewer: Neels Hofmeyr Gerrit-Reviewer: sivasankari
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 2: So, da7250ad2c1cd5ddc7d3c6e10435a00b357ef8f7 introduced a segfault, and b78a4a6dfef217c538d45949a6ae725e22a36b05 as well as this patch fix the segfault, while still having problems. To cut this short, we could also revert all of the patches. Opinions/better patches welcome. -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 2 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-Reviewer: Neels Hofmeyr Gerrit-HasComments: No
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 2: (2 comments) Thanks for the review; the thing is, I don't really know what this code is doing and why, this issue jumped at my knees and now I'm trying to throw stones at it to not be broken. I would very much welcome a more qualified person to take this patch over from me; might be easier than telling me what exactly should happen here, probably in numerous iterations of patch review too. Re how was this found: a live installation of a GSM network saw 80% drop in CS service reliability, hunted down to missing pagings for phones that were using GPRS. In a pcap, before the mentioned patch you see a lot of PACCH pagings, with it you see none. The point being that CS hands down pagings to the PCU in case of GPRS being active, and the regression is that we skip all of those pagings. https://gerrit.osmocom.org/#/c/2420/2/src/gprs_rlcmac_sched.cpp File src/gprs_rlcmac_sched.cpp: Line 186: return NULL; > msg is leaked now? right... Line 202: return msg; > okay but we might still have a tbf here? I kind of assumed that we'd either have a TBF and a message or neither... right, the code is not checking that... -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 2 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-Reviewer: Neels Hofmeyr Gerrit-HasComments: Yes
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 2: Code-Review+1 (2 comments) How long did the fix take? https://gerrit.osmocom.org/#/c/2420/2/src/gprs_rlcmac_sched.cpp File src/gprs_rlcmac_sched.cpp: Line 186: return NULL; msg is leaked now? Line 202: return msg; okay but we might still have a tbf here? -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 2 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Holger Freyther Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-HasComments: Yes
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 2: Out of curiosity - how this was found? I mean in what way PACCH paging was broken as a result of incorrect fix? Is there a way this can be formulated as a unit test? -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 2 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Jenkins Builder Gerrit-Reviewer: Max Gerrit-HasComments: No
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 2: Code-Review+1 -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 2 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Jenkins Builder Gerrit-HasComments: No
osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Patch Set 2: Code-Review+2 -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 2 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Harald Welte Gerrit-Reviewer: Jenkins Builder Gerrit-HasComments: No
[PATCH] osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Hello Jenkins Builder, I'd like you to reexamine a change. Please visit https://gerrit.osmocom.org/2420 to look at the new patch set (#2). fix PACCH paging: don't return early in case of NULL TBF Commit b78a4a6dfef217c538d45949a6ae725e22a36b05 tried to fix a NULL dereference error, but apparently was overly eager to return, because it looked like all code paths would dereference the tbf. In fact the code path further above, for msg != NULL, has "always" dereferenced the tbf, but the lower code path, the one effecting the paging, has only started to dereference tbf since shortly before the overly eager fix: in da7250ad2c1cd5ddc7d3c6e10435a00b357ef8f7, to "update the dl ctrl msg counter for ms". It seems that this tbf dereference in the paging path is bogus and the cause for the segfault that made me write the early exit fix. Fix that fix: Do not exit early if tbf == NULL, stay in there to be able to reach the paging path below. In case of a message to be sent, assume that tbf is present, and verify: print an error message and abort if there is a msg but no tbf, so that we will see the error if I'm wrong there. In case of no message, go on to send pending pagings, but do not attempt to count ctrl messages for a tbf -- IIUC there will never be a tbf if we're paging. This should avoid segfaults while keeping PACCH paging intact. Tweak a comment for and add a blank line above the paging section. Related: OS#2176 CID#158969 Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 --- M src/gprs_rlcmac_sched.cpp 1 file changed, 7 insertions(+), 7 deletions(-) git pull ssh://gerrit.osmocom.org:29418/osmo-pcu refs/changes/20/2420/2 diff --git a/src/gprs_rlcmac_sched.cpp b/src/gprs_rlcmac_sched.cpp index 3b940f4..b40ff9a 100644 --- a/src/gprs_rlcmac_sched.cpp +++ b/src/gprs_rlcmac_sched.cpp @@ -178,11 +178,13 @@ } } - if (!tbf) - return NULL; - /* any message */ if (msg) { + if (!tbf) { + LOGP(DRLCMACSCHED, LOGL_ERROR, +"Control message to be scheduled, but no TBF (TRX=%d, TS=%d)\n", trx, ts); + return NULL; + } tbf->rotate_in_list(); LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling control " "message at RTS for %s (TRX=%d, TS=%d)\n", @@ -191,14 +193,12 @@ tbf->ms()->update_dl_ctrl_msg(); return msg; } - /* schedule PACKET PAGING REQUEST */ + + /* schedule PACKET PAGING REQUEST, if any are pending */ msg = pdch->packet_paging_request(); if (msg) { LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling paging request " "message at RTS for (TRX=%d, TS=%d)\n", trx, ts); - - /* Updates the dl ctrl msg counter for ms */ - tbf->ms()->update_dl_ctrl_msg(); return msg; } -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 2 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr Gerrit-Reviewer: Jenkins Builder
[PATCH] osmo-pcu[master]: fix PACCH paging: don't return early in case of NULL TBF
Review at https://gerrit.osmocom.org/2420 fix PACCH paging: don't return early in case of NULL TBF Commit b78a4a6dfef217c538d45949a6ae725e22a36b05 tried to fix a NULL dereference error, but apparently was overly eager to return, because it looked like all code paths would dereference the tbf. In fact the code path further above, for msg != NULL, has "always" dereferenced the tbf, but the lower code path, the one effecting the paging, has only started to dereference tbf since shortly before the overly eager fix: in da7250ad2c1cd5ddc7d3c6e10435a00b357ef8f7, to "update the dl ctrl msg counter for ms". It seems that this tbf dereference in the paging path is bogus and the cause for the segfault that made me write the early exit fix. Fix that fix: Do not exit early if tbf == NULL, stay in there to be able to reach the paging path below. In case of a message to be sent, assume that tbf is present, and verify: print an error message and abort if there is a msg but no tbf, so that we will see the error if I'm wrong there. In case of no message, go on to send pending pagings, but do not attempt to count ctrl messages for a tbf -- IIUC there will never be a tbf if we're paging. This should avoid segfaults while keeping PACCH paging intact. Tweak a log message for and add a blank line above paging section. Related: OS#2176 CID#158969 Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 --- M src/gprs_rlcmac_sched.cpp 1 file changed, 7 insertions(+), 7 deletions(-) git pull ssh://gerrit.osmocom.org:29418/osmo-pcu refs/changes/20/2420/1 diff --git a/src/gprs_rlcmac_sched.cpp b/src/gprs_rlcmac_sched.cpp index 3b940f4..b40ff9a 100644 --- a/src/gprs_rlcmac_sched.cpp +++ b/src/gprs_rlcmac_sched.cpp @@ -178,11 +178,13 @@ } } - if (!tbf) - return NULL; - /* any message */ if (msg) { + if (!tbf) { + LOGP(DRLCMACSCHED, LOGL_ERROR, +"Control message to be scheduled, but no TBF (TRX=%d, TS=%d)\n", trx, ts); + return NULL; + } tbf->rotate_in_list(); LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling control " "message at RTS for %s (TRX=%d, TS=%d)\n", @@ -191,14 +193,12 @@ tbf->ms()->update_dl_ctrl_msg(); return msg; } - /* schedule PACKET PAGING REQUEST */ + + /* schedule PACKET PAGING REQUEST, if any are pending */ msg = pdch->packet_paging_request(); if (msg) { LOGP(DRLCMACSCHED, LOGL_DEBUG, "Scheduling paging request " "message at RTS for (TRX=%d, TS=%d)\n", trx, ts); - - /* Updates the dl ctrl msg counter for ms */ - tbf->ms()->update_dl_ctrl_msg(); return msg; } -- To view, visit https://gerrit.osmocom.org/2420 To unsubscribe, visit https://gerrit.osmocom.org/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib79f4a945e211a13ac7d1e511cc37b0940ac6202 Gerrit-PatchSet: 1 Gerrit-Project: osmo-pcu Gerrit-Branch: master Gerrit-Owner: Neels Hofmeyr