Patch Set 2:

(4 comments)

I've placed some comments, but.

My reflection on the conn FSM as I wrote it makes me come to the conclusion 
that the root cause for the need for this patch is that I lack a state.

Currently it's like

1. INIT (initial state on allocation)
2. NEW (we're busy with auth and ciph? are we?)
3. ACCEPTED (allowed to do stuff)
4. COMMUNICATING (a trans is ongoing)
5. RELEASED (we noticed that all transactions are done.

Currently, when we don't go into auth and ciphering code paths, the FSM remains 
in NEW and eventually times out, only then is the conn closed. We should cut 
that short at the end of msc_compl_l3().

So I think we also need a dedicated AUTH_CIPH state:

1. INIT
2. NEW (we're still evaluating the compl_l3)
3. AUTH_CIPH (*actually* busy with auth and ciph)
4. ACCEPTED (allowed to do stuff)
...

Because then, subscr_conn_release_when_unused() can distinguish between "what, 
we're still NEW after evaluating compl_l3 so something went wrong" and "ah yes, 
busy with auth and ciph, keep it open".

As a result, at the end of msc_compl_l3(), when we call 
subscr_conn_release_when_unused(), we notice that we're neither in AUTH_CIPH 
nor ACCEPTED nor COMMUNICATING, and hence the conn must be closed. Currently 
subscr_conn_release_when_unused() still sees the NEW state as "ah yes, busy 
with auth and ciph" even though we might not be.

That saves us from triggering close events in numerous places, all we need to 
do is *not* advance to AUTH_CIPH during compl_l3 evaluation and automatically 
we've caught all wrong situations, for all the compl_l3 types (LU, CM Service 
Request, Paging) at the same time.

So it's all my fault, sorry for the complication, and I think the above is how 
we should fix it.

https://gerrit.osmocom.org/#/c/6503/2/src/libmsc/gsm_04_08.c
File src/libmsc/gsm_04_08.c:

Line 680:               osmo_fsm_inst_dispatch(conn->conn_fsm, 
SUBSCR_CONN_E_CN_CLOSE, &cause);
rather call

  msc_subscr_conn_close(conn, cause);

which should take the right action for any situation.

Closing the conn like this would also actually work without an FSM.


Line 744:                                               
GSM48_REJECT_INCORRECT_MESSAGE);
One problem I still see here is that we kind of imply that we'd accept multiple 
CM Service Request messages on the same conn, by cm_serv_reuse_conn() above. If 
we want to allow that properly, then we can't kill the conn just because the 
second service request failed; i.e.

  if (!msc_subscr_conn_is_accepted())
      msc_subscr_conn_close();
  else
      LOGP("keeping it open");

We'd also need to skip launching below vlr_proc_acc_req().

But I'm not sure whether we should actually support several CM Service Requests 
in the first place?? Is that a dedicated compl_l3 message or can it also happen 
apart from a compl_l3?


Line 825:       return msc_gsm48_tx_mm_serv_rej_and_close(conn, 
GSM48_REJECT_SRV_OPT_NOT_SUPPORTED);
does an FSM exist at this point? When doing subscr_conn_close() above closing 
the conn also works without an FSM.


Line 3627:      return msc_gsm48_tx_mm_serv_rej_and_close(conn, cause);
Hmm, I see that I was re-using this function for two purposes. Its primary 
intention is to be a callback for the VLR. See the msc_vlr_ops struct.

But I'm also using it in the code here to send out a service reject in failure 
paths, directly.

I doubt that when called from the VLR as a callback, that wes should trigger a 
close event. It's just the VLR asking us to send a message. The conn should 
then end up unused and be closed by the "normal" mechanisms.


-- 
To view, visit https://gerrit.osmocom.org/6503
To unsubscribe, visit https://gerrit.osmocom.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfa39fdbe5bb764f8ea2bbf8c5442e15e01cadbb
Gerrit-PatchSet: 2
Gerrit-Project: osmo-msc
Gerrit-Branch: master
Gerrit-Owner: daniel <[email protected]>
Gerrit-Reviewer: Harald Welte <[email protected]>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: Neels Hofmeyr <[email protected]>
Gerrit-HasComments: Yes

Reply via email to