Re: uthum dropping out [was re ugold]

2013-05-04 Thread Stuart Henderson
On 2013/05/04 01:49, Stuart Henderson wrote:
 On 2013/05/04 01:40, Stuart Henderson wrote:
  --- uthum.c 15 Apr 2013 09:23:02 -  1.19
  +++ uthum.c 4 May 2013 00:19:28 -
  @@ -515,7 +515,7 @@ uthum_ntc_getdata(struct uthum_softc *sc
  return EIO;
   
  /* get sensor value */
  -   if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 10) != 0) {
  +   if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 1000) != 0) {
  DPRINTF((uthum: data read fail\n));
  return EIO;
  }
  @@ -600,6 +600,7 @@ uthum_ntc_tuning(struct uthum_softc *sc,
  }
  ostate = state;
  }
  +   tsleep(sc-sc_sensortask, 0, uthum, hz * 10);
   
  DPRINTF((uthum: ntc tuning done. state change: 0x%.2x-0x%.2x\n,
  s-cur_state, state));
  
 
 ...of course, as soon as I send the diff out, I get a handful of
 messages even with the huge delays
 
 uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
 uthum_refresh_temperntc: data read fail
 uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
 uthum_refresh_temperntc: data read fail
 uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
 uthum_refresh_temperntc: data read fail
 
 but so far it has not got stuck, and the sensors stay attached.
 I'll re-check later...
 

so... each time this happens, the sensor device disappears, which
some userland monitoring programs don't cope with particularly well.
This happens fairly often, e.g. at

11:38:28
11:38:44
11:39:18
11:52:43
11:53:00
11:53:17
12:05:42
12:07:16 ...

I noticed that uthum_ntc_tuning() calls uthum_ntc_getdata() and permits
it to retry 3 times. New diff below takes a different approach: leave
timeouts as they were, and move this retry code up into uthum_ntc_getdata().

With this I do hit some broken ntc data DPRINTFs, however after retrying
the read is successful; sensor is updated and things are more robust.
I've also changed some DPRINTF() to make it clear which function
they're called from.

May  4 14:31:16 slate /bsd: uhidev0 at uhub0 port 2 configuration 1 interface 0 
Ten X Technology, Inc. TEMPer sensor rev 1.10/1.50 addr 2
May  4 14:31:16 slate /bsd: uhidev0: iclass 3/1
May  4 14:31:16 slate /bsd: uthum0 at uhidev0
May  4 14:31:16 slate /bsd: uhidev1 at uhub0 port 2 configuration 1 interface 1 
Ten X Technology, Inc. TEMPer sensor rev 1.10/1.50 addr 2
May  4 14:31:16 slate /bsd: uhidev1: iclass 3/0
May  4 14:31:16 slate /bsd: uthum1 at uhidev1
May  4 14:31:16 slate /bsd: uthum1: type ds75/12bit (temperature), calibration 
offset -1.0 degC
May  4 14:31:16 slate /bsd: uthum1: type NTC (temperature), calibration offset 
1.0 degC
May  4 14:31:16 slate /bsd: uthum_attach: complete
May  4 14:31:16 slate /bsd: uthum: ntc tuning start. cur state = 0x61, val = 
0x83d4
May  4 14:31:17 slate /bsd: uthum: ntc tuning done. state change: 0x61-0x65
May  4 14:34:39 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
May  4 14:38:53 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
May  4 14:39:16 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x90 0x31
May  4 14:39:38 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
May  4 14:40:01 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
May  4 14:40:24 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31

any comments? OK?

Index: uthum.c
===
RCS file: /cvs/src/sys/dev/usb/uthum.c,v
retrieving revision 1.20
diff -u -p -r1.20 uthum.c
--- uthum.c 4 May 2013 12:22:14 -   1.20
+++ uthum.c 4 May 2013 13:44:54 -
@@ -489,21 +489,31 @@ uthum_setup_sensors(struct uthum_softc *
 int
 uthum_ntc_getdata(struct uthum_softc *sc, int *val)
 {
+   int retry = 3;
uint8_t buf[8];
 
if (val == NULL)
return EIO;
 
-   /* get sensor value */
-   if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 10) != 0) {
-   DPRINTF((uthum: data read fail\n));
-   return EIO;
+   while (retry) {
+   /* get sensor value */
+   if (uthum_read_data(sc, CMD_GETDATA_NTC,
+   buf, sizeof(buf), 10) != 0) {
+   DPRINTF((%s: data read fail\n, __FUNCTION__));
+   retry--;
+   continue;
+   }
+   /* check data integrity */
+   if (buf[2] !=  CMD_GETDATA_EOF2) {
+   DPRINTF((%s: broken ntc data 0x%.2x 0x%.2x 0x%.2x\n,
+   __FUNCTION__, buf[0], buf[1], buf[2]));
+   retry--;
+   continue;
+   }
+   break;
}
-
-   /* check data integrity */
-   if (buf[2] !=  CMD_GETDATA_EOF2) {
-   DPRINTF((uthum: broken ntc data 0x%.2x 0x%.2x 0x%.2x\n,
-   buf[0], buf[1], buf[2]));
+   if (retry = 0) {
+   DPRINTF((%s: too many failures, 

Re: uthum dropping out [was re ugold]

2013-05-04 Thread Yojiro UO
Hi, 

Only I can remember is the NTC sensor calibration mechanism is
very complicated and it was hard to reverse engineering.

To fix (or discuss) the problem, I have to find my memo of the device.
Would you wait till next week? (all temper devices are in my office and
maybe the memo also in my office)

And... If you can capture the usb bus, would you send it to me?
It is very helpful to check.

-- Yojiro UO


On 2013/05/04, at 22:52, Stuart Henderson wrote:

 On 2013/05/04 01:49, Stuart Henderson wrote:
 On 2013/05/04 01:40, Stuart Henderson wrote:
 --- uthum.c 15 Apr 2013 09:23:02 -  1.19
 +++ uthum.c 4 May 2013 00:19:28 -
 @@ -515,7 +515,7 @@ uthum_ntc_getdata(struct uthum_softc *sc
 return EIO;
 
 /* get sensor value */
 -   if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 10) != 0) {
 +   if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 1000) != 0) {
 DPRINTF((uthum: data read fail\n));
 return EIO;
 }
 @@ -600,6 +600,7 @@ uthum_ntc_tuning(struct uthum_softc *sc,
 }
 ostate = state;
 }
 +   tsleep(sc-sc_sensortask, 0, uthum, hz * 10);
 
 DPRINTF((uthum: ntc tuning done. state change: 0x%.2x-0x%.2x\n,
 s-cur_state, state));
 
 
 ...of course, as soon as I send the diff out, I get a handful of
 messages even with the huge delays
 
 uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
 uthum_refresh_temperntc: data read fail
 uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
 uthum_refresh_temperntc: data read fail
 uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
 uthum_refresh_temperntc: data read fail
 
 but so far it has not got stuck, and the sensors stay attached.
 I'll re-check later...
 
 
 so... each time this happens, the sensor device disappears, which
 some userland monitoring programs don't cope with particularly well.
 This happens fairly often, e.g. at
 
 11:38:28
 11:38:44
 11:39:18
 11:52:43
 11:53:00
 11:53:17
 12:05:42
 12:07:16 ...
 
 I noticed that uthum_ntc_tuning() calls uthum_ntc_getdata() and permits
 it to retry 3 times. New diff below takes a different approach: leave
 timeouts as they were, and move this retry code up into uthum_ntc_getdata().
 
 With this I do hit some broken ntc data DPRINTFs, however after retrying
 the read is successful; sensor is updated and things are more robust.
 I've also changed some DPRINTF() to make it clear which function
 they're called from.
 
 May  4 14:31:16 slate /bsd: uhidev0 at uhub0 port 2 configuration 1 interface 
 0 Ten X Technology, Inc. TEMPer sensor rev 1.10/1.50 addr 2
 May  4 14:31:16 slate /bsd: uhidev0: iclass 3/1
 May  4 14:31:16 slate /bsd: uthum0 at uhidev0
 May  4 14:31:16 slate /bsd: uhidev1 at uhub0 port 2 configuration 1 interface 
 1 Ten X Technology, Inc. TEMPer sensor rev 1.10/1.50 addr 2
 May  4 14:31:16 slate /bsd: uhidev1: iclass 3/0
 May  4 14:31:16 slate /bsd: uthum1 at uhidev1
 May  4 14:31:16 slate /bsd: uthum1: type ds75/12bit (temperature), 
 calibration offset -1.0 degC
 May  4 14:31:16 slate /bsd: uthum1: type NTC (temperature), calibration 
 offset 1.0 degC
 May  4 14:31:16 slate /bsd: uthum_attach: complete
 May  4 14:31:16 slate /bsd: uthum: ntc tuning start. cur state = 0x61, val = 
 0x83d4
 May  4 14:31:17 slate /bsd: uthum: ntc tuning done. state change: 0x61-0x65
 May  4 14:34:39 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
 May  4 14:38:53 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
 May  4 14:39:16 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x90 0x31
 May  4 14:39:38 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
 May  4 14:40:01 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
 May  4 14:40:24 slate /bsd: uthum_ntc_getdata: broken ntc data 0x18 0x80 0x31
 
 any comments? OK?
 
 Index: uthum.c
 ===
 RCS file: /cvs/src/sys/dev/usb/uthum.c,v
 retrieving revision 1.20
 diff -u -p -r1.20 uthum.c
 --- uthum.c   4 May 2013 12:22:14 -   1.20
 +++ uthum.c   4 May 2013 13:44:54 -
 @@ -489,21 +489,31 @@ uthum_setup_sensors(struct uthum_softc *
 int
 uthum_ntc_getdata(struct uthum_softc *sc, int *val)
 {
 + int retry = 3;
   uint8_t buf[8];
 
   if (val == NULL)
   return EIO;
 
 - /* get sensor value */
 - if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 10) != 0) {
 - DPRINTF((uthum: data read fail\n));
 - return EIO;
 + while (retry) {
 + /* get sensor value */
 + if (uthum_read_data(sc, CMD_GETDATA_NTC,
 + buf, sizeof(buf), 10) != 0) {
 + DPRINTF((%s: data read fail\n, __FUNCTION__));
 + retry--;
 + continue;
 + }
 + /* check data integrity */
 + if (buf[2] !=  CMD_GETDATA_EOF2) {
 + DPRINTF((%s: broken ntc data 0x%.2x 0x%.2x 

Re: uthum dropping out [was re ugold]

2013-05-04 Thread Stuart Henderson
On 2013/05/04 23:32, Yojiro UO wrote:
 Hi, 
 
 Only I can remember is the NTC sensor calibration mechanism is
 very complicated and it was hard to reverse engineering.
 
 To fix (or discuss) the problem, I have to find my memo of the device.
 Would you wait till next week? (all temper devices are in my office and
 maybe the memo also in my office)

Yes, this is totally fine with me.

Since posting I have hit the too many failures case with my diff
now, right after tuning, so this helps my device a lot, but isn't perfect.
It recovered after this, and there have been many other times when it has
made 1 or 2 retries.

May  4 16:05:28 symphytum /bsd: uthum1: type ds75/12bit (temperature), 
calibration offset -1.0 degC
May  4 16:05:28 symphytum /bsd: uthum1: type NTC (temperature), calibration 
offset 1.0 degC
May  4 16:05:28 symphytum /bsd: uthum_attach: complete
May  4 16:05:28 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0xff 0xff 
0xff
May  4 16:05:28 symphytum /bsd: uthum: ntc tuning start. cur state = 0x65, val 
= 0x83bb
May  4 16:05:28 symphytum /bsd: uthum: ntc tuning done. state change: 0x65-0x66
May  4 16:05:28 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x30 
0x31
May  4 16:05:28 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x70 
0x31
May  4 16:05:28 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x60 
0x31
May  4 16:05:28 symphytum /bsd: uthum_ntc_getdata: too many failures
May  4 16:05:28 symphytum /bsd: uthum_refresh_temperntc: data read fail

...

May  4 16:07:43 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x60 
0x31
May  4 16:08:35 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x50 
0x31
May  4 16:11:33 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x30 
0x31
May  4 16:11:33 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x30 
0x31

^^ so here, it retried 2 times

May  4 16:20:58 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x30 
0x31
May  4 16:26:55 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x60 
0x31
May  4 16:28:17 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x50 
0x31
May  4 16:29:02 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x40 
0x31
May  4 16:29:17 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x40 
0x31
May  4 16:31:46 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x50 
0x31
May  4 16:31:46 symphytum /bsd: uthum_ntc_getdata: broken ntc data 0x15 0x50 
0x31

^^ and here

 And... If you can capture the usb bus, would you send it to me?
 It is very helpful to check.

I have no hardware bus analyser, but maybe I can find a Windows machine
with a software analyser to see if ThermoHID / UTAC do something different.



Re: uthum dropping out [was re ugold]

2013-05-03 Thread Stuart Henderson
On 2013/05/04 01:40, Stuart Henderson wrote:
 --- uthum.c   15 Apr 2013 09:23:02 -  1.19
 +++ uthum.c   4 May 2013 00:19:28 -
 @@ -515,7 +515,7 @@ uthum_ntc_getdata(struct uthum_softc *sc
   return EIO;
  
   /* get sensor value */
 - if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 10) != 0) {
 + if (uthum_read_data(sc, CMD_GETDATA_NTC, buf, sizeof(buf), 1000) != 0) {
   DPRINTF((uthum: data read fail\n));
   return EIO;
   }
 @@ -600,6 +600,7 @@ uthum_ntc_tuning(struct uthum_softc *sc,
   }
   ostate = state;
   }
 + tsleep(sc-sc_sensortask, 0, uthum, hz * 10);
  
   DPRINTF((uthum: ntc tuning done. state change: 0x%.2x-0x%.2x\n,
   s-cur_state, state));
 

...of course, as soon as I send the diff out, I get a handful of
messages even with the huge delays

uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
uthum_refresh_temperntc: data read fail
uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
uthum_refresh_temperntc: data read fail
uthum_ntc_getdata: broken ntc data 0x16 0x00 0x31
uthum_refresh_temperntc: data read fail

but so far it has not got stuck, and the sensors stay attached.
I'll re-check later...