Hi Alex, Paul,

[Sorry, long email below. Feel free to skip if you are not Paul, or if 
you are not interested or affected by this problem.]

On 05/03/2011 06:03 PM, Alex Shepherd wrote:

> I don't have a solution but a similar observation. I have also been trying
> to get a LinkUSB going on my NSLU2 running OpenWRT 10.03 and owfs 2.8p1 and
> I can't start owserver with:
>
>       owserver --LINK /dev/ttyUSB0
>
> Like I can on my Debian 6.0.1 Linux box. I have to use:
>
>       owserver -d /dev/ttyUSB0

I actually ran owserver from owfs 2.8p1 for several months using my 
LinkUSB bus master in emulation mode, so I used -d instead of --link. I 
don't remember if I ever tried --link but I can certainly try and report 
back.

I now have a better understanding of the issue:

ow_link.c:LINK_detect() has this:

                         SOC(in)->flow = flow_none ;
                         RETURN_GOOD_IF_GOOD( LINK_detect_serial(in) ) ;

                         LEVEL_DEBUG("Second attempt at serial LINK setup");
                         SOC(in)->flow = flow_hard ;
                         RETURN_GOOD_IF_GOOD( LINK_detect_serial(in) ) ;

                         LEVEL_DEBUG("Third attempt at serial LINK setup");
                         SOC(in)->flow = flow_hard ;
                         RETURN_GOOD_IF_GOOD( LINK_detect_serial(in) ) ;

Before continuing, it is important to note that my LinkUSB will not work 
at all unless hardware flow control is used.

Anyway, let's keep looking at the sequence of events... in the first 
detection attempt, LINK_detect_serial() will use no flow control. I know 
that this will not work for my LinkUSB master, but let's keep looking so 
we can understand the reason for the nasty delays:

ow_link.c:LINK_detect_serial() does this:

* Call LINK_write(LINK_string(" "), 1, in) to elicit a response from the 
Link.

* After sending that space character to the link, it enters the 
following loop:

// need to read 1 char at a time to get a short string
for ( version_index=0 ; version_index<MAX_LINK_VERSION_LENGTH ; 
++version_index ) {
     if ( BAD( LINK_read_true_length( (BYTE *) 
&(version_string[version_index]), 1, in)) ) {

Then ow_link.c:LINK_read_true_length() does this:

* Call ow_com_read.c:COM_read( buf, size, in ), which in turn, calls 
ow_com_read.c:COM_read_size_low(), which in turn calls 
ow_tcp_read.c:tcp_read().

ow_tcp_read.c:tcp_read() uses select() to read from the file descriptor 
that is associated with the bus master. The file descriptor in my case 
is actually tied to a serial port and not a network connection (this is 
just an abstraction and, while a bit confusing in my opinion, there is 
nothing wrong with this). However, since my LinkUSB will not send 
anything back to the serial port it is connected to, this select() times 
out. The default timeout is 5 seconds.

On select() timeout, ow_tcp_read.c:tcp_read() returns -EAGAIN *but* it 
also leaves input parameter *chars_in set to 0.

After ow_tcp_read.c:tcp_read() returns to 
ow_com_read.c:COM_read_size_low() in this timeout scenario, 
ow_com_read.c:COM_read_size_low() checks only for -EBADF (returned by 
tcp_read() in the case of a select() error) and since that is not the 
case in the timeout scenario, ow_com_read.c:COM_read_size_low() ends up 
returning actual_size, which was actually passed to tcp_read() as the 
chars_in parameter and is now zero.

Upon returning from ow_com_read.c:COM_read_size_low(), 
ow_com_read.c:COM_read() returns gbGOOD size "ssize_t actual" is 0.

We are now back at ow_link.c:LINK_read_true_length(), which just returns 
the gbGOOD that COM_read() returned.

And finally, we are back at the "for ( version_index=0 ; 
version_index<MAX_LINK_VERSION_LENGTH; ++version_index)" loop, with a 
gbGOOD result after the call to LINK_read_true_length(). This causes the 
check:

if ( BAD( LINK_read_true_length() ) ...

to be false, so we stay in the loop. However, nothing was read from the 
serial port (again, because it needs to be configured for hardware flow 
control, which it was not during the first attempt), so the rest of the 
code in this "for" loop will fail to make progress in identifying a Link 
bus master.

The end result is that we stay in this "for" loop for 
MAX_LINK_VERSION_LENGTH iterations, and each iteration will last the 
time that it takes select() to timeout, which by default is 5 seconds.

So, from the moment owserver is run and a first attempt to communicate 
with the LinkUSB with no flow control is made, to the second attempt 
with the hardware flow control setting that the LinkUSB needs, 
5*MAX_LINK_VERSION_LENGTH = 5*36 = 180 seconds = 3 minutes will have 
elapsed.

If you look at my logs, this is exactly the time it took for my LinkUSB 
to finally be recognized:

> ow_link.c:LINK_version(298) Checking LINK version
> May  3 14:22:38 altamira OWFS[21715]:   DEBUG:

[...]

> May  3 14:25:38 altamira OWFS[21715]:   DEBUG:
> ow_link.c:LinkVersion_knownstring(133) Link version Found 1.4

Personally, and no offense to Paul ;-), I never liked the approach of 
trying first with no flow control, followed by attempts with hardware 
flow control. I didn't like it because I thought that it would result in 
unnecessary delays. Instead of that approach I would have preferred to 
see a command-line switch that forced hardware flow control, just as 
there is a command-line switch to set the baud rate. This would have 
resulted in immediate success during serial port communications provided 
that the right command-line argument was specified.

While the above LinkUSB detection delay would not have been a problem if 
I had had a way of forcing hardware flow control, I did not implement an 
alternative to the three attempts with no flow control/flow control/flow 
control approach that Paul implemented, so I don't get to complain ;-)

So, how to fix this (I hope there is agreement that taking 3 minutes to 
detect a LinkUSB is a bug, even though the end result is successful 
detection)? I haven't tried yet, but I think we need to propagate gbBAD 
from tcp_read() all the way back to the "for" loop in 
ow_link.c:LINK_version() so the "if ( BAD( LINK_read_true_length() ..." 
check is true and we leave LINK_version() right after the first timeout, 
instead of iterating MAX_LINK_VERSION_LENGTH times.

Note that the above analysis applies to --link. However, using the 
LinkUSB in emulation mode (-d /dev/ttyUSBxx), should result in a similar 
problem. Perhaps not in the order of 5 seconds x 
MAX_LINK_VERSION_LENGTH, but there will be some 5-second delays too 
since -d ends up calling tcp_read() as well, which will timeout after 5 
seconds when communications without hardware flow control are attempted.

My band-aid solution for now has been to do a global search and replace 
of flow_none with flow_hard, leaving everything else the same. This 
causes the first communication attempt to be done with hardware flow 
control, which causes communications to succeed right way. I am 
currently using -d, i.e. emulation mode.

Hope this helps.

Cheers,

Eloy Paris.-

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Owfs-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/owfs-developers

Reply via email to