Hi Alex, Paul,
[Sorry, long email below. Feel free to skip if you are not Paul, or if
you are not interested or affected by this problem.]
On 05/03/2011 06:03 PM, Alex Shepherd wrote:
> I don't have a solution but a similar observation. I have also been trying
> to get a LinkUSB going on my NSLU2 running OpenWRT 10.03 and owfs 2.8p1 and
> I can't start owserver with:
>
> owserver --LINK /dev/ttyUSB0
>
> Like I can on my Debian 6.0.1 Linux box. I have to use:
>
> owserver -d /dev/ttyUSB0
I actually ran owserver from owfs 2.8p1 for several months using my
LinkUSB bus master in emulation mode, so I used -d instead of --link. I
don't remember if I ever tried --link but I can certainly try and report
back.
I now have a better understanding of the issue:
ow_link.c:LINK_detect() has this:
SOC(in)->flow = flow_none ;
RETURN_GOOD_IF_GOOD( LINK_detect_serial(in) ) ;
LEVEL_DEBUG("Second attempt at serial LINK setup");
SOC(in)->flow = flow_hard ;
RETURN_GOOD_IF_GOOD( LINK_detect_serial(in) ) ;
LEVEL_DEBUG("Third attempt at serial LINK setup");
SOC(in)->flow = flow_hard ;
RETURN_GOOD_IF_GOOD( LINK_detect_serial(in) ) ;
Before continuing, it is important to note that my LinkUSB will not work
at all unless hardware flow control is used.
Anyway, let's keep looking at the sequence of events... in the first
detection attempt, LINK_detect_serial() will use no flow control. I know
that this will not work for my LinkUSB master, but let's keep looking so
we can understand the reason for the nasty delays:
ow_link.c:LINK_detect_serial() does this:
* Call LINK_write(LINK_string(" "), 1, in) to elicit a response from the
Link.
* After sending that space character to the link, it enters the
following loop:
// need to read 1 char at a time to get a short string
for ( version_index=0 ; version_index<MAX_LINK_VERSION_LENGTH ;
++version_index ) {
if ( BAD( LINK_read_true_length( (BYTE *)
&(version_string[version_index]), 1, in)) ) {
Then ow_link.c:LINK_read_true_length() does this:
* Call ow_com_read.c:COM_read( buf, size, in ), which in turn, calls
ow_com_read.c:COM_read_size_low(), which in turn calls
ow_tcp_read.c:tcp_read().
ow_tcp_read.c:tcp_read() uses select() to read from the file descriptor
that is associated with the bus master. The file descriptor in my case
is actually tied to a serial port and not a network connection (this is
just an abstraction and, while a bit confusing in my opinion, there is
nothing wrong with this). However, since my LinkUSB will not send
anything back to the serial port it is connected to, this select() times
out. The default timeout is 5 seconds.
On select() timeout, ow_tcp_read.c:tcp_read() returns -EAGAIN *but* it
also leaves input parameter *chars_in set to 0.
After ow_tcp_read.c:tcp_read() returns to
ow_com_read.c:COM_read_size_low() in this timeout scenario,
ow_com_read.c:COM_read_size_low() checks only for -EBADF (returned by
tcp_read() in the case of a select() error) and since that is not the
case in the timeout scenario, ow_com_read.c:COM_read_size_low() ends up
returning actual_size, which was actually passed to tcp_read() as the
chars_in parameter and is now zero.
Upon returning from ow_com_read.c:COM_read_size_low(),
ow_com_read.c:COM_read() returns gbGOOD size "ssize_t actual" is 0.
We are now back at ow_link.c:LINK_read_true_length(), which just returns
the gbGOOD that COM_read() returned.
And finally, we are back at the "for ( version_index=0 ;
version_index<MAX_LINK_VERSION_LENGTH; ++version_index)" loop, with a
gbGOOD result after the call to LINK_read_true_length(). This causes the
check:
if ( BAD( LINK_read_true_length() ) ...
to be false, so we stay in the loop. However, nothing was read from the
serial port (again, because it needs to be configured for hardware flow
control, which it was not during the first attempt), so the rest of the
code in this "for" loop will fail to make progress in identifying a Link
bus master.
The end result is that we stay in this "for" loop for
MAX_LINK_VERSION_LENGTH iterations, and each iteration will last the
time that it takes select() to timeout, which by default is 5 seconds.
So, from the moment owserver is run and a first attempt to communicate
with the LinkUSB with no flow control is made, to the second attempt
with the hardware flow control setting that the LinkUSB needs,
5*MAX_LINK_VERSION_LENGTH = 5*36 = 180 seconds = 3 minutes will have
elapsed.
If you look at my logs, this is exactly the time it took for my LinkUSB
to finally be recognized:
> ow_link.c:LINK_version(298) Checking LINK version
> May 3 14:22:38 altamira OWFS[21715]: DEBUG:
[...]
> May 3 14:25:38 altamira OWFS[21715]: DEBUG:
> ow_link.c:LinkVersion_knownstring(133) Link version Found 1.4
Personally, and no offense to Paul ;-), I never liked the approach of
trying first with no flow control, followed by attempts with hardware
flow control. I didn't like it because I thought that it would result in
unnecessary delays. Instead of that approach I would have preferred to
see a command-line switch that forced hardware flow control, just as
there is a command-line switch to set the baud rate. This would have
resulted in immediate success during serial port communications provided
that the right command-line argument was specified.
While the above LinkUSB detection delay would not have been a problem if
I had had a way of forcing hardware flow control, I did not implement an
alternative to the three attempts with no flow control/flow control/flow
control approach that Paul implemented, so I don't get to complain ;-)
So, how to fix this (I hope there is agreement that taking 3 minutes to
detect a LinkUSB is a bug, even though the end result is successful
detection)? I haven't tried yet, but I think we need to propagate gbBAD
from tcp_read() all the way back to the "for" loop in
ow_link.c:LINK_version() so the "if ( BAD( LINK_read_true_length() ..."
check is true and we leave LINK_version() right after the first timeout,
instead of iterating MAX_LINK_VERSION_LENGTH times.
Note that the above analysis applies to --link. However, using the
LinkUSB in emulation mode (-d /dev/ttyUSBxx), should result in a similar
problem. Perhaps not in the order of 5 seconds x
MAX_LINK_VERSION_LENGTH, but there will be some 5-second delays too
since -d ends up calling tcp_read() as well, which will timeout after 5
seconds when communications without hardware flow control are attempted.
My band-aid solution for now has been to do a global search and replace
of flow_none with flow_hard, leaving everything else the same. This
causes the first communication attempt to be done with hardware flow
control, which causes communications to succeed right way. I am
currently using -d, i.e. emulation mode.
Hope this helps.
Cheers,
Eloy Paris.-
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today. Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Owfs-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/owfs-developers