Hello Ken

So I apologise for not being clear enough - the request for debug was
not on the "telnet>" prompt,
but an actual katcp request, so sending a

?log-level debug

or even a

?log-level trace

once you have connected, no typing in of the telnet escape sequence
required or desired.

It might be worth re-running this to see if you can get more detail

But, provisionally, it looks like the problem is somewhere here:

#log info 1653629003.095 raw awaiting\_transfer\_on\_port\_2157
#log error 1653629003.121 kcpfpg
unable\_to\_create\_client\_connection\_to\_server\_localhost:\_Connection\_refused
#log info 1653629003.122 raw job\_858\_completed\_with\_code\_2

A part of the fpg file is transferred using a separate port - in this
instance 2157. Connection refused suggests
nothing is listening there - perhaps a timeout or another error which
causes tcpborphserver to
not even get as far as binding the port to receive the bitstream ?

The programming step is unfortunately a bit complicated - the start of
an fpg file is actually a set of
katcp commands you can copy into the telnet session ("?register xxx
...") while the bitstream is sent across afterwards
on a separate port using the "?uploadbin" request using the content of
the fpg file after the "?quit" and where it turns into binary.

The idea was to have "kcpfpg" be able to do all of that across the
network, so with a
kcpfpg installed on a remote machine in the correct location, the
correct KATCP_SERVER environment
variable, it is possible to mark the fpg file executable and "run" it
to perform the upload, across the network, like
a shell script  - emulating what was possible on the original "borph"
system, but using tcp. Though that seems to not have
caught on (a pity).

The complication is that this method is also used "internally" to run
a stored fpg file on the board itself, which means
there is also a kcpfpg connecting to localhost, and that seems to
fail. I suspect that something goes wrong
before that - but more information would be needed.

If you'd like to partition the problem a bit more, perhaps building
kcpfpg on a remote machine (or even locally on the board), and
then using  "./kcpfpg -s xxx.177.143.srv:7130 yourfile.fpg" to do the
programming could be helpful, as that would
remove the "nested" uploads.

And perhaps even an "strace -o logfile ./kcpfpg -s
xxx.177.143.srv:7130 yourfile.fpg" to see what the steps are
and where they might fail.

Perhaps the fpga is not properly deprogrammed by killing the existing
tcpborphserver process ? At first
glance it does not seem firewall related, though I am wrong often enough.

regards

marc

On Mon, Dec 9, 2024 at 11:49 PM Ken Semanov <shapkiqua...@gmail.com> wrote:
>
> Hello Marc,The output shown below is an attempt to send a bit stream using 
> fpga.upload_to_ram_and_program( ) . This is being sent to a tcpborphserver 
> running on port 7130. On the board side, only this server is running by 
> itself. (no multiple running servers). It appears to conne ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
> ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌
> CGBANNERINDICATOR
> Hello Marc,
>
> The output shown below is an attempt to send a bit stream using 
> fpga.upload_to_ram_and_program( ) . This is being sent to a tcpborphserver 
> running on port 7130.  On the board side, only this server is running by 
> itself. (no multiple running servers).   It appears to connect to the board, 
> but an attempt to upload a bitstream raises an exception in code file 
> transport_katcp.py
>
> I can sudo kill the tcpborphserver, and restart another one on the original 
> port 7147. When I do that,  everything connects perfectly and the bitstream 
> works fine.  My current hypothesis is that port 7130 is being blocked by a 
> firewall.  everyone is welcome to try to replicate this error on your own.
>
>
> [START] connect to zcu216 [xxx.177.143.srv] . . .
> KatcpTransport class created on port  7130
> [DONE] connect to zcu216
> [START] fpga.upload_to_ram_and_program( )
> self.sendfile  started on thread.
> obtained an upload_result. Okay.
> 2024-12-09 18:24:24.54 ERROR xxx.177.143.srv transport_katcp.py:662 - 
> xxx.177.143.srv: no programming informs yet. Odd?
> ERROR:xxx.177.143.srv:xxx.177.143.srv: no programming informs yet. Odd?
> Traceback (most recent call last):
>   File 
> "/home/xpswhite/cfpga/casper_venv/lib/python3.8/site-packages/casperfpga-0.4.4.dev1336+py38.276ee44-py3.8-linux-x86_64.egg/casperfpga/transport_katcp.py",
>  line 660, in upload_to_ram_and_program
>     inf = unhandled_informs_queue.get(block=True, timeout=timeout)
>   File "/usr/lib/python3.8/queue.py", line 178, in get
>     raise Empty
> _queue.Empty
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "acqworker.py", line 223, in <module>
>     main( which_buf )
>   File "acqworker.py", line 122, in main
>     fpga.upload_to_ram_and_program(fullpath)
>   File 
> "/home/xpswhite/cfpga/casper_venv/lib/python3.8/site-packages/casperfpga-0.4.4.dev1336+py38.276ee44-py3.8-linux-x86_64.egg/casperfpga/casperfpga.py",
>  line 375, in upload_to_ram_and_program
>     rv = self.transport.upload_to_ram_and_program(
>   File 
> "/home/xpswhite/cfpga/casper_venv/lib/python3.8/site-packages/casperfpga-0.4.4.dev1336+py38.276ee44-py3.8-linux-x86_64.egg/casperfpga/transport_katcp.py",
>  line 663, in upload_to_ram_and_program
>     raise RuntimeError('%s: no programming informs yet. '
> RuntimeError: xxx.177.143.srv: no programming informs yet. Odd?
>
>
> The telnet  debug log during the above,
>
> telnet> set debug
> Will turn on socket level debugging.
> telnet> open xxx.177.143.srv 7130
> Trying xxx.177.143.srv...
> Connected to xxx.177.143.srv.
> Escape character is '^]'.
> #version-connect katcp-library v0.2.0-285-gcb88d0b-dirty 2021-05-06T22:29:36
> #version-connect katcp-protocol 5.0-M
> #version-connect kernel 5.4.0-xilinx-v2020.2 
> #1\_SMP\_Thu\_Jul\_15\_16:49:16\_UTC\_2021
> #log info 1653629001.060 raw new\_client\_connection\_xxx.177.143.cli:58576
> #client-connected xxx.177.143.cli:58576
> #log info 1653629001.067 raw 
> received\_end\_of\_file\_from\_xxx.177.143.cli:58576
> #client-disconnected xxx.177.143.cli:58576
> #log info 1653629001.070 raw new\_client\_connection\_xxx.177.143.cli:58578
> #client-connected xxx.177.143.cli:58578
> #log info 1653629001.073 raw 
> received\_end\_of\_file\_from\_xxx.177.143.cli:58578
> #client-disconnected xxx.177.143.cli:58578
> #log info 1653629001.077 raw new\_client\_connection\_xxx.177.143.cli:58592
> #client-connected xxx.177.143.cli:58592
> #log error 1653629001.079 raw fpga\_not\_programmed
> #log error 1653629001.581 raw fpga\_not\_programmed
> #log error 1653629002.084 raw fpga\_not\_programmed
> #log error 1653629002.085 raw fpga\_not\_programmed
> #log error 1653629002.588 raw fpga\_not\_programmed
> #log error 1653629003.090 raw fpga\_not\_programmed
> #log info 1653629003.095 raw awaiting\_transfer\_on\_port\_2157
> #log error 1653629003.121 kcpfpg 
> unable\_to\_create\_client\_connection\_to\_server\_localhost:\_Connection\_refused
> #log info 1653629003.122 raw job\_858\_completed\_with\_code\_2
> #log info 1653629013.202 raw 
> received\_end\_of\_file\_from\_xxx.177.143.cli:58592
> #client-disconnected xxx.177.143.cli:58592
>
> I made a few edits to transport_katcp.py starting below line 624.  You can 
> see the prints corresponding to the output above.
>
>
> if port == -1:
>     port = random.randint(2000, 2500)
> # start the request thread and join
> request_queue = queue.Queue()
> request_thread = threading.Thread(target=makerequest,
>                                   args=(request_queue, ))
> old_timeout = self._timeout
> self._timeout = timeout
> request_thread.start()
> request_thread.join()
> request_result = request_queue.get()
> if request_result != '':
>     raise RuntimeError('progremote request(%s) on host %s failed' %
>                        (request_result, self.host))
> # start the upload thread and join
> upload_queue = queue.Queue()
> unhandled_informs_queue = queue.Queue()
> print("self.sendfile  started on thread.")
> upload_thread = threading.Thread(target=self.sendfile, args=(
>     filename, self.host, port, upload_queue, ))
> self.unhandled_inform_handler = \
>     lambda msg: unhandled_informs_queue.put(msg)
> upload_thread.start()
> if not wait_complete:
>     self.unhandled_inform_handler = None
>     self._timeout = old_timeout
>     return True
> upload_thread.join()  # wait for the file upload to complete
> upload_result = upload_queue.get()
> if upload_result != '':
>     raise RuntimeError('upload(%s)' % upload_result)
> print("obtained an upload_result. Okay.")
> # wait for the '#fpga ready' inform
> done = False
> while not done:
>     try:
>         inf = unhandled_informs_queue.get(block=True, timeout=timeout)
>     except queue.Empty:
>         self.logger.error('%s: no programming informs yet. Odd?' % self.host)
>         raise RuntimeError('%s: no programming informs yet. '
>                            'Odd?' % self.host)
>     if (inf.name == 'fpga') and (inf.arguments[0].decode() == 'ready'):
>         done = True
> On Monday, December 9, 2024 at 12:45:32 AM UTC-5 Marc wrote:
>>
>>
>>
>> Hello
>>
>> Try doing a "telnet xxx.177.143.xxx 7130", issuing a "log-level debug" and 
>> then re-running your code, while keeping that telnet session open.
>>
>> With a bit of luck it should provide some feedback.
>>
>> Running multiple instances of tcpborphserver can be problematic - for one 
>> the register names are local to each process, and so could get out of sync. 
>> Other problems may occur too.
>>
>> regards
>>
>> marc
>>
>>
>>
>> On Mon, Dec 9, 2024 at 3:27 AM Ken Semanov <shapki...@gmail.com> wrote:
>>>
>>> A call to read_raw( )  eventually flows execution to within the method 
>>> called  def katcprequest( )  located in the following file,  (236)
>>>
>>> https://github.com/casper-astro/casperfpga/blob/master/src/transport_katcp.py
>>>
>>> Therein, self.blocking_request()  is called at 261.   The returned 'reply' 
>>> raises the exception on line 266.
>>>
>>> File 
>>> "/home/xpswhite/cfpga/casper_venv/lib/python3.8/site-packages/casperfpga-0.4.4.dev1336+py38.276ee44-py3.8-linux-x86_64.egg/casperfpga/transport_katcp.py",
>>>  line 297, in katcprequest
>>>     raise KatcpRequestFail(
>>> casperfpga.transport_katcp.KatcpRequestFail: Request write on host 
>>> xxx.177.143.xxx failed.
>>>         Request: ?write counted_capt_buffer_ctrl 0 \0\0\0\0
>>>         Reply: !write fail
>>>
>>>
>>> I am currently using a tcpborphserver3 listening on port 7130.   This 
>>> differs from the default port of 7147
>>>
>>> $ ps aux | grep tcp
>>> root         497  1.0  0.0   5804  3108 ?        Ss   May25   5:24 
>>> /bin/tcpborphserver3 -b /lib/firmware
>>> root        1087  0.0  0.0   3692  1896 ?        Ss   00:45   0:00 
>>> /bin/tcpborphserver3 -b /lib/firmware -p 7130
>>> casper      1319  0.0  0.0   5832   684 ttyPS0   S+   03:24   0:00 grep 
>>> --color=auto tcp
>>>
>>>
>>> If the above write error is suppressed,  the code errors in the same place 
>>> with reads.  The client is able to connect just fine to port 7130,  but 
>>> these reads and writes to software registers fail in the following way 
>>> shown above.  The content of the request "looks" to be just fine.   During 
>>> reads, it actually returns the number 4 in the reply, which is correct.  So 
>>> something else is populating errors in re
>>>
>>> A possible fix here is to inform the blocking_request() call with the new 
>>> port I am using.  Is there a way to inform this call with the particular 
>>> port?
>>>
>>> Another possible fix is that port 7130 must be let through a firewall on 
>>> the SoC board.
>>>
>>> Your thoughts?
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "http://cas...@lists.berkeley.edu"; group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to casper+un...@lists.berkeley.edu.
>>> To view this discussion visit 
>>> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/79eb0468-fe39-44a4-86f2-4fd9dcef47d5n%40lists.berkeley.edu.
>>
>>
>>
>> --
>> The use of google poses numerous grave privacy risks
>>
>>
>> Disclaimer
>>
>> The information contained in this communication from the sender is 
>> confidential. It is intended solely for use by the recipient and others 
>> authorized to receive it. If you are not the recipient, you are hereby 
>> notified that any disclosure, copying, distribution or taking action in 
>> relation of the contents of this information is strictly prohibited and may 
>> be unlawful.



-- 
The use of google poses numerous grave privacy risks

Disclaimer

The information contained in this communication from the sender is 
confidential. It is intended solely for use by the recipient and others 
authorized to receive it. If you are not the recipient, you are hereby notified 
that any disclosure, copying, distribution or taking action in relation of the 
contents of this information is strictly prohibited and may be unlawful.

-- 
You received this message because you are subscribed to the Google Groups 
"casper@lists.berkeley.edu" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to casper+unsubscr...@lists.berkeley.edu.
To view this discussion visit 
https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAGrhWaQB9s%2BikEeW6yFv0i9AVeMEz2X31G9fCGHNY6ojCTYZJw%40mail.gmail.com.

Reply via email to