Re: [lldb-dev] [RFC] Improving protocol-level compatibility between LLDB and GDB

Jason Molenda via lldb-dev Sun, 25 Apr 2021 13:27:11 -0700

I was looking at lldb-platform and I noticed I implemented the A packet in it, 
and I was worried I might have the same radix error as lldb in there, but this 
code I wrote made me laugh:


    const char *p = pkt.c_str() + 1;   // skip the 'A'
    std::vector<std::string> packet_contents = get_fields_from_delimited_string 
(p, ',');
    std::vector<std::string> inferior_arguments;
    std::string executable_filename;

    if (packet_contents.size() % 3 != 0)
    {
        log_error ("A packet received with fields that are not a multiple of 3: 
 %s\n", pkt.c_str());
    }

    unsigned long tuples = packet_contents.size() / 3;
    for (int i = 0; i < tuples; i++)
    {
        std::string length_of_argument_str = packet_contents[i * 3];
        std::string argument_number_str = packet_contents[(i * 3) + 1];
        std::string argument = decode_asciihex (packet_contents[(i * 3) + 
2].c_str());

        int len_of_argument;
        if (ascii_to_i (length_of_argument_str, 16, len_of_argument) == false)
            log_error ("Unable to parse length-of-argument field of A packet: 
%s in full packet %s\n",
                       length_of_argument_str.c_str(), pkt.c_str());

        int argument_number;
        if (ascii_to_i (argument_number_str, 16, argument_number) == false)
            log_error ("Unable to parse argument-number field of A packet: %s 
in full packet %s\n",
                       argument_number_str.c_str(), pkt.c_str());

        if (argument_number == 0)
        {
            executable_filename = argument;
        }
        inferior_arguments.push_back (argument);
    }


These A packet fields give you the name of the binary and the arguments to pass 
on the cmdline.  My guess is at some point in the past the arguments were not 
asciihex encoded, so you genuinely needed to know the length of each argument.  
But now, of course, and you could write a perfectly fine client that mostly 
ignores argnum and arglen altogether.


I wrote a fix for the A packet for debugserver using a 'a-packet-base16' 
feature in qSupported to activate it, and tested it by hand, works correctly.  
If we're all agreed that this is how we'll request/indicate these protocol 
fixes, I can put up a phab etc and get this started.

diff --git a/lldb/tools/debugserver/source/RNBRemote.cpp 
b/lldb/tools/debugserver/source/RNBRemote.cpp
index 586336a21b6..996ce2f96cf 100644
--- a/lldb/tools/debugserver/source/RNBRemote.cpp
+++ b/lldb/tools/debugserver/source/RNBRemote.cpp
@@ -176,7 +176,7 @@ RNBRemote::RNBRemote()
       m_extended_mode(false), m_noack_mode(false),
       m_thread_suffix_supported(false), m_list_threads_in_stop_reply(false),
       m_compression_minsize(384), m_enable_compression_next_send_packet(false),
-      m_compression_mode(compression_types::none) {
+      m_compression_mode(compression_types::none), m_a_packet_base16(false) {
   DNBLogThreadedIf(LOG_RNB_REMOTE, "%s", __PRETTY_FUNCTION__);
   CreatePacketTable();
 }
@@ -1530,8 +1530,9 @@ void RNBRemote::NotifyThatProcessStopped(void) {
 
  6,0,676462,4,1,2d71,10,2,612e6f7574
 
- Note that "argnum" and "arglen" are numbers in base 10.  Again, that's
- not documented either way but I'm assuming it's so.  */
+ lldb would use base 10 for "argnum" and "arglen" but that is incorrect.
+ Default behavior is currently still base10, but when m_a_packet_base16 is
+ via the qSupported packet negotiation, use base16. */
 
 rnb_err_t RNBRemote::HandlePacket_A(const char *p) {
   if (p == NULL || *p == '\0') {
@@ -1548,6 +1549,7 @@ rnb_err_t RNBRemote::HandlePacket_A(const char *p) {
    2nd arg has to be non-const which makes it problematic to step
    through the string easily.  */
   char *buf = const_cast<char *>(p);
+  const char *end_of_buf = buf + strlen(buf);
 
   RNBContext &ctx = Context();
 
@@ -1557,7 +1559,7 @@ rnb_err_t RNBRemote::HandlePacket_A(const char *p) {
     char *c;
 
     errno = 0;
-    arglen = strtoul(buf, &c, 10);
+    arglen = strtoul(buf, &c, m_a_packet_base16 ? 16 : 10);
     if (errno != 0 && arglen == 0) {
       return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
                                     "arglen not a number on 'A' pkt");
@@ -1569,7 +1571,7 @@ rnb_err_t RNBRemote::HandlePacket_A(const char *p) {
     buf = c + 1;
 
     errno = 0;
-    argnum = strtoul(buf, &c, 10);
+    argnum = strtoul(buf, &c, m_a_packet_base16 ? 16 : 10);
     if (errno != 0 && argnum == 0) {
       return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
                                     "argnum not a number on 'A' pkt");
@@ -1582,6 +1584,10 @@ rnb_err_t RNBRemote::HandlePacket_A(const char *p) {
 
     c = buf;
     buf = buf + arglen;
+
+    if (buf > end_of_buf)
+      break;
+
     while (c < buf && *c != '\0' && c + 1 < buf && *(c + 1) != '\0') {
       char smallbuf[3];
       smallbuf[0] = *c;
@@ -3651,8 +3657,12 @@ rnb_err_t RNBRemote::HandlePacket_qSupported(const char 
*p) {
     snprintf(numbuf, sizeof(numbuf), "%zu", m_compression_minsize);
     numbuf[sizeof(numbuf) - 1] = '\0';
     strcat(buf, numbuf);
-  } 
+  }
 
+  if (strstr(p, "a-packet-base16")) {
+    m_a_packet_base16 = true;
+    strcat(buf, ";a-packet-base16+");
+  }
   return SendPacket(buf);
 }
 
diff --git a/lldb/tools/debugserver/source/RNBRemote.h 
b/lldb/tools/debugserver/source/RNBRemote.h
index b0535372a32..e3daa6e1a5c 100644
--- a/lldb/tools/debugserver/source/RNBRemote.h
+++ b/lldb/tools/debugserver/source/RNBRemote.h
@@ -405,6 +405,8 @@ protected:
   size_t m_compression_minsize; // only packets larger than this size will be
                                 // compressed
   bool m_enable_compression_next_send_packet;
+  bool m_a_packet_base16; // A packet arguments are base 16 (correct) versus
+                          // (old buggy) base 10
 
   compression_types m_compression_mode;
 };



> On Apr 22, 2021, at 12:03 AM, Jason Molenda <[email protected]> wrote:
> 
> 
> 
>> On Apr 20, 2021, at 11:39 PM, Pavel Labath via lldb-dev 
>> <[email protected]> wrote:
>> 
>> I am very happy to see this effort and I fully encourage it.
> 
> 
> Completely agree.  Thanks for Cc:'ing me Pavel, I hadn't seen Michał's thread.
> 
> 
>> 
>> On 20/04/2021 09:13, Michał Górny via lldb-dev wrote:
>>> On Mon, 2021-04-19 at 16:29 -0700, Greg Clayton wrote:
>>>>> I think the first blocker towards this project are existing
>>>>> implementation bugs in LLDB. For example, the vFile implementation is
>>>>> documented as using incorrect data encoding and open flags. This is not
>>>>> something that can be trivially fixed without breaking compatibility
>>>>> between different versions of LLDB.
>>>> 
>>>> We should just fix this bug in LLDB in both LLDB's logic and lldb-server 
>>>> IMHO. We typically distribute both "lldb" and "lldb-server" together so 
>>>> this shouldn't be a huge problem.
>>> Hmm, I've focused on this because I recall hearing that OSX users
>>> sometimes run new client against system server... but now I realized
>>> this isn't relevant to LLGS ;-).  Still, I'm happy to do things
>>> the right way if people feel like it's needed, or the easy way if it's
>>> not.
>> 
>> The vFile packets are, used in the "platform" mode of the connection (which, 
>> btw, is also something that gdb does not have), and that is implemented by 
>> lldb-server on all hosts (although I think apple may have some custom 
>> platform implementations as well). In any case though, changing flag values 
>> on the client will affect all servers that it communicates with, regardless 
>> of the platform.
>> 
>> At one point, Jason cared enough about this to add a warning about not 
>> changing these constants to the code. I'd suggest checking with him whether 
>> this is still relevant.
>> 
>> Or just going with your proposed solution, which sounds perfectly reasonable 
>> to me....
> 
> 
> The main backwards compatibility issue for Apple is that lldb needs to talk 
> to old debugservers on iOS devices, where debugserver can't be updated.  I 
> know of three protocol bugs we have today:
> 
> vFile:open flags
> vFile:pread/pwrite base   https://bugs.llvm.org/show_bug.cgi?id=47820
> A packet base   https://bugs.llvm.org/show_bug.cgi?id=42471
> 
> debugserver doesn't implement vFile packets.  So for those, we only need to 
> worry about lldb/lldb-server/lldb-platform.
> 
> 
> lldb-platform is a freestanding platform packets stub I wrote for Darwin 
> systems a while back.  Real smol, it doesn't link to/use any llvm/lldb code.  
> I never upstreamed it because it doesn't really fit in with llvm/lldb 
> projects in any way and it's not super interesting, it is very smol and 
> simple.  I was tired of tracking down complicated bugs and wanted easier 
> bugs.  It implements the vFile packets; it only does the platform packets and 
> runs debugserver for everything else.
> 
> Technically a modern lldb could need to communicate with an old 
> lldb-platform, but it's much more of a corner case and I'm not super worried 
> about it, we can deal with that inside Apple (that is, I can be responsible 
> for worrying about it.)
> 
> For vFile:open and vFile:pread/pwrite, I say we just change them in 
> lldb/lldb-server and it's up to me to change them in lldb-platform at the 
> same time.
> 
> 
> For the A packet, debugserver is using base 10,
> 
>    errno = 0;
>    arglen = strtoul(buf, &c, 10);
>    if (errno != 0 && arglen == 0) {
>      return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
>                                    "arglen not a number on 'A' pkt");
>    }
> [..]
>    errno = 0;
>    argnum = strtoul(buf, &c, 10);
>    if (errno != 0 && argnum == 0) {
>      return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
>                                    "argnum not a number on 'A' pkt");
>    }
> 
> as does lldb,
> 
>    packet.PutChar('A');
>    for (size_t i = 0, n = argv.size(); i < n; ++i) {
>      arg = argv[i];
>      const int arg_len = strlen(arg);
>      if (i > 0)
>        packet.PutChar(',');
>      packet.Printf("%i,%i,", arg_len * 2, (int)i);
>      packet.PutBytesAsRawHex8(arg, arg_len);
> 
> 
> and lldb-server,
> 
>    // Decode the decimal argument string length. This length is the number of
>    // hex nibbles in the argument string value.
>    const uint32_t arg_len = packet.GetU32(UINT32_MAX);
>    if (arg_len == UINT32_MAX)
>      success = false;
>    else {
>      // Make sure the argument hex string length is followed by a comma
>      if (packet.GetChar() != ',')
>        success = false;
>      else {
>        // Decode the argument index. We ignore this really because who would
>        // really send down the arguments in a random order???
>        const uint32_t arg_idx = packet.GetU32(UINT32_MAX);
> 
> uint32_t StringExtractor::GetU32(uint32_t fail_value, int base) {
>  if (m_index < m_packet.size()) {
>    char *end = nullptr;
>    const char *start = m_packet.c_str();
>    const char *cstr = start + m_index;
>    uint32_t result = static_cast<uint32_t>(::strtoul(cstr, &end, base));
> 
> where 'base' defaults to 0 which strtoul treats as base 10 unless the number 
> starts with 0x.
> 
> 
> The A packet one is the trickiest to clean up IMO.  We have two signals that 
> can be useful.  debugserver response to the qGDBServerVersion packet,
> 
> (lldb) process plugin packet send qGDBServerVersion
>  packet: qGDBServerVersion
> response: name:debugserver;version:1205.2;
> 
> which hilariously no one else does.  This can tell us definitively that we're 
> talking to debugserver.  And we can add a feature request to qSupported, like
> 
> send packet:  "qSupported:xmlRegisters=i386,arm,mips,arc;a-packet-base16;"
> read packet:  "qXfer:features:read+;PacketSize=20000;qEcho+;a-packet-base16+"
> 
> This tells us that we're talking to a debugserver that can handle base16 
> numbers in A, and it will expect them.  And we can test if the remote stub is 
> debugserver.  if it's debugserver and it did not say it supports this, then 
> we need to send base10.
> 
> 
> 
> 
> 
> 
>> 
>>>> The other main issue LLDB has when using other GDB servers is the dynamic 
>>>> register information is not enough for debuggers to live on unless there 
>>>> is some hard coded support in the debugger that can help fill in register 
>>>> numberings. The GDB server has its own numbers, and that is great, but in 
>>>> order to truly be dynamic, we need to know the compiler register number 
>>>> (such as the reg numbers used for .eh_frame) and the DWARF register 
>>>> numbers for debug info that uses registers numbers (these are usually the 
>>>> same as the compiler register numbers, but they do sometimes differ (like 
>>>> x86)). LLDB also likes to know "generic" register numbers like which 
>>>> register it the PC (RIP for x86_64, EIP for x86, etc), SP, FP and a few 
>>>> more. lldb-server has extensions for this so that the dynamic register 
>>>> info it emits is enough for LLDB. We have added extra key/value pairs to 
>>>> the XML that is retrieved via "target.xml" so that it can be complete. See 
>>>> the function in 
>>>> lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:
>>>> 
>>>> bool ParseRegisters(XMLNode feature_node, GdbServerTargetInfo &target_info,
>>>>                    GDBRemoteDynamicRegisterInfo &dyn_reg_info, ABISP 
>>>> abi_sp,
>>>>                    uint32_t &reg_num_remote, uint32_t &reg_num_local);
>>>> 
>>>> There are many keys we added: "encoding", "format", "gcc_regnum", 
>>>> "ehframe_regnum", "dwarf_regnum", "generic", "value_regnums", 
>>>> "invalidate_regnums", "dynamic_size_dwarf_expr_bytes"
>>>> 
>>> Yes, this is probably going to be the hardest part.  While working
>>> on plugins, I've found LLDB register implementation very hard to figure
>>> out, especially that the plugins seem to be a mix of new, old and older
>>> solutions to the same problem.
>>> We will probably need more ground-level design changes too.  IIRC lldb
>>> sends YMM registers as a whole (i.e. with duplication with XMM
>>> registers) while GDB sends them split like in XSAVE.  I'm not yet sure
>>> how to handle this best -- if we don't want to push the extra complexity
>>> on plugins, it might make sense to decouple the packet format from
>>> the data passed to plugins.
>> 
>> Yes, this is definitely going to be the trickiest part, and probably 
>> deserves its own RFC. However, I want to note that in the past discussions, 
>> the consensus (between Jason and me, at least) has been to move away from 
>> this "rich" register information transfer. For one, because we have this 
>> information coded into the client anyway (as people want to communicate with 
>> gdb-like stubs).
> 
> 
> Yes agree.  The remote stub should tell us a register name, what register it 
> wants to use to refer to it.  Everything else is gravy (unnecessary).  If we 
> want to support the g/G packets, I would like to get the offset into the g/G 
> packet.
> 
> The rest of the register numbers -- eh_frame, dwarf, ABI argument register 
> convenience names -- comes from the ABI.  We can use the register names to 
> match these up - the stub says "I've got an 'r12' and I will refer to it as 
> register number 53" and lldb looks up r12 and gets the rest of the register 
> information from that.  It assumes we can all agree on register names, but 
> otherwise I think it's fine.
> 
> As for xmm/ymm/zmm, Greg has a scheme where we can specify registers that 
> overlap in the target.xml returned.  This allows us to say we have register 
> al/ah/ax/eax/rax and that they're all the same actual register, so if any one 
> of them is modified, they're all modified.  e.g.
> 
>  <reg name="rax" regnum="0" offset="0" bitsize="64" group="general" 
> group_id="1" ehframe_regnum="0" dwarf_regnum="0" 
> invalidate_regnums="0,21,37,53,57"/>
> 
>  <reg name="eax" regnum="21" offset="0" bitsize="32" group="general" 
> group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>
> 
>  <reg name="ax" regnum="37" offset="0" bitsize="16" group="general" 
> group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>
> 
>  <reg name="ah" regnum="53" offset="1" bitsize="8" group="general" 
> group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>
> 
>  <reg name="al" regnum="57" offset="0" bitsize="8" group="general" 
> group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>
> 
> 
> debugserver sends up the value of rax at a stop along with all the GPRs,
> 
> <  19> send packet: $vCont;s:183b166#8d
> < 626> read packet: 
> $T05thread:183b166;threads:183b166;thread-pcs:100003502;00:f034000001000000;...
> 
> 
> and I can print any of these variants without any more packets being sent, 
> because lldb knows it already has all of them.
> 
> (lldb) reg read al ah ax eax rax
>      al = 0xf0
>      ah = 0x34
>      ax = 0x34f0
>     eax = 0x000034f0
>     rax = 0x00000001000034f0  a.out`main at b.cc:3
> (lldb) 
> 
> (I had packet logging turned on here, you'll have to take my word that no 
> packets were sent ;)
> 
> debugserver describes xmm/ymm/zmm the same, so when I go to read one, it gets 
> the full register contents -
> 
> (lldb) reg read xmm0
> <  23> send packet: $p63;thread:183b166;#9c
> < 132> read packet: 
> $ffff0000000000000000000000ff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000#00
>    xmm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
> 0xff 0x00 0x00}
> 
> (lldb) reg read ymm0
>    ymm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
> 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
> 0x00 0x00 0x00 0x00}
> 
> (lldb) reg read zmm0
>    zmm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
> 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
> 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
> 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
> 0x00 0x00 0x00 0x00 0x00 0x00}
> (lldb) 
> 
> 
> So my request to read xmm0 actually fetched zmm0.
> 
> J

_______________________________________________
lldb-dev mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] [RFC] Improving protocol-level compatibility between LLDB and GDB

Reply via email to