Re: [lldb-dev] PATCH for REVIEW - Fix [Bug 15038] LLDB does not support printing wide-character variables on Linux

Enrico Granata Mon, 01 Apr 2013 15:50:50 -0700

On Apr 1, 2013, at 2:50 PM, "Thirumurthi, Ashok" <[email protected]> 
wrote:


> Hi Enrico,
>  
> Ø If we go this route, the same thing might have to be done for 
> ValueObject::ReadPointedString() which does the same kind of “unbounded read”
> I see.  Looking through the code base for methods that call 
> GetMaximumSizeOfStringSummary, I see that there are a number of code paths to 
> handle.  The attached patch lowers the fix so that this functionality can be 
> used as needed via a helper method in Process to validate an unbounded read.
>  

We already have a Process::ReadCStringFromMemory - that does read&validation
You might instead want to consider making a Process::ReadWStringFromMemory
The mere act of validating that a buffer is 0-terminated is not a Process 
specific operation at all

> Ø  But I am inclined to believe the best option would be to adopt your 
> ProcessMonitor fix that reads smaller and smaller chunks and then returns the 
> number of bytes read and no error
> Well, the unbounded read is a characteristic of the use case (i.e. string 
> reads), not the implementation.  The behavior of the plug-in under this type 
> of request is implementation specific, and an error is always reasonable 
> since there was a failure to read the requested size in bytes.  My point is 
> that the corner case of null terminator in the last 8-byte read needs to be 
> tested and fixed.
>  
> Ø What troubles me a little with the overall idea of this patch is that we 
> are putting a workaround for a lower-level issue in higher-level code.
> Ø I would prefer, if we do this, that at least we put the code around #if 
> defined (__linux__), as in:
> Ø What we have here is a Linux-specific implementation issue
> Note that the issue can appear with any ptrace implementation.  So, I lowered 
> the fix to the ProcessPOSIX plugin, which is consistent with the case of 
> eErrorTypePOSIX.

I don’t think that is a Process responsibility to validate a buffer
If you want to make it a new type of request, ReadWString, and pass it a size 
so it knows how many consecutive bytes must be 0 to mean EndOfData, that would 
be fine.
It would look like
size_t Process::ReadWStringFromInferior(const void* buffer, size_t max_size, 
Error& error, size_t sizeofitem);
and there you could check the end-of-string
Then the WString data formatter can be moved to use this new read request, but 
you will have to ensure that partial strings are accepted :-)

Thanks,
Enrico Granata
✉ egranata@.com
✆ 27683

>  
> Thanks again for the feedback,
> -        Ashok
>  
>  
> From: Enrico Granata [mailto:[email protected]] 
> Sent: Thursday, March 28, 2013 2:39 PM
> To: Thirumurthi, Ashok
> Cc: [email protected]
> Subject: Re: [lldb-dev] PATCH for REVIEW - Fix [Bug 15038] LLDB does not 
> support printing wide-character variables on Linux
>  
> Hi,
> now I pretty much see where you are coming from.
> What we have here is a Linux-specific implementation issue (which of course 
> is not going to get fixed anytime soon :-)
>  
> What troubles me a little with the overall idea of this patch is that we are 
> putting a workaround for a lower-level issue in higher-level code.
> If we go this route, the same thing might have to be done for 
> ValueObject::ReadPointedString() which does the same kind of “unbounded read”
> I would prefer, if we do this, that at least we put the code around #if 
> defined (__linux__), as in:
>      if (error.Fail() || data_read == 0)
>      {
> +        bool found = false;
> #if defined (__linux__)
> +        if (data_read && error.GetType() == eErrorTypePOSIX) {
> +            // Determine if a null terminator was found in spite of an 
> out-of-bounds read
> +            if (error.GetError() == EIO || error.GetError() == EFAULT) {
> +                char* buffer = reinterpret_cast<char 
> *>(buffer_sp->GetBytes());
> +                char terminator[4] = {'\0', '\0', '\0', '\0'};
> +                assert(sizeof(terminator) >= type_size && "Atempting to read 
> a wchar with more than 4 bytes per character!");
> +
> +                for (size_t i = 0; !found && (i + type_size <= data_read); i 
> += type_size)
> +                    if (::strncmp(&buffer[i], terminator, type_size) == 0)
> +                        found = true; // null terminator
> +            }
> +        }
> #endif
> +        if (!found) {
> +            stream.Printf("unable to read data");
> +            return true;
>          }
> }
>  
> This will maintain a clean behavior for non-Linux systems (and other systems 
> with a similarly broken API can opt-in this behavior by adding themselves to 
> the ifdef)
>  
> But I am inclined to believe the best option would be to adopt your 
> ProcessMonitor fix that reads smaller and smaller chunks and then returns the 
> number of bytes read and no error - that would make all unbounded-read-based 
> calls just work without having to add Linux-specific patches around (main 
> issue being that now we all have to remember that anything that reads a 
> string-like thing has to add a similar workaround or cause Linux unhappiness)
>  
> If that is not going to be viable, or will take some time, I don’t want to 
> keep you blocked, so I would suggest:
> - commit this patch with #ifdef markers to keep it Linux-only
> - file a bugzilla issue to make process reads as functional as possible on 
> Linux
>  
> Thoughts?
>  
> Enrico Granata
> ✉ egranata@.com
> ✆ 27683
>  
> On Mar 28, 2013, at 8:39 AM, "Thirumurthi, Ashok" 
> <[email protected]> wrote:
> 
> 
> Thanks for the feedback, Enrico,
>  
> I’m relying on the fact that when the wstring is larger than the default size 
> (1GB in your example), we won’t get a ptrace error like EIO or EFAULT.  So, 
> there will be no search for a null terminator.
>  
> If we can’t rely on the null terminator, it’s hard to know if a different 
> error occurred.  The trouble is that there is inconsistency in the 
> PTRACE_PEEK implementation on Linux that is more or less arbitrary (sigh), so 
> the out-of-bounds read can generate EIO or EFAULT.  In particular, EIO can 
> mean a number of things:
>   “request is invalid, or an attempt was made to read from or write to an 
> invalid area in the parent's or child's memory, or there was a word-alignment 
> violation, or an invalid signal was specified during a restart request.” 
> –http://linux.die.net/man/2/ptrace            
>  
> In addition, the Linux ProcessMonitor reads in 8-byte chunks, so 
> ReadUTFBufferAndDumpToStream will probably fail (for the right reasons) in 
> the case where at least part of the null terminator was in that block.  I see 
> that as an implementation detail that can be resolved in ProcessMonitor by 
> reading smaller chunks until PTRACE_PEEK succeeds (i.e. a good follow-up 
> patch).
>  
> Do let me know if you feel this is okay to commit as is.  Cheers,
>  
> -        Ashok
>  
> From: Enrico Granata [mailto:[email protected]] 
> Sent: Wednesday, March 27, 2013 6:18 PM
> To: Thirumurthi, Ashok
> Cc: [email protected]
> Subject: Re: [lldb-dev] PATCH for REVIEW - Fix [Bug 15038] LLDB does not 
> support printing wide-character variables on Linux
>  
> Hi Ashok,
> thanks for submitting an LLDB patch :)
>  
> I have looked at the code and while I see where the patch is coming from, I 
> see a potential issue with it.
>  
> The code in ReadUTFBufferAndDumpToStream, which is what you are editing, is 
> meant to work on a partial string (i.e. one that does not have a NULL 
> terminator at the end). The reason for that is that you might have a wstring 
> that is 1GB long and we would not want to try and read all of it and then 
> display it. What we do is pick a size and only extract that much data. For 
> obvious reasons, your string might be longer than our upper boundary, so you 
> would get a chunk of valid bytes and then no end-of-buffer marker.
>  
> It looks like your code would fail in that case and produce no summary for a 
> string if an EIO was received before \0.
>  
> Is there any reason why you can’t just check if (error == EIO and data_read > 
> 0) and if so treat this as a “partial string” condition and keep going?
> Would that break/crash anything?
>  
> Best,
> Enrico Granata
> ✉ egranata@.com
> ✆ 27683
>  
> On Mar 27, 2013, at 2:52 PM, "Thirumurthi, Ashok" 
> <[email protected]> wrote:
> 
> 
> 
> The root cause for Bug 15038 is a ptrace EIO that can occur because we don't 
> know the size of a (UTF) string and so read a fixed number of characters for 
> strings.  The attached fix accepts EIO in the case where data has been read 
> and a null terminator of the correct size and alignment was found.
> 
> - Ashok
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On 
> Behalf [email protected]
> Sent: Tuesday, January 22, 2013 10:13 AM
> To: [email protected]
> Subject: [lldb-dev] [Bug 15038] New: LLDB does not support printing 
> wide-character variables on Linux
> 
> http://llvm.org/bugs/show_bug.cgi?id=15038
> 
>             Bug #: 15038
>           Summary: LLDB does not support printing wide-character
>                    variables on Linux
>           Product: lldb
>           Version: unspecified
>          Platform: PC
>        OS/Version: Linux
>            Status: NEW
>          Severity: enhancement
>          Priority: P
>         Component: All Bugs
>        AssignedTo: [email protected]
>        ReportedBy: [email protected]
>    Classification: Unclassified
> 
> 
> Printing a variable of wchar_t type does not behave as expected and results 
> in garbage being printed on the screen.
> 
> To reproduce, remove the @expectedFailureLinux decorator from 
> TestChar1632T.py and TestCxxWCharT.py and run:
> 
> python dotest.py --executable <path-to-lldb> lang/cpp/char1632_t 
> lang/cpp/wchar_t
> 
> --
> Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
> ------- You are receiving this mail because: ------- You are the assignee for 
> the bug.
> _______________________________________________
> lldb-dev mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> <read-strings.diff>_______________________________________________
> lldb-dev mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>  
> <read-strings-2.diff>

_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Re: [lldb-dev] PATCH for REVIEW - Fix [Bug 15038] LLDB does not support printing wide-character variables on Linux

Reply via email to