Re: 8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

2019-06-21 Thread Daniil Titov
I don't think so, the stack trace for JDK-8226367 shows the different problem.

Best regards,
Daniil

On 6/21/19, 10:02 AM, "Gary Adams"  wrote:

Any chance this will also fix JDK-8226367 ?

On 6/20/19, 10:39 PM, Chris Plummer wrote:
> Thanks for looking into this.
>
> Chris
>
> On 6/20/19 6:42 PM, Daniil Titov wrote:
>> Thank you, Chris and Serguei, for reviewing this change.
>>
>> I did more testing with a test program on Linux that repeats 
>> get_namespace_pid() method and reads from the real file ( the copy of 
>> /proc//status).
>> I paused the program after the first several lines ( but not "NSpid:" 
>> line)  where processed and deleted all lines in the status file 
>> before " NSpid: ..." line in order
>> to this line became the first in the edited file. After that the 
>> program continues in the while loop but it seems as the original file 
>> content was
>> already buffered and the program just continues over the original 
>> unedited lines and successfully found the match.
>>
>> Best regards,
>> Daniil
>>
>> On 6/19/19, 9:13 PM, "Chris Plummer"  wrote:
>>
>>  Hi Daniil,
>>   I think your fix is good, although I wonder about the 
>> robustness of this
>>  function given that the status file can change while it is 
>> reading it. I
>>  wonder if it can return a false negative because it never saw the
>>  matching line. This could happen if a line gets deleted, causing 
>> the
>>  matching line to suddenly be earlier in the file, possibly 
>> before the
>>  current location being read. Anyway, that's not really related 
>> to the
>>  current issue or fix, but if you think it might be an issue 
>> maybe a bug
>>  should be filed for it.
>>   thanks,
>>   Chris
>>   On 6/19/19 9:02 PM, Daniil Titov wrote:
>> > Please review the change that fixes an intermittent failure of 
>> serviceability/dcmd/framework/* tests on Linux platform.
>> >
>> > The problem here is that get_namespace_pid() method, that is called 
>> by mmap_attach_shared () that in turn is called by PerfMemory::attach(),
>> > tries to read the namespace pid information from /proc//status 
>> file. However, it doesn't check that the error indicator associated with
>> > stream is set that results in the endless  loop (lines 664-677) if 
>> the process terminates after /proc//status was opened (line 659)
>> > and checked for null (line 661).
>> >
>> >658  snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
>> > 659  FILE *fp = fopen(fname, "r");
>> > 660
>> > 661  if (fp) {
>> > 662int pid, nspid;
>> > 663int ret;
>> > 664while (!feof(fp)) {
>> > 665  ret = fscanf(fp, "NSpid: %d %d", , );
>> > 666  if (ret == 1) {
>> > 667break;
>> > 668  }
>> > 669  if (ret == 2) {
>> > 670retpid = nspid;
>> > 671break;
>> > 672  }
>> > 673  for (;;) {
>> > 674int ch = fgetc(fp);
>> > 675if (ch == EOF || ch == (int)'\n') break;
>> > 676  }
>> > 677}
>> > 678fclose(fp);
>> > 679  }
>> >
>> > The fix adds the check for the error indicator to ensure that the 
>> "while" loop terminates properly if the file no longer exists.
>> >
>> > Issues [3] and [4] have the same cause and will be closed as 
>> duplicates of this issue.
>> >
>> > Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, 
>> and tier3 tests are in progress.
>> >
>> > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
>> > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
>> > [3] https://bugs.openjdk.java.net/browse/JDK-8223600
>> > [4] https://bugs.openjdk.java.net/browse/JDK-8217351
>> >
>> > Thanks!
>> > -Daniil
>> >
>> >
>>
>>
>
>






Re: 8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

2019-06-21 Thread Gary Adams

Any chance this will also fix JDK-8226367 ?

On 6/20/19, 10:39 PM, Chris Plummer wrote:

Thanks for looking into this.

Chris

On 6/20/19 6:42 PM, Daniil Titov wrote:

Thank you, Chris and Serguei, for reviewing this change.

I did more testing with a test program on Linux that repeats 
get_namespace_pid() method and reads from the real file ( the copy of 
/proc//status).
I paused the program after the first several lines ( but not "NSpid:" 
line)  where processed and deleted all lines in the status file 
before " NSpid: ..." line in order
to this line became the first in the edited file. After that the 
program continues in the while loop but it seems as the original file 
content was
already buffered and the program just continues over the original 
unedited lines and successfully found the match.


Best regards,
Daniil

On 6/19/19, 9:13 PM, "Chris Plummer"  wrote:

 Hi Daniil,
  I think your fix is good, although I wonder about the 
robustness of this
 function given that the status file can change while it is 
reading it. I

 wonder if it can return a false negative because it never saw the
 matching line. This could happen if a line gets deleted, causing 
the
 matching line to suddenly be earlier in the file, possibly 
before the
 current location being read. Anyway, that's not really related 
to the
 current issue or fix, but if you think it might be an issue 
maybe a bug

 should be filed for it.
  thanks,
  Chris
  On 6/19/19 9:02 PM, Daniil Titov wrote:
> Please review the change that fixes an intermittent failure of 
serviceability/dcmd/framework/* tests on Linux platform.

>
> The problem here is that get_namespace_pid() method, that is called 
by mmap_attach_shared () that in turn is called by PerfMemory::attach(),
> tries to read the namespace pid information from /proc//status 
file. However, it doesn't check that the error indicator associated with
> stream is set that results in the endless  loop (lines 664-677) if 
the process terminates after /proc//status was opened (line 659)

> and checked for null (line 661).
>
>658  snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
> 659  FILE *fp = fopen(fname, "r");
> 660
> 661  if (fp) {
> 662int pid, nspid;
> 663int ret;
> 664while (!feof(fp)) {
> 665  ret = fscanf(fp, "NSpid: %d %d", , );
> 666  if (ret == 1) {
> 667break;
> 668  }
> 669  if (ret == 2) {
> 670retpid = nspid;
> 671break;
> 672  }
> 673  for (;;) {
> 674int ch = fgetc(fp);
> 675if (ch == EOF || ch == (int)'\n') break;
> 676  }
> 677}
> 678fclose(fp);
> 679  }
>
> The fix adds the check for the error indicator to ensure that the 
"while" loop terminates properly if the file no longer exists.

>
> Issues [3] and [4] have the same cause and will be closed as 
duplicates of this issue.

>
> Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, 
and tier3 tests are in progress.

>
> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
> [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
> [3] https://bugs.openjdk.java.net/browse/JDK-8223600
> [4] https://bugs.openjdk.java.net/browse/JDK-8217351
>
> Thanks!
> -Daniil
>
>









Re: 8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

2019-06-20 Thread Chris Plummer

Thanks for looking into this.

Chris

On 6/20/19 6:42 PM, Daniil Titov wrote:

Thank you, Chris and Serguei, for reviewing this change.

I did more testing with a test program on Linux that repeats get_namespace_pid() 
method and reads from the real file ( the copy of /proc//status).
I paused the program after the first several lines ( but not "NSpid:" line)  where 
processed and deleted all lines in the status file before " NSpid: ..." line in order
to this line became the first in the edited file. After that the program 
continues in the while loop but it seems as the original file content was
already buffered and the program just continues over the original unedited 
lines and successfully found the match.

Best regards,
Daniil

On 6/19/19, 9:13 PM, "Chris Plummer"  wrote:

 Hi Daniil,
 
 I think your fix is good, although I wonder about the robustness of this

 function given that the status file can change while it is reading it. I
 wonder if it can return a false negative because it never saw the
 matching line. This could happen if a line gets deleted, causing the
 matching line to suddenly be earlier in the file, possibly before the
 current location being read. Anyway, that's not really related to the
 current issue or fix, but if you think it might be an issue maybe a bug
 should be filed for it.
 
 thanks,
 
 Chris
 
 On 6/19/19 9:02 PM, Daniil Titov wrote:

 > Please review the change that fixes an intermittent failure of 
serviceability/dcmd/framework/* tests on Linux platform.
 >
 > The problem here is that get_namespace_pid() method, that is called by 
mmap_attach_shared () that in turn is called by PerfMemory::attach(),
 > tries to read the namespace pid information from /proc//status 
file. However, it doesn't check that the error indicator associated with
 > stream is set that results in the endless  loop (lines 664-677) if the process 
terminates after /proc//status was opened (line 659)
 > and checked for null (line 661).
 >
 >658  snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
 > 659 FILE *fp = fopen(fname, "r");
 > 660   
 > 661 if (fp) {
 > 662   int pid, nspid;
 > 663   int ret;
 > 664   while (!feof(fp)) {
 > 665 ret = fscanf(fp, "NSpid: %d %d", , );
 > 666 if (ret == 1) {
 > 667   break;
 > 668 }
 > 669 if (ret == 2) {
 > 670   retpid = nspid;
 > 671   break;
 > 672 }
 > 673 for (;;) {
 > 674   int ch = fgetc(fp);
 > 675   if (ch == EOF || ch == (int)'\n') break;
 > 676 }
 > 677   }
 > 678   fclose(fp);
 > 679 }
 >
 > The fix adds the check for the error indicator to ensure that the 
"while" loop terminates properly if the file no longer exists.
 >
 > Issues [3] and [4] have the same cause and will be closed as duplicates 
of this issue.
 >
 > Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, and 
tier3 tests are in progress.
 >
 > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
 > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
 > [3] https://bugs.openjdk.java.net/browse/JDK-8223600
 > [4] https://bugs.openjdk.java.net/browse/JDK-8217351
 >
 > Thanks!
 > -Daniil
 >
 >
 
 








Re: 8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

2019-06-20 Thread Daniil Titov
Thank you, Chris and Serguei, for reviewing this change.

I did more testing with a test program on Linux that repeats 
get_namespace_pid() method and reads from the real file ( the copy of 
/proc//status).
I paused the program after the first several lines ( but not "NSpid:" line)  
where processed and deleted all lines in the status file before " NSpid: ..." 
line in order 
to this line became the first in the edited file. After that the program 
continues in the while loop but it seems as the original file content was 
already buffered and the program just continues over the original unedited 
lines and successfully found the match.  

Best regards,
Daniil

On 6/19/19, 9:13 PM, "Chris Plummer"  wrote:

Hi Daniil,

I think your fix is good, although I wonder about the robustness of this 
function given that the status file can change while it is reading it. I 
wonder if it can return a false negative because it never saw the 
matching line. This could happen if a line gets deleted, causing the 
matching line to suddenly be earlier in the file, possibly before the 
current location being read. Anyway, that's not really related to the 
current issue or fix, but if you think it might be an issue maybe a bug 
should be filed for it.

thanks,

Chris

On 6/19/19 9:02 PM, Daniil Titov wrote:
> Please review the change that fixes an intermittent failure of 
serviceability/dcmd/framework/* tests on Linux platform.
>
> The problem here is that get_namespace_pid() method, that is called by 
mmap_attach_shared () that in turn is called by PerfMemory::attach(),
> tries to read the namespace pid information from /proc//status file. 
However, it doesn't check that the error indicator associated with
> stream is set that results in the endless  loop (lines 664-677) if the 
process terminates after /proc//status was opened (line 659)
> and checked for null (line 661).
>
>658  snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
> 659 FILE *fp = fopen(fname, "r");
> 660
> 661 if (fp) {
> 662   int pid, nspid;
> 663   int ret;
> 664   while (!feof(fp)) {
> 665 ret = fscanf(fp, "NSpid: %d %d", , );
> 666 if (ret == 1) {
> 667   break;
> 668 }
> 669 if (ret == 2) {
> 670   retpid = nspid;
> 671   break;
> 672 }
> 673 for (;;) {
> 674   int ch = fgetc(fp);
> 675   if (ch == EOF || ch == (int)'\n') break;
> 676 }
> 677   }
> 678   fclose(fp);
> 679 }
>
> The fix adds the check for the error indicator to ensure that the "while" 
loop terminates properly if the file no longer exists.
>
> Issues [3] and [4] have the same cause and will be closed as duplicates 
of this issue.
>
> Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, and 
tier3 tests are in progress.
>
> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
> [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
> [3] https://bugs.openjdk.java.net/browse/JDK-8223600
> [4] https://bugs.openjdk.java.net/browse/JDK-8217351
>
> Thanks!
> -Daniil
>
>






Re: RFR: 8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

2019-06-20 Thread serguei.spit...@oracle.com

Hi Daniil,

It looks good to me.

Thanks,
Serguei


On 6/19/19 21:02, Daniil Titov wrote:

Please review the change that fixes an intermittent failure of 
serviceability/dcmd/framework/* tests on Linux platform.

The problem here is that get_namespace_pid() method, that is called by 
mmap_attach_shared () that in turn is called by PerfMemory::attach(),
tries to read the namespace pid information from /proc//status file. 
However, it doesn't check that the error indicator associated with
stream is set that results in the endless  loop (lines 664-677) if the process 
terminates after /proc//status was opened (line 659)
and checked for null (line 661).

   658snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
659   FILE *fp = fopen(fname, "r");
660 
661   if (fp) {
662 int pid, nspid;
663 int ret;
664 while (!feof(fp)) {
665   ret = fscanf(fp, "NSpid: %d %d", , );
666   if (ret == 1) {
667 break;
668   }
669   if (ret == 2) {
670 retpid = nspid;
671 break;
672   }
673   for (;;) {
674 int ch = fgetc(fp);
675 if (ch == EOF || ch == (int)'\n') break;
676   }
677 }
678 fclose(fp);
679   }

The fix adds the check for the error indicator to ensure that the "while" loop 
terminates properly if the file no longer exists.

Issues [3] and [4] have the same cause and will be closed as duplicates of this 
issue.

Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, and tier3 
tests are in progress.

[1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
[2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
[3] https://bugs.openjdk.java.net/browse/JDK-8223600
[4] https://bugs.openjdk.java.net/browse/JDK-8217351

Thanks!
-Daniil






Re: RFR: 8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

2019-06-19 Thread Chris Plummer

Hi Daniil,

I think your fix is good, although I wonder about the robustness of this 
function given that the status file can change while it is reading it. I 
wonder if it can return a false negative because it never saw the 
matching line. This could happen if a line gets deleted, causing the 
matching line to suddenly be earlier in the file, possibly before the 
current location being read. Anyway, that's not really related to the 
current issue or fix, but if you think it might be an issue maybe a bug 
should be filed for it.


thanks,

Chris

On 6/19/19 9:02 PM, Daniil Titov wrote:

Please review the change that fixes an intermittent failure of 
serviceability/dcmd/framework/* tests on Linux platform.

The problem here is that get_namespace_pid() method, that is called by 
mmap_attach_shared () that in turn is called by PerfMemory::attach(),
tries to read the namespace pid information from /proc//status file. 
However, it doesn't check that the error indicator associated with
stream is set that results in the endless  loop (lines 664-677) if the process 
terminates after /proc//status was opened (line 659)
and checked for null (line 661).

   658snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
659   FILE *fp = fopen(fname, "r");
660 
661   if (fp) {
662 int pid, nspid;
663 int ret;
664 while (!feof(fp)) {
665   ret = fscanf(fp, "NSpid: %d %d", , );
666   if (ret == 1) {
667 break;
668   }
669   if (ret == 2) {
670 retpid = nspid;
671 break;
672   }
673   for (;;) {
674 int ch = fgetc(fp);
675 if (ch == EOF || ch == (int)'\n') break;
676   }
677 }
678 fclose(fp);
679   }

The fix adds the check for the error indicator to ensure that the "while" loop 
terminates properly if the file no longer exists.

Issues [3] and [4] have the same cause and will be closed as duplicates of this 
issue.

Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, and tier3 
tests are in progress.

[1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
[2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175
[3] https://bugs.openjdk.java.net/browse/JDK-8223600
[4] https://bugs.openjdk.java.net/browse/JDK-8217351

Thanks!
-Daniil






RFR: 8220175: serviceability/dcmd/framework/VMVersionTest.java fails with a timeout

2019-06-19 Thread Daniil Titov
Please review the change that fixes an intermittent failure of 
serviceability/dcmd/framework/* tests on Linux platform.

The problem here is that get_namespace_pid() method, that is called by 
mmap_attach_shared () that in turn is called by PerfMemory::attach(),  
tries to read the namespace pid information from /proc//status file. 
However, it doesn't check that the error indicator associated with 
stream is set that results in the endless  loop (lines 664-677) if the process 
terminates after /proc//status was opened (line 659) 
and checked for null (line 661).

  658 snprintf(fname, sizeof(fname), "/proc/%d/status", vmid);
   659FILE *fp = fopen(fname, "r");
   660  
   661if (fp) {
   662  int pid, nspid;
   663  int ret;
   664  while (!feof(fp)) {
   665ret = fscanf(fp, "NSpid: %d %d", , );
   666if (ret == 1) {
   667  break;
   668}
   669if (ret == 2) {
   670  retpid = nspid;
   671  break;
   672}
   673for (;;) {
   674  int ch = fgetc(fp);
   675  if (ch == EOF || ch == (int)'\n') break;
   676}
   677  }
   678  fclose(fp);
   679}

The fix adds the check for the error indicator to ensure that the "while" loop 
terminates properly if the file no longer exists.

Issues [3] and [4] have the same cause and will be closed as duplicates of this 
issue.

Testing: Mach5 hotspot_serviceability tests succeeded, tier1,tier2, and tier3 
tests are in progress.

[1] Webrev: http://cr.openjdk.java.net/~dtitov/8220175/webrev.01/
[2] Bug: https://bugs.openjdk.java.net/browse/JDK-8220175 
[3] https://bugs.openjdk.java.net/browse/JDK-8223600 
[4] https://bugs.openjdk.java.net/browse/JDK-8217351 

Thanks!
-Daniil