Hi,

 

>From my experience, I think the problem is more the libc, or the combination 
>of kernel and libc. I had huge problems with the libc and the kernel that was 
>shipped originally with Ubuntu 16.04. A recent update fixed a lot of those 
>issues on Ubuntu. One thing that completely broke was running of 32 bit 
>programs (Java crashed with strange errors as soon as you started it), so you 
>were not even able to compile Lucene with a 32 bit JDK. Maybe they broke 
>something in thread handling.

 

In addition: Keep in mind, that newer CPU models have different behavior in 
their guarantees on what is visible in cache/RAM if different cores/cpus when 
they do concurrent work. In lots of cases, the problem is just missing/wrong 
synchronization/happens before/… and therefor wrong usage of java memory model 
in those tests. As the listed tests are also failing from time to time with 
other CPUs I’d check them and fix their concurrency, I am quite sure that 
something is fishy on them.

 

Uwe

 

-----

Uwe Schindler

Achterdiek 19, D-28357 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: [email protected]

 

From: Ishan Chattopadhyaya [mailto:[email protected]] 
Sent: Monday, October 16, 2017 2:34 AM
To: [email protected]
Subject: Re: 6.6.2 Release

 

Hi Gus,

I encountered those failures on Fedora 26. It seems everything is fine on 
Fedora 25, but there are bunch of failures on Fedora 26. Most of these failures 
are Solr losing ZK connections and hence timing out.

Initially, I thought these are related to the kernel version, but now I think 
this is distribution specific. What is a bit baffling to me is that I remember 
these tests running well with Fedora 26 about a month back (so, maybe some 
latest update broke it?). I'm looking into what could be the underlying reason.

Interesting that you could reproduce these on Ubuntu 17.04. I'll take a look at 
that distro version as well. I'm also using the AMD Threadripper 1950X these 
days, and I see that -Dtests.jvms=24 gives me the best overall times.

Regards,

Ishan

 

On Mon, Oct 16, 2017 at 1:36 AM, Gus Heck <[email protected] 
<mailto:[email protected]> > wrote:

@Ishan, re failures.. I'm seeing very common test failures on a Ubuntu 17.04 
box that I built very recently, but that is apparently lower version than your 
success. (4.10.0-37-generic x86_64)... would be interested to know what distro 
you are using... and what the diff between the versions you used were. Failures 
were more common with things cranked up to 30 processors (success 1 in 10 
times, note: box has a 32 thread AMD processor). Failures are less common with 
"auto" which yeilds 4 (about 50/50 chance of success), am now working on 
figuring out how common failures are with it tuned down to 1 thread (but that's 
very slow).

 

On Sun, Oct 15, 2017 at 3:01 PM, Ishan Chattopadhyaya 
<[email protected] <mailto:[email protected]> > wrote:

Thanks Steve, it was indeed the problem!

 

On Sun, Oct 15, 2017 at 8:42 PM, Ishan Chattopadhyaya 
<[email protected] <mailto:[email protected]> > wrote:

Thanks a lot, Steve! I'll take a look :-)

 

On Sun, Oct 15, 2017 at 8:37 PM, Steve Rowe <[email protected] 
<mailto:[email protected]> > wrote:

Hi Ishan,

(I see you pinged me on #solr-dev IRC, but I was AFK for a while, sorry.)

I think the change I made to buildAndPushRelease.py, which fixed a problem I 
had with building the 7.0.1 RC that sounds suspiciously like what you’re 
encountering, might help?  I didn’t commit to branch_6_6, but here’s the 
branch_7_0 commit: 
<https://git1-us-west.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=8d6c3889>

Here’s the branch_6_6 version:

  result = p.poll()
  if result is not None:
    msg = '    FAILED: %s [see log %s]' % (command, LOG)

Null is returned by poll() to indicate that the process has not terminated.  So 
what’s AFAICT happening to you is that the process *is* terminating in time, 
and is returning 0 (for success), which is not Null, which triggers failure.  
This is wrong.  My patch switches this code to use wait() instead of poll():

  try:
    result = p.wait(timeout=120)
    if result != 0:
      msg = '    FAILED: %s [see log %s]' % (command, LOG)
      print(msg)
      raise RuntimeError(msg)
  except TimeoutExpired:
    msg = '    FAILED: %s [timed out after 2 minutes; see log %s]' % (command, 
LOG)


--
Steve
www.lucidworks.com <http://www.lucidworks.com> 


> On Oct 15, 2017, at 10:45 AM, Ishan Chattopadhyaya <[email protected] 
> <mailto:[email protected]> > wrote:
>
> Update on the RC: I'm trying to build one for some time now. The latest 
> situation is that all the steps seem to be going well, but still the script 
> fails: https://gist.github.com/chatman/fa307c3e8253d2014d0e7bb381328396
>
> Looking into what could be going wrong. Any help is most welcome.
>
> @Shalin, I remember you mentioned that you found a way to build the artifacts 
> separately and signing them separately. Can you please share how to do so? It 
> will save me a lot of time; currently each of my attempts is building 
> artifacts from scratch.
>
> Thanks,
> Ishan
>
> On Sat, Oct 14, 2017 at 11:09 PM, Erick Erickson <[email protected] 
> <mailto:[email protected]> > wrote:
> Thanks! I ran precommit and test after the commit and all's well....
>
> On Sat, Oct 14, 2017 at 8:27 AM, Ishan Chattopadhyaya 
> <[email protected] <mailto:[email protected]> > wrote:
> No problem, I'll pick up your commit. :-)
>
> On Sat, Oct 14, 2017 at 8:51 PM, Erick Erickson <[email protected] 
> <mailto:[email protected]> > wrote:
> Committed now.
>
>
>
> On Sat, Oct 14, 2017 at 8:19 AM, Erick Erickson <[email protected] 
> <mailto:[email protected]> > wrote:
> Michael: Good catch. Have I mentioned lately that Git and I don't get along? 
> Apparently I was in some weird state when I tried to push.
>
> Ishan: Many apologies, but I'll have to push again, is it too late to re-spin?
>
> On Sat, Oct 14, 2017 at 7:56 AM, Ishan Chattopadhyaya 
> <[email protected] <mailto:[email protected]> > wrote:
> Here are the logs of two failed runs, FYI.
> http://textsearch.io/tests.log.gz (kernel: 4.13.5-200.fc26.x86_64)
> http://textsearch.io/tests2.log.gz (kernel: 4.13.5-200.fc26.x86_64)
>
> On Sat, Oct 14, 2017 at 8:15 PM, Ishan Chattopadhyaya 
> <[email protected] <mailto:[email protected]> > wrote:
> FYI, I've been struggling to run tests for past 4-5 hours. About 10-15 of 
> them failed on every run; I tried all the branches, variety of different 
> machines (Intel i7 Haswell-E, Ryzen 1700, Threadripper 1950X). My JDK version 
> on all of these are 8u144.
>
> Finally, figured out that all my machines had the latest 
> 4.12.14-300.fc26.x86_64 or 4.13.5-200.fc26.x86_64 kernels. When I downgraded 
> the kernel to 4.11.6-201.fc25.x86_64, the tests started running as usual. 
> Now, I'll try to build the RC for 6.6.2 on this kernel. Is this a known issue?
>
> On Sat, Oct 14, 2017 at 6:26 AM, Erick Erickson <[email protected] 
> <mailto:[email protected]> > wrote:
> Done both for 6.6 and 6x
>
> On Fri, Oct 13, 2017 at 5:16 PM, Ishan Chattopadhyaya 
> <[email protected] <mailto:[email protected]> > wrote:
> Sure Erick, please go ahead.
> I'll start the release later today.
> Thanks,
> Ishan
>
> On Sat, Oct 14, 2017 at 5:44 AM, Erick Erickson <[email protected] 
> <mailto:[email protected]> > wrote:
> Ishan:
>
> I have 11297 ready to rock-n-roll, it's just a matter of pushing it. Give me 
> a few.
>
> The thing I'm not clear on is what to do with CHANGES.txt. Currently it's in 
> 7.0.1 and 7.1.
>
> I propose adding a 6.6.2 section to 6x and including it there and leaving it 
> in the 7.0.1 and 7.1 sections of master.
>
> I'll do it that way, you can change it if you want unless I hear back from 
> you sooner.
>
> Erick
>
> On Fri, Oct 13, 2017 at 4:59 PM, Allison, Timothy B. <[email protected] 
> <mailto:[email protected]> > wrote:
> Sounds good.  Thank you!
>
>
>
> From: Ishan Chattopadhyaya [mailto:[email protected] 
> <mailto:[email protected]> ]
> Sent: Friday, October 13, 2017 5:25 PM
> To: [email protected] <mailto:[email protected]> 
> Subject: Re: 6.6.2 Release
>
>
>
> > Any chance we could get SOLR-11450 in?  I understand if the answer is no. 😊
>
> Currently, I want to have this release out as soon as possible so as to 
> mitigate the risk exposure of the security vulnerability. Since this is not 
> committed yet, I'd vote for leaving this out and possibly having it included 
> in a later release, if needed.
>
> +1 to SOLR-11297.
>
>
>
>
>
> On Sat, Oct 14, 2017 at 2:32 AM, David Smiley <[email protected] 
> <mailto:[email protected]> > wrote:
>
> Suggested criteria for bug-fix release issues:
>
> * fixes a bug :-)     and doesn't harm backwards-compatibility in the process
>
> * helps users upgrade to later versions
>
> * documentation
>
>
>
> +1 to SOLR-11297
>
>
>
> I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?
>
>
>
> On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson <[email protected] 
> <mailto:[email protected]> > wrote:
>
> I'd also like to get SOLR-11297 in if there are no objections. Ditto if the 
> answer is no....
>
>
>
> It's quite a safe fix though.
>
>
>
>
>
>
>
> On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. <[email protected] 
> <mailto:[email protected]> > wrote:
>
> Any chance we could get SOLR-11450 in?  I understand if the answer is no. 😊
>
>
>
> Thank you!
>
>
>
> From: Ishan Chattopadhyaya [mailto:[email protected] 
> <mailto:[email protected]> ]
> Sent: Friday, October 13, 2017 4:23 PM
> To: [email protected] <mailto:[email protected]> 
> Subject: 6.6.2 Release
>
>
>
> Hi,
>
> In light of [0], we need a 6.6.2 release as soon as possible.
>
> I'd like to volunteer to RM for this release, unless someone else wants to do 
> so or has an objection.
>
> Regards,
>
> Ishan
>
>
>
> [0] - 
> https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list
>
>
>
> --
>
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: 
> http://www.solrenterprisesearchserver.com
>
>
>
>
>
>
>
>
>
>
>
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] 
<mailto:[email protected]> 
For additional commands, e-mail: [email protected] 
<mailto:[email protected]> 

 

 





 

-- 

http://www.the111shift.com

 

Reply via email to