Re: Increasing ASF Jenkins bandwidth

2017-07-11 Thread Steve Rowe

> On Jul 11, 2017, at 3:41 AM, Uwe Schindler  wrote:
> 
> Thanks Steve!
> 
> How came that we got a second machine? Is it a VM or is it some other 
> (sponsored one)?

It is a VM.  See .

INFRA-14004 was closed as “Won’t Fix” a month ago, and then suddenly yesterday 
the request on that issue was honored.

I don’t understand why (hence my comment about it being bizarre), but we were 
maybe grandfathered out of Greg Stein’s kiboshing of new project-specific VMs 
(see 
).

--
Steve
www.lucidworks.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Increasing ASF Jenkins bandwidth

2017-07-11 Thread Uwe Schindler
Thanks Steve!

How came that we got a second machine? Is it a VM or is it some other 
(sponsored one)?

Uwe

-
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Steve Rowe [mailto:sar...@gmail.com]
> Sent: Tuesday, July 11, 2017 7:06 AM
> To: dev@lucene.apache.org
> Subject: Re: Increasing ASF Jenkins bandwidth
> 
> Well that’s bizarre - as Pono mentioned on thread “New Jenkins build node”,
> we now have a second Jenkins node named “lucene2".  Both the “lucene"
> and the “lucene2" node are under the “lucene” label, which should cause the
> jobs to be distributed across the two nodes without any configuration
> changes, since all our jobs are tied to the “lucene” label.
> 
> Many Jenkins jobs have failed since “lucene2" came online an hour ago,
> because “ant ivy-bootstrap” hasn’t been run on “lucene2”.
> 
> On Jun 28, 2017, at 5:47 PM, Uwe Schindler <u...@thetaphi.de> wrote:
> > The second question was, if there is anything against using the other nodes
> for running Lucene tests? My answer to that is:
> > […]
> > - You can easily run Lucene jobs on other nodes, just not with a 24/7
> schedule  The only thing you have to do is: (a) invoke "ant iv-bootstrap"
> once per node (see the manual job on our node to trigger this);
> 
> I manually started the "Lucene-Ivy-Bootstrap” job after reconfiguring it to 
> run
> on the “lucene2” node.  Hopefully that will allow jobs to succeed there.
> 
> > (b) ideally place a lucene.build.properties file on the node's home 
> > directory
> with node-specific config. Problem is that Lucene's automatic CPU
> assignment can be tuned for jenkins nodes (you can use all cpus, but on the
> other hand if a node has multiple executors, go down). As this is job-
> unspecific, it's better to deploy that as config file on the node's home 
> directy.
> The lucene node has a lucene.build.properties, the same for Policeman
> Jenkins.
> 
> I copied the contents of /home/jenkins/lucene.build.properties from the
> “lucene” node into a shell script build task on the “Lucene-Ivy-Bootstrap” 
> job,
> then ran the job again to populate the same file on the “lucene2” node.
> 
> So the “lucene2” node (and jobs running on it) should be good now.
> 
> --
> Steve
> www.lucidworks.com
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Increasing ASF Jenkins bandwidth

2017-07-10 Thread Steve Rowe
Well that’s bizarre - as Pono mentioned on thread “New Jenkins build node”, we 
now have a second Jenkins node named “lucene2".  Both the “lucene" and the 
“lucene2" node are under the “lucene” label, which should cause the jobs to be 
distributed across the two nodes without any configuration changes, since all 
our jobs are tied to the “lucene” label.

Many Jenkins jobs have failed since “lucene2" came online an hour ago, because 
“ant ivy-bootstrap” hasn’t been run on “lucene2”.

On Jun 28, 2017, at 5:47 PM, Uwe Schindler  wrote:
> The second question was, if there is anything against using the other nodes 
> for running Lucene tests? My answer to that is:
> […]
> - You can easily run Lucene jobs on other nodes, just not with a 24/7 
> schedule  The only thing you have to do is: (a) invoke "ant iv-bootstrap" 
> once per node (see the manual job on our node to trigger this);

I manually started the "Lucene-Ivy-Bootstrap” job after reconfiguring it to run 
on the “lucene2” node.  Hopefully that will allow jobs to succeed there.

> (b) ideally place a lucene.build.properties file on the node's home directory 
> with node-specific config. Problem is that Lucene's automatic CPU assignment 
> can be tuned for jenkins nodes (you can use all cpus, but on the other hand 
> if a node has multiple executors, go down). As this is job-unspecific, it's 
> better to deploy that as config file on the node's home directy. The lucene 
> node has a lucene.build.properties, the same for Policeman Jenkins.

I copied the contents of /home/jenkins/lucene.build.properties from the 
“lucene” node into a shell script build task on the “Lucene-Ivy-Bootstrap” job, 
then ran the job again to populate the same file on the “lucene2” node.

So the “lucene2” node (and jobs running on it) should be good now.

--
Steve
www.lucidworks.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Increasing ASF Jenkins bandwidth

2017-06-28 Thread Uwe Schindler
Hi,

> > I don't understand the background: we currently have Lucene node, so why
> a second one?
> 
> Our randomized testing regime means the more bandwidth, the better.  Are
> you seriously arguing against increasing testing frequency?

Of course not! When reading your original e-mail, I was under the impression 
that Infra wants to turn off our lucene-slave node!

The second question was, if there is anything against using the other nodes for 
running Lucene tests? My answer to that is:
- I talked with Infra several times about this and the general problem is: As 
the other nodes are shared by all projects, it would be not good to block other 
nodes with running Lucene/Solr tests 24/7 in a loop. This is what the current 
lucene node is doing. It triggers builds by a "fake" schedule so it ensures 
something is running all the time. This already led to confusion at infra, 
because this is untypical. Lucene is "the" user of randomized testing and it is 
not yet used at other places to this extent. Because of this, Infra does not 
allow "scheduled" builds with high frequencies and disables them from time to 
time. I figured out with them, that all jobs running on the lucene node are 
excluded by that limitation, so we can use it 100% for randomization.
- You can easily run Lucene jobs on other nodes, just not with a 24/7 schedule 
 The only thing you have to do is: (a) invoke "ant iv-bootstrap" once per node 
(see the manual job on our node to trigger this); (b) ideally place a 
lucene.build.properties file on the node's home directory with node-specific 
config. Problem is that Lucene's automatic CPU assignment can be tuned for 
jenkins nodes (you can use all cpus, but on the other hand if a node has 
multiple executors, go down). As this is job-unspecific, it's better to deploy 
that as config file on the node's home directy. The lucene node has a 
lucene.build.properties, the same for Policeman Jenkins.

> There has been discussion of automatically testing patches on JIRA issues
> (SOLR-10912).  As far as I’m concerned, we won’t be able to consider that
> until/unless we have more Jenkins bandwidth.

I think that's fine, because it's not running 24/7, only once per JIRA patch 
upload.

> Also, it would be nice to be able to run regular performance benchmarks, but
> without more bandwidth, that would not be a prudent use of the currenly
> limited resources.

That again has the same problems as described above, a separate node would be 
good!

> > BTW, the Policeman server now has new hardware like NVME RAID drives
> and faster CPU. See it as donation to the project. It is not linked to ASF
> because of custom build behavior regarding JDK randomization, that's not
> possible to setup at ASF.
> 
> Thanks Uwe for running your Jenkins.  It’s great to have regular testing on 
> all
> the platforms you support.

Thanks!

About donating nodes: Maybe we can get a second sponsor for another node 
running Lucene jobs in our usual 24/7 loop. The difference to Policeman's 
machine would be that it can be included into ASF's Jenkins infrastructure, as 
it would be a non-random-JVM slave and would only run Linux. So we just need to 
buy/rent the hardware and install the OS (e.g., Ubuntu 16.04) on it, everything 
else would be handled by ASF. The node would get the "lucene" label and would 
be part of the round.

FYI, the Policeman Jenkins node is a rented datacenter node, hosted at Hetzner 
in Germany. I updated it recently ago to this one (perviously I was at another 
hoster, but this one is way better because of better network, hardware, native 
IPv6,...): https://www.hetzner.com/dedicated-rootserver/px61-nvme

For monthly 59,00 EUR you get 2x512 GB NVMe drives and a - ok, well enough - 
CPU with 8 hypercores (that runs main Jenkins and Slave 3 VMs at same time). I 
am very happy with that one. Maybe some Solr-related company like my own 
sponsoring SD DataSolutions GmbH could sponsor the second one? I can take over 
the initial installation and some admin tasks. The machine gets a single IPv4 
address by default and a whole IPv6 /64 network. Policeman Jenkins mainly uses 
the IPv6 network, only for the not-so-equipped countries without IPv6 in every 
household I added extra IPv4 addresses for the virtual machines on it .

What's interesting: Apache's Europe webservers are running in same datacenter 
and all IPv6 Apache webserver visitors use this server (also when coming from 
the US; I think it's the only one at ASF with IPv6 - which is incredible 
lame...). Here is the traceroute from Policeman Jenkins to ASF webserver 
(somehow Apache missed to add the reverse IPv6 lookup):

root@serv1 ~ # ifconfig eth0
eth0  Link encap:Ethernet  HWaddr 90:1b:0e:c4:3f:e9
  inet addr:88.99.242.108  Bcast:88.99.242.127  Mask:255.255.255.192
  inet6 addr: 2a01:4f8:10b:2ab::2/64 Scope:Global
  inet6 addr: fe80::921b:eff:fec4:3fe9/64 Scope:Link
[...]

root@serv1 ~ # traceroute6 lucene.apache.org

Re: Increasing ASF Jenkins bandwidth

2017-06-28 Thread Mike Drob
> There has been discussion of automatically testing patches on JIRA issues
(SOLR-10912).  As far as I’m concerned, we won’t be able to consider that
until/unless we have more Jenkins bandwidth.

I don't think we need to dedicate resources to Yetus patch testing. It runs
on generic ASF nodes.

On Wed, Jun 28, 2017 at 4:03 PM, Steve Rowe  wrote:

> Hi Uwe,
>
> > On Jun 28, 2017, at 4:10 PM, Uwe Schindler  wrote:
> >
> > I don't understand the background: we currently have Lucene node, so why
> a second one?
>
> Our randomized testing regime means the more bandwidth, the better.  Are
> you seriously arguing against increasing testing frequency?
>
> There has been discussion of automatically testing patches on JIRA issues
> (SOLR-10912).  As far as I’m concerned, we won’t be able to consider that
> until/unless we have more Jenkins bandwidth.
>
> Also, it would be nice to be able to run regular performance benchmarks,
> but without more bandwidth, that would not be a prudent use of the currenly
> limited resources.
>
> > BTW, the Policeman server now has new hardware like NVME RAID drives and
> faster CPU. See it as donation to the project. It is not linked to ASF
> because of custom build behavior regarding JDK randomization, that's not
> possible to setup at ASF.
>
> Thanks Uwe for running your Jenkins.  It’s great to have regular testing
> on all the platforms you support.
>
> --
> Steve
> www.lucidworks.com
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Increasing ASF Jenkins bandwidth

2017-06-28 Thread Steve Rowe
Hi Uwe,

> On Jun 28, 2017, at 4:10 PM, Uwe Schindler  wrote:
> 
> I don't understand the background: we currently have Lucene node, so why a 
> second one?

Our randomized testing regime means the more bandwidth, the better.  Are you 
seriously arguing against increasing testing frequency?

There has been discussion of automatically testing patches on JIRA issues 
(SOLR-10912).  As far as I’m concerned, we won’t be able to consider that 
until/unless we have more Jenkins bandwidth.

Also, it would be nice to be able to run regular performance benchmarks, but 
without more bandwidth, that would not be a prudent use of the currenly limited 
resources. 

> BTW, the Policeman server now has new hardware like NVME RAID drives and 
> faster CPU. See it as donation to the project. It is not linked to ASF 
> because of custom build behavior regarding JDK randomization, that's not 
> possible to setup at ASF.

Thanks Uwe for running your Jenkins.  It’s great to have regular testing on all 
the platforms you support.

--
Steve
www.lucidworks.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Increasing ASF Jenkins bandwidth

2017-06-28 Thread Cassandra Targett
I'm not up on the background of the original request entirely, but
just to be clear the 2nd node is not for the Ref Guide. AFAIUI, Ref
Guide builds are already running on a separate node previously
reserved for any ASF project documentation builds. This came up as
part of a separate conversation Steve and I were having about the
state of Solr testing.

On Wed, Jun 28, 2017 at 3:10 PM, Uwe Schindler  wrote:
> Hi,
>
> I don't understand the background: we currently have Lucene node, so why a
> second one?
>
> The ref guide stuff can easily run on other nodes. The Lucene node was added
> so we are able to run randomized tests all day long without blocking other
> projects.
>
> So as long as we have the current Lucene node, I see no reason to panic.
>
> BTW, the Policeman server now has new hardware like NVME RAID drives and
> faster CPU. See it as donation to the project. It is not linked to ASF
> because of custom build behavior regarding JDK randomization, that's not
> possible to setup at ASF.
>
> Uwe
>
> Am 28. Juni 2017 21:20:23 MESZ schrieb Steve Rowe :
>>
>> In an offline discussion, Cassandra Targett pointed out to me that the
>> INFRA issue set up to provision an additional Jenkins node for the Lucene
>> project  has been closed
>> as Won’t Fix, because:
>>
>>>  Per [~gstein] we will not be provisioning any more project-specific
>>> build nodes on our infrastructure. If they wish to provide resources, we can
>>> connect them to our master like Cassandra, etc.
>>
>>
>> I think that if no organization is willing to provide Jenkins hardware, we
>> should consider figuring out how to run Lucene/Solr tests on ASF’s
>> non-project-specific nodes.
>>
>> Uwe (or anybody else), do you have any thoughts about this?
>>
>> --
>> Steve
>> www.lucidworks.com
>>
>>
>> 
>>
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Increasing ASF Jenkins bandwidth

2017-06-28 Thread Uwe Schindler
Hi,

I don't understand the background: we currently have Lucene node, so why a 
second one?

The ref guide stuff can easily run on other nodes. The Lucene node was added so 
we are able to run randomized tests all day long without blocking other 
projects.

So as long as we have the current Lucene node, I see no reason to panic.

BTW, the Policeman server now has new hardware like NVME RAID drives and faster 
CPU. See it as donation to the project. It is not linked to ASF because of 
custom build behavior regarding JDK randomization, that's not possible to setup 
at ASF.

Uwe

Am 28. Juni 2017 21:20:23 MESZ schrieb Steve Rowe :
>In an offline discussion, Cassandra Targett pointed out to me that the
>INFRA issue set up to provision an additional Jenkins node for the
>Lucene project  has
>been closed as Won’t Fix, because:
>
>> Per [~gstein] we will not be provisioning any more project-specific
>build nodes on our infrastructure. If they wish to provide resources,
>we can connect them to our master like Cassandra, etc. 
>
>I think that if no organization is willing to provide Jenkins hardware,
>we should consider figuring out how to run Lucene/Solr tests on ASF’s
>non-project-specific nodes.
>
>Uwe (or anybody else), do you have any thoughts about this?
>
>--
>Steve
>www.lucidworks.com
>
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>For additional commands, e-mail: dev-h...@lucene.apache.org

--
Uwe Schindler
Achterdiek 19, 28357 Bremen
https://www.thetaphi.de

Increasing ASF Jenkins bandwidth

2017-06-28 Thread Steve Rowe
In an offline discussion, Cassandra Targett pointed out to me that the INFRA 
issue set up to provision an additional Jenkins node for the Lucene project 
 has been closed as Won’t 
Fix, because:

> Per [~gstein] we will not be provisioning any more project-specific build 
> nodes on our infrastructure. If they wish to provide resources, we can 
> connect them to our master like Cassandra, etc. 

I think that if no organization is willing to provide Jenkins hardware, we 
should consider figuring out how to run Lucene/Solr tests on ASF’s 
non-project-specific nodes.

Uwe (or anybody else), do you have any thoughts about this?

--
Steve
www.lucidworks.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org