[this post is available online at https://s.apache.org/InsideInfra-Chris ]

"Inside Infra" is a new interview series with members of the ASF Infrastructure 
team. The series opens with an interview with Chris Thistlethwaite, who shares 
his experience with Sally Khudairi, ASF VP Marketing & Publicity.

- - -
"I get very attached to the technology that I'm working with and the 
communities that I'm working with, so if a server goes down or a site's acting 
wonky, I take that very personally. That reflects on how I do my job.”
- - -

- Let’s start with you telling us your name --how is it pronounced?

It’s “Chris Thistle-wait” --I don’t correct people who say “thistle-th-wait”-- 
that’s also correct, but our branch of the family doesn’t pronounce the second 
“th”.

- What’s your handle if people are trying to find you? I know you’re "christ" 
(pronounced "Chris T") on the internal ASF Slack channel.

Yeah --anything ASF-related is all under "christ".

- Do people call you "Christ"?

They do! I first started in IT around Christmastime and was doing desktop 
support and office-type IT. When people started putting in tickets, and my 
username  was "christ" there, they were asking "why was Christ logging into my 
computer right now?" and it became a thing. When I was hired at the ASF I told 
Greg (Stein; ASF Infrastructure Administrator) about that story, he said "you 
gotta go with that for your Apache username."

- When and how did you get involved with the ASF?

A long time ago I started getting into Linux and Open Source, and naturally 
progressed to httpd (Apache HTTP Server). Truth be told, that’s where it 
started and stopped, but I’ve always been interested in Open Source and working 
with projects and within communities. Three years ago I was looking for a new 
job and stumbled across the infra blog post for a job opening. I fired up an 
email, sent it off to VP infra and that’s how everything started. The ramp up 
of the job was diving deep into everything there is with the ASF and Open 
Source --which I am still diving. I don't think I found the bottom yet with the 
ASF.

- How long have you been a member of the Infrastructure team?

This November will be my fourth year.

- What are you responsible for in ASF Infrastructure?

Infrastructure has a whole bunch of different services that are used by both 
Apache projects as well as the Foundation itself: the Infrastructure team 
builds, monitors, supports, and keeps all those things running. Anything from 
Jenkins to mailing lists to Git, SVN repositories and on the back end of things 
we keep everything working for the Foundation itself within, say, SVN or 
mailing lists, keeping archives of those things, keeping your standard security 
and permissions set up and split out. Anyone you ask on the Infra team will 
say: "I do everything!" It's too hard to explain --it's quite possibly a little 
bit of everything that has anything to do with technology --as broad as it can 
possibly be.

- So you really have to be a jack-of-all-trades. Do you have a specialty, or 
does everybody literally do everything?

Everyone on the team generally does everything --for the most part any one of 
us can jump into the role of anyone else on the team. Everyone has a deep 
knowledge of a particular or a handful of services that they’ll take care of 
--like, Gavin (McDonald; ASF Infrastructure team member) knows more about 
Jenkins and the buildbot and build services than most people on the team. At 
any one given point we’re on call and need to be able to fix something or take 
a look at something, so everyone needs to be versed enough in how to 
troubleshoot Jenkins. That can also be said for not just services that we 
offer, but also parts of technology, like MySQL or Postgres or our mail system 
or DNS: we do have actual physical hardware in some places, and we have VMs 
everywhere too, so sometimes we’re troubleshooting a bad backplane on a server 
or why a VM is acting the way it is. There's a very broad knowledge base that 
all of us have but there are specifics that some people know more about than 
others.

- How does ASF Infrastructure differ from other organizations?

There are a lot of similarities but a ton of differences. A big part of how 
Infra is different is, to use a "Sally-ism": if you look at it on paper, it 
wouldn't work --I've heard you describe the ASF that way. If you explained the 
way things work at the Foundation to somebody, they would literally think that 
you're making it up and there's no way that it would possibly be working the 
way that it does. There's a lot of that with the Infrastructure team too: many 
people that I keep in contact with that I've worked with over the years, from 
my first job where we would buy servers, unbox them, rack them, wire them up, 
set them up, and run them from the office next door to us --I'd be impressed 
whenever I had 25 servers running in our little "data center" at that job, and 
now I talk to these guys about what we do at the ASF: we have 200 servers in 
10+ different data centers that are vendor-agnostic and we make it all work. 
They ask: "how the heck do you do that?!" We just do --it's an interesting 
thing as to how it all works together because we solve problems that others 
have as well, but their problems are often centralized to one thing, or a data 
center that they control and own, or one cloud provider that they control and 
own, where they deal with a single vendor and possibly at most have to talk 
with the same vendor in two different geographical areas. We're having to deal 
with stuff with one cloud vendor that's a VM and other stuff on the other side 
of the world that's actual hardware in a co-location or data center running and 
the only thing that makes them the same is that they're on the Internet.

It's a good summation of the team too due to the fact that we’re all based out 
of worldwide locations, we’re not all in one spot doing something.

- Describe your typical workday. Since you're all working on different things 
on such a huge scale, what's it like to be you?

"It's amazing" [laughs]. Everyone on the team generally has some project or 
projects that they are working on --long-running things for Infra. 

I'm currently working on rewriting a script for Apache ID creations. The 
process of putting your ICLA in, sending off to the Secretary, the Secretary 
says, "OK good," puts in all your data, and that gets put into a file in SVN 
...currently, we have a script that we manually run that does a bunch of checks 
on the account and whatnot, and then creates it, sends off a welcome email, 
whatever. I'm rewriting that because it's an old script, it's in several 
different languages. It's actually six scripts that all run off of one script. 
Consolidating that into one, massive script, that's in a supported language for 
us, and then moving forward with it into something that we could potentially 
automate, versus me having to run a script manually a couple of times a day.

Fluxo (the ID/handle for Apache Infra team member Chris Lambertus) was working 
on some mail archive stuff in our mail servers. Gavin (Apache Infra team member 
Gavin McDonald) is working on some actual build stuff. Everyone has kind of 
"one-two punch" tasks that they work on during the day, and then the rest of 
the time is (Jira) tickets or staying on top of Slack, if people are asking 
questions in the Infra channel or in our team channel or something like that. 
The rest of it is bouncing around inside the ASF and checking things out, or 
finding out new projects to work on, or ways to improve such-and-such process. 

- How many requests does Infra usually receive a day, in general?

Over the past three years, we've resolved an average of 6 Jira tickets a day, 
year-round. We've had 213 commits to puppet repositories in the last 30 days. 
We handle thousands of messages on our #asfinfra Slack channel, and have had 
659 different email topics in the last year.

- Dovetailing that, how do you keep your workload organized?

Everyone on the team does it their own personal way. I have a whiteboard and a 
Todoist list. We also have Jira to keep our actual tickets prioritized and 
running. We have a weekly team meeting/call and talk about things that are 
going on, and is the more social aspect of what we do week-to-week.

- How do you get things done? You're juggling a lot of requests --what's the 
structure of the team? How do you prioritize when things are coming in? Is 
there a go-to person for certain things? If you're sharing everything, how do 
you balance it and who structures it? How does that work? 

To one end, the funnel to us starts with Greg and David (ASF Infrastructure 
Administrator Greg Stein and VP Infrastructure David Nalley). It's different 
from other places that I've worked, where I'm on a team of other systems 
administrator/engineering people, and we have a singular, customer-facing site. 
Someone says, "Hey, this should be blue instead of red," there's a ticket and 
we make the change and then it goes to the production.

There're many different ways to get a hold of the Infrastructure team. Everyone 
gets emails about Jira tickets and gets updated as soon as one of those comes 
in. If it's something that you know about --say, the Windows nodes that we 
handle-- those all fall into my wheelhouse because I'm the last one to work 
with Windows extensively. Everyone else knows how to work with them, but it 
makes more sense for me to pick it up in some cases. 

Most of the stuff in Jira are very "break-fix" kinds of things. A lot of the 
requests on Slack are too, for example: "DNS is busted," and we fix DNS. It's a 
very quick, conversational, "Let me go change that," or, "I'm going to go fix 
that real quick." Of course, some of the Jira tickets are very long-running, 
but the end result is they're fixing something that used to work. 

We were originally running git.apache.org, and Git WIP, so we hosted our own 
internal Git servers and we would read-only mirror those out to GitHub. 
Somewhere along the line, Humbedooh (the ID/handle for Apache Infrastructure 
team member Daniel Gruno) started writing out Gitbox or building Gitbox based 
on the need to have writable GitHub repositories. He built Gitbox and set up 
with the help of some other people on the team, got it going, and that became 
our replacement for git.apache.org. While we still host our own Git 
repositories, people are free to either write to ours or write to GitHub, and 
the changes are instantaneously mirrored between the two.

We had Git hosted at the ASF, and had GitHub as a read-only resource. The need 
arose to have rewrite on both sides: Humbedooh went and built out MATT (Merge 
All The Things), which does all of the sync between GitHub and our Git 
instance. 

MATT started a while ago, and Humbedooh added on to that to do the rewrite to 
GitHub. Basically what all that does is once your Apache ID is created, or if 
you have one already, you go on ID.apache.org, you add your GitHub username in 
there and then MATT --there's another part of that called Grouper-- MATT and/or 
Grouper will run periodically, pull data from our LDAP system and say, "Oh, 
ChrisT at apache.org has ChrisT as his GitHub ID. I'll pull those down." It 
says, "ChrisT is in the Infrastructure group. Hey look, there's an 
Infrastructure group in GitHub. I'll give ChrisT write access to the GitHub 
project." In a nutshell, that's what that does.

There's a ton of other house cleaning things, if you get removed from the LDAP 
group ... we run LDAP and keep all this stuff straight. If you get removed from 
the Infrastructure group at LDAP then MATT/Grouper will go and say, "Oh, this 
person's not in this LDAP group but they do have access in GitHub. Let me pull 
that so that they don't have access to that any more." It does housekeeping of 
everything as well as additions to groups and that kind of thing. There's a ton 
of technical backend to that, and that's what Humbedooh's doing. 

At first when Git and GitHub were set up, it was fine: the ASF has to keep 
canonical everything about what goes into each project. You could only write to 
our Git repos. Then it was conveniently mirrored out to GitHub because there's 
a lot of tools that GitHub has that we didn't have or weren't prepared to set 
up. GitHub has a very familiar way of doing things for a lot of developers. 
Once GitHub Writable came along with Gitbox and the changes to MATT, that 
opened up a whole other world of tools for people on projects to use. If they 
wanted to use pull requests on GitHub, they could start using pull requests on 
GitHub to manage code. They could wire up their build systems to GitHub with 
Jenkins so that whenever a PR was submitted and got approved, it would kick off 
a build in Jenkins and go through unit tests and do all the lovely things that 
Jenkins does.

It was really an evolution of, "Here's the service that we have. Someone, 
somewhere, be it infrastructure or otherwise, once they have writable GitHub 
access, here we go." And here's the swath of things that that now opens up to 
projects inside the ASF that if they could come and set up a project with us, 
and then never, ever actually commit code to the ASF, it would always go to 
GitHub but still be safe and saved on our GitHub servers for ASF project 
reasons.

At the same point, we saw a need and said, "Let's build this out and go." 
Another funnel that comes into us is when we're on-call, something breaks and 
we ask, "Why do we do it this way? We should be doing it a different way." We 
then come up with a project to fix that or build it. It's a very interesting 
process of how work gets into the Infrastructure team.

It's been an interesting ride with that one.

There's always stuff that we're working and fixing and making better. For the 
most part, Gitbox as it is now is kind of in a state of "It's getting worked 
on". If there are bugs that need fixed, it gets fixed, but I don't know what 
the next feature request is on Gitbox. There's talk of other services ...like 
GitLab. If someone wanted to write code and put it in GitLab as opposed to  
GitHub, then someone would need to come in and write the connector from Gitbox 
to GitLab. So it's possible. I don't know if that's necessarily an 
Infrastructure need as much as it is a volunteer need for infra. But it's a 
system that can be set up to any other Git service as long as someone goes in 
and writes that.

- You brought up an interesting point here, which is volunteers. Do volunteers 
contribute to Infra also? 

We sometimes have volunteers, yes. We have a lot of people on the infra mailing 
lists that will bounce ideas back to us or they'll work on a ticket or put in a 
pull request.

- Well, the need is not as critical because you have a paid team, versus Apache 
projects. 

Right. That's exactly true. There's a bit of a wall that we have to have 
because we work with Foundation data, which not everyone has access to. 
Granted, we're a non-profit, Open Source company and everything's out there to 
begin with, but usernames and passwords of databases and things that we have 
encrypted that the team has access to isn't necessarily something that you 
would want any volunteer to have access to.

- How do you stay ahead of demand? This is a really interesting thing because 
part of it is you're saying, "Necessity is the mother of invention." You guys 
are doing stuff because you've got those binary, "break-fix" types of 
scenarios. In an ideal situation, do you even have enough runway to be able to 
optimize your processes? How do you have the opportunity to fix things and 
improve things as you're going along if you're firefighting pretty much all day 
long?

That's a really good question about just how our workflow is. In other 
companies that I've been in, there's the operations people that are doing the 
"break-fix", and then there's the development people that are doing "the next 
big thing". The break-fix folks are spinning the plates and keeping them spun 
without breaking, and that's a lot of firefighting. That's literally all that 
job is. Even when you're not firefighting, you're sitting around thinking about 
firefighting in a sense of, “when is this going to fall over again? If it does 
fall over, what can we do to fix it so it doesn't do that anymore?" And in the 
past, the break-fix guys, the firefighters, would end up saying, "Hey, there's 
this thing that needs fixed." And it would fall over the wall to the 
developers. They would develop the fix for it, and then it would go back into 
production and then the cycle continues. 

To some extent, that's kind of where DevOps came from: if you merge the two of 
those together, then while you're firefighting you can also write the fix for 
the problem, and then you don't have to wait for the lag between the two. We 
don't have that split here. Everyone on the team is firefighting with one hand 
and typing out the solution with another. And a lot of the times our project 
work, like getting a new mail server spun up or my task to rewrite the workflow 
for new Apache ID creations, I've been working on that for a very long time 
because it will keep falling off ... it gets put on the backburner while we're 
like, "Hey, we found out that our TLP servers are getting hammered with 
downloads from apps and people trying to use them instead of the mirror 
servers." So, let's set up downloads.apache.org and we can funnel stuff over to 
that so that that server can get hammered and do whatever it needs to do so 
that our www. site and all the Apache Project websites stay up and running in a 
more reliable way.

- What's the size of the teams that you were dealing with before that had a 
firefighting team and a dev team versus ASF infra?

The last "big" corporate job I had was ...six ops people that kept the site 
going, four database people, another eight technical operations-type people… 
all-told it was about thirty.

There were technically thirty firefighting people and we had a NOC (network 
operations center) that was literally people that only watched dashboards and 
watched for alerts. Whenever those go off, they’d call the firefighting people. 
The NOC was another 20 people. And then the development teams were ... twenty 
to fifty people.

- What kind of consumer base were they accommodating? Does it match the volume 
that ASF has? Was it more of a direct, enterprise type of, "We have a customer 
that's paying, we have to respond to them" situation? Or is it different?

This was at a financial services company that transacted on their Website: 
completely different from the type of stuff we're dealing with here at the ASF. 
Volume-wise, they were much smaller, but it was much more ...visible, as their 
big times were at the start of market and end of market. After end-of-market 
came all the processing for the day to get done before markets started the next 
day. The site had to be up 100% of the time. We had SLAs of five minutes. If 
you got paged or something broke, you had to get the page and respond to it in 
a way of, "Hey, this is what's going on and these are the people that I need 
involved with it," all within five minutes of it going off. That was the way 
the management structure was. It was intense.

In scale, Apache probably does way more: they do way more traffic across all of 
our services in any given day. If someone doesn't get mail for a little bit, 
then they come and tell us or we get alerted of it by our systems, and we go 
and we fix it and we take care of it. But with the financial services group, 
people were losing money: dealing with people and money is just a very 
stressful situation for anyone working in technology because you have to get it 
right and it has to be done as fast as possible before someone's kids can’t go 
to college anymore. It was a completely different minefield to navigate.

- The type of stress that's involved or the type of demand or the pressure is 
different, but you also have the responsibility with ASF that systems have to 
be up and running. I understand it's not mission critical if something goes 
down for more than five minutes, which is different in the financial sector, 
but do you feel that same type of pressure? Is it there or is it completely 
different for you? 

No. I think I do because we also have SLAs here: they're just not five minutes. 
We have structure around that and the way that we handle uptime and that kind 
of thing. I get very attached to the technology that I'm working with and the 
communities that I'm working with, so if a server goes down or a site's acting 
wonky, I take that very personally. That reflects on how I do my job. If a 
server's not working or if something's broken either because of me or something 
externally that's going on, I want to get that up and running as fast as 
possible because that's how I would expect anyone to work in a field that has 
...any technology field, for that matter. And generally, that's the same 
attitude the rest of the team has as well.

- How has ASF Infra changed over the years?

It's matured quite a bit. When I first started, it was Gavin, Fluxo, Humbedooh, 
Pono (former ASF Infrastructure team member Daniel Takamori), and me. There 
were five of us. The amount of stuff that we got done, I'm like, "Man, there's 
no way that five people can do this."

That's kind of what I'm pointing at. If you're a team of eight or five or 
twelve or whatever, compared to the other thing that you did with the other job 
that had maybe a core team of twenty, thirty --that in itself is insane.

We were five people, everything was very, "Here's the shiny thing we're working 
on," and then something else would come up and we'd have to jump on that. Then 
something else would come up and we'd have to jump on that. We were very ...I 
don't want to say we were stretched thin, but there wasn't necessarily ...time 
for improvement.

There was a lot of stuff we had still in physical hardware, and a couple of 
vendors that we no longer use. But things were moving more towards a 
configuration-based infrastructure with Puppet instead of one person building a 
machine, setting up all the configs themselves, installing everything and then 
letting it go off into the ether to run and do its job. We were moving 
everything towards Puppet to where you configure Puppet to configure the 
server. So then if the server breaks, or goes down or goes away or we need to 
move vendors or whatever, all you need to do is spin up a new server somewhere 
else, give it the Puppet config, it configures itself and then goes off into 
the ether to run and do whatever it needs to do.

- That's great. More automation.

Right. We were automating a lot more stuff right when I first started. Over the 
course of the next year, the team kind of ebbed and flowed a little bit until 
we were eight in the last year. We started to get to the point of "where can we 
point the gun to next? What can we target next to get it taken care of and 
done?" That's where we started taking on more specific infra projects, for 
instance, mail. Our mail server has been around since the dawn of time, and 
it's virtualized so it moves servers every now and then, but the same base of 
it is quite old in technology standards.

Fluxo started moving this on to newer stuff and he got that going. We started 
taking care of projects that were not broken, but needed to be worked on. 
Instead of waiting for it to break, we're fixing and upgrading and moving down 
that path versus firefighting, break-fix, that kind of thing. We were moving 
more towards, "Hey, I see a problem. I have time. I'm going to take care of 
that and make that into a more serviceable system." 

Automation has helped quite a bit with that. I also think that just as the team 
grew, it just got to a point where I think tickets were getting responded to 
quicker, emails, chat was responded to quicker. And then we also could focus 
more on the tools that we use for the foundation. Like, HipChat was going away. 
We needed a new chat platform, so we chose Slack. And then we updated and moved 
everything over to Slack, and that's where we are with that. It started 
following into its own with workflows of like, "Oh, okay. How do we get this 
done? Let's go do that."

- What areas are you experiencing your biggest growth? Is it a technical area? 
Like, "Hey, all of a sudden mail's out of control"? Or, "Hey, we need to 
satiate the demand for more virtual machines," or is it a geographic influence 
that's coming in in terms of draw? Where are you guys pointing all your guns to?

Currently we're trying to get more out to the projects and talk to people more 
often. Not that we didn't do that before, because ApacheCons and any Meetups 
that we had, Infra would always have a table. We were always accessible, but we 
were always passively accessible. We weren't really going out and talking to 
projects proactively to say, "Hey. What do you guys need from us? What are we 
doing with this?" So I think that's one part of it, that I think that we're 
moving towards a little bit. It's not at all technical, but more of a 
foundation broadening, community broadening thing that we're doing.

That's one part of it. The other thing that we're doing too is from a more 
technical or infrastructure standpoint, is we're really trying to get our arms 
around all of the services we provide, and then really take a look at those and 
say, how is this used inside the ASF? How is it used in the industry as a 
whole? Do we need to put more time and energy towards those things in order to 
make the offerings of the infrastructure team a little bit a more solid 
platform, kind of thing? Generally, that ... and on top of any other automation 
and that kind of stuff, I think that's really the two spots that I see infra 
growing in a lot in the next year-ish of just really boiling down our services 
to, "Hey, we've seen a lot of people using this. And a lot more projects are 
using this. It's not just a flash in the pan. We need to build out more infra 
around blah service, so let's really do that and make that a solid platform to 
use."

- What do you think people would be surprised to know about ASF Infra? When you 
tell someone something about your job and they go, "Whoa, I had no idea" or, 
"That's crazy." What would people be surprised to know?

That Apache has an infrastructure team. [laughs]

- Why are you saying that?

Because honestly, I don't think a lot of people know about the Infrastructure 
team. Those that do, have used us for something, not used us for something, 
have talked to us about something, and worked with us on something. Those that 
don't are like, "Oh, I didn't know the ASF paid people to be here," --that kind 
of thing. That's kind of the two reactions I've got from people. It's like, 
"Oh, that's cool. You work for the infrastructure team." Shrug. And then the 
other people are like, "Oh, sweet. Yeah, that's great. I know Gav. I've worked 
with him on blah, blah, blah." But that's not necessarily surprising. I mean, 
it is in a sort of way. 

- When people ask, "What are you doing for work?" and you say you work for ASF, 
do people even know what that is? Do they know what you're doing? Do they care? 
Are they like, "Oh, okay. Whatever"?

There's literally three types of people that I've run into that ask, "Oh, what 
are you doing for work?" One person is the person that has no idea what the ASF 
is, not even the vaguest hint of Apache, and they're like, "Oh, okay. That's 
cool." There's that next person that does, and may or may not know about the 
ASF but knows of Apache, the Web server, or some other lineage of that.  
They're like, "Oh, whoa. That's super cool. It's impressive.” That's wild. Then 
the third people ask "Why are ‘Indians’/Native Americans running software? That 
doesn't make any sense to me" and "Are you on a reserve?" I swear to God I've 
gotten that question before. I don't even know how to answer that. I'm like, 
"No, buddy."

- Are these technologists or are these just guys off the street? Are they in 
the industry?

Guys off the street. I say Apache Software Foundation, and they're like 
"Apache" and "software" doesn’t make sense. Actually I've gotten mean tweets 
too whenever I've been tweeting about being at ApacheCon. Things like I'm 
"taking away" from Native Americans and whatever...

- We also get that on Twitter, on the Foundation side: we get included in 
tweets about some kind of violation along the lines of, "Stand up for the ..." 
I get it. From time to time we also get sent these "How dare you?" letters, 
that sort of kind of thing. It's an interesting challenge, the whole issue of 
"why do Native Americans run this thing?" misinterpretation. Let’s move on. 
What's your favorite part of your job?

The whole job is my favorite part of the job.

- That's funny because everyone at Infra ... You know how people have bad days 
or may be grumpy or whatever, in general you guys seem to all like each other. 
You all have a great camaraderie. You all get along. You work really closely 
together. It's a very interesting thing to see from the outside. Is that true? 
Or are you just playing it up? Does it really work that way?

That's absolutely true. I've found that generally speaking, when you get a 
bunch of nerds together, they either really like each other and everything 
works or they really don't like each other and nothing gets done. The team is 
great, and it's like no other team I've ever worked with before. But it's very 
odd because you go through the interview process, and the interviews are 
interviews. I mean, you get to know people in interviews, but not really. Then 
you start working with people, and at some point you start getting below the 
surface. And at some point you get deep enough to where you find out whether or 
not ...how you gel with all these people. 

It's very odd that all of us have the same general sense of humor. We'll talk 
about food non-stop in the channel, and recipes and cooking, and different 
beers or different whatevers. It's nice to get to that point with a team that 
you're comfortable enough with everybody to ... like I said, I've been here 
three years and there is still so much that I don't know, both technical and 
non-technical, about the ASF. I ask very dumb questions in channel and say, "I 
have no idea why this is doing this this way," or, "Can someone else take a 
look?" or, "I don't know what I'm doing here." And never in the entire time 
I've been here, from the day one until now, has anyone ever chastised me for 
not knowing something or said anything about the way that I work or something 
like that. Well, at least not in channel. At least not publicly. 

Everyone's very supportive. It doesn't matter if you know everything there 
possibly is to know about one singular product or thing you're working on, or 
don't know anything about it. You can ask questions and really learn about why 
it was done the way it was done, or figure out how to fix a problem. No problem 
on the team. It's just like, "Okay, yeah. This is what you have to do." Or, 
"Here's a document. Read up on it." Or, "I don't know either." And then out of 
that comes an hour of conversation and then a document pops out, and then the 
next person that asks, we can say, "Here, go read the doc." Yeah. I mean, we're 
all very happy. Very happy.

- Which is really good. Looking back when you first started, what was your 
biggest challenge when you came onto the team?

Oh man. I look back at that and I feel like the learning curve was ... It 
wasn't a curve. It was a wall. I've used Linux, I've used Ubuntu for a while 
and various other flavors of Debian and whatnot, so getting spun up on all of 
...expanding my Linux knowledge was a big deal, expanding everything about the 
ASF and how it works. Which I'm still trying to figure out. If you know, send 
me something to read to figure out how that all works. I mean, I don't want to 
sound like I was completely out of my depth and I have no idea what I'm doing, 
but I feel like I was completely out of my depth and I had no idea what I was 
doing. 

There's a lot about the ASF that is just tribal knowledge, and there's a lot 
about Infra that's tribal knowledge. It's just no one has anything written down 
--"the server's been running under Jim's desk for the last 15 years in a 
basement that has battery backups and redundant Internet, so it's never gone 
down. But don't ever touch that server, because if it goes down, then all of 
our mail goes down" or whatever. There was a lot of figuring all that out for 
myself and digging around. Which is, frankly, one of the parts that I really 
enjoy, is just, "Hey, this thing broke. I've no idea what that thing is. I've 
no idea where it lives," and just diving in and trying to figure out what's 
going on with it and how it's built, and then the hair trigger that sets it off 
to crash and never work again. Yeah. That's an interesting question too.

- What are you most proud of in your Infra career to date? You're talking about 
overcoming these challenges, I'm always curious just to see what people are 
like, "Yeah, I'm patting myself on the back for that one" or, "Ta-da. That's my 
ta-da moment."

I did lightning talks at ApacheCon Las Vegas and didn't get a phone call from 
you when I was done. [laughs]

- I wasn't at lightning talks --what did you say? What would make me call you?

I didn't say it. We were on stage, and it's John (former ASF Infrastructure 
team member John Andrunas), Drew (ASF Infrastructure team member Drew Foulks), 
and I, and we figured we'd do lightning talks: "Hey, we're the new guys: ask us 
infrastructure questions." A week or two before ApacheCon, there was a massive 
outage at a particular vendor. It wasn't: "Oh, our server's down for a while," 
the server went down and then it was *gone*. It got erased from the vendor 
side. I can't remember what service it was. There was something that 
disappeared two weeks before Vegas and never came back. 

It wasn't just us, though: tons of companies had this issue. So we're on stage 
answering questions, and someone asks where this service went: "What happened 
to XYZ?" And John has the mic and he goes, "You should probably go ask [vendor 
name]." At that point it was very widely published that the vendor"s response 
was like, "Whoops, someone tripped over the cord that powered the data center. 
And when it came back up, then deleted all of your VMs.” They totally 
acknowledged it and they didn't give refunds for it, so it was a little bit of 
a PR kerfuffle for them. The vendor is in the other room handing out buttons 
and stickers, and John was like, "Oh yeah, go ask the [vendor] guys what 
happened to your server. That's their fault," he said it jokingly but my jaw 
dropped. 

- [laughs] No one told me this story. No one said anything. Someone's trying to 
protect you. I had no idea this happened ...oh my gosh.

Well, David Nalley was in the back of the room, and he's screaming with his 
hands cupped around his mouth, "Don't badmouth the vendor and the sponsors." I 
deflected and quickly moved onto something else. [laughs]

But yes, that's another good question that I haven't actually reflected on. 
Looking back and seeing where Infra was when I first started and where it is 
now, it was a very runnable and very good team then, and it's a very runnable 
and it's a very good team now. I feel like a lot of the work that I've done and 
a lot of the work that the team has done over the last three years has been 
getting from a spot of "everything's on fire, who's holding up what this 
weekend?" to things being stable and us nitpicking on whether or not something 
needs to be updated or not. That's huge. That's a big step from like starting a 
company and treading water to being profitable and having resources to do other 
things versus just keeping your employees paid. I mean, it's a big step for a 
company and it's a big step for Infrastructure.

- I love your talking about how you guys are tightly-knit and all that. How 
would your co-workers describe you?

The other odd part about that too is being completely remote and not having 
day-to-day, face-to-face interactions with people. You get a very odd sense of 
people through text for a 24-hour period that you're online reading stuff. It's 
a different perspective than if I was in the office every day, working on 
something and interacting with people. Even though every day, except for the 
weekends, I'm online talking to these guys and doing stuff. How would they 
describe me? Dashingly good looking and ... I don't know. [laughs]

- I know that Infra's "just Infra," right --you guys are all under the Infra 
umbrella. Do you have a title? When you got hired, what do they call you?

We're all systems administrators. The only person that actually has a title is 
Greg, and he's Infrastructure Administrator.

- What are the biggest threats you face? For infra folks or systems 
administrators or infrastructure administrators even, what do you need to watch 
out for these days? What's big in the industry? Is everyone saying, "Oh, XYZ's 
coming"? In terms of your role in the job: is there something that you need to 
keep your eye on? Is there something that you would advise other people, "If 
you're in this job keep an eye out for blah, this is a new threat" or anything 
along those lines?

General scope stuff. 16 years ago, everything was hardware: you bought hardware 
and you had to physically put it somewhere. And virtual machines came along 
about the same time. People were starting to do virtual stuff to where you 
could have a physical machine and then multiple machines running on that, 
sharing resources. Then cloud and infrastructure as a service, and everything's 
been moving more and more towards that over the years.

Of course, there's still people that work in office IT, doing desk support 
stuff or office infrastructure type things.Those are still a majority of how 
things run at companies. As everything is moved more towards the cloud or 
hosted services, more systems administrators are becoming more like software 
engineers. And software engineers are becoming more like systems 
administrators. They're kind of melding into one, big group of people. Now of 
course, there are still people that only write software. But gone are the days 
where it used to be someone would write some code and say, "I need to deploy it 
and get it out to all these computers." They would write the code, they'd hand 
it off to a systems person. Systems would go and configure on whatever server 
to get it out to however many machines and hit the button and go. The software 
developer never really needed to know hardware specifics of the systems that it 
was going to run on. And the systems people never really needed to know what 
software packages this was getting put together. There's exceptions to that, 
but for the most part ... 

Over the years, it's fallen into a thing now where the software developer knows 
exactly what systems this is going to run on and how it's going to run there, 
so it's more efficient and things work better and they're releasing less buggy 
code based on the fact that they know they're closer to the hardware. And the 
systems people, they want to troubleshoot it more and work with it and fix 
problems because they're closer to the software and know more about its 
internal workings and how it's going to run on systems. Everything is getting 
more and more chunked down into, first it was VMs, then it's cloud, then it's 
containers with Docker and things like that, and it's going to get more 
virtualized down into that. Knowing about Docker orchestration and things like 
Kubernetes and Apache Mesos. The reality is other people run Kubernetes, people 
run Docker, people run everything. That's the interesting thing in terms of how 
they do it at ASF. We don't require folks to do just one thing.

In terms of where the industry's going ... everything's getting pushed down to 
"a developer can work in a container on a set of systems, write software for 
that and then deploy that to a machine themselves, never involving a systems 
engineer at all, and build a product using that." It's getting stuff out the 
door faster, and it's also keeping the unicorn of the industry a while to go 
... even today, I developed this thing, it works on my machine. If I move it 
over to another computer, it stops working. Why? What's the problem with that? 
Containering or containers fix that problem. The container you run on my system 
runs the same way as it does on every system everywhere. It takes the "runs on 
my machine" thing out of the equation. 

- What's your greatest piece of advice? What would you tell aspiring sysadmins?

Part of the ASF is the community behind it, and a giant part of that is what 
makes it work. I mean, you could say all of it. That's what makes everything 
work with this. Right when I first started the sysadmin kind of thing, I didn't 
get into Meetups and Linux Users Groups and any of that stuff. I didn't get 
into the network. I didn't go into the community that I had around me. And 
honestly, I don't know if that's because it didn't exist or because I didn't 
know about it or what, but now that I'm older and wiser, the community part of 
it is really ...there's a massive benefit to that. Aside from socialization, or 
networking and how to get a better job through networking, getting together 
with like-minded people and talking through your problems is an amazing tool to 
use. And I didn't do that enough when I was a sysadmin starting out, and 
looking back it's something that I sort of regret not doing, was really sharing 
knowledge with other people in the community and building a group of people 
that I could ping ideas off of, or help with other ideas, or share in the 
knowledge of, "Hey, this is what's going on in the industry" or, "Hey, I saw 
this at work the other day. How do we work around that?" or that kind of thing. 
It's much easier these days with social media: the never-ending amounts of 
social media. But it's a big, important part of my day-to-day now, that I wish 
I had 16 years ago.

- That's powerful. OK, If you had a magic wand, what would you see happen with 
ASF infra?

If I had a magic wand, I'd update our mail server instantly or maybe magic wand 
a few other projects.

- Wait. I know you're joking, but what is the problem with the mail server?

It's running on an older version FreeBSD that doesn't play well with our 
current tools. Some form of that server has been upgraded, patched, moved, 
migrated, etc for the last 20 years. We want to bring it up to more modern 
standards. Mail runs fine for the most part, but it's probably the most 
critical service we have at the ASF and we want to make sure everything 
continues to hum along. Because of that, it's a huge project that touches a ton 
of different parts of our infrastructure.

- How big is it?

It's all of our email. Every email that goes through an apache.org address.

This is a huge project and Chris (Lambertus) has been working on it for a while 
--it's not a simple thing to fix. It's very, very complicated. We couldn’t do 
it without him.

Back to the magic wand thing: I'd wish for more wands. 

- - -

Chris is based in Pennsylvania on UTC -4. His favorite thing to eat during the 
workday is chicken ramen.

# # #

NOTE: you are receiving this message because you are subscribed to the 
announce@apache.org distribution list. To unsubscribe, send email from the 
recipient account to announce-unsubscr...@apache.org with the word 
"Unsubscribe" in the subject line.

Reply via email to