Re: OT: Tomcat on AWS for Dummies

Mark Eggers Sat, 20 Jul 2019 09:39:45 -0700

Chris,

> Jerry,
> 
>> On 7/19/19 13:38, Jerry Malcolm wrote:
>>>>> I have had a dedicated hosted environment with WAMP and
>>>>> Tomcat for over 15 years.  I'm very familiar with everything
>>>>> related to that environment... apache http, mysql, dns
>>>>> server, the file system, JAMES, and all of my management
>>>>> scripts that I've accumulated over the years. Everything is
>>>>> in the same box and basically on the same desktop. But now I
>>>>> have a client that has needs that are best met in an AWS
>>>>> environment.
>> Can you explain that in a little more depth? What is it about AWS
>> that meets their needs better?
> 
>> I ask because you can provision a one-box wonder in AWS just like
>> you do on a physical space with a single server. You just have to
>> use remote-desktop to get into it, and then it's all the same.
> 
>> But if they want to use RDS, auto-scaling, and other
>> Amazon-provided services then things can get confusing.
>>> Unfortunately, that is the precise reason we need to go AWS.... 
>>> Extremely high availability and scalability / load-balancing
>>> across multiple instances.  There will need to at least one
>>> instance running at all times. Even when doing
>>> maintenance/upgrades on other instances.
> 
> It's not "unfortunate" necessarily. At least it makes it clear why
> they want to migrate to AWS.
> 
>> So the answer to your question really depends upon what the client 
>> thinks they'll be getting by you taking your existing product "to
>> the cloud".
> 
>>>>> I understand just enough AWS to be dangerous, which is not 
>>>>> much.... I do know that it's a bunch of different modules,
>>>>> and I believe I lose the direct file system access.
>> That heavily depends upon how you do things. You can get yourself
>> a server with a disk and everything, just like you are used to
>> doing.
>>> Do you mean AWS offers a 'file server' module that I can
>>> basically access directly as a drive from TC?  If so, that eases
>>> my mind a bunch. I manage and serve gigabytes of videos and
>>> photos.  I don't really want a full CMS implementation.  Just
>>> want a big hard drive I can get to.
> 
> No, AWS doesn't really have a "file server" module that you can
> enable. Do you need a large disk for bulk storage? What are you
> storing? Perhaps switching over to a key-value store (which can act
> like a filesystem) or a document-store database (e.g. CouchDB) if you
> have fairly regular documents that you want to store. All of those
> technologies are quite cloud-friendly. You can even use them
> single-node if you want to make your application available to either
> AWS-based clients OR your more traditional one-box-wonder clients. Or,
> you can abstract your "write a file somewhere" process so that you can
> swap implementations at run-time: configuration says local-disk? Use
> FileWriter. Using CouchDB? Push the file to CouchDB through it's APIs.
>


What about using EFS (NFS store) in this environment? For Windows, an
NFS client would have to be installed, but that doesn't seem like much
of a barrier.

> 
>>>>> I've watched an AWS intro video and a couple of youtube
>>>>> videos on setting up TC in AWS. But they always starts with
>>>>> "now that you have your AWS environment set up....".   I am
>>>>> looking for something that explains the big picture of
>>>>> migrating an existing WAMP+TC to AWS.  I am not so naive to
>>>>> think that there won't be significant rip-up to what I have
>>>>> now. But I don't want to do unnecessary rip-up just because I
>>>>> don't understand where I'm heading. Basically, I don't know
>>>>> enough to know what I don't know.... But I need to start
>>>>> planning ahead and learning soon if I'm going to have any
>>>>> disasters in my code where I might have played it too loose 
>>>>> with accessing the file system directly in my dedicated 
>>>>> environment.
>>>>>
>>>>> Has anyone been down this path before and could point me to
>>>>> some tutorials targeted to migrating WAMP+TC to AWS? Or
>>>>> possible hand-hold me just a little...? I'm a pretty quick
>>>>> learner.  I just don't know where to start.
>> As usual, start with your requirements :)
> 
>>> Requirements are what I have now in a single box, but with the
>>> addition of multiple instances of TC (and HTTPD and/or mySQL?)
>>> for HA and load balancing.
> 
> One box with multi is ... not HA. Sorry. That allows you to do things
> like upgrade the application without taking it down completely. But it
> does not allow you to perform maintenance on the OS because everything
> has to come down.
> 
>>> Day-1 launch won't be massive traffic and theoretically could be
>>> handled by my single dedicated server I have today.  But if this
>>> takes off like the client predicts, I don't want to get caught 
>>> flat-footed and have to throw together an emergency redesign to
>>> begin clustering TC to handle the traffic. Rather go live
>>> initially with single instance AWS, but with a thought-out (and
>>> tested/verified) plan to easily begin clustering when the need
>>> hits.
> 
> One of the first things I'd take a look at is what it would take to
> switch from vanilla MySQL to one of the databases available through
> RDS. This sounds stupid: MySQL is available via RDS, so you're done,
> right? Well, it's not so simple. First of all, RDS is distributed by
> definition. So anything that affects e.g. MySQL when you go
> "distributed" means that your application has to handle it.
> 
> For example -- and we're doing this right now with MariaDB -- when you
> program a simple JDBC transaction, you usually expect that an INSERT
> is going to fail if a record with the same PK already exists, and then
> you roll-back that transaction and you are done. The user has to try
> again maybe. You don't like that, so you do this:
> 
> BEGIN
> SELECT id, foo, bar, baz FROM table WHERE id=? FOR UPDATE
> 
> if(present)
>   update row with new foo, bar, baz values
>   update row
> else
>   move to insert row
>   update row with new foo, bar, baz values
>   insert row
> 
> COMMIT
> 
> That's great, and it works perfectly on a single server. However, on
> multiple servers, you can get a failure on COMMIT because that's when
> the transaction is sent-around to the rest of the cluster.
> 
> So if you want to have semantics like the above (create-or-update
> record), then you have to wrap pretty much every transaction in a "try
> it once; if you fail, rollback and try the exact same thing again".
> You need to do this because if you don't, you'll give your users way
> more unnecessary errors because simply re-trying the transaction will
> (likely) succeed because the contention has passed.
> 

My understanding of multiple availability zone RDS is different. There's
a primary connection and endpoint, and then that endpoint is switched if
there's a failure. From the documentation that I've read, the actual
endpoint doesn't change, just where it actually points to.

Is this not your experience?


> I'm assuming that there isn't anything problematic about running
> multiple web application instances because you said you already run
> multi-instance. Don't run those on the same EC2 instance. Instead, run
> multiple EC2 instances.
> 
> Load-balancing is easy, although I have to admit that I haven't
> figured out yet how to properly do "application load balancers" (I use
> old-school "Classic" ELB).
> 
> Once you can (a) use ELB (b) have multiple EC2 instances for your
> applications and (c) use RDS, I think you'll be able to go into
> production like that and figure everything else out from there.
> 
> I've really been hoping that someone can come to an ApacheCon and give
> a presentation about deploying Tomcat-based web applications to AWS.
> Especially talking about auto-scaling and stuff like that.
> 
> Jean-Frederic has been showing his "to the cloud" presentation but
> it's not nearly to the level of detail that I'd like to see. Yes, I
> can see that he has been able to do it, but I have no idea HOW it has
> been done. What is the role of e.g. Kubernetes? How does one provision
> those services to get started? What do you need to start with one
> node? How do you move to 2 nodes? What about arbitrary auto-scaling
> with AWS? These are the kinds of things I want to see.
> 
> -chris

I'm slowly working through all of those issues. I hope to have something
documented by the end of this year.

I'm trying to do this without Docker, without Netflix's Spinnaker, etc.
However, a suitably configured Docker image (application, Tomcat, Redis
client) works really well locally. I can spin up lots of containers and
it all just works (as long as the application is well-behaved). I'm just
using swarms right now. I'll try it with Kubernetes soon.

I've tried Elastic Beanstalk, but I was not impressed with deployment
speed. I also make use of versioned deployments in a more traditional
environment, and I find that to be useful.

What I'm looking at now is Elastic Groups and deployment with a CI/CD
platform such as Jenkins. How do you tell how many EC2 instances do you
have in your group and target them? One possible solution would be to
deploy to an S3 bucket, trigger an event to send information into a
message queue, and then have each EC2 instance poll the queue to do the
actual deployment.

That sounds like a LOT of plumbing, and sort of counter to the
philosophy of AWS. I'm still thinking about it.

Hacking around with application load balancers and multiple Elastic
Beanstalk groups might be another solution, but you have to play some
undocumented games with AWS. That always raises alarm bells.

I move at my client's pace, but hopefully this will happen sooner than
later.

. . . just my two cents
/mde/

> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>

signature.asc
Description: OpenPGP digital signature

Re: OT: Tomcat on AWS for Dummies

Reply via email to