I guess I'm a little confused as to what this means.

I have a simple WAR based web application that uses Lucene created
indexes to provide search results in a xml format.

Especially given the following context:

and supplementary question how do I modify my pom file to do this
with maven

I was under the impression that Paul was building a separate
application using Lucene during the build stage to create the
indexes, but then using an application - specific mechanism to use
those indexes.

That's what I thought, too.
Yes correct, let me explain it a bit further. I'm trying to deploy an application that serves results from a lucene index in response to user requests. Deploying it manually to my own server is fine, first of all I just copy the index files to a location on the disk, then I deploy my application, and within its web.xml I have a servlet parameter that defines where the indexes are, so within the servlets init() method i initilize the indexes. The problem is that I'm trying to deploy my application to Amazon Web Services using autoscaled Elastic Beanstalk, this means that the application has to be able to be initilized and created based on what is in the war because Elastic Beanstalk will automatically start new servers as required due to load and terminate those instances when not required.

I do seem to have a solution, but I detail it here because it doesn't seem quite right and might be useful to others.

Short Answer:
Originally I first tried putting the index files (unzipped) into the src/main/resources folder of my maven project, and referred to the WEB-INF/classes/index_dir location in my web.xml and tomcat didn't start. It didnt seem right for non Java classes to be in that folder anyway so I discarded that idea, however Ive just tried it again locally and it worked so if it works on EB that is the solution I'm going to use for now unless any better suggestions. It does mean that the resulting .war file is rather large, far too large to upload from my local machine but as I build the code and indexes from another AWS EC2 instance I can just dump it into S3, and deploy from S3 to EB, if I need to redeploy you dont seem able to redeploy from S3 but Ive realised that when I need to redeploy I would do it to a new EB configuration and then swap the dns from EB1 to EB2 to mimimize downtime so that is not really a problem.

A supplementary question:
Is there a system property I can use to refer to the WEB-INF as a relative directory rather than full path

Long Answer:
Since originally posting this question I have looked at a few other possible solutions but none were satisfactory.

1. Deploy war without indexes but in my servlet init() method write code to grab the compressed indexes from S3 and unzip to location specified in web.xml. This worked with a single instance EB but unfortunately AWS does not wait for the init() method (which takes 20 minutes) to finish before declaring it, and this meant because it was busy unzipping indexes and could not serve request it caused AWS monitoring to declare it to busy and open another two instances, once all three instances finished their init() method they were all up and working , then a few minutes two were terminated because not needed. But this means if server is genuinely busy the newly started instances will be declared ready by AWS but fail to service requests during the init() period. This seems like a bug with AWS but not going to change anytime soon.

2. Deploy war without indexes and use AWS .ebextensions files to grab and unzip the indexes. This might work but I really dislike having to write custom deployment code/configurations as a general rule. And because the size of the disk provided by the AWS instance is limited, unzipping is not so simple. For example instead of creating a tar.gz file , I had to gzip the files first and then tar so when untarrred I could decompress one file at a time which required less temporaray space, this would make the eb code more complex.

3. Create a custom Amazon Image that can be used by EB, this seems theoretically possible but quickly got very messy and seemed very much a hack.

4. Use Docker, AWS now supports the docker framework. This might be a good solution but having spent far too much time on understanding AWS I wasnt keen to spen dmore time on yet another framework to solve one problem

If the Lucene API is used, then writing a servlet context listener
that digs out the initial indexes and places them in java.io.tmpdir
in a known subdirectory is probably the way to go. This ensures
that even if a WAR file is not exploded, the Lucene DirectoryReader
API can get to the files.
That's precisely what I was suggesting.

So this is what I did with 1> but because of the AWS issue didnt work as well as hoped.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to