Actually, let me withdraw the question for now. If I call an unfiltered
(describe-images) on my account I'll get ~27,900 images. It takes 70
seconds to retrieve them using the Java api
(from clojure).
If I then print (str image) for all those images to a file, that makes adds
another 153 seconds for a total of 223 seconds. Presumably that's the
normal java toString() method invocation.
If I print out the Amazonica version of it, it takes 195 seconds,
presumably because we're sharing keyword references internally and so
abusing memory less overall (just a wild guess).
So if I do the native calls and cherry pick the information I want (like
the java EC2 CLI does), then I can get the time down significantly.
Otherwise Amazonica is probably doing a reasonable job given what I'm
asking of it.
And, in the wisdom gained department, never do unfiltered (describe-images)
requests if you can help it :-)
On Fri, Mar 28, 2014 at 12:52 PM, Michael Cohen mcohe...@gmail.com wrote:
time ec2-describe-images -a ec2-cli-images.txt
real 1m26.401s
user 0m6.551s
sys 0m1.159s
and writes a 7.5MB file to disk. Note the -a flag, to list all of the
available public images.
in a repl,
(time (spit clj-awz-images.txt (describe-images)))
Elapsed time: 90258.47 msecs
and writes an 18MB file to disk containing all the available public
images.
Am I missing something?
You can also pass a list of filters to the call to narrow the result.
On Friday, March 28, 2014 7:59:48 AM UTC-7, Dave Tenny wrote:
I'm trying to code some amazonica based solutions in a nontrivial AWS
environment.
I work with many AWS accounts and it isn't unusual to see a thousand
instances running on one account, and similar excesses in other types of
AWS resources. So if you're going an ec2-describe-instances (or amazonica
equivalent), it needs not to choke in this environment.
I like the way amazonica does all the bean marshalling for me so I can
express queries simply. But the returned datasets need to be more
pragmatic/performant.
The problem for me is that Amazonica doesn't seem up to the task of
dealing with queries that return large volumes of data.
It has nothing to do with reflection I suspect, and more to do with
unwieldy amounts of duplicate information in the result unmarshalling
process.
The clojure all the way down philosophy results of duplicated
information and just printing the result to a file takes a long time.
If I accidentally let the output go to an emacs cider repl buffer, then
things get so wedged up to the point I may as well kill -9 emacs.
(Known cider repl issues here, it isn't all amazonica).
For example: here's how long it takes to run the java based ec2 cli to
describe instances on an account:
$ time ec2-describe-images /tmp/ec2-cli-images.out
real0m11.484s
user0m2.564s
sys 0m0.129s
And here's how long it takes from a 'lein repl' to run the same query on
the same account:
(time (with-output [/tmp/clj-awz-images.out] (println
(ec2/describe-images
Elapsed time: 194685.552683 msecs
Now the amount of data being printed by the EC2 CLI is of course much
different than the output from Amazonica,
amazonica is returning everything in gory duplicate map detail, ec2 is
not, as evidenced by the relative output sizes:
-rw-rw-r--. 1 dave dave 17201290 Mar 28 10:35 clj-awz-images.out
-rw-rw-r--. 1 dave dave99342 Mar 28 10:26 ec2-cli-images.out.11.5s
Where the amazonica output starts with:
{:images [{:hypervisor xen, :state available, :virtualization-type
paravirtual, :root-device-type instance-store,
... and goes on like that with duplicate keywords all the way down.
Anyway, my goal isn't to turn amazonica into ec2 cli. But even the most
trivial operations in amazonica (especially the most trivial, i.e. those
lacking filters against large data sets), pretty much whack me left and
right
with CPU wedged tools and (completely unacceptable) long waits for
results.
Any suggestions on how to use amazonica in a way where the output is ...
different, and minimal/workable?
Or am I left with going to another package or writing my own java sdk
api's directly?
I'm pretty sure the results need to be structures whose relationship to
data values is implicit (and not explicit in map keys). I don't see any
options with amazonica to change this however.
Thanks for suggestions, forgive me if I've missed something obvious. I'm
just trying to see what's out there and at the same time move along quickly
enough that I can get some usable tools for work (so I can lose all my
python and bash scripts for various interfaces, I want clojure!).
- Dave
--
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with
your first post.
To unsubscribe from this group, send email to