Hi Guys, We have an opportunity to test VXQuery on AWS. I wanted to get feedback the tasks needed to prepare a scale test on AWS. What do we need to create in AWS and what are the coding tasks we need to complete first?
I think the test would be a great opportunity to test out the Yarn and HDFS code (from GSOC). Also, now that we have a handful of XMark queries working (also from GSOC), they could be used for the test. XMark includes a XML generator for a single machine. What type so scale tests would be good in this environment? Typically we use Scale-Up and Speed-Up tests. What would the AWS architecture would the test require? I am new to AWS so please post your suggestions. One requirement for our test will be to exceed local memory by five times. Testing Requirements (suggested): - AWS Architecture - XMark Benchmark - HDFS as data storage - Data size 5 times local memory for each node (at least for the largest scale-up test) - Scale-Up test (how big can we go?) Here is a list of tasks that I can think of right now... Coding - Finish GSOC projects - Update XMark XML generator to work in a cluster environment. (Create local node data in parallel that is unique across the cluster.) - Benchmark scripts for the XMark query tests. AWS - Determine architecture for the test. - Scripts/configuration for cluster build out. Previous tests only used eight local server nodes. The AWS test will test Apache VXQuery in a cloud environment and could scale to a much larger cluster size. Thanks for your feedback. Preston
