[Architecture] Testing DAS 3.1.0 Performance on EC2 with HBase

Gokul Balakrishnan Tue, 13 Sep 2016 01:40:45 -0700

Hi all,

The objective of this mail is to summarise the results of the recently
conducted performance test round for DAS 3.1.0.

These tests were intended to measure the throughput of the batch and
interactive analytics capabilities of DAS under different conditions;
namely data persistence, Spark analytics job execution and indexing. For
this purpose, we've used DAS 3.1.0 RC3 instances backed by an Apache HBase
cluster running on HDFS as the data store, tuned for writes.

This test round was conducted on Amazon EC2 nodes, in the following
configuration:

3 DAS nodes (variable roles: publisher, receiver, analyzer and indexer):
c4.2xlarge
1 HBase master + Hadoop Namenode: c3.2xlarge
9 HBase Regionservers + Hadoop Datanodes: c3.2xlarge

*1. Persisting 1 billion events from the Smart Home DAS sample*

This test was designed to test the data layer during sustained event
publication. During testing, the TPS was around the 150K mark, and the
HBase cluster's memstore flush (which suspends all writes) and minor
compaction operations brought it down somewhat in bursts. Overall, we were
able to achieve a mean of 96K TPS, but a steady rate of around 100-150K TPS
as is achievable, as opposed to the current no-flow-control situation.

The published data took around 950GB on the Hadoop filesystem, taking
HDFS-level replication into account.

Events 1000000000
Time (s) 10391.768
Mean TPS 96230.01591

*2. Analyzing 1 billion events through Spark*

Spark queries from the Smart Home DAS sample were executed against the
published data, and the analyzer node count was kept 2 and 3 respectively
for 2 separate tests. We'd given 6 processor cores and 12GB dedicated
memory for the Spark JVM during this test, and were able to get a
throughput of over 1M TPS on Spark for 2 analyzers and about 1.3M TPS for 3
analyzers.

DAS read operations from the HBase cluster also leverage HBase data
locality, which would have made the read process more efficient compared to
random reads.

The mean throughput readings from 3 tests at each case with a query
involving aggregate functions and GROUP BY are as follows:

INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS
avg_usage,
min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM
smartHomeData GROUP BY metro_area ;

2 Analyzer Nodes 3 Analyzer Nodes

Records 1000000000 1000000000
Time (s) 958.802 741.152
Mean TPS 1042968.204 1349250.896
*3. Persisting the entire Wikipedia corpus*

This test involved publishing the entirety of the Wikipedia dataset, where
a single event comprises of one Wiki article (16.8M articles in total).
Events vary greatly in size, with the mean being ~3.5KB; hence, the
throughput also varies greatly as expected. Here, we were able to see a
mean throughput of around 9K TPS:

Events 16753779
Time (s) 1862.901
Mean TPS 8993.381291

*4. Indexing the full Wikipedia dataset*

In this test, the data from the Wikipedia dataset was indexed, whereby the
articles would support full text search through Lucene. The index worker
counts of 2 and 4 were tested, and 2 dedicated indexer nodes were used in
the test to run the indexing jobs independently to each other.

The TPS v time graph of the first indexer node with 4 dedicated index
worker threads is as below:

The overall results from both indexer nodes can be summarised as below:

Records 16753779
Node 2 Worker threads 4 worker threads
Indexer 1 2198.66 TPS
2268.62 TPS
Indexer 2 4230.75 TPS
3048.91 TPS

*5. Analyzing the Wikipedia dataset*

Similar to the Smart Home dataset, Spark queries were run against the
published Wikipedia dataset, using analyzer clusters of 2 and 3 nodes
respectively. The results of one of these tests are as follows:

INSERT INTO TABLE wikiContributorSummary SELECT contributor_username,
COUNT(*) as page_count FROM wiki GROUP BY contributor_username;

Records 16753779
Node 2 Analyzer Nodes 4 Analyzer Nodes

Time (s)
236.107
181.419
TPS 70958.41716
92348.53571

The full findings of this test may be found in the attached Spreadsheet.

Best regards,

Testing DAS 3.1.0 Performance on a 10-node HBas...
<https://docs.google.com/a/wso2.com/spreadsheets/d/1Ng7pTR0MpSg3Asn02idBIZaq8AhdNd24HySKp37GzK8/edit?usp=drive_web>

--
Gokul Balakrishnan
Senior Software Engineer,
WSO2, Inc. http://wso2.com
M +94 77 5935 789 | +44 7563 570502

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

[Architecture] Testing DAS 3.1.0 Performance on EC2 with HBase

Reply via email to