[jira] [Comment Edited] (HBASE-20188) [TESTING] Performance

stack (JIRA) Thu, 22 Mar 2018 06:48:15 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409537#comment-16409537
 ]


stack edited comment on HBASE-20188 at 3/22/18 1:47 PM:
--------------------------------------------------------

[~eshcar] YCSB_load is the YCSB 'load' phase.

One column in the table only, I let the table split naturally.. it ends up w/ 
10 regions on single server. I do not set distribution nor value size; i.e. 
defaults. Below is guts of the script I robbed from Sean IIRC.

{code}
RECORD_COUNT=100000000
MAX_TIME=1200
BINDING_ARGS=(-threads 128 -cp ~/conf_hbase -p columnfamily=family)
EXEC_ARGS=(-s -p maxexecutiontime=${MAX_TIME} -jvm-args='-Xmx8192m 
-Djava.security.egd=file:/dev/./urandom ')
WORKLOAD_ARGS=(-p table=ycsb "${BINDING_ARGS[@]}" "${EXEC_ARGS[@]}")

perf stat "${YCSB}" load "${BINDING}" -P "${WORKLOADS}/workloada" 
"${WORKLOAD_ARGS[@]}" -p recordcount=${RECORD_COUNT} \
  -p exportfile="${LOGS}/ycsb-load-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-load-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-load-${HERE}-${date}.err"

echo "`date` 50/50 workload a"
time "${YCSB}" run "${BINDING}" -P "${WORKLOADS}/workloada" 
"${WORKLOAD_ARGS[@]}" -p recordcount=0 -p operationcount=${INSERT_COUNT} \
  -p exportfile="${LOGS}/ycsb-workloada-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-workloada-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-workloada-${HERE}-${date}.err"

echo "`date` 95% read workload b"
perf stat "${YCSB}" run "${BINDING}" -P "${WORKLOADS}/workloadb" 
"${WORKLOAD_ARGS[@]}" -p recordcount=0 -p operationcount=${INSERT_COUNT} \
  -p exportfile="${LOGS}/ycsb-workloadb-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-workloadb-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-workloadb-${HERE}-${date}.err"

echo "`date` 80% writes"
time "${YCSB}" run "${BINDING}" -P "${WORKLOADS}/workloadw" 
"${WORKLOAD_ARGS[@]}" -p recordcount=0 -p operationcount=${INSERT_COUNT} \
  -p exportfile="${LOGS}/ycsb-workloadw-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-workloadw-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-workloadw-${HERE}-${date}.err"
{code}

Let me know what you'd suggest. Going to try w/o in-memory compactions next.

Looking at emitted stats:

||phase||metrics||1.2.7||2.0.0||
|load|Throughput(ops/sec)|19160|16282|
|load|Operations|2.2998601E7|1.9542989E7|
|load|AverageLatency(us)|6668|7847|
|a(50/50)|Throughput(ops/sec)|40717|50867|
|b(95r/5)|Throughput(ops/sec)|88877|52659|
|w(20/80w)|Throughput(ops/sec)|28955|46452|

Checked failures and about same in both cases. Seems like writes are a bit 
better, reads are worse.



was (Author: stack):
[~eshcar] YCSB_load is the YCSB 'load' phase.

One column in the table only, I let the table split naturally.. it ends up w/ 
10 regions on single server. I do not set distribution nor value size; i.e. 
defaults. Below is guts of the script I robbed from Sean IIRC.

{code}
RECORD_COUNT=100000000
MAX_TIME=1200
BINDING_ARGS=(-threads 128 -cp ~/conf_hbase -p columnfamily=family)
EXEC_ARGS=(-s -p maxexecutiontime=${MAX_TIME} -jvm-args='-Xmx8192m 
-Djava.security.egd=file:/dev/./urandom ')
WORKLOAD_ARGS=(-p table=ycsb "${BINDING_ARGS[@]}" "${EXEC_ARGS[@]}")

perf stat "${YCSB}" load "${BINDING}" -P "${WORKLOADS}/workloada" 
"${WORKLOAD_ARGS[@]}" -p recordcount=${RECORD_COUNT} \
  -p exportfile="${LOGS}/ycsb-load-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-load-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-load-${HERE}-${date}.err"

echo "`date` 50/50 workload a"
time "${YCSB}" run "${BINDING}" -P "${WORKLOADS}/workloada" 
"${WORKLOAD_ARGS[@]}" -p recordcount=0 -p operationcount=${INSERT_COUNT} \
  -p exportfile="${LOGS}/ycsb-workloada-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-workloada-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-workloada-${HERE}-${date}.err"

echo "`date` 95% read workload b"
perf stat "${YCSB}" run "${BINDING}" -P "${WORKLOADS}/workloadb" 
"${WORKLOAD_ARGS[@]}" -p recordcount=0 -p operationcount=${INSERT_COUNT} \
  -p exportfile="${LOGS}/ycsb-workloadb-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-workloadb-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-workloadb-${HERE}-${date}.err"

echo "`date` 80% writes"
time "${YCSB}" run "${BINDING}" -P "${WORKLOADS}/workloadw" 
"${WORKLOAD_ARGS[@]}" -p recordcount=0 -p operationcount=${INSERT_COUNT} \
  -p exportfile="${LOGS}/ycsb-workloadw-measurements-${HERE}-${date}.json" -p 
exporter="${EXPORTER}" \
  >"${LOGS}/ycsb-workloadw-${HERE}-${date}.out" 
2>"${LOGS}/ycsb-workloadw-${HERE}-${date}.err"
{code}

Let me know what you'd suggest. Going to try w/o in-memory compactions next.

Looking at emitted stats:

||phase|metrics|1.2.7||2.0.0||
|load|Throughput(ops/sec)|19160|16282|
|load|Operations|2.2998601E7|1.9542989E7|
|load|AverageLatency(us)|6668|7847|
|a(50/50)|Throughput(ops/sec)|40717|50867|
|b(95r/5)|Throughput(ops/sec)|88877|52659|
|w(20/80w)|Throughput(ops/sec)|28955|46452|

Checked failures and about same in both cases. Seems like writes are a bit 
better, reads are worse.


> [TESTING] Performance
> ---------------------
>
>                 Key: HBASE-20188
>                 URL: https://issues.apache.org/jira/browse/HBASE-20188
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Performance
>            Reporter: stack
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: ITBLL2.5B_1.2.7vs2.0.0_cpu.png, 
> ITBLL2.5B_1.2.7vs2.0.0_gctime.png, ITBLL2.5B_1.2.7vs2.0.0_iops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_load.png, ITBLL2.5B_1.2.7vs2.0.0_memheap.png, 
> ITBLL2.5B_1.2.7vs2.0.0_memstore.png, ITBLL2.5B_1.2.7vs2.0.0_ops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, 
> YCSB_GC_TIME.png, YCSB_MEMSTORE.png, YCSB_OPs.png, YCSB_load.png, 
> flamegraph-1072.1.svg, flamegraph-1072.2.svg, tree.txt
>
>
> How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor 
> that it is much slower, that the problem is the asyncwal writing. Does 
> in-memory compaction slow us down or speed us up? What happens when you 
> enable offheaping?
> Keep notes here in this umbrella issue. Need to be able to say something 
> about perf when 2.0.0 ships.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-20188) [TESTING] Performance

Reply via email to