[ 
https://issues.apache.org/jira/browse/METRON-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957203#comment-15957203
 ] 

Nick Allen commented on METRON-822:
-----------------------------------

h2. Performance Test

h3. Test Environment

There were two hosts, both Cisco UCS boxes with 10G Cisco VNICs.  There is a 
point-to-point fiber connection between `enp9s0f0` on both hosts.  There is 
also a traditional switched link on `enp10s0f0` connecting these hosts.  
Packets are generated on y137 on `enp9s0f0` and are then captured on y138 over 
`enp9s0f0` by Fastcapa.  Fastcapa then uses `en10s0f0` to package and send 
those packets in bulk to Kafka running on y137.

h3. Results

The following shows the Fastcapa probe capturing ~1.1 Gbps successfully for 
roughly 10 minutes.  All metrics were inline and no significant backlog was 
created.  Effectively, no packets were dropped.  The small number shown as a 
loss are a rounded error and due to some noise on the line caused by a 
misconfigured NIC. The test was successful.

||Event ||Packet Count||Delta||
| Packets Generated   | 126,925,885 | -      |   
| Received @ Fastcapa | 126,926,236 | -351   | 
| Received @ Kafka    | 126,926,235 | ~0     |

h3. Steps

Create Kakfa Topic with 256 partitions and 1 broker
{code}
[root@y137 ~]# kafka-topics.sh --zookeeper y137:2181 --create --topic pcap256 
--partitions 256 --replication-factor 1
Created topic "pcap256".
{code}

Count of the number of packets in the topic before the test. The count simply 
uses the offset.
{code}
[root@y137 ~]# kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list 
y137:9092 --topic pcap256 --time -1 | tail -n +2 | awk -F":" '{s+=$3} END 
{print s}'
  0
{code}

Start Fastcapa
* Rx descriptors limited to 512 due to VNIC
* Rx queues limited to 2 due to VNIC
* All other parameters as shown in log
{code}
[root@y138 fastcapa]# fastcapa -l 0,1,2,3,4,5,6,7 --huge-dir /mnt/huge_1GB -- 
-t pcap256 -c /etc/fastcapa.y137 -r 512 -q 2 -x 32768
EAL: Detected 32 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:01:00.1 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:09:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 net_enic
PMD: rte_enic_pmd: Advanced Filters not available
PMD: rte_enic_pmd: vNIC MAC addr 00:35:1a:cd:6c:ad wq/rq 256/512 mtu 1500, max 
mtu:9004
PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min timer 
125 usec loopback tag 0x0000
PMD: rte_enic_pmd: vNIC resources avail: wq 1 rq 4 cq 5 intr 8
EAL: PCI device 0000:0a:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 net_enic
[ -t KAFKA_TOPIC ] defined as pcap256
[ -c KAFKA_CONFIG ] defined as /etc/fastcapa.y137
[ -r NB_RX_DESC ] defined as 512
[ -q NB_RX_QUEUE ] defined as 2
[ -x TX_RING_SIZE ] defined as 32768
[ -p PORT_MASK ] undefined; defaulting to 0x01
[ -b BURST_SIZE ] undefined; defaulting to 32
USER1: Initializing port 0
PMD: rte_enic_pmd: Rq 0 Scatter rx mode enabled
PMD: rte_enic_pmd: Rq 0 Scatter rx mode not being used
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: Rq 1 Scatter rx mode enabled
PMD: rte_enic_pmd: Rq 1 Scatter rx mode not being used
PMD: rte_enic_pmd: Using 512 rx descriptors (sop 512, data 0)
PMD: rte_enic_pmd: TX Queues - effective number of descs:32
PMD: rte_enic_pmd: vNIC resources used:  wq 1 rq 4 cq 3 intr 0
USER1: Device setup successfully; port=0, mac=00 35 1a cd 6c ad
USER1: Launching receive worker; worker=0, core=1, queue=0
USER1: Receive worker started; core=1, socket=0, queue=0 attempts=32
USER1: Launching receive worker; worker=1, core=2, queue=1
USER1: Receive worker started; core=2, socket=0, queue=1 attempts=32
USER1: Launching transmit worker; worker=0, core=3 ring=0
USER1: Launching transmit worker; worker=1, core=4 ring=1
USER1: Transmit worker started; core=3, socket=0
USER1: Transmit worker started; core=4, socket=0
USER1: Launching transmit worker; worker=2, core=5 ring=0
USER1: Launching transmit worker; worker=3, core=6 ring=1
USER1: Transmit worker started; core=5, socket=0
USER1: Launching transmit worker; worker=4, core=7 ring=0
USER1: Transmit worker started; core=6, socket=0
USER1: Starting to monitor workers; core=0, socket=0
USER1: Transmit worker started; core=7, socket=0


      ----- in -----  --- queued --- ----- out ----- ---- drops ----
[nic]               2               -               -               -
[rx]                2               -               2               0
[tx]                2               -               2               0
[kaf]               2               0               1               0

...
{code}

Start packet generator.
{code}
[root@y137 ~]# time tcpreplay -i enp9s0f0 --loop=0 --stats=5 --preload-pcap 
--mbps 1100 example.pcap
File Cache is enabled
Actual: 1055541 packets (687500724 bytes) sent in 5.00 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211107.94 pps
...
{code}

Allow test to proceed for roughly 10 minutes, then stop packet generator.
{code}
...

Actual: 126663236 packets (82500059550 bytes) sent in 600.00 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211105.24 pps
^C User interrupt...
sendpacket_abort
Actual: 126925885 packets (82671152125 bytes) sent in 601.02 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211105.16 pps
Flows: 68 flows, 0.11 fps, 126926429 flow packets, 0 non-flow
Statistics for network device: enp9s0f0
        Successful packets:        126925885
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0

real    10m1.533s
user    7m17.506s
sys     2m44.173s
{code}

Stop Fastcapa
{code}
...
      ----- in -----  --- queued --- ----- out ----- ---- drops ----
[nic]       126926236               -               -               -
[rx]        126926236               -       126926236               0
[tx]        126926236               -       126926236               0
[kaf]       126926236               1       126926235               0
^CUSER1: Exiting on signal '2'
USER1: Transmit worker finished; core=5, socket=0
USER1: Finished monitoring workers; core=0, socket=0
USER1: Receive worker finished; core=2, socket=0, queue=1
USER1: Receive worker finished; core=1, socket=0, queue=0
USER1: Transmit worker finished; core=4, socket=0
USER1: Transmit worker finished; core=6, socket=0
USER1: Transmit worker finished; core=7, socket=0
USER1: Transmit worker finished; core=3, socket=0
USER1: Closing all Kafka connections
USER1: '0' message(s) queued on rdkafka#producer-1
USER1: '0' message(s) queued on rdkafka#producer-2
USER1: '0' message(s) queued on rdkafka#producer-3
USER1: '0' message(s) queued on rdkafka#producer-4
USER1: '1' message(s) queued on rdkafka#producer-5
USER1: All messages cleared on rdkafka#producer-1
USER1: All messages cleared on rdkafka#producer-2
USER1: All messages cleared on rdkafka#producer-3
USER1: All messages cleared on rdkafka#producer-4
USER1: Waiting for '1' message(s) on rdkafka#producer-5
USER1: All messages cleared on rdkafka#producer-5
{code}

Count of the number of packets in the topic after the test.
{code}
[root@y137 ~]# kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list 
y137:9092 --topic pcap256 --time -1 | tail -n +2 | awk -F":" '{s+=$3} END 
{print s}'
  126926243
{code}

h3. Configuration

Code under test
{code}
[root@y138 fastcapa]# git log
commit 324b487a98f1d4efa99b737a29c3150d2f3b1d53
Author: Nick Allen <[email protected]>
Date:   Wed Apr 5 14:34:02 2017 +0000

    Added multi-burst for receive workers

commit 2492bc70833f40512b7bfbac34d03ef3f862242c
Author: Nick Allen <[email protected]>
Date:   Tue Apr 4 21:00:42 2017 -0400

    Updated README

commit 6410229e6a6a031f470d01ced2eb316d43bae29a
Author: Nick Allen <[email protected]>
Date:   Tue Mar 21 13:51:55 2017 -0400

    METRON-822 Improve Fastcapa Performance
{code}

Fastcapa configuration
{code}
[root@y138 fastcapa]# cat /etc/fastcapa.y137 | grep -e "^[^#;]"
[kafka-global]
metadata.broker.list = y137.l42scl.hortonworks.com:9092
queue.buffering.max.messages = 5000000
compression.codec = snappy
batch.num.messages = 500000
message.max.bytes = 1000000000
statistics.interval.ms = 5000
socket.timeout.ms = 6000
[kafka-topic]
request.required.acks = 1
{code}

Kafka broker configuration
* 1 Broker
* 4 disks
* Kafka defaults mostly
{code}
[root@y137 ~]# cat /usr/hdp/current/kafka-broker/conf/server.properties | grep 
-e "^[^#]"
broker.id=0
advertised.listeners=PLAINTEXT://y137:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/metron1/kafka-logs,/metron2/kafka-logs,/metron3/kafka-logs,/metron4/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
{code}

> Improve Fastcapa Performance
> ----------------------------
>
>                 Key: METRON-822
>                 URL: https://issues.apache.org/jira/browse/METRON-822
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>
> Improve the performance and scalability of the Fastcapa probe.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to