400Ping commented on PR #796:
URL: https://github.com/apache/mahout/pull/796#issuecomment-3710225556
Before(dev-qdp):
```
$ python ./qdp-python/benchmark/benchmark_throughput.py
Generating 12800 samples of 16 qubits...
Batch size : 64
Vector length: 65536
Batches : 200
Prefetch : 16
Frameworks : pennylane, qiskit, mahout
Generated 12800 samples
PennyLane/Qiskit format: 6400.00 MB
Mahout format: 6400.00 MB
======================================================================
DATALOADER THROUGHPUT BENCHMARK: 16 Qubits, 12800 Samples
======================================================================
[PennyLane] Full Pipeline (DataLoader -> GPU)...
/home/jay/work/mahout/qdp/./qdp-python/benchmark/benchmark_throughput.py:170:
UserWarning: Casting complex values to real discards the imaginary part
(Triggered internally at /pytorch/aten/src/ATen/native/Copy.cpp:309.)
state_gpu = state_cpu.to("cuda", dtype=torch.float32)
Total Time: 6.7562 s (1894.6 vectors/sec)
[Qiskit] Full Pipeline (DataLoader -> GPU)...
Total Time: 848.2128 s (15.1 vectors/sec)
[Mahout] Full Pipeline (DataLoader -> GPU)...
IO + Encode Time: 9.7979 s
Total Time: 9.7979 s (1306.4 vectors/sec)
======================================================================
THROUGHPUT (Higher is Better)
Samples: 12800, Qubits: 16
======================================================================
PennyLane 1894.6 vectors/sec
Mahout 1306.4 vectors/sec
Qiskit 15.1 vectors/sec
----------------------------------------------------------------------
Speedup vs PennyLane: 0.69x
Speedup vs Qiskit: 86.57x
```
After:
```
$ python ./qdp-python/benchmark/benchmark_throughput.py
Generating 12800 samples of 16 qubits...
Batch size : 64
Vector length: 65536
Batches : 200
Prefetch : 16
Frameworks : pennylane, qiskit, mahout
Generated 12800 samples
PennyLane/Qiskit format: 6400.00 MB
Mahout format: 6400.00 MB
======================================================================
DATALOADER THROUGHPUT BENCHMARK: 16 Qubits, 12800 Samples
======================================================================
[PennyLane] Full Pipeline (DataLoader -> GPU)...
/home/jay/work/mahout/qdp/./qdp-python/benchmark/benchmark_throughput.py:169:
UserWarning: Casting complex values to real discards the imaginary part
(Triggered internally at /pytorch/aten/src/ATen/native/Copy.cpp:309.)
state_gpu = state_cpu.to("cuda", dtype=torch.float32)
Total Time: 6.7298 s (1902.0 vectors/sec)
[Qiskit] Full Pipeline (DataLoader -> GPU)...
Total Time: 854.9839 s (15.0 vectors/sec)
[Mahout] Full Pipeline (DataLoader -> GPU)...
IO + Encode Time: 3.6776 s
Total Time: 3.6776 s (3480.5 vectors/sec)
======================================================================
THROUGHPUT (Higher is Better)
Samples: 12800, Qubits: 16
======================================================================
Mahout 3480.5 vectors/sec
PennyLane 1902.0 vectors/sec
Qiskit 15.0 vectors/sec
----------------------------------------------------------------------
Speedup vs PennyLane: 1.83x
Speedup vs Qiskit: 232.48x
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]