Hello,
* This post is mainly for the attention of any V8 Developers who might be
monitoring this group. However, all thoughts and suggestions are welcome!
First some background. I'm development an interface to a database
specifically for the Node.js environment. The database in question can
operate as an embedded entity, so I am using the published Node.js/V8 API
to communicate with it via its own C API. I acknowledge that this is a
slightly unusual approach in that the interface between JavaScript and
other databases is often created over TCP infrastructure. However, with the
embedded architecture there is, quite understandably, an expectation of
high performance - certainly much higher than what could otherwise be
achieved over TCP.
The requirement for high performance has led me to spend some time
analyzing the throughput of various aspects of the V8 API. As a result, I
have found that there is a particular problem/bottleneck in marshaling
string based data between the JavaScript environment and C/C++ - which of
course is an essential part of establishing close-coupled lines of
communication at this level.
The following simple benchmark illustrates this performance issue.
Basically, I use the following simple Node.js/JavaScript code to call the
'db.benchmark()' method 1000000000 times and record the time taken.
Although the code implies that a connection to the database is made, no
database is used, or even loaded, in these tests. The source code to the
various incarnations of the 'db.benchmark()' method are included together
with the timing results obtained.
JavaScript Benchmark Code:
var my_database = require('db_api');
var db = new my_database.db_api();
var max = 1000000000;
var d1 = new Date();
var d1_ms = d1.getTime()
console.log("d1: " + d1.toISOString());
for (n = 0; n < max; n ++) {
db.benchmark("Input String");
}
var d2 = new Date();
var d2_ms = d2.getTime()
var diff = Math.abs(d1_ms - d2_ms)
console.log("\nd2: " + d2.toISOString());
console.log("diff: " + diff + " secs: " + (diff / 1000));
First, let’s create a baseline by removing the benchmark from the
JavaScript code …
//db.benchmark("Input String");
Results:
d1: 2016-05-16T09:55:01.919Z
d2: 2016-05-16T09:55:06.589Z
diff: 4670 secs: 4.67
Now let’s create a second baseline by adding a benchmark call that does
absolutely nothing. The JavaScript call 'db.benchmark("Input String")' is
reinstated but the C++ code of the benchmark method does absolutely nothing
...
static void benchmark(const FunctionCallbackInfo<Value>& args)
{
// interact with a database via its C API
return;
}
Results:
d1: 2016-05-16T09:59:18.915Z
d2: 2016-05-16T09:59:38.933Z
diff: 20018 secs: 20.018
This tells us that calling a C/C++ method/function that does absolutely
nothing (no inputs or outputs to process) is moderately expensive on its
own.
Next, let’s accept a single string argument and copy it to a C character
buffer for use at the database API …
static void Benchmark(const FunctionCallbackInfo<Value>& args)
{
char c_input[256];
Local<String> input = args[0]->ToString();
input->WriteUtf8(c_input);
// interact with a database via its C API
return;
}
Results:
d1: 2016-05-16T10:02:54.355Z
d2: 2016-05-16T10:04:26.832Z
diff: 92477 secs: 92.477
We see a significant performance hit in simply marshaling a simple JS
string into a C buffer.
Next, let’s generate an output for JavaScript derived from a string held in
a C buffer …
static void benchmark(const FunctionCallbackInfo<Value>& args)
{
char c_output[256];
Isolate* isolate = args.GetIsolate();
HandleScope scope(isolate);
// interact with a database via its C API
strcpy(c_output, "Output String");
Local<String> output = String::NewFromUtf8(isolate, c_output);
args.GetReturnValue().Set(output);
return;
}
Results:
d1: 2016-05-16T10:14:30.399Z
d2: 2016-05-16T10:17:16.905Z
diff: 166506 secs: 166.506
Creating a JavaScript string resource from a C buffer is also expensive.
Finally, let's put it all together by accepting a string argument and
returning a string output …
static void benchmark(const FunctionCallbackInfo<Value>& args)
{
char c_input[256], c_output[256];
Isolate* isolate = args.GetIsolate();
HandleScope scope(isolate);
Local<String> input = args[0]->ToString();
input->WriteUtf8(c_input);
// interact with a database via its C API
strcpy(c_output, "Output String");
Local<String> output = String::NewFromUtf8(isolate, c_output);
args.GetReturnValue().Set(output);
return;
}
Results:
d1: 2016-05-16T10:28:47.164Z
d2: 2016-05-16T10:32:58.709Z
diff: 251545 secs: 251.545
This last experiment is fairly representative of the 'real world': string
based data sent to the database (update operations) and string based data
returned (retrieval operations).
As you can see, there is a significant cost in marshaling data between C
string buffers and internal V8 string data types/constructs (and vice
versa). Is there an alternative way of doing this? On the input side,
simply getting a pointer to the raw input data would probably work fine for
the purpose of interacting with an outgoing C API. Likewise, on the output
side, is there a faster way to generate V8 strings from C character buffers?
Alternatively, are there plans to improve the performance of the existing
functionality?
When developing this software I was expecting calls to the database
(particularly update operations) to be the rate limiting part. However,
differential benchmarks have demonstrated that operations on the database
are, on their own, several orders of magnitude faster than the mechanics of
marshaling data between the V8/JavaScript and C/C++ environment. This was
a bit of a surprise to be honest. Finally, for the curious, what's the
database? It's InterSystems Caché.
Many thanks for reading this and thanks in advance for any thoughts or
suggestions!
Chris.
Director
M/Gateway Developments Ltd
http://www.mgateway.com
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.