Many thanks for the suggestions, Jochen. However, I'm not sure that this resolves the issue at the Node.js/V8 interface. I did read up on External String resources a while back. The problem, as I understand it, is that strings passed into a C++ method via JavaScript will be bona fide JavaScript strings over which I have no control. Besides, I'm not sure how you would create an External String resource in the JavaScript environment anyway. The reverse is true on the output side: the JavaScript environment will be expecting the C++ method to create a bona fide JavaScript string to pass out as the return value.
Of course feel free to correct my analysis (preferably with an example or two!) if there is a way to utilize some other string handling methodology which doesn't merely push the same performance issue into a different part of the system. *** I'm still hopeful that the V8 Development team might take a look at this as the performance of the standard string handling functions at the JS/C++ interface is absolutely dreadful (as my benchmarks show) and will negatively impact on anyone creating C++ based add-ons to what is, an otherwise, brilliant JS engine. *** Thanks again, Chris. On Friday, 10 June 2016 09:54:45 UTC+1, Jochen Eisinger wrote: > > Thanks for the detailed analysis, very interesting. Some comments inline: > > On Fri, Jun 10, 2016 at 10:45 AM <[email protected] <javascript:>> > wrote: > >> Hello, >> >> * This post is mainly for the attention of any V8 Developers who might be >> monitoring this group. However, all thoughts and suggestions are welcome! >> >> First some background. I'm development an interface to a database >> specifically for the Node.js environment. The database in question can >> operate as an embedded entity, so I am using the published Node.js/V8 API >> to communicate with it via its own C API. I acknowledge that this is a >> slightly unusual approach in that the interface between JavaScript and >> other databases is often created over TCP infrastructure. However, with the >> embedded architecture there is, quite understandably, an expectation of >> high performance - certainly much higher than what could otherwise be >> achieved over TCP. >> >> The requirement for high performance has led me to spend some time >> analyzing the throughput of various aspects of the V8 API. As a result, I >> have found that there is a particular problem/bottleneck in marshaling >> string based data between the JavaScript environment and C/C++ - which of >> course is an essential part of establishing close-coupled lines of >> communication at this level. >> >> The following simple benchmark illustrates this performance issue. >> Basically, I use the following simple Node.js/JavaScript code to call the >> 'db.benchmark()' method 1000000000 times and record the time taken. >> Although the code implies that a connection to the database is made, no >> database is used, or even loaded, in these tests. The source code to the >> various incarnations of the 'db.benchmark()' method are included together >> with the timing results obtained. >> >> JavaScript Benchmark Code: >> >> var my_database = require('db_api'); >> var db = new my_database.db_api(); >> var max = 1000000000; >> var d1 = new Date(); >> var d1_ms = d1.getTime() >> console.log("d1: " + d1.toISOString()); >> for (n = 0; n < max; n ++) { >> db.benchmark("Input String"); >> } >> var d2 = new Date(); >> var d2_ms = d2.getTime() >> var diff = Math.abs(d1_ms - d2_ms) >> console.log("\nd2: " + d2.toISOString()); >> console.log("diff: " + diff + " secs: " + (diff / 1000)); >> >> >> First, let’s create a baseline by removing the benchmark from the >> JavaScript code … >> >> //db.benchmark("Input String"); >> >> Results: >> d1: 2016-05-16T09:55:01.919Z >> d2: 2016-05-16T09:55:06.589Z >> diff: 4670 secs: 4.67 >> >> >> Now let’s create a second baseline by adding a benchmark call that does >> absolutely nothing. The JavaScript call 'db.benchmark("Input String")' is >> reinstated but the C++ code of the benchmark method does absolutely nothing >> ... >> >> static void benchmark(const FunctionCallbackInfo<Value>& args) >> { >> >> // interact with a database via its C API >> >> return; >> } >> >> Results: >> d1: 2016-05-16T09:59:18.915Z >> d2: 2016-05-16T09:59:38.933Z >> diff: 20018 secs: 20.018 >> >> This tells us that calling a C/C++ method/function that does absolutely >> nothing (no inputs or outputs to process) is moderately expensive on its >> own. >> >> >> Next, let’s accept a single string argument and copy it to a C character >> buffer for use at the database API … >> >> static void Benchmark(const FunctionCallbackInfo<Value>& args) >> { >> char c_input[256]; >> Local<String> input = args[0]->ToString(); >> input->WriteUtf8(c_input); >> >> // interact with a database via its C API >> >> return; >> } >> >> >> Results: >> d1: 2016-05-16T10:02:54.355Z >> d2: 2016-05-16T10:04:26.832Z >> diff: 92477 secs: 92.477 >> >> We see a significant performance hit in simply marshaling a simple JS >> string into a C buffer. >> >> >> Next, let’s generate an output for JavaScript derived from a string held >> in a C buffer … >> >> static void benchmark(const FunctionCallbackInfo<Value>& args) >> { >> char c_output[256]; >> Isolate* isolate = args.GetIsolate(); >> HandleScope scope(isolate); >> >> // interact with a database via its C API >> >> strcpy(c_output, "Output String"); >> Local<String> output = String::NewFromUtf8(isolate, c_output); >> args.GetReturnValue().Set(output); >> return; >> } >> >> Results: >> d1: 2016-05-16T10:14:30.399Z >> d2: 2016-05-16T10:17:16.905Z >> diff: 166506 secs: 166.506 >> >> Creating a JavaScript string resource from a C buffer is also expensive. >> >> >> Finally, let's put it all together by accepting a string argument and >> returning a string output … >> >> static void benchmark(const FunctionCallbackInfo<Value>& args) >> { >> char c_input[256], c_output[256]; >> Isolate* isolate = args.GetIsolate(); >> HandleScope scope(isolate); >> Local<String> input = args[0]->ToString(); >> input->WriteUtf8(c_input); >> >> // interact with a database via its C API >> >> strcpy(c_output, "Output String"); >> Local<String> output = String::NewFromUtf8(isolate, c_output); >> args.GetReturnValue().Set(output); >> return; >> } >> >> Results: >> d1: 2016-05-16T10:28:47.164Z >> d2: 2016-05-16T10:32:58.709Z >> diff: 251545 secs: 251.545 >> >> This last experiment is fairly representative of the 'real world': string >> based data sent to the database (update operations) and string based data >> returned (retrieval operations). >> >> >> As you can see, there is a significant cost in marshaling data between C >> string buffers and internal V8 string data types/constructs (and vice >> versa). Is there an alternative way of doing this? On the input side, >> simply getting a pointer to the raw input data would probably work fine for >> the purpose of interacting with an outgoing C API. Likewise, on the output >> side, is there a faster way to generate V8 strings from C character buffers? >> > > You can get a pointer to the underlying string using v8::String::Value. > Note that the string might have the be flattened first, and it's not safe > to store the pointer across calls to v8. > > In the other direction, you can implement a > v8::String::ExternalStringResource to expose a string to v8 without copying > it. > > > >> >> Alternatively, are there plans to improve the performance of the existing >> functionality? >> >> When developing this software I was expecting calls to the database >> (particularly update operations) to be the rate limiting part. However, >> differential benchmarks have demonstrated that operations on the database >> are, on their own, several orders of magnitude faster than the mechanics of >> marshaling data between the V8/JavaScript and C/C++ environment. This was >> a bit of a surprise to be honest. Finally, for the curious, what's the >> database? It's InterSystems Caché. >> >> Many thanks for reading this and thanks in advance for any thoughts or >> suggestions! >> >> Chris. >> >> Director >> M/Gateway Developments Ltd >> http://www.mgateway.com >> >> -- >> -- >> v8-dev mailing list >> [email protected] <javascript:> >> http://groups.google.com/group/v8-dev >> --- >> You received this message because you are subscribed to the Google Groups >> "v8-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
