Hi, Will:

Glad to hear you had success with it -- thanks for working through the hiccups.

In the next release, require may change to work out of the box for a *.js 
library (that is, a module with the application/javascript mime type).

You have to use an absolute path to the library in the Modules database because 
there's no equivalent to a global or local node_modules directory at present. 
The db.config.extlibs.write() API gives you that absolute path for free, but it 
is opinionated (rooted at /ext). Writing directly to the Modules database is 
possible but a sharp tool.  An arbitrary write could conflict with 
configuration managed by the REST API (such as transforms, resource extensions, 
and so on).

Anyway, please keep the feedback coming.


Erik Hennum

________________________________
From: [email protected] 
[[email protected]] on behalf of Will Lawrence 
[[email protected]]
Sent: Monday, July 13, 2015 8:48 PM
To: [email protected]
Subject: Re: [MarkLogic Dev General] Can node libraries be installed 
server-side?

Thanks, Erik.

It helped me get in the right frame of mind when thinking critically on where 
certain ingestion logic should reside. And thanks for digging into the example 
of node-xslx and pointing out that it's async built on an underlying sync 
library. I definitely looked at the binary extract for xslx and the Open Office 
pipeline, but these seem to only allow rough grain text searches. I need to be 
able to create indexes and create fine-grain queries on the data. Plus, xslx 
has the nasty behavior of putting any repeated strings into a separate 
sharedStrings.xml file and there didn't seem to be any MarkLogic server side 
solution to remedy this. And I need to automate or at least control the 
shredding process from an external tier as much as possible because there will 
be a lot of different sets of xslx. I'm thinking of massaging xslx into json, 
send to MarkLogic, and use CPF to split each "row" into a document since the 
transform function can't do a xdmp.documentInsert().

Ok, back to the node/npm/JavaScript libraries. Here's a knowledgebase 
page<https://help.marklogic.com/knowledgebase/article/View/222/0/server-side-javascript-implementation-and-module-reuse>
 I just came across that offers additional explanation that you pretty much 
nailed. I've also included my troubleshooting steps in how to require a library 
server side using the example of 'lodash.js'.

I tried to send lodash.js to modules database and then use it in in a transform 
with `require(“lodash.js”)` statement, but it failed with:

"message": "JS-JAVASCRIPT: var _ = require('lodash.js'); -- Error running 
JavaScript request: XDMP-NOEXECUTE: Document is not of executable mimetype. 
URI: lodash.js

So, I needed to write it as lodash.sjs and require(“lodash.sjs”). But then this 
failed with:

"message": "JS-JAVASCRIPT: var _ = require('lodash.sjs'); -- Error running 
JavaScript request: XDMP-MODNOTFOUND: Module lodash.sjs not found

To fix this, send as uri: “/lodash.sjs" and used with require(“/lodash.sjs”).

Note: I used contentType: "application/vnd.marklogic-javascript” when sending 
lodash.sjs to server and used the node.js client api modulesDb.documents.write 
instead of the more specialized db.config.extlibs.write because I couldn't get 
the transform's require statement to work. Plus, the former feels like it gives 
more flexibility without having to learn a special set of write and read calls. 
Maybe my perspective will change on this with time.


Regards,

Will

------------------------------

Message: 2
Date: Mon, 13 Jul 2015 02:55:54 +0000
From: Erik Hennum <[email protected]<mailto:[email protected]>>
Subject: Re: [MarkLogic Dev General] Can node libraries be installed
        server-side?
To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Message-ID:
        
<dfdf2fd50bf5aa42adaf93ff2e3ca185070ea...@exchg10-be01.marklogic.com<mailto:dfdf2fd50bf5aa42adaf93ff2e3ca185070ea...@exchg10-be01.marklogic.com>>
Content-Type: text/plain; charset="iso-8859-1"

Hi, Will:

There are some significant differences between Node.js and MarkLogic as a 
JavaScript runtime environment (even though both make use of v8).

First and foremost, Node.js emphasizes asynchronous IO.  As a transactional 
database, MarkLogic emphasizes synchronous IO.  You can execute asynchronous 
actions in MarkLogic (via the task server), but when you do an 
xdmp.documentInsert(), the operation blocks until the operation succeeds or 
fails.

Stepping back, the tier where you implement an action is not arbitrary.  In the 
database, it's best to write short actions (similar to stored procedure) for 
query expansion, query composition, inbound or outbound data transformation, 
and so on.  The middle tier is great for information bus operations, business 
logic, and so on.

With that perspective, the libraries that make sense to use as dependencies for 
server-side JavaScript actions are those that finish synchronous actions 
quickly.

For that reason, in the particular case, my guess would be that js-xlsx (the 
core library wrapped by node-xlsx) might be a better fit for server-side 
processing than node-xlsx (which adds asynchronous IO conveniences that would 
not work in the server).

At present, you would need to either modify the mimetypes configuration to 
identify *.js as an extension for server-side JavaScript (so the server knows 
that it's not static JavaScript to send to the client) or rename the library 
extension to sjs.

You could put the library in the modules database as described in:

    http://docs.marklogic.com/guide/rest-dev/extensions#id_55309

Then, require the library in your transform or main module.

The speculations about package management for such dependencies is very 
interesting.

By the way, the server can extract metadata from spreadsheets without 
installing an external library:

    http://docs.marklogic.com/guide/search-dev/binary-document-metadata#id_74790


Hoping that helps,



Erik Hennum

------------------------------

Message: 1
Date: Sun, 12 Jul 2015 22:19:41 -0400
From: Will Lawrence <[email protected]<mailto:[email protected]>>
Subject: [MarkLogic Dev General] Can node libraries be installed
        server-side?
To: [email protected]<mailto:[email protected]>
Message-ID:
        
<cagehxqseol3dqogobk-t6fze6fx-m8dhylbnlw3lc6t1c0m...@mail.gmail.com<mailto:cagehxqseol3dqogobk-t6fze6fx-m8dhylbnlw3lc6t1c0m...@mail.gmail.com>>
Content-Type: text/plain; charset="utf-8"

I tried but couldn't find any examples or guidance for using node libraries
within .sjs files on the MarkLogic server. How could we use, for example,
the npm module 'node-xlsx' in a transform?

It would be great to be able to leverage the power of the npm and node
micro-library ecosystem within .sjs files.

Perhaps there could be an .npmrc file controlled via the MarkLogic admin to
specify if the server is allowed to talk to 
registry.npmjs.com<http://registry.npmjs.com/> or an
enterprise npm registry or non at all. Then, a REST API could be exposed to
write dependencies to the MarkLogic's package.json that would automatically
do an 'npm install' so that when an .sjs file is installed, it can execute
the line:

```spreadsheetShredder = require('node-xlsx');

Regards,
Will
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20150712/60738f32/attachment-0001.html



On Sun, Jul 12, 2015 at 10:19 PM, Will Lawrence 
<[email protected]<mailto:[email protected]>> wrote:
I tried but couldn't find any examples or guidance for using node libraries 
within .sjs files on the MarkLogic server. How could we use, for example, the 
npm module 'node-xlsx' in a transform?

It would be great to be able to leverage the power of the npm and node 
micro-library ecosystem within .sjs files.

Perhaps there could be an .npmrc file controlled via the MarkLogic admin to 
specify if the server is allowed to talk to 
registry.npmjs.com<http://registry.npmjs.com> or an enterprise npm registry or 
non at all. Then, a REST API could be exposed to write dependencies to the 
MarkLogic's package.json that would automatically do an 'npm install' so that 
when an .sjs file is installed, it can execute the line:

```spreadsheetShredder = require('node-xlsx');

Regards,
Will



--
William Lawrence
703-873-7035
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to